In late 2023, a team of third-party researchers discovered a troubling glitch in OpenAI’s widely used artificial intelligence model GPT-3.5.
When asked to repeat certain words a thousand times, the model began repeating the word over and over, then suddenly switched to spitting out incoherent text and snippets of personal information drawn from its training data, including parts of names, phone numbers, and email addresses. The team that discovered the problem worked with OpenAI to ensure the flaw was fixed before revealing it publicly. It is just one of scores of problems found in major AI models in recent years.
In a proposal released today,
→ Continue reading at WIRED