Artificial intelligence powerhouse OpenAI has discreetly pulled the pin on its AI-detection software citing a low rate of accuracy.
The OpenAI-developed AI classifier was first launched on Jan. 31, and aimed to aid users, such as teachers and professors, in distinguishing human-written text from AI-generated text.
However, per the original blog post which announced the launch of the tool, the AI classifier has been shut down as of July 20:
“As of July 20, 2023, the AI classifier is no longer available due to its low rate of accuracy.”
The link to the tool is no longer functional, while the note offered only simple reasoning as to why the tool was shut down. However, the company explained that it was looking at new, more effective ways of identifying AI-generated content.
“We are working to incorporate feedback and are currently researching more effective provenance techniques for text, and have made a commitment to develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated,” the note read.
From the get go, OpenAI made it clear the detection tool was prone to errors and could not be considered “fully reliable.”
The company said limitations of its AI detection tool included being “very inaccurate” at verifying text with less than 1,000 characters and could “confidently” label text written by humans as AI-generated.
The classifier is the latest of OpenAI’s products to come under scrutiny.
On July 18, researchers from Stanford and UC Berkeley published a study which revealed that OpenAI’s flagship product ChatGPT was getting significantly worse with age.
We evaluated #ChatGPT‘s behavior over time and found substantial diffs in its responses to the *same questions* between the June version of GPT4 and GPT3.5 and the March versions. The newer versions got worse on some tasks. w/ Lingjiao Chen @matei_zaharia https://t.co/TGeN4T18Fd https://t.co/36mjnejERy pic.twitter.com/FEiqrUVbg6
— James Zou (@james_y_zou) July 19, 2023
Researchers found that over the course of the last few months, ChatGPT-4’s ability to accurately identify prime numbers had plummeted from 97.6% to just 2.4%. Additionally, both ChatGPT-3.5 and ChatGPT-4 witnessed a significant decline in being able to generate new lines of code.