New AI Models Make More Mistakes, Creating Risk for Marketers
The newest AI tools, built to be smarter, make more factual errors than older versions.
As The New York Times highlights, tests show errors as high as 79% in advanced systems from companies like OpenAI.
This can create problems for marketers who rely on these tools for content and customer service.
Rising Error Rates in Advanced AI Systems
Recent tests reveal a trend: newer AI systems are less accurate than their predecessors.
OpenAI’s latest system, o3, got facts wrong 33% of the time when answering questions about people. That’s twice the error rate of their previous system.
Its o4-mini model performed even worse, with a 48% error rate on the same test.
For general questions, the results (PDF link) were:
- OpenAI’s o3 made mistakes 51% of the time
- The o4-mini model was wrong 79% of the time
Similar problems appear in systems from Google and DeepSeek.
Amr Awadallah, CEO of Vectara and former Google executive, tells The New York Times:
“Despite our best efforts, they will always hallucinate. That will never go away.”
Real-World Consequences For Businesses
These aren’t just abstract problems. Real businesses are facing backlash when AI gives wrong information.
Last month, Cursor (a tool for programmers) faced angry customers when its AI support bot falsely claimed users couldn’t use the software on multiple computers.
This wasn’t true. The mistake led to canceled accounts and public complaints.
Cursor’s CEO, Michael Truell, had to step in:
“We have no such policy. You’re of course free to use Cursor on multiple machines.”
Why Reliability Is Declining
Why are newer AI systems less accurate? According to a New York Times report, the answer lies in how they’re built.
Companies like OpenAI have used most of the available internet text for training. Now they’re using “reinforcement learning,” which involves teaching AI through trial and error. This approach helps with math and coding, but seems to hurt factual accuracy.
Researcher Laura Perez-Beltrachini explained:
“The way these systems are trained, they will start focusing on one task—and start forgetting about others.”
Another issue is that newer AI models “think” step-by-step before answering. Each step creates another chance for mistakes.
These findings are concerning for marketers using AI for content, customer service, and data analysis.
AI content with factual errors could hurt your search rankings and brand.
Pratik Verma, CEO of Okahu, tells the New York Times:
“You spend a lot of time trying to figure out which responses are factual and which aren’t. Not dealing with these errors properly basically eliminates the value of AI systems.”
Protecting Your Marketing Operations
Here’s how to safeguard your marketing:
- Have humans review all customer-facing AI content
- Create fact-checking processes for AI-generated material
- Use AI for structure and ideas rather than facts
- Consider AI tools that cite sources (called retrieval-augmented generation)
- Create clear steps to follow when you spot questionable AI information
The Road Ahead
Researchers are working on these accuracy problems. OpenAI says it’s “actively working to reduce the higher rates of hallucination” in its newer models.
Marketing teams need their own safeguards while still using AI’s benefits. Companies with strong verification processes will better balance AI’s efficiency with the need for accuracy.
Finding this balance between speed and correctness will remain one of digital marketing’s biggest challenges as AI continues to evolve.
Featured Image: The KonG/Shutterstock