Google’s AI Overviews are wrong millions of times per hour

Google’s AI Overviews are accurate about 91 percent of the time. This sounds good at first, but Tripp Mickle and colleagues report for The New York Times that this still means the search engine delivers tens of millions of incorrect answers every hour. The New York Times commissioned AI startup Oumi to test Google’s system using SimpleQA, an industry-standard benchmark for measuring AI accuracy.

Oumi tested 4,326 searches twice: first when AI Overviews ran on Gemini 2, then after an upgrade to Gemini 3. Accuracy improved from 85 to 91 percent. However, more than half of all correct answers were “ungrounded,” meaning the cited sources did not fully support the information provided. That figure rose from 37 percent to 56 percent between the two tests.

The sources Google cites raise further concerns. Facebook and Reddit ranked among the four most-cited platforms. Incorrect answers cited Facebook 7 percent of the time, compared to 5 percent for correct ones.

The analysis also revealed specific failure patterns. Google sometimes correctly identifies a source but misreads it. In one case, it linked to a page confirming Yo-Yo Ma’s induction into the Classical Music Hall of Fame while simultaneously stating no such induction existed.

Google disputes the findings, calling Oumi’s methodology flawed. Spokesman Ned Adriance stated the study “doesn’t reflect what people are actually searching on Google.” Google advises users to double-check all AI-generated responses.

Stay up to date

AI for content creation: the latest tools, tips and trends. Every two weeks in your inbox:

More info …

About the author

Related posts:

Advertisement