Research reveals citation problems in AI search tools

A comprehensive study conducted by the Tow Center for Digital Journalism has found that AI search tools frequently provide incorrect information, fail to accurately cite sources, and often fabricate URLs. According to the report published in the Columbia Journalism Review, researchers tested eight generative search engines and discovered that they collectively provided incorrect answers to more than 60 percent of queries.

The study, which builds on previous research, revealed several concerning patterns across all tested platforms. Most chatbots presented inaccurate answers with alarming confidence, rarely acknowledging knowledge gaps or declining to provide answers when they couldn’t determine the correct information. Surprisingly, premium models like Perplexity Pro ($20/month) and Grok 3 ($40/month) demonstrated higher error rates than their free counterparts, primarily due to their tendency to provide definitive but wrong answers rather than declining to respond.

Another troubling finding was that multiple AI search tools appeared to bypass Robot Exclusion Protocol preferences, which allow publishers to control whether their content can be crawled. The researchers found instances where chatbots correctly answered queries about publishers whose content they shouldn’t have had access to, suggesting they may have disregarded publishers’ preferences.

The study also revealed that even when chatbots correctly identified articles, they often failed to properly link to the original sources. In many cases, they directed users to syndicated versions of articles on platforms like Yahoo News or AOL rather than to the original publishers. This practice deprives original sources of proper attribution and potential referral traffic, which is particularly problematic since news publishers rely on this content to monetize their work.

Perhaps most surprisingly, the presence of licensing deals between AI companies and publishers didn’t necessarily result in more accurate citations. While Time magazine, which has deals with both OpenAI and Perplexity, was among the most accurately identified publishers in the dataset, the San Francisco Chronicle, which permits OpenAI’s search crawler and is part of Hearst’s “strategic content partnership” with the company, was correctly identified by ChatGPT in only one of the ten test cases.

When contacted for comment, OpenAI and Microsoft responded but did not address the specific findings. OpenAI stated they support “publishers and creators by helping 400M weekly ChatGPT users discover quality content through summaries, quotes, clear links, and attribution,” while Microsoft affirmed they “respect the robots.txt standard and honors the directions provided by websites.”

Mark Howard, Time magazine’s COO, expressed optimism about future improvements despite the current issues, stating that “it’s just going to continue to get better.” However, he cautioned that consumers shouldn’t expect these free products to be 100 percent accurate.

As these tools continue to gain popularity, with nearly one in four Americans now using AI in place of traditional search engines, addressing these citation problems becomes increasingly urgent to ensure accurate information dissemination and fair attribution to content creators.

Research reveals citation problems in AI search tools

Related posts:

Stay up-to-date: