New AI system OpenScholar helps scientists process research papers

OpenScholar, a new open-source AI system developed by the Allen Institute for AI and the University of Washington, is transforming how researchers analyze scientific literature. As reported by Michael Nuñez for VentureBeat, the system processes over 45 million open-access academic papers to provide citation-backed answers to complex research questions. The AI combines advanced retrieval systems …

Read more

Tech companies develop new AI testing methods as models outgrow existing benchmarks

Leading AI companies are creating new ways to evaluate increasingly sophisticated AI models as current testing methods prove inadequate. According to Cristina Criddle’s report in the Financial Times, companies like OpenAI, Microsoft, Meta, and Anthropic are developing internal benchmarks because their latest AI systems achieve over 90% accuracy on existing public tests. Meta’s generative AI …

Read more

AI-generated images raise concerns about research integrity

AI tools that can generate realistic images are becoming a significant concern for research integrity specialists. The ease with which these tools can create fake scientific figures that are hard to distinguish from real ones raises fears of an increasingly untrustworthy scientific literature, Nature reports. Companies like Proofig and Imagetwin are developing AI-based solutions to …

Read more

New AI math benchmark exposes limitations in advanced reasoning

The FrontierMath benchmark, developed by Epoch AI, presents hundreds of challenging math problems that require deep reasoning and creativity to solve. Despite the growing power of AI models like GPT-4o and Gemini 1.5 Pro, they are solving fewer than 2% of these problems, even with extensive support, according to Epoch AI. The benchmark was created …

Read more

OpenAI and others exploring new strategies to overcome AI improvement slowdown

OpenAI is reportedly developing new strategies to deal with a slowdown in AI model improvements. According to The Information, OpenAI employees testing the company’s next flagship model, code-named Orion, found less improvement compared to the jump from GPT-3 to GPT-4, suggesting the rate of progress is diminishing. In response, OpenAI has formed a foundations team …

Read more

AI expert warns of limits to current AI approaches

Gary Marcus, a prominent AI expert, argues that pure scaling of AI systems without fundamental architectural changes is reaching a point of diminishing returns. He cites recent comments from venture capitalist Marc Andreesen and editor Amir Efrati confirming that improvements in large language models (LLMs) are slowing down, despite increasing computational resources. Marcus warns that …

Read more

AI debates help identify the truth, new research shows

Two recent studies provide the first empirical evidence that having AI models debate each other can help a human or machine judge discern the truth, reports Nash Weerasekera for Quanta Magazine. The approach, first proposed in 2018, involves two expert language models presenting arguments on a given question to a less-informed judge, who then decides …

Read more

Deep learning boom fueled by three visionaries pursuing unorthodox ideas

Geoffrey Hinton, Jensen Huang, and Fei-Fei Li were instrumental in launching the deep learning revolution, despite facing skepticism from colleagues, Timothy B. Lee writes. Hinton spent decades promoting neural networks and developed the backpropagation algorithm for training them efficiently, as detailed in Cade Metz’s book “Genius Makers.” Huang, CEO of Nvidia, recognized the potential of …

Read more

OmniGen: First unified model for image generation

Researchers have introduced OmniGen, the first diffusion model capable of unifying various image generation tasks within a single framework. Unlike existing models like Stable Diffusion, OmniGen does not require additional modules to handle different control conditions, according to the authors Shitao Xiao, Yueze Wang, Junjie Zhou, Huaying Yuan, et al. The model can perform text-to-image …

Read more

SynthID-Text: How well do Google’s watermarks for AI generated texts work?

Google subsidiary DeepMind has introduced SynthID-Text, a system for watermarking text generated by large language models (LLMs). By subtly altering word probabilities during text generation, SynthID-Text embeds a detectable statistical signature without degrading the quality, accuracy, or speed of the output, as described by Pushmeet Kohli and colleagues in the journal Nature. While not foolproof, …

Read more