AI debates help identify the truth, new research shows

Two recent studies provide the first empirical evidence that having AI models debate each other can help a human or machine judge discern the truth, reports Nash Weerasekera for Quanta Magazine. The approach, first proposed in 2018, involves two expert language models presenting arguments on a given question to a less-informed judge, who then decides …

Read more

SynthID-Text: How well do Google’s watermarks for AI generated texts work?

Google subsidiary DeepMind has introduced SynthID-Text, a system for watermarking text generated by large language models (LLMs). By subtly altering word probabilities during text generation, SynthID-Text embeds a detectable statistical signature without degrading the quality, accuracy, or speed of the output, as described by Pushmeet Kohli and colleagues in the journal Nature. While not foolproof, …

Read more

DeepMind introduces Talker-Reasoner framework for AI agents

DeepMind researchers have introduced a new agentic framework called Talker-Reasoner, which is inspired by the “two systems” model of human cognition. The framework divides the AI agent into two distinct modules, VentureBeat reports: the Talker, which handles real-time interactions with the user and the environment, and the Reasoner, which performs complex reasoning and planning. The …

Read more

DeepMind’s Michelangelo tests reasoning in long context windows

DeepMind has introduced the Michelangelo benchmark to evaluate the long-context reasoning capabilities of large language models (LLMs), Ben Dickson reports for VentureBeat. While LLMs can manage extensive context windows, research indicates they struggle with reasoning over complex data structures. Current benchmarks often focus on retrieval tasks, which do not adequately assess a model’s reasoning abilities. …

Read more

DeepMind’s SCoRE makes AI models more reliable

DeepMind has developed a new technique called SCoRe that significantly improves the self-correction abilities of large language models (LLMs). Ben Dickson reports this in an article for VentureBeat. SCoRe uses self-generated data and enables LLMs to use their internal knowledge to identify and correct errors. In tests, SCoRe significantly outperformed other self-correction methods. The technique …

Read more

Is math the key to reliable AI answers?

Researchers are working on new AI systems that can check their own mathematical answers. This is reported by Cade Metz in the New York Times. The technology is meant to prevent chatbots like ChatGPT from providing incorrect information. Instead, the new systems can generate mathematical evidence to verify their answers. Silicon Valley startup Harmonic, for …

Read more

DeepMind JEST speeds up AI training

Google’s DeepMind researchers have developed a new method called JEST that significantly speeds up AI training while reducing energy requirements. By optimizing the selection of training data, JEST can reduce the number of iterations by a factor of 13 and the computational complexity by a factor of 10.

DeepMind V2A automatically generates audio for videos

Google’s AI research lab DeepMind has developed a new technology called V2A that can automatically generate appropriate soundtracks, sound effects, and even dialogue for videos. While V2A seems promising, DeepMind admits that the quality of the audio generated is not yet perfect. For now, it is not generally available.

Google DeepMind Gecko evaluates image generators

Google DeepMind develops “Gecko”, a new standard to evaluate the capabilities of AI image generators. It is designed to help better understand the strengths and weaknesses of AI models and drive their development.

Research shows the usefulness of examples in prompts

Researchers at DeepMind have found that large language models can learn new skills through hundreds or even thousands of examples in the prompt, without the need to fine-tune the model. This method enables companies to quickly prototype and develop AI applications.