Google DeepMind researchers predict “Era of Experience” in AI

Google DeepMind’s David Silver and Richard S. Sutton predict a major shift in artificial intelligence development, which they call the “Era of Experience.” In a preprint paper for MIT Press, the researchers argue that AI will increasingly learn from its own experiences rather than human-generated data. The authors suggest that current AI systems, particularly large …

Read more

DeepMind’s AlphaGeometry2 outperforms math olympiad gold medalists

Google DeepMind has developed an AI system that surpasses the average performance of gold medalists in solving geometry problems from the International Mathematical Olympiad (IMO). As reported by Kyle Wiggers, the new system called AlphaGeometry2 successfully solved 84% of geometry problems from the past 25 years of IMO competitions. The AI combines a Gemini language …

Read more

DeepMind launches new benchmark to test AI model accuracy

Google DeepMind has introduced FACTS Grounding, a new benchmark system to evaluate the factual accuracy of large language models (LLMs). According to Taryn Plumb’s report in VentureBeat, the benchmark tests how well AI models generate accurate responses based on long-form documents. The system includes a public leaderboard on Kaggle, where Gemini 2.0 Flash currently leads …

Read more

Google introduces Veo 2 AI video generator

Google DeepMind has announced Veo 2, its latest AI video generation model, positioning it as a direct competitor to OpenAI’s Sora. The new model is currently available through Google Labs’ VideoFX platform on a waitlist basis, with users required to apply through a Google Form for access. According to Google, Veo 2 can generate videos …

Read more

Google DeepMind’s Genie 2 generates interactive 3D environments

DeepMind has announced Genie 2, an artificial intelligence model capable of creating playable 3D environments from single images and text prompts. The model, unveiled on December 4, 2024, represents an advancement over its predecessor Genie 1, which was limited to 2D worlds. According to DeepMind, Genie 2 can generate interactive environments that respond to keyboard …

Read more

Google DeepMind CEO discusses AI development and company direction

Google DeepMind CEO Demis Hassabis leads the company’s efforts to advance artificial intelligence while balancing research goals with commercial applications. In an extensive interview with Harry McCracken, Hassabis details how the April 2023 merger of DeepMind and Google Brain created a powerhouse AI research organization that now serves as Google’s “engine room.” The combined entity …

Read more

AI debates help identify the truth, new research shows

Two recent studies provide the first empirical evidence that having AI models debate each other can help a human or machine judge discern the truth, reports Nash Weerasekera for Quanta Magazine. The approach, first proposed in 2018, involves two expert language models presenting arguments on a given question to a less-informed judge, who then decides …

Read more

SynthID-Text: How well do Google’s watermarks for AI generated texts work?

Google subsidiary DeepMind has introduced SynthID-Text, a system for watermarking text generated by large language models (LLMs). By subtly altering word probabilities during text generation, SynthID-Text embeds a detectable statistical signature without degrading the quality, accuracy, or speed of the output, as described by Pushmeet Kohli and colleagues in the journal Nature. While not foolproof, …

Read more

DeepMind introduces Talker-Reasoner framework for AI agents

DeepMind researchers have introduced a new agentic framework called Talker-Reasoner, which is inspired by the “two systems” model of human cognition. The framework divides the AI agent into two distinct modules, VentureBeat reports: the Talker, which handles real-time interactions with the user and the environment, and the Reasoner, which performs complex reasoning and planning. The …

Read more

DeepMind’s Michelangelo tests reasoning in long context windows

DeepMind has introduced the Michelangelo benchmark to evaluate the long-context reasoning capabilities of large language models (LLMs), Ben Dickson reports for VentureBeat. While LLMs can manage extensive context windows, research indicates they struggle with reasoning over complex data structures. Current benchmarks often focus on retrieval tasks, which do not adequately assess a model’s reasoning abilities. …

Read more