Google DeepMind’s David Silver and Richard S. Sutton predict a major shift in artificial intelligence development, which they call the “Era of Experience.” In a preprint paper for MIT Press, the researchers argue that AI will increasingly learn from its own experiences rather than human-generated data.
The authors suggest that current AI systems, particularly large language models (LLMs), are approaching the limits of what they can learn from human data alone. “The majority of high-quality data sources – those that can actually improve a strong agent’s performance – have either already been, or soon will be consumed,” they write. This limitation prevents AI from achieving truly superhuman intelligence across many domains.
Silver and Sutton propose that future AI systems will need to generate their own data through environmental interactions. They cite AlphaProof, a program that achieved a silver medal in the International Mathematical Olympiad, as an example of this approach. While initially trained on human-created formal proofs, AlphaProof went on to generate millions more through reinforcement learning and direct interaction with formal proving systems.
The researchers outline four key characteristics that will define this new era:
- First, AI agents will operate in continuous “streams” of experience rather than brief interactions, allowing them to adapt over time and pursue long-term goals.
- Second, they will interact with the real world through rich action and observation spaces beyond simple text input and output.
- Third, they will learn from grounded rewards based on environmental feedback rather than human judgment.
- Fourth, they will develop planning and reasoning methods that may differ significantly from human thinking.
According to the paper, this approach offers several advantages. Experiential agents could provide personalized assistance with health, education, or professional needs. They could accelerate scientific discovery by autonomously designing and conducting experiments. Furthermore, they could adapt to changing environments and self-correct problematic behaviors.
However, the authors acknowledge significant challenges. Autonomous agents pursuing long-term goals with minimal human intervention raise important safety concerns. The researchers suggest that experiential learning might mitigate some risks by allowing agents to recognize when their actions cause human distress and adapt accordingly.
They also note that physical experimentation imposes natural time constraints on potential AI self-improvement. “The development of a new drug, even with AI-assisted design, still requires real-world trials that cannot be completed overnight,” they write.
The paper concludes that the era of experience will mark a pivotal moment in AI evolution. “Experiential data will eclipse the scale and quality of human generated data,” the authors predict, unlocking “new capabilities that surpass those possessed by any human.”
via: VentureBeat