Google introduces fast new AI model using diffusion technology

Google unveiled Gemini Diffusion at its I/O developer conference, marking a significant shift in how AI models generate text. The experimental model uses diffusion technology instead of the traditional transformer approach that powers ChatGPT and similar systems.

The key advantage is speed. Gemini Diffusion generates text at 857 to 2,000 tokens per second, which is four to five times faster than Google’s current fastest public model. Simon Willison, who tested the system, reported that it created an interactive HTML and JavaScript chat application within seconds.

How diffusion differs from traditional AI

Traditional AI models like ChatGPT generate text one word at a time, from left to right. Each new word depends on all previous words, making the process sequential and relatively slow.

Diffusion models work differently. They start with random gibberish and refine it step by step into coherent text. This approach allows the model to work on multiple parts of the text simultaneously, resulting in faster generation. The technique resembles sculpting rather than writing.

Diffusion technology originally powered image generation tools like DALL-E 2 and Stable Diffusion. Until recently, it had not been successfully applied to text generation at this scale.

Performance and limitations

Google claims Gemini Diffusion matches the performance of its Gemini 2.0 Flash-Lite model while operating five times faster. The company specifically highlights strong performance in coding and mathematical reasoning tasks.

However, the technology has trade-offs. Diffusion models can only generate fixed-length text segments and may struggle with longer narratives that require natural flow. For coding tasks, where logic and syntax matter more than narrative flow, these limitations are less problematic.

Industry impact

The development has generated significant interest among AI researchers and developers. Jack Rae from Google DeepMind called it “a landmark moment,” noting that text diffusion models had previously lagged behind traditional approaches in quality.

Stefano Ermon from Stanford University, whose company Inception Labs released a similar diffusion model called Mercury earlier this year, said Google’s entry validates the direction of diffusion-based text generation.

The model could potentially impact the competitive landscape between Google, OpenAI, Anthropic, and other AI companies, particularly in areas like autonomous coding agents. However, questions remain about computational costs and real-world performance.

Currently, Gemini Diffusion remains an experimental research project with limited access through a waitlist. Google has not announced plans for broader public release.

Sources: Simon Willison, Fortune

Related posts:

Stay up-to-date: