Google has launched Gemini 3.5 Flash, a new artificial intelligence model designed to execute complex tasks quickly and at a lower cost. The technology aims to solve a common challenge in the industry, where organizations previously had to choose between fast, cheap models and slower, more capable ones.
Google claims the new model outperforms its previous flagship model, Gemini 3.1 Pro, on multiple benchmarks while generating output four times faster than comparable models. According to Google, an optimized version of the model running within its development platform can operate up to twelve times faster.
Impact on enterprise costs
The chief executive of Google, Sundar Pichai, stated that the efficiency of the new model could significantly reduce operational costs for large enterprises. Google estimates that companies processing one trillion tokens per day could save over one billion dollars annually by routing eighty percent of their workloads to Gemini 3.5 Flash and other models. Tokens are the basic units of data that artificial intelligence models process.
The cost reduction is particularly relevant for agentic workflows, which are autonomous sessions where the technology executes multi-step tasks, writes code, and uses external tools. These workflows typically consume high volumes of data.
Alongside the new model, Google released Antigravity 2.0, a development platform designed to manage teams of autonomous software agents. This platform allows developers to run multiple agents in parallel to perform tasks such as writing code and generating brand assets. The development of the model was aided by a rapid increase in internal usage. Google reported that its own developers processed over three trillion tokens per day using the platform, creating a feedback loop that helped improve the system.
To support these services, Google announced plans for capital expenditures of approximately 180 billion to 190 billion dollars. The company uses its own custom silicon, specifically its eighth generation of Tensor Processing Units, to train and run these models more efficiently.
Consumer features and safety
Google is also integrating Gemini 3.5 Flash into its consumer products. The model is now the default engine for the Gemini app and the AI Mode in Google Search.
Key features powered by the new model include:
- Gemini Spark, a personal digital assistant that runs continuously to help users manage tasks and emails under their direction.
- Enhanced search features that use information agents to monitor the web.
- Gemini Omni, a model capable of generating video and other media from various inputs.
Google stated that Gemini 3.5 Flash was developed under its safety framework. The company used new safety training and interpretability tools to analyze the reasoning of the model before it generates a response, reducing the likelihood of harmful outputs.
Sources: Google Blog, VentureBeat
Stay up to date
AI for content creation: the latest tools, tips and trends. Every two weeks in your inbox: