Small price, big ambitions: Meet Google’s new workhorse AI model Gemini 3.1 Flash-Lite

Google has released Gemini 3.1 Flash-Lite, a new artificial intelligence model designed for developers handling large numbers of tasks at low cost. The Gemini Team writes in the Google Blog that the model is now available in preview through the Gemini API, Google AI Studio, and Vertex AI for enterprise users.

The model is priced at $0.25 per million input tokens and $1.50 per million output tokens. According to the Artificial Analysis benchmark, it is 2.5 times faster to deliver its first response than Gemini 2.5 Flash and produces output 45% faster.

On the Arena.ai leaderboard, 3.1 Flash-Lite earns an Elo score of 1,432. It scores 86.9% on the GPQA Diamond benchmark and 76.8% on MMMU Pro, outperforming comparable models from competitors and even some larger earlier Gemini models.

A key feature is adjustable thinking levels. Developers can control how much reasoning the model applies to a given task. This matters for high-frequency workloads such as translation or content moderation, where speed and cost take priority, as well as for more complex tasks like generating user interfaces.

About the author

Related posts:

Stay up-to-date:

Advertisement