Google brings computer use to its main AI model Gemini 3.5 Flash

Google has built computer use directly into Gemini 3.5 Flash, its latest AI model. Mateo Quiros writes for The Keyword, Google’s official blog, that the capability was previously only available as a separate standalone model. It is now integrated into the main Flash model, making it accessible to a much broader range of developers and businesses.

Computer use means the AI can see what is on a screen, reason about it, and take actions across browsers, mobile apps, and desktop software. This enables the model to perform tasks that previously required human hands on a keyboard and mouse.

What this means in practice

Google highlights two practical examples: the model can analyze an app and return a categorized list of its features, and it can audit documentation for accessibility issues. More broadly, the company points to uses such as continuous software testing and automating knowledge work across professional applications.

Developers can access the capability through the Gemini API and the Gemini Enterprise Agent Platform.

Safety measures for live environments

Google acknowledges that AI agents operating on live systems carry risks. A key concern is prompt injection, where malicious content in the environment tries to hijack the agent’s actions. To counter this, Google has applied targeted adversarial training to the model. Two optional safeguards are also available to enterprise customers:

Requiring explicit user confirmation before sensitive or irreversible actions are taken
Automatically stopping a task if an indirect prompt injection is detected

Google recommends combining these tools with secure sandboxing, human oversight, and strict access controls. The company describes this layered approach as “defense-in-depth.”

What this means in practice

Safety measures for live environments

Stay up to date

Related posts: