This new Microsoft AI beats GPT-4o at web tasks and protects your privacy

Microsoft has introduced Fara-7B, a new artificial intelligence agent designed to perform complex computer tasks directly on a user’s device. The model is small enough to run locally, which enhances data security and privacy by keeping sensitive information on the user’s personal computer.

As Ben Dickson reports for VentureBeat, Fara-7B operates by visually interpreting the screen like a human. It analyzes screenshots to control a virtual mouse and keyboard, allowing it to click, type, and scroll through websites and applications. This pixel-based approach enables the agent to navigate even complex or obfuscated user interfaces without accessing the underlying code.

According to Microsoft Research, this method offers significant privacy advantages for businesses handling sensitive data. On the WebVoyager benchmark, which tests web navigation abilities, Fara-7B achieved a higher task success rate than larger models like GPT-4o. The model also proved more efficient, completing tasks in significantly fewer steps than comparable systems.

To address safety concerns, the agent is trained to identify “Critical Points”. At these moments, such as before sending an email or completing a purchase, Fara-7B pauses and requests explicit user permission to proceed. Microsoft has released the model under an MIT license for experimentation but cautions that it is not yet ready for mission-critical deployment.

About the author

Related posts:

Stay up-to-date:

Advertisement