Particle.news

Download on the App Store

Google Releases Gemini 2.5 Computer Use in Public Preview for Browser-Based Agents

The specialized model uses a screenshot-driven action loop to operate interfaces under built-in safety checks.

Overview

  • Developers can access the model now via the Gemini API in Google AI Studio and Vertex AI, with a public demo available on Browserbase.
  • Gemini 2.5 Computer Use powers agents that click, type, scroll, drag and drop, manipulate forms and dropdowns, and work behind logins using a cyclical tool loop of screenshots and action history.
  • Google positions the model as optimized for web browsers with promising mobile results, noting it is not yet optimized for desktop OS-level control.
  • Google reports superior accuracy and lower latency versus leading alternatives on multiple web and mobile control benchmarks, citing evaluations including Browserbase tests.
  • Safety measures include model-level training, a per-step inference-time safety service, and developer-set instructions to refuse or require confirmation for high-risk actions such as purchases or CAPTCHA bypass attempts.