Particle.news

Download on the App Store

Google Puts Gemini 2.5 ‘Computer Use’ Into Public Preview for Browser-Based UI Control

Developers get an agent that performs human-like input inside web interfaces without desktop OS control.

Overview

  • The specialized variant of Gemini 2.5 Pro lets agents operate websites by clicking, typing, scrolling, dragging and dropping, and submitting forms.
  • A new computer_use tool runs in a loop from screenshot and action history to a function‑call action, with the client executing steps and returning an updated view.
  • Google says it outperforms leading alternatives on web and mobile control benchmarks with lower latency, based on self‑reported and Browserbase evaluations.
  • Safety features include an out‑of‑model per‑step review and developer rules that can require confirmation or block high‑risk actions such as purchases or CAPTCHA bypasses.
  • Access is available now through the Gemini API in Google AI Studio and Vertex AI, with a live demo on Browserbase and internal use reported in Mariner, Firebase Testing Agent, and AI Mode in Search.