Particle: Google Launches Agentic Vision for Gemini 3 Flash, Bringing Code-Driven Image Reasoning

Overview

Agentic Vision is available now to developers through the Gemini API in Google AI Studio and Vertex AI, with an initial rollout in the Gemini app.
The system follows a Think–Act–Observe loop in which the model plans tasks, runs Python to manipulate or analyze images, and reviews transformed inputs before answering.
Google reports a consistent 5–10% quality improvement across most vision benchmarks when code execution is enabled.
PlanCheckSolver.com cites a roughly 5% accuracy gain after using iterative code-driven inspection on high‑resolution building plans.
Demos include implicit zooming, programmatic annotation using a visual scratchpad, and visual arithmetic, with a roadmap for more implicit behaviors, web and reverse image search tools, and support beyond the Flash model.