Overview
- The early build, labeled GemPix2, briefly appeared on Media.ai before being taken down, with sample images spreading widely according to testingcatalog and IT Home.
- Leaked outputs show precise in-image text rendering and convincing simulations of full browser and desktop interfaces generated from scratch.
- Tests highlight progress on tasks requiring physical logic, including img2img reconstruction of a ball’s trajectory and improved adherence to detailed constraints.
- The model demonstrates stronger image repair and complex color editing capabilities, pointing to higher visual fidelity and better instruction following.
- Some samples resemble surveillance-style footage, and testing notes indicate these capabilities might be reduced before any public release.