Overview
- The new model handles both text-to-image generation and precise editing of existing images through natural language instructions.
- It improves character consistency across outputs and adds multi-image fusion to combine elements from several inputs into a single result.
- Google demonstrated world-knowledge use cases including interpreting hand-drawn diagrams, adapting real estate templates, and supporting educational tasks.
- The preview is live through the Gemini API, Google AI Studio, and Vertex AI, with Google indicating full stability is expected within weeks.
- Safety measures include automated content filtering and SynthID watermarking, and an early tester reported strong editing performance with the ability to revert to the original image on request.