Overview
- Apple published the accompanying paper on arXiv and made the full dataset available on GitHub for research use under a non‑commercial license.
- The 400,000 real‑photo examples span 35 edit types across eight categories, with source images drawn from OpenImages.
- Data are split into 258,000 single‑edit cases, 56,000 preference pairs, and 72,000 multi‑turn sequences to support training and evaluation.
- Edits were generated using Google’s Gemini‑2.5‑Flash‑Image (Nano‑Banana) and evaluated by Gemini‑2.5‑Pro for instruction compliance and visual quality.
- Apple reports about 93% success on global style changes and below 60% on precise local tasks such as object relocation or text editing.
 
 