Particle.news

Google Limits Meta's Access to Gemini Over Compute Shortfall

Google says broad rate limits and spending tiers are temporary measures to ration scarce AI compute after commercial demand surged.

Overview

  • Google has introduced compute-based usage limits for its Gemini API, including rolling five-hour refresh windows and weekly caps, that apply across customers.
  • Meta sought far more Gemini capacity than Google could supply and was told it could not get the full allotment, a shortfall that disrupted some internal AI projects.
  • The restrictions have led Meta to urge staff to conserve AI 'tokens,' the units used to measure model usage, while other Google clients have felt smaller effects.
  • Google executives link the caps to rapid commercial uptake of Gemini and a growing Google Cloud backlog, highlighting the high cost and long lead times of adding specialized AI hardware.
  • The episode shows how scarce high-performance compute can reshape competition and product plans, as rivals become buyers and firms delay or redesign AI features that need large-scale models.