Overview
- OpenAI began renting Google’s TPUs through Google Cloud on June 27 to support inference workloads for ChatGPT and related products.
- The shift marks the first time the company has deployed non-Nvidia chips in a meaningful way for inference tasks.
- OpenAI anticipates that using rented TPUs will lower the cost of inference computing and help meet rising AI usage demands.
- Google has opened external access to its in-house TPUs in recent months but is withholding its highest-end models from OpenAI’s arrangement.
- The arrangement underscores a growing trend of compute diversification as AI leaders balance performance, supply constraints and cost efficiency.