Particle.news

Download on the App Store

Google Unveils Ironwood, Its First Inference-Optimized TPU

The seventh-generation chip, revealed at Google Cloud Next, promises enhanced efficiency, scalability, and cost-effectiveness for real-time AI applications.

Google Ironwood TPU
People walk next to a Google logo during a trade fair in Hannover Messe, in Hanover, Germany, April 22, 2024.  REUTERS/Annegret Hilse/File Photo
Image
Image

Overview

  • Ironwood, Google's seventh-generation TPU, is purpose-built for inference tasks, marking a strategic pivot from training-centric designs.
  • The chip delivers 4,614 TFLOPs of peak performance, 192GB of memory, and 7.4 Tbps of bandwidth, doubling energy efficiency compared to its predecessor, Trillium.
  • Ironwood features an enhanced SparseCore for handling complex recommendation and ranking tasks, making it ideal for data-intensive workloads.
  • It will integrate into Google Cloud’s AI Hypercomputer, supporting configurations of up to 9,216 chips for scalable, high-performance inference workloads.
  • This launch highlights Google's focus on reducing the rising operational costs of inference computing as AI applications become more widespread and resource-intensive.