Particle News: Google Unveils Ironwood, Its First Inference-Optimized TPU

Overview

Ironwood, Google's seventh-generation TPU, is purpose-built for inference tasks, marking a strategic pivot from training-centric designs.
The chip delivers 4,614 TFLOPs of peak performance, 192GB of memory, and 7.4 Tbps of bandwidth, doubling energy efficiency compared to its predecessor, Trillium.
Ironwood features an enhanced SparseCore for handling complex recommendation and ranking tasks, making it ideal for data-intensive workloads.
It will integrate into Google Cloud’s AI Hypercomputer, supporting configurations of up to 9,216 chips for scalable, high-performance inference workloads.
This launch highlights Google's focus on reducing the rising operational costs of inference computing as AI applications become more widespread and resource-intensive.