Overview
- Ironwood is Google's seventh-generation Tensor Processing Unit (TPU), purpose-built for real-time AI inference rather than model training.
- The chip is available in two configurations: a 256-chip cluster and a 9,216-chip cluster, designed to handle diverse AI workload demands at scale.
- Each Ironwood chip delivers 4,614 TFLOPs of peak compute, features 192GB of high-bandwidth memory, and includes an enhanced SparseCore for advanced workloads like ranking and recommendations.
- Ironwood integrates with Google Cloud's AI Hypercomputer and Pathways software, enabling efficient distributed AI inference across tens of thousands of chips.
- Google positions Ironwood as a strategic solution to reduce the rising costs of AI inference, offering significant performance and energy efficiency improvements over previous TPU generations.