Particle News: Google Unveils Ironwood, Its Most Advanced AI Inference Chip

Overview

Ironwood is Google's seventh-generation Tensor Processing Unit (TPU), purpose-built for real-time AI inference rather than model training.
The chip is available in two configurations: a 256-chip cluster and a 9,216-chip cluster, designed to handle diverse AI workload demands at scale.
Each Ironwood chip delivers 4,614 TFLOPs of peak compute, features 192GB of high-bandwidth memory, and includes an enhanced SparseCore for advanced workloads like ranking and recommendations.
Ironwood integrates with Google Cloud's AI Hypercomputer and Pathways software, enabling efficient distributed AI inference across tens of thousands of chips.
Google positions Ironwood as a strategic solution to reduce the rising costs of AI inference, offering significant performance and energy efficiency improvements over previous TPU generations.