Particle.news

Download on the App Store

Google Unveils Ironwood, Its Most Advanced AI Inference Chip

Announced at Cloud Next 2025, Ironwood marks a shift to scalable, energy-efficient AI inference and will be available to Google Cloud customers later this year.

Image
Meet the chip that's 24 times faster than the world's fastest super computer.
Google Ironwood TPU
People walk next to a Google logo during a trade fair in Hannover Messe, in Hanover, Germany, April 22, 2024.  REUTERS/Annegret Hilse/File Photo

Overview

  • Ironwood is Google's seventh-generation Tensor Processing Unit (TPU), purpose-built for real-time AI inference rather than model training.
  • The chip is available in two configurations: a 256-chip cluster and a 9,216-chip cluster, designed to handle diverse AI workload demands at scale.
  • Each Ironwood chip delivers 4,614 TFLOPs of peak compute, features 192GB of high-bandwidth memory, and includes an enhanced SparseCore for advanced workloads like ranking and recommendations.
  • Ironwood integrates with Google Cloud's AI Hypercomputer and Pathways software, enabling efficient distributed AI inference across tens of thousands of chips.
  • Google positions Ironwood as a strategic solution to reduce the rising costs of AI inference, offering significant performance and energy efficiency improvements over previous TPU generations.