Overview
- Each Ironwood chip includes 192 GiB of HBM3E, with a Superpod reported at 9,216 chips delivering about 42.5 exaFLOPS of FP8 compute and roughly 1.77 PB of aggregate HBM.
- The architecture treats a pod as one supercomputer using massive-scale RDMA, a 3D Torus Inter‑Chip Interconnect, and an Optical Circuit Switch that can reroute around unhealthy components.
- Google targets the stack with XLA and supports both JAX and a native PyTorch experience, while Pallas and the Mosaic backend enable Python-defined custom kernels.
- Google cites roughly 2× performance per watt versus Trillium and frames Ironwood as optimized for inference where on‑package memory, latency, throughput, and cost per query dominate.
- Availability is described as coming in the next few weeks across workloads, and reporting indicates the offering is exclusive to Google Cloud, raising lock‑in considerations.