Particle: Microsoft Rolls Out Maia 200 AI Inference Chip to U.S. Datacenters

Overview

Initial Maia 200 systems are live in Azure’s US Central region in Iowa with a planned expansion to Phoenix and other U.S. sites.
The accelerator delivers 10.1 PFLOPS at FP4 and about 5 PFLOPS at FP8 with 272 MB of inline SRAM, 216 GB of HBM3e, and 7 TB/s of memory bandwidth.
Microsoft says Maia 200 is its most efficient inference deployment to date and promises up to 30% better performance per dollar than its current hardware.
Early units are assigned to the Microsoft Superintelligence team, Microsoft Foundry, and Copilot, and they will run the latest OpenAI models on Azure.
Fabricated by TSMC on a 3 nm process, Maia 200 advances Microsoft’s in‑house silicon strategy as the company designs a successor called Maia 300 and positions the lineup as an alternative to Nvidia and rival cloud chips.