Particle.news
Download on the App Store

Microsoft Rolls Out Maia 200 AI Inference Chip to U.S. Datacenters

Microsoft casts the chip as a more cost‑efficient inference alternative to rival cloud silicon.

Overview

  • Initial Maia 200 systems are live in Azure’s US Central region in Iowa with a planned expansion to Phoenix and other U.S. sites.
  • The accelerator delivers 10.1 PFLOPS at FP4 and about 5 PFLOPS at FP8 with 272 MB of inline SRAM, 216 GB of HBM3e, and 7 TB/s of memory bandwidth.
  • Microsoft says Maia 200 is its most efficient inference deployment to date and promises up to 30% better performance per dollar than its current hardware.
  • Early units are assigned to the Microsoft Superintelligence team, Microsoft Foundry, and Copilot, and they will run the latest OpenAI models on Azure.
  • Fabricated by TSMC on a 3 nm process, Maia 200 advances Microsoft’s in‑house silicon strategy as the company designs a successor called Maia 300 and positions the lineup as an alternative to Nvidia and rival cloud chips.