Particle News: Microsoft Azure Launches First Production GB300 NVL72 Supercluster for OpenAI

Overview

Azure says the NDv6 GB300 VM series is now running on a production supercluster with more than 4,600 NVIDIA Blackwell Ultra GPUs purpose‑built for OpenAI’s highest‑demand inference work.
Each GB300 NVL72 rack combines 72 Blackwell Ultra GPUs with 36 Grace CPUs into a 37 TB fast‑memory domain, delivering 130 TB/s NVLink bandwidth in‑rack and 800 Gb/s per GPU via Quantum‑X800 InfiniBand across racks.
NVIDIA reports record MLPerf Inference v5.1 results for GB300 NVL72 systems, including up to 5x higher per‑GPU throughput on the 671‑billion‑parameter DeepSeek‑R1 reasoning benchmark versus Hopper.
Microsoft details a jointly engineered data‑center and software stack—custom protocols, collective libraries, in‑network computing, and SHARP‑accelerated operations—aimed at higher utilization, faster training, and lower latency at supercomputer scale.
Azure plans phased global expansion to hundreds of thousands of Blackwell Ultra GPUs with an objective to enable training of models measured in hundreds of trillions of parameters.