Particle News: Microsoft Azure Launches First Production-Scale NVIDIA GB300 NVL72 Cluster for OpenAI

Overview

Azure introduced the NDv6 GB300 VM series built on liquid-cooled GB300 NVL72 racks optimized for reasoning models and complex multimodal AI.
The initial supercomputer links 4,608 NVIDIA Blackwell Ultra GPUs via the Quantum‑X800 InfiniBand platform providing 800 Gb/s of bandwidth per GPU.
Within each rack, 72 GPUs and 36 NVIDIA Grace CPUs share 37 TB of fast memory over fifth‑generation NVLink Switch fabric delivering 130 TB/s of intra-rack bandwidth.
NVIDIA reports MLPerf Inference v5.1 leadership for GB300 NVL72 systems, including up to 5x higher per‑GPU throughput on the 671‑billion‑parameter DeepSeek‑R1 model compared with Hopper.
Microsoft says the rollout required custom liquid cooling, new power distribution and a reengineered software stack, with plans to scale to hundreds of thousands of Blackwell Ultra GPUs.