Particle.news

Download on the App Store

Microsoft Azure Launches First Production GB300 NVL72 Supercluster for OpenAI

A co‑engineered stack links rack‑scale NVLink with 800 Gb/s InfiniBand to speed inference for large reasoning plus multimodal models.

Overview

  • Azure says the NDv6 GB300 VM series is now running on a production supercluster with more than 4,600 NVIDIA Blackwell Ultra GPUs purpose‑built for OpenAI’s highest‑demand inference work.
  • Each GB300 NVL72 rack combines 72 Blackwell Ultra GPUs with 36 Grace CPUs into a 37 TB fast‑memory domain, delivering 130 TB/s NVLink bandwidth in‑rack and 800 Gb/s per GPU via Quantum‑X800 InfiniBand across racks.
  • NVIDIA reports record MLPerf Inference v5.1 results for GB300 NVL72 systems, including up to 5x higher per‑GPU throughput on the 671‑billion‑parameter DeepSeek‑R1 reasoning benchmark versus Hopper.
  • Microsoft details a jointly engineered data‑center and software stack—custom protocols, collective libraries, in‑network computing, and SHARP‑accelerated operations—aimed at higher utilization, faster training, and lower latency at supercomputer scale.
  • Azure plans phased global expansion to hundreds of thousands of Blackwell Ultra GPUs with an objective to enable training of models measured in hundreds of trillions of parameters.