Overview
- Azure says the NDv6 GB300 VM series is now running on a production supercluster with more than 4,600 NVIDIA Blackwell Ultra GPUs purpose‑built for OpenAI’s highest‑demand inference work.
- Each GB300 NVL72 rack combines 72 Blackwell Ultra GPUs with 36 Grace CPUs into a 37 TB fast‑memory domain, delivering 130 TB/s NVLink bandwidth in‑rack and 800 Gb/s per GPU via Quantum‑X800 InfiniBand across racks.
- NVIDIA reports record MLPerf Inference v5.1 results for GB300 NVL72 systems, including up to 5x higher per‑GPU throughput on the 671‑billion‑parameter DeepSeek‑R1 reasoning benchmark versus Hopper.
- Microsoft details a jointly engineered data‑center and software stack—custom protocols, collective libraries, in‑network computing, and SHARP‑accelerated operations—aimed at higher utilization, faster training, and lower latency at supercomputer scale.
- Azure plans phased global expansion to hundreds of thousands of Blackwell Ultra GPUs with an objective to enable training of models measured in hundreds of trillions of parameters.