Particle News: Nvidia Puts Rubin AI Platform Into Full Production at CES, Touting 10x Inference Cost Cut

Overview

The extreme‑codesigned Rubin system integrates six components — Rubin GPUs, Vera CPUs, NVLink 6, Spectrum‑X, ConnectX‑9 SuperNICs and BlueField‑4 DPUs — as a single AI supercomputing platform.
Nvidia says Rubin reduces inference token cost to about one‑tenth of Blackwell and can train some large models with roughly one‑quarter as many chips, with the Rubin GPU also touted as 5x faster for inference than Blackwell.
Rubin-based cloud instances are slated for the second half of 2026 with Microsoft Azure and CoreWeave named as initial providers, and Microsoft’s new Georgia and Wisconsin data centers expected to host thousands of Rubin chips.
A new AI-native Inference Context Memory Storage Platform, powered by BlueField‑4 and Spectrum‑X, targets long‑context workloads with claims of up to 5x gains in tokens per second, TCO performance and power efficiency.
Nvidia expanded open, domain‑specific models including Nemotron and Cosmos and launched Alpamayo VLA for autonomous driving with a 1,700‑hour driving dataset and the open AlpaSim simulator, while Qualcomm introduced new Dragonwing IoT processors and a general‑purpose robotics architecture at CES.