Overview
- CEO Jensen Huang introduced the Vera Rubin six‑chip platform at CES, saying it is in full production with claims of up to ten times lower inference token cost compared with Blackwell.
- Rubin combines new GPUs and the Vera CPU with NVLink 6, Spectrum‑X networking, ConnectX‑9 SuperNICs, and BlueField‑4 DPUs, targeting agentic, long‑context and mixture‑of‑experts workloads.
- Nvidia said partner offerings based on Rubin will arrive in the second half of 2026, naming Microsoft Azure and CoreWeave as early cloud deployers.
- To support long‑context inference, Nvidia unveiled the Inference Context Memory Storage platform, citing up to 5x more tokens per second, 5x better performance per TCO dollar, and 5x better power efficiency versus traditional approaches.
- Alongside Rubin, Nvidia released Alpamayo open reasoning VLA models, the AlpaSim simulator, and 1,700 hours of driving data, with the first Alpamayo‑equipped Mercedes‑Benz CLA expected on U.S. roads this year on the Nvidia DRIVE stack.