Overview
- Nemotron 3 debuts in three sizes—Nano (30B parameters) available now, with Super (100B) and Ultra (500B) slated for the first half of 2026.
- The models use a hybrid latent mixture‑of‑experts design with a 1‑million‑token context window, and Nvidia claims up to 4x higher token throughput versus Nemotron 2 Nano and roughly 60% fewer reasoning tokens.
- Nvidia is openly releasing three trillion tokens of pre‑training data, about 18 million post‑training samples, and new NeMo Gym and NeMo RL libraries plus evaluation tools for customization and agent training.
- Distribution starts with Hugging Face, inference providers such as Baseten, Deepinfra, Fireworks, FriendliAI, OpenRouter, and Together AI, and availability on AWS Bedrock, with support coming to Google Cloud and other platforms.
- Nvidia frames Nemotron 3 as an open foundation for multi‑agent systems, cites Artificial Analysis for strong openness and efficiency rankings, and highlights early enterprise interest from firms including Palantir, Accenture, Oracle Cloud Infrastructure, ServiceNow, Siemens, and Zoom.