Overview
- Rubin CPX is purpose-built for massive-context inference, integrating video decoding and encoding with long-context processing on a single chip.
- Nvidia lists roughly 30 petaFLOPs of NVFP4 compute and up to 128 GB of cost-efficient GDDR7 memory per CPX device.
- The Vera Rubin NVL144 CPX rack pairs 144 CPX accelerators with 144 Rubin GPUs and 36 Vera CPUs for about 8 exaFLOPs of NVFP4, which Nvidia says is 7.5x the GB300 NVL72.
- Nvidia will offer CPX as part of new NVL144 CPX racks and as separate trays or racks that attach to existing Rubin deployments.
- The company pitches use cases such as understanding large software projects and hour-scale generative video, and it claims a $100 million deployment could generate $5 billion in token revenue.