Particle News: Nvidia Debuts Rubin CPX, a Long-Context GPU for Million-Token AI and Video

Overview

Rubin CPX is purpose-built for massive-context inference, integrating video decoding and encoding with long-context processing on a single chip.
Nvidia lists roughly 30 petaFLOPs of NVFP4 compute and up to 128 GB of cost-efficient GDDR7 memory per CPX device.
The Vera Rubin NVL144 CPX rack pairs 144 CPX accelerators with 144 Rubin GPUs and 36 Vera CPUs for about 8 exaFLOPs of NVFP4, which Nvidia says is 7.5x the GB300 NVL72.
Nvidia will offer CPX as part of new NVL144 CPX racks and as separate trays or racks that attach to existing Rubin deployments.
The company pitches use cases such as understanding large software projects and hour-scale generative video, and it claims a $100 million deployment could generate $5 billion in token revenue.