Particle News: Alibaba’s Aegaeon Claims 82% Cut in Nvidia GPU Needs in Peer-Reviewed Tests

Overview

In a multi-month Model Studio beta, required accelerators fell from 1,192 to 213, with reporting indicating the use of Nvidia H20 chips that remain available in China under U.S. export rules.
Scheduling work at the token level pooled GPUs across many models and lifted effective output by up to nine times compared with older serverless approaches.
The research was presented at the 2025 ACM SOSP in Seoul by authors from Peking University and Alibaba, including CTO Jingren Zhou.
The paper does not detail the network fabric, and analysts caution the gains may depend on Alibaba’s vertically integrated stack, leaving portability outside its environment unverified.
Market commentary noted shares of several data-center and related companies weakened after the reports, while the broader impact on GPU demand awaits independent replication.