Elon Musk's xAI Builds World's Largest AI Supercomputer with Nvidia Tech. xAI

xAI's Colossus, powered by 100,000 NVIDIA GPUs, is set to double in size, showcasing advanced networking and rapid deployment capabilities.

Overview

xAI's Colossus supercomputer, located in Memphis, Tennessee, currently operates with 100,000 NVIDIA Hopper GPUs and is recognized as the largest AI training cluster globally.
The supercomputer was constructed in just 122 days, with AI model training commencing 19 days post-installation, a timeline praised as 'superhuman' by NVIDIA CEO Jensen Huang.
Colossus utilizes Nvidia's Spectrum-X Ethernet for high data throughput and minimal latency, crucial for AI training tasks that require processing vast datasets efficiently.
Plans are underway to expand Colossus to 200,000 GPUs, incorporating next-gen H200 GPUs alongside existing H100 units, which will significantly increase its computational capacity.
The supercomputer supports xAI's Grok language models, aimed at enhancing AI-driven chatbot features for X Premium subscribers, highlighting the system's role in advancing AI technology.