Overview
- Z.ai rolled out two open-source general language models, GLM-4.5 and its lighter Air variant, that unify reasoning, coding and agentic capabilities in a hybrid architecture with separate “thinking” and “non-thinking” modes and support a 128,000-token context window.
- The models require only eight Nvidia H20 chips to run, about half the hardware footprint of DeepSeek R1, thanks to a mixture-of-experts design with 355 billion total parameters and 32 billion active at inference.
- Z.ai’s pricing undercuts DeepSeek, at $0.11 per million input tokens and $0.28 per million output tokens versus DeepSeek’s $0.14 and $2.19 rates, and the weights are available open-source for free download.
- In benchmarks across 12 industry tests, GLM-4.5 and GLM-4.5-Air ranked third globally, trailing only OpenAI’s o3 and xAI’s Grok-4 and outperforming most Western open-source models.
- Despite its U.S. Entity List designation, Z.ai has secured over $1.5 billion from investors including Alibaba, Tencent and state-backed funds and is preparing for a Greater China IPO.