Overview
- Z.ai has released two models—GLM-4.5 and the smaller GLM-4.5-Air—built on agentic AI architecture that splits tasks into sub-tasks for more accurate completion
- The series employs a Mixture of Experts design with 355 billion parameters of which only 32 billion are active at any moment, enabling efficient operation on eight Nvidia H20 chips
- API pricing is set at $0.11 per million input tokens and $0.28 per million output tokens, undercutting DeepSeek’s 14¢ and $2.19 rates respectively
- The GLM-4.5 models are available under an MIT license despite Z.ai’s placement on the U.S. Entity List, allowing developers unrestricted access on GitHub and Hugging Face
- Backed by over $1.5 billion in state and private funding, Z.ai is advancing plans for an initial public offering in Greater China