Overview
- DeepSeek released and open-sourced DeepSeek-V3.2-Exp, making the model available across its app, web and API, with a paper on GitHub and model files on HuggingFace and ModelScope.
- The update introduces DeepSeek Sparse Attention, which the company says significantly improves training and inference efficiency on long texts while maintaining output quality.
- Training settings were aligned with V3.1-Terminus, and public benchmark performance is reported as roughly on par to enable fair comparison.
- API pricing was reduced by more than 50% for developers, and access to the V3.1-Terminus endpoint is temporarily retained to support side-by-side testing.
- Huawei’s Ascend platform announced same-day support with open inference code and new operator implementations, citing sub-2-second TTFT and sub-30-millisecond TPOT at 128K using vLLM and SGLang integrations.