Particle News: DeepSeek Releases Open-Source V3.2‑Exp With Sparse Attention, Cuts API Prices as Huawei Offers 0‑Day Support

Overview

DeepSeek made the experimental V3.2‑Exp model and paper publicly available on Hugging Face, ModelScope, and GitHub, and rolled the model into its app and web clients.
The update introduces DeepSeek Sparse Attention, a fine‑grained sparse attention mechanism aimed at boosting long‑sequence training and inference efficiency.
Training settings were aligned with V3.1‑Terminus, and DeepSeek reports roughly comparable performance on public benchmarks while keeping temporary V3.1 API endpoints for comparison.
DeepSeek reduced API pricing by more than 50% effective immediately, citing lower service costs with the new model.
Huawei’s Ascend platform announced 0‑day support with open‑sourced inference code and operators via vLLM and SGLang, claiming 128K‑sequence TTFT under 2 seconds and TPOT under 30 milliseconds alongside new LI and SFA operator implementations and tools such as PyPTO and TileLang.