Particle.news

Download on the App Store

DeepSeek Releases Open-Source V3.2‑Exp With Sparse Attention, Cuts API Prices as Huawei Offers 0‑Day Support

Framed as a transitional step, the release isolates a new sparse‑attention design to improve long‑context efficiency without degrading public‑benchmark results.

Overview

  • DeepSeek made the experimental V3.2‑Exp model and paper publicly available on Hugging Face, ModelScope, and GitHub, and rolled the model into its app and web clients.
  • The update introduces DeepSeek Sparse Attention, a fine‑grained sparse attention mechanism aimed at boosting long‑sequence training and inference efficiency.
  • Training settings were aligned with V3.1‑Terminus, and DeepSeek reports roughly comparable performance on public benchmarks while keeping temporary V3.1 API endpoints for comparison.
  • DeepSeek reduced API pricing by more than 50% effective immediately, citing lower service costs with the new model.
  • Huawei’s Ascend platform announced 0‑day support with open‑sourced inference code and operators via vLLM and SGLang, claiming 128K‑sequence TTFT under 2 seconds and TPOT under 30 milliseconds alongside new LI and SFA operator implementations and tools such as PyPTO and TileLang.