Particle.news
Download on the App Store

DeepSeek Publishes 'Engram' Conditional Memory for LLMs With Reported Gains Over MoE

By adding deterministic N‑gram lookups with context gating, the module shifts static pattern retrieval out of attention to free compute for reasoning.

Overview

  • The paper identifies a U-shaped sparsity-allocation law, with roughly 20–25% of sparse parameters assigned to Engram yielding the lowest validation loss.
  • Under matched parameter and FLOPs budgets, Engram-27B outperforms a MoE-27B baseline on knowledge, reasoning, code, and math benchmarks (e.g., MMLU +3.0, BBH +5.0, HumanEval +3.0, MATH +2.4).
  • The approach improves long-context and variable-tracking performance in the authors' tests, including RULER Multi-Query NIAH rising from 84.2 to 97.0 and Variable Tracking from 77.0 to 89.0.
  • Deterministic retrieval enables offloading very large embedding tables to CPU memory with prefetching, with reported inference throughput overhead around 3% on H800-class GPUs.
  • Media reports suggest DeepSeek may integrate Engram into an upcoming V4 sparse model before the Lunar New Year, though this has not been independently confirmed.