Particle.news
Download on the App Store

Qwen3-TTS Launches as Open-Source, Low-Latency Multilingual TTS With 3-Second Voice Cloning

Open licensing signals a push for broad community use.

Overview

  • A newly posted arXiv report details a dual-track language-model design paired with two speech tokenizers to enable real-time synthesis and fine-grained control.
  • The 12Hz tokenizer is built for ultra-low latency, with first audio packets reported at about 97 milliseconds.
  • Training covers more than five million hours across ten languages to support multilingual and robust speech generation.
  • The authors report state-of-the-art results on a multilingual TTS test set, InstructTTSEval, and a long-speech benchmark.
  • Tongyi Lab says the full family is open-sourced under Apache 2.0, with weights, code, demos, and variants including VoiceDesign, CustomVoice, and Base at 0.6B and 1.7B parameters with fine-tuning support.