Particle News: Alibaba’s Upgraded Qwen3 Model Tops OpenAI and DeepSeek in Maths and Coding

Overview

The open-source Qwen3-235B-A22B-Instruct-2507-FP8 achieved significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.
It scored 70.3 on the 2025 American Invitational Mathematics Examination, outperforming DeepSeek-V3-0324 by 23.7 points and GPT-4o-0327 by 43.6 points.
On the MultiPL-E coding benchmark, it earned 87.9 points, ahead of DeepSeek and OpenAI models but just behind Anthropic’s Claude Opus 4 Non-thinking.
The model operates exclusively in non-thinking mode, delivering direct outputs without revealing its reasoning steps.
Alibaba plans to integrate a three-billion-parameter Qwen variant into HP’s Xiaowei Hui assistant on PCs in China to enhance document drafting and meeting summarization.