Wes Roth | Qwen 3: A New Chinese Open-source AI Model
The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI.
2025-04-29 02:00:00 - Wes Roth
Brief:
- Qwen 3 is a new Chinese open-source AI model that competes with top-tier models like DeepSeek R1 and Gemini 2.5 Pro.
- The flagship Qwen 3 model, with 235 billion parameters, uses a "mixture of experts" approach, activating different parts of the model based on the query.
- The model supports both thinking and non-thinking modes, allowing for deep reasoning or fast responses depending on the task.
- Qwen 3 outperforms Gemini 2.5 Pro and O3 Mini in benchmarks like Aimeme 24 and 25, especially in coding and agentic tasks.
- It has improved agentic capabilities, supports 119 languages, and offers enhanced programming abilities.
- Qwen 3's dataset has been significantly expanded, using both web and PDF sources, with additional data from synthetic code and math data generation.
- The training process involves three stages, including reinforcement learning and combining thinking and non-thinking modes for better performance.
- The model is open-source under the Apache 2.0 license, allowing for commercial use and modification, promoting research and product development.