QWQ-32B, Qwen Max… What model Alibaba Cloud for what use?

Alibaba Cloud offers a wide variety of LLM adapted to the majority of market use cases, for a very competitive price.

China continues to excel in AI, with a new glance signed Alibaba. The firm has just struck strongly on March 5, 2025 with QWQ-32B, a model of reasoning that rivals with the very publicized Deepseek R1, but for a fraction of its computational resources. Where Deepseek R1 requires 671 billion parameters and more than 1,500 GB of VRAM (16 GPU NVIDIA A100), QWQ-32B reaches comparable performance with only 32 billion parameters and 24 GB of true on a single GPU. This engineering feat is part of a wider Alibaba Cloud strategy, which has developed a remarkably complete LLM family since April 2023: Qwen. Its general models, owners as open source, regularly rank among the most efficient in benchmarks, confirming the rise of Chinese AI in the face of American heavy goods vehicles.

Owner models and open source versions

In March 2025, Alibaba Cloud offered both proprietary and open source models. Modality, context, performance on complex tasks, speed … Here is a comparison of all the large Language Model currently deployed on Alibaba Cloud Model Studio, the Alibaba Cloud platform dedicated to generative AI.

Owner models

Model Complex tasks Modality (input) Tokens (input) Tokens (output) Context Fast
Qwen-max (2.5 max) X text 30 720 8 192 32,768
Qwen-Plus X text 129 024 8 192 131 072 X
Qwen-turbo text 1,000,000 8 192 1,000,000 X
Qwen-Vl-Plus text, image 6,000 1,500 7,500
Qwen-Vl-Max X text, image 6,000 1,500 7,500

Open Source models

Model Complex tasks Modality (input) Tokens (input) Tokens (output) Context Fast
Qwen2.5-14b-instruct-1m text 1,000,000 8 192 1,000,000
Qwen2.5-7b-instruct-1m text 1,000,000 8 192 1,000,000 X
Qwen2.5-72b-instruct X text 129 024 8 192 131 072
Qwen2.5-32b-instruct X text 129 024 8 192 131 072
Qwen2.5-14b-instruct text 129 024 8 192 131 072
Qwen2.5-7b-instruct text 129 024 8 192 131 072 X
Qwen2-72b-instruct X text 128,000 6 144 131 072
QWEN2-57B-A14B-Instruct text 63,488 6 144 65 536
Qwen2-7b-instruct text 128,000 6 144 131 072 X
Qwen1.5-110b-chat text 6,000 2,000 8,000
Qwen1.5-72B-Chat text 6,000 2,000 8,000
Qwen1.5-32B-Chat text 6,000 2,000 8,000
Qwen1.5-14b-chat text 6,000 2,000 8,000
Qwen1.5-7B-Chat text 6,000 2,000 8,000 X
Qwen2.5-VL-72B-Instruct X text, image, video 129 024 8 192 131 072
Qwen2.5-VL-7B-Instruct text, image, video 129 024 8 192 131 072 X
Qwen2.5-VL-3B-Instruct text, image, video 129 024 8 192 131 072 X
QWQ-32B X text Nc Nc 131 072
For organizations seeking optimal performance with commercial support, the owner models of Alibaba Cloud offer solutions adapted to each need. Qwen-max (2.5 max) is essential for complex tasks requiring advanced intelligence, while Qwen-Plus offers an excellent balance between performance and cost. Companies favoring the speed of treatment will turn to Qwen-Turbo, ideal for real-time applications with a large capacity of a million context tokens. For the analysis of visual content, Qwen-Vl-Max excels in understanding complex images. Be careful, on the other hand, regarding the owner models of Alibaba: it is advisable to remain vigilant to the safety issues for the sensitive sectors.

The open source versions of Qwen constitute an interesting alternative for companies concerned with their technological independence or confronted with strict regulatory constraints (in local use therefore). For demanding tasks, Qwen2.5-72B-Instruct stands out with its 72 billion parameters, while QWEN2.5-7B-instruct effectively meets textual current needs with higher speed. Multimodal models like QWEN2.5-VL-72B-Instruct add the ability to process images and videos for up to 10 minutes. Qwen1.5 is, in our view, more relevant for production in production.

Finally, for companies wishing to develop agent capacities, QWQ-32B is the ideal solution. Unveiled on March 5, 2025, the model reasoned as well as Deepseek R1 with only 32 billion parameters.

Very attractive pricing

Model Price of 1000 tokens in Input ($) Price of 1000 tokens in output ($)
Qwen-max (2.5 max) 0.0016 0.0064
Qwen-Plus 0.0004 0.0012
Qwen-turbo 0.00005 0.0002
Qwen-Vl-Plus 0.00021 0.00063
Qwen-Vl-Max 0.0008 0.0032
Qwen2.5-14b-instruct-1m Free (for the moment) Free (for the moment)
Qwen2.5-7b-instruct-1m Free (for the moment) Free (for the moment)
Qwen2.5-72b-instruct Free (for the moment) Free (for the moment)
Qwen2.5-32b-instruct Free (for the moment) Free (for the moment)
Qwen2.5-14b-instruct Free (for the moment) Free (for the moment)
Qwen2.5-7b-instruct Free (for the moment) Free (for the moment)
Qwen2-72b-instruct Free (for the moment) Free (for the moment)
QWEN2-57B-A14B-Instruct Free (for the moment) Free (for the moment)
Qwen2-7b-instruct Free (for the moment) Free (for the moment)
Qwen1.5-110b-chat Free (for the moment) Free (for the moment)
Qwen1.5-72B-Chat Free (for the moment) Free (for the moment)
Qwen1.5-32B-Chat Free (for the moment) Free (for the moment)
Qwen1.5-14b-chat Free (for the moment) Free (for the moment)
Qwen1.5-7B-Chat Free (for the moment) Free (for the moment)
Qwen2.5-VL-72B-Instruct Free (for the moment) Free (for the moment)
Qwen2.5-VL-7B-Instruct Free (for the moment) Free (for the moment)
Qwen2.5-VL-3B-Instruct Free (for the moment) Free (for the moment)
QWQ-32B Nc Nc
The real asset of Alibaba Cloud’s QWEN models lies in their exceptional value for money. Where Qwen-Max invoices only 0.0016 dollars per 1000 tokens at the input and $ 0.0064 OPENAI GPT-4O costs $ 0.0025 at entry and $ 0.01 output, about 1.6 times more expensive. The gap is widening more with the fast models: Qwen-Turbo (0.00005 dollar in input, $ 0,0002 at output) is up to 3 times less expensive than GPT-4O Mini ($ 0.00015 in entry, $ 0.0006 on the output).

Qwen stands out as an alternative solution to American models. Combining technical performance, diversity of supply (proprietary models and open source) and prices up to three times lower than the competition, Qwen perfectly illustrates the rise of China in the field of artificial intelligence. A suitable value proposal adapted to the majority of use cases in business. Alibaba Cloud is now a key player in the world generative AI market.