On January 30, Mistral AI, French unicorn of the Genai, introduced Small 3, an LLM of 24 billion parameters, demonstrating that to be efficient, an LLM does not require an astronomical number of parameters. Small 3.1, its successor, retains a compact architecture while introducing significant improvements in terms of performance, multimodal understanding and long context management, thus surpassing models like Gemma 3-IT 27B from Google and GPT-4O Mini of Openai.

Like its predecessor, Small 3.1 has 24 billion parameters and can be deployed on accessible hardware configurations, such as a PC operating with a single RTX 4090 GPU or Mac with a 32 GB RAM memory, which allows companies to keep control over their sensitive data without depending on centralized cloud infrastructure. The speed of inference is the same: 150 tokens per second, guaranteeing minimal latency for applications requiring instant responses. True to its commitment to open source, Mistral AI offers the two models under Apache 2.0 license, allowing the community to use them, refine and deploy for various use cases.

Source: Mistral AI

Performance optimization

If Small 3.1 is based on Small 3, one of the major advances lies in the expansion of the contextual window from 32,000 to 128,000 tokens, an essential asset for tasks involving reasoning on long sequences of text. While Mistral Small 3 focused mainly on the text, version 3.1 improves the interpretation of images and documents, which positions it favorably in the face of small proprietary models and opens the door to various applications, ranging from industrial quality control to documentary recognition, including automatic analysis of medical images.

Mistral Small 3.1 is available in two formats:

  • A educated version, Mistral Small 3.1 instruct,, Ready to be used for conversational and language understanding;
  • A pre -worn version, Mistral Small 3.1 Base,, Ideal for fine-tuning and specialization in specific areas (health, finance, legal, etc.).

The instruct version is one of the best models in its category, surpassing its competitors on benchmarks requiring reasoning and contextual understanding. According to benchmarks shared by Mistral AI:

  • Small 3.1 Instruct displays better performance than Gemma 3-IT (27b) from Google in textual, multimodal and multilingual tasks;

  • It exceeds GPT-4O Mini of Openai in benchmarks like Mmlu, Humaneval and Longbench V2, in particular thanks to its contextual window extended to 128,000 tokens;

  • He also surpasses Claude-3.5 Haiku in complex tasks involving long contexts and multimodal data;

  • He excels in front of Cohere Aya-Vision (32b) in multimodal benchmarks like Chartqa and Docvqa, demonstrating an advanced understanding of visual and textual data;

  • Small 3.1 displays high performance in multilingualism, surpassing its competitors in categories such as European and Asian languages.

Mistral Small 3.1 can be downloaded from the Huggingface platform and tested on the Mistral AI platform. It is also available on Google Cloud Vertex AI and will be offered on Nvidia Nim in the coming weeks.