Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
- fr
|
5 |
+
- de
|
6 |
+
- es
|
7 |
+
- it
|
8 |
+
- pt
|
9 |
+
- zh
|
10 |
+
- ja
|
11 |
+
- ru
|
12 |
+
- ko
|
13 |
+
license: apache-2.0
|
14 |
+
library_name: vllm
|
15 |
+
inference: false
|
16 |
+
extra_gated_description: >-
|
17 |
+
If you want to learn more about how we process your personal data, please read
|
18 |
+
our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
19 |
+
tags:
|
20 |
+
- transformers
|
21 |
+
---
|
22 |
+
|
23 |
+
# Model Card for Mistral-Small-24B-Base-2501
|
24 |
+
|
25 |
+
Mistral Small 3 ( 2501 ) sets a new benchmark in the "small" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models!
|
26 |
+
Check out our fine-tuned Instruct version [Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501).
|
27 |
+
|
28 |
+
For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community.
|
29 |
+
|
30 |
+
This release demonstrates our commitment to open source, serving as a strong base model.
|
31 |
+
|
32 |
+
Learn more about Mistral Small in our [blog post](https://mistral.ai/news/mistral-small-3/).
|
33 |
+
|
34 |
+
Model developper: Mistral AI Team
|
35 |
+
|
36 |
+
## Key Features
|
37 |
+
- **Multilingual:** Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.
|
38 |
+
- **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities.
|
39 |
+
- **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes.
|
40 |
+
- **Context Window:** A 32k context window.
|
41 |
+
- **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size.
|
42 |
+
|
43 |
+
## Benchmark Results
|
44 |
+
|
45 |
+
| Benchmark | Metric | Mistral-Small-24B-Base |
|
46 |
+
| ------------------------------ | ------------- | ----------- |
|
47 |
+
| [MMLU][mmlu] | 5-shot | 80.73 |
|
48 |
+
| [MMLU Pro][mmlu_pro] | 5-shot, CoT | 54.37 |
|
49 |
+
| [GPQA Main][gpqa] | 5-shot, CoT | 34.37 |
|
50 |
+
| [TriviaQA][triviaqa] | 5-shot | 80.32 |
|
51 |
+
| [ARC-c][arc] | 0-shot | 91.29 |
|
52 |
+
| [TriviaQA][triviaqa] | 5-shot | 76.6 |
|
53 |
+
| [MBPP][mbpp] | pass@1 | 69.64 |
|
54 |
+
| [GSM8K][gsm8k] | 5-shot, maj@1 | 80.73 |
|
55 |
+
| [MATH][math] | 4-shot, MaJ | 45.98 |
|
56 |
+
| [AGIEval][agieval] | - | 65.80 |
|
57 |
+
|
58 |
+
| Benchmark | Metric | Mistral-Small-24B-Base |
|
59 |
+
| ------------------------------ | ------------- | ----------- |
|
60 |
+
| French MMLU | - | 78.03 |
|
61 |
+
| German MMLU | - | 77.69 |
|
62 |
+
| Spanish MMLU | - | 78.86 |
|
63 |
+
| Russian MMLU | - | 75.64 |
|
64 |
+
| Chinese MMLU | - | 70.35 |
|
65 |
+
| Korean MMLU | - | 56.42 |
|
66 |
+
| Japanese MMLU | - | 74.46 |
|
67 |
+
|
68 |
+
|
69 |
+
[mmlu]: https://arxiv.org/abs/2009.03300
|
70 |
+
[hellaswag]: https://arxiv.org/abs/1905.07830
|
71 |
+
[piqa]: https://arxiv.org/abs/1911.11641
|
72 |
+
[socialiqa]: https://arxiv.org/abs/1904.09728
|
73 |
+
[boolq]: https://arxiv.org/abs/1905.10044
|
74 |
+
[winogrande]: https://arxiv.org/abs/1907.10641
|
75 |
+
[commonsenseqa]: https://arxiv.org/abs/1811.00937
|
76 |
+
[openbookqa]: https://arxiv.org/abs/1809.02789
|
77 |
+
[arc]: https://arxiv.org/abs/1911.01547
|
78 |
+
[triviaqa]: https://arxiv.org/abs/1705.03551
|
79 |
+
[naturalq]: https://github.com/google-research-datasets/natural-questions
|
80 |
+
[humaneval]: https://arxiv.org/abs/2107.03374
|
81 |
+
[mbpp]: https://arxiv.org/abs/2108.07732
|
82 |
+
[gsm8k]: https://arxiv.org/abs/2110.14168
|
83 |
+
[realtox]: https://arxiv.org/abs/2009.11462
|
84 |
+
[bold]: https://arxiv.org/abs/2101.11718
|
85 |
+
[crows]: https://aclanthology.org/2020.emnlp-main.154/
|
86 |
+
[bbq]: https://arxiv.org/abs/2110.08193v2
|
87 |
+
[winogender]: https://arxiv.org/abs/1804.09301
|
88 |
+
[truthfulqa]: https://arxiv.org/abs/2109.07958
|
89 |
+
[winobias]: https://arxiv.org/abs/1804.06876
|
90 |
+
[math]: https://arxiv.org/abs/2103.03874
|
91 |
+
[agieval]: https://arxiv.org/abs/2304.06364
|
92 |
+
[big-bench]: https://arxiv.org/abs/2206.04615
|
93 |
+
[toxigen]: https://arxiv.org/abs/2203.09509
|
94 |
+
[mmlu_pro]: https://arxiv.org/abs/2406.01574
|
95 |
+
[gpqa]: https://arxiv.org/abs/2311.12022
|