bigband commited on
Commit
24e589c
·
verified ·
1 Parent(s): 5273695

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +95 -0
README.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - de
6
+ - es
7
+ - it
8
+ - pt
9
+ - zh
10
+ - ja
11
+ - ru
12
+ - ko
13
+ license: apache-2.0
14
+ library_name: vllm
15
+ inference: false
16
+ extra_gated_description: >-
17
+ If you want to learn more about how we process your personal data, please read
18
+ our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
19
+ tags:
20
+ - transformers
21
+ ---
22
+
23
+ # Model Card for Mistral-Small-24B-Base-2501
24
+
25
+ Mistral Small 3 ( 2501 ) sets a new benchmark in the "small" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models!
26
+ Check out our fine-tuned Instruct version [Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501).
27
+
28
+ For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community.
29
+
30
+ This release demonstrates our commitment to open source, serving as a strong base model.
31
+
32
+ Learn more about Mistral Small in our [blog post](https://mistral.ai/news/mistral-small-3/).
33
+
34
+ Model developper: Mistral AI Team
35
+
36
+ ## Key Features
37
+ - **Multilingual:** Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.
38
+ - **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities.
39
+ - **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes.
40
+ - **Context Window:** A 32k context window.
41
+ - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size.
42
+
43
+ ## Benchmark Results
44
+
45
+ | Benchmark | Metric | Mistral-Small-24B-Base |
46
+ | ------------------------------ | ------------- | ----------- |
47
+ | [MMLU][mmlu] | 5-shot | 80.73 |
48
+ | [MMLU Pro][mmlu_pro] | 5-shot, CoT | 54.37 |
49
+ | [GPQA Main][gpqa] | 5-shot, CoT | 34.37 |
50
+ | [TriviaQA][triviaqa] | 5-shot | 80.32 |
51
+ | [ARC-c][arc] | 0-shot | 91.29 |
52
+ | [TriviaQA][triviaqa] | 5-shot | 76.6 |
53
+ | [MBPP][mbpp] | pass@1 | 69.64 |
54
+ | [GSM8K][gsm8k] | 5-shot, maj@1 | 80.73 |
55
+ | [MATH][math] | 4-shot, MaJ | 45.98 |
56
+ | [AGIEval][agieval] | - | 65.80 |
57
+
58
+ | Benchmark | Metric | Mistral-Small-24B-Base |
59
+ | ------------------------------ | ------------- | ----------- |
60
+ | French MMLU | - | 78.03 |
61
+ | German MMLU | - | 77.69 |
62
+ | Spanish MMLU | - | 78.86 |
63
+ | Russian MMLU | - | 75.64 |
64
+ | Chinese MMLU | - | 70.35 |
65
+ | Korean MMLU | - | 56.42 |
66
+ | Japanese MMLU | - | 74.46 |
67
+
68
+
69
+ [mmlu]: https://arxiv.org/abs/2009.03300
70
+ [hellaswag]: https://arxiv.org/abs/1905.07830
71
+ [piqa]: https://arxiv.org/abs/1911.11641
72
+ [socialiqa]: https://arxiv.org/abs/1904.09728
73
+ [boolq]: https://arxiv.org/abs/1905.10044
74
+ [winogrande]: https://arxiv.org/abs/1907.10641
75
+ [commonsenseqa]: https://arxiv.org/abs/1811.00937
76
+ [openbookqa]: https://arxiv.org/abs/1809.02789
77
+ [arc]: https://arxiv.org/abs/1911.01547
78
+ [triviaqa]: https://arxiv.org/abs/1705.03551
79
+ [naturalq]: https://github.com/google-research-datasets/natural-questions
80
+ [humaneval]: https://arxiv.org/abs/2107.03374
81
+ [mbpp]: https://arxiv.org/abs/2108.07732
82
+ [gsm8k]: https://arxiv.org/abs/2110.14168
83
+ [realtox]: https://arxiv.org/abs/2009.11462
84
+ [bold]: https://arxiv.org/abs/2101.11718
85
+ [crows]: https://aclanthology.org/2020.emnlp-main.154/
86
+ [bbq]: https://arxiv.org/abs/2110.08193v2
87
+ [winogender]: https://arxiv.org/abs/1804.09301
88
+ [truthfulqa]: https://arxiv.org/abs/2109.07958
89
+ [winobias]: https://arxiv.org/abs/1804.06876
90
+ [math]: https://arxiv.org/abs/2103.03874
91
+ [agieval]: https://arxiv.org/abs/2304.06364
92
+ [big-bench]: https://arxiv.org/abs/2206.04615
93
+ [toxigen]: https://arxiv.org/abs/2203.09509
94
+ [mmlu_pro]: https://arxiv.org/abs/2406.01574
95
+ [gpqa]: https://arxiv.org/abs/2311.12022