migtissera commited on
Commit
b86bf2c
·
verified ·
1 Parent(s): 9099977

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -4,15 +4,16 @@ license_name: qwen2
4
  license_link: https://huggingface.co/Qwen/Qwen2-72B/blob/main/LICENSE
5
  ---
6
 
7
- ![Tess-v2.5](https://huggingface.co/migtissera/Tess-v2.5-Qwen2-72B/resolve/main/Tess-v2.5.png)
8
 
9
- # Tess-v2.5-Qwen2-72B
10
 
11
- We've created Tess-v2.5-Qwen2-72B, the latest state-of-the-art model in the Tess series of Large Language Models (LLMs). Tess, short for Tesoro (<em>Treasure</em> in Italian), is the flagship LLM series created by Migel Tissera. Tess-v2.5 brings significant improvements in reasoning capabilities, coding capabilities and mathematics. It is currently the #1 ranked open weight model when evaluated on MMLU (Massive Multitask Language Understanding). It scores higher than all other open weight models including Qwen2-72B-Instruct, Llama3-70B-Instruct, Mixtral-8x22B-Instruct and DBRX-Instruct. Further, when evaluated on MMLU, Tess-v2.5-Qwen2-72B model outperforms even the frontier closed models Gemini-1.0-Ultra, Gemini-1.5-Pro, Mistral-Large and Claude-3-Sonnet.
12
 
13
- Tess-v2.5-Qwen2-72B was fine-tuned over the Qwen2-72B base, using the Tess-v2.5 dataset that contains 300K samples spanning multiple topics, including business and management, marketing, history, social sciences, arts,STEM subjects and computer programming. This dataset was synthetically generated with the [Sensei](https://github.com/migtissera/Sensei) framework, using multiple frontier models such as GPT-4-Turbo, Claude-Opus and Mistral-Large.
14
 
 
15
 
 
16
 
17
  # Evaluation
18
 
 
4
  license_link: https://huggingface.co/Qwen/Qwen2-72B/blob/main/LICENSE
5
  ---
6
 
7
+ # Tess-v2.5 (Qwen2-72B)
8
 
9
+ ![Tess-v2.5](https://huggingface.co/migtissera/Tess-v2.5-Qwen2-72B/resolve/main/Tess-v2.5.png)
10
 
 
11
 
12
+ We've created Tess-v2.5, the latest state-of-the-art model in the Tess series of Large Language Models (LLMs). Tess, short for Tesoro (<em>Treasure</em> in Italian), is the flagship LLM series created by Migel Tissera. Tess-v2.5 brings significant improvements in reasoning capabilities, coding capabilities and mathematics. It is currently the #1 ranked open weight model when evaluated on MMLU (Massive Multitask Language Understanding). It scores higher than all other open weight models including Qwen2-72B-Instruct, Llama3-70B-Instruct, Mixtral-8x22B-Instruct and DBRX-Instruct. Further, when evaluated on MMLU, Tess-v2.5 (Qwen2-72B) model outperforms even the frontier closed models Gemini-1.0-Ultra, Gemini-1.5-Pro, Mistral-Large and Claude-3-Sonnet.
13
 
14
+ Tess-v2.5 (Qwen2-72B) was fine-tuned over the newly released Qwen2-72B base, using the Tess-v2.5 dataset that contain 300K samples spanning multiple topics, including business and management, marketing, history, social sciences, arts, STEM subjects and computer programming. This dataset was synthetically generated using the [Sensei](https://github.com/migtissera/Sensei) framework, using multiple frontier models such as GPT-4-Turbo, Claude-Opus and Mistral-Large.
15
 
16
+ The compute for this model was generously sponsored by [KindoAI](kindo.ai).
17
 
18
  # Evaluation
19