migtissera commited on
Commit
0523050
1 Parent(s): fddc2ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -12,6 +12,17 @@ model-index:
12
 
13
  <br>
14
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  Welcome to the Tess-Reasoning-1 (Tess-R1) series of models. Tess-R1 is designed with test-time compute in mind, and has the capabilities to produce a Chain-of-Thought (CoT) reasoning before producing the final output.
17
 
 
12
 
13
  <br>
14
 
15
+ # Evaluations
16
+
17
+ | | Tess-R1 Limerick | Claude 3.5 Haiku | GPT-4o mini |
18
+ |--------------|------------------|------------------|-------------|
19
+ | GPQA | 41.5% | 41.6% | 40.2% |
20
+ | MMLU | 81.6% | - | 82.0% |
21
+ | MATH | 64.2% | 69.4% | 70.2% |
22
+ | MMLU-Pro | 65.6% | 65.0% | - |
23
+ | HumanEval | | 88.1% | 87.2% |
24
+ | DROP (F1 Score) | | 83.1% | 79.7% |
25
+
26
 
27
  Welcome to the Tess-Reasoning-1 (Tess-R1) series of models. Tess-R1 is designed with test-time compute in mind, and has the capabilities to produce a Chain-of-Thought (CoT) reasoning before producing the final output.
28