Text Generation
PEFT
Safetensors
mistral
conversational
Eval Results
dfurman commited on
Commit
2b75850
1 Parent(s): 87dac68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -7
README.md CHANGED
@@ -39,13 +39,16 @@ This model was built via parameter-efficient finetuning of the [mistralai/Mixtra
39
 
40
  ## Evaluation Results
41
 
42
- | Metric | Value |
43
- |-----------------------|-------|
44
- | MMLU (5-shot) | Coming |
45
- | ARC (25-shot) | Coming |
46
- | HellaSwag (10-shot) | Coming |
47
- | TruthfulQA (0-shot) | Coming |
48
- | Avg. | Coming |
 
 
 
49
 
50
  We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
51
 
 
39
 
40
  ## Evaluation Results
41
 
42
+ | Metric | Value |
43
+ |-----------------------|---------------------------|
44
+ | Avg. | 68.87 |
45
+ | ARC (25-shot) | 67.24 |
46
+ | HellaSwag (10-shot) | 86.03 |
47
+ | MMLU (5-shot) | 68.59 |
48
+ | TruthfulQA (0-shot) | 59.54 |
49
+ | Winogrande (5-shot) | 80.43 |
50
+ | GSM8K (5-shot) | 51.4 |
51
+
52
 
53
  We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
54