dfurman
/

Mixtral-8x7B-Instruct-v0.1

Text Generation

Model card Files Files and versions Community

dfurman commited on Jan 2

Commit

2b75850

•

1 Parent(s): 87dac68

Update README.md

Files changed (1) hide show

README.md +10 -7

README.md CHANGED Viewed

@@ -39,13 +39,16 @@ This model was built via parameter-efficient finetuning of the [mistralai/Mixtra
 ## Evaluation Results
-| Metric                | Value |
-|-----------------------|-------|
-| MMLU (5-shot)         | Coming |
-| ARC (25-shot)         | Coming |
-| HellaSwag (10-shot)   | Coming |
-| TruthfulQA (0-shot)   | Coming |
-| Avg.                  | Coming |
 We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).

 ## Evaluation Results
+| Metric                | Value                     |
+|-----------------------|---------------------------|
+| Avg.                  | 68.87   |
+| ARC (25-shot)         | 67.24          |
+| HellaSwag (10-shot)   | 86.03    |
+| MMLU (5-shot)         | 68.59         |
+| TruthfulQA (0-shot)   | 59.54   |
+| Winogrande (5-shot)   | 80.43   |
+| GSM8K (5-shot)        | 51.4        |
 We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).