Adding Evaluation Results

#4
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -147,3 +147,17 @@ Please consider citing our work if you use the data or code in this repo.
147
  ## Acknowledgements
148
 
149
  Thanks to [Llama 2](https://ai.meta.com/llama/), [FastChat](https://github.com/lm-sys/FastChat), [AlpacaFarm](https://github.com/tatsu-lab/alpaca_farm), and [vllm](https://github.com/vllm-project/vllm).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  ## Acknowledgements
148
 
149
  Thanks to [Llama 2](https://ai.meta.com/llama/), [FastChat](https://github.com/lm-sys/FastChat), [AlpacaFarm](https://github.com/tatsu-lab/alpaca_farm), and [vllm](https://github.com/vllm-project/vllm).
150
+
151
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
152
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Xwin-LM__Xwin-LM-7B-V0.1)
153
+
154
+ | Metric | Value |
155
+ |-----------------------|---------------------------|
156
+ | Avg. | 45.94 |
157
+ | ARC (25-shot) | 56.57 |
158
+ | HellaSwag (10-shot) | 79.4 |
159
+ | MMLU (5-shot) | 49.98 |
160
+ | TruthfulQA (0-shot) | 47.89 |
161
+ | Winogrande (5-shot) | 73.32 |
162
+ | GSM8K (5-shot) | 5.31 |
163
+ | DROP (3-shot) | 9.09 |