ehartford leaderboard-pr-bot commited on
Commit
7552f54
1 Parent(s): b6d16c3

Adding Evaluation Results (#6)

Browse files

- Adding Evaluation Results (d601cbfcb913da0dac0b97a830c210c48a1fefef)


Co-authored-by: Open LLM Leaderboard PR Bot <[email protected]>

Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -94,3 +94,17 @@ ASSISTANT:
94
  9. Conservation status: Many dolphin species face threats due to habitat loss, pollution, overfishing, and climate change. As a result, several populations are considered vulnerable or endangered by the International Union for Conservation of Nature (IUCN). In comparison, orca conservation status varies depending on the region; however, no species is currently listed as critically endangered or facing extinction.
95
 
96
  10. Adaptability: Dolphins have adapted to various aquatic environments, ranging from warm coastal waters to colder open oceans. Their versatility allows them to thrive in both tropical and temperate climates. Orcas, conversely, are adapted to specific habitats such as cooler coastal waters and are mostly found in the Northern Hemisphere.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  9. Conservation status: Many dolphin species face threats due to habitat loss, pollution, overfishing, and climate change. As a result, several populations are considered vulnerable or endangered by the International Union for Conservation of Nature (IUCN). In comparison, orca conservation status varies depending on the region; however, no species is currently listed as critically endangered or facing extinction.
95
 
96
  10. Adaptability: Dolphins have adapted to various aquatic environments, ranging from warm coastal waters to colder open oceans. Their versatility allows them to thrive in both tropical and temperate climates. Orcas, conversely, are adapted to specific habitats such as cooler coastal waters and are mostly found in the Northern Hemisphere.
97
+
98
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
99
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-llama-13b)
100
+
101
+ | Metric | Value |
102
+ |-----------------------|---------------------------|
103
+ | Avg. | 48.6 |
104
+ | ARC (25-shot) | 55.55 |
105
+ | HellaSwag (10-shot) | 77.11 |
106
+ | MMLU (5-shot) | 52.16 |
107
+ | TruthfulQA (0-shot) | 52.23 |
108
+ | Winogrande (5-shot) | 69.93 |
109
+ | GSM8K (5-shot) | 14.4 |
110
+ | DROP (3-shot) | 18.83 |