Commit
·
ad24928
1
Parent(s):
dacb5dc
Information about the Space
Browse files
INFO.md
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
This Space applies [item response
|
2 |
+
theory](https://en.wikipedia.org/wiki/Item_response_theory) (2PL) to
|
3 |
+
results of the Hugging Face [OpenLLM
|
4 |
+
Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard). Seperate
|
5 |
+
models were fit for each [evaluation
|
6 |
+
framework](https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about#tasks)
|
7 |
+
covered in the leaderboard; each top level tab corresponds to
|
8 |
+
one. Within each tab sub-tabs corresponding to individual parameters
|
9 |
+
from the model. Each tab presents a table of results:
|
10 |
+
|
11 |
+
* For item related parameters, results are over questions presented to
|
12 |
+
the language models. For brevity, questions are listed using their
|
13 |
+
hash. Details of the question can be found by clicking the row of
|
14 |
+
interest.
|
15 |
+
* The person related parameter is over language models. This tab
|
16 |
+
supports comparison between models based on their _ability_. See the
|
17 |
+
interface below the table for details.
|
18 |
+
|
19 |
+
Code that produced the results in this Space can be found on
|
20 |
+
[Github](https://github.com/jerome-white/open-llm-bda), including the
|
21 |
+
[Stan model](https://github.com/jerome-white/open-llm-bda/blob/1334a1bf4cd9b04333bb1726c78bae0c03eec00b/src/model/model.stan) that drove sampling.
|