Spaces:

jerome-white
/

leaderboard-item-response

Sleeping

jerome-white commited on Feb 5

Commit

ad24928

1 Parent(s): dacb5dc

Information about the Space

Files changed (1) hide show

INFO.md ADDED Viewed

+This Space applies [item response
+theory](https://en.wikipedia.org/wiki/Item_response_theory) (2PL) to
+results of the Hugging Face [OpenLLM
+Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard). Seperate
+models were fit for each [evaluation
+framework](https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about#tasks)
+covered in the leaderboard; each top level tab corresponds to
+one. Within each tab sub-tabs corresponding to individual parameters
+from the model. Each tab presents a table of results:
+* For item related parameters, results are over questions presented to
+  the language models. For brevity, questions are listed using their
+  hash. Details of the question can be found by clicking the row of
+  interest.
+* The person related parameter is over language models. This tab
+  supports comparison between models based on their _ability_. See the
+  interface below the table for details.
+Code that produced the results in this Space can be found on
+[Github](https://github.com/jerome-white/open-llm-bda), including the
+[Stan model](https://github.com/jerome-white/open-llm-bda/blob/1334a1bf4cd9b04333bb1726c78bae0c03eec00b/src/model/model.stan) that drove sampling.