Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1052

Empty main benchmark scores

#1007

by ymcki - opened Nov 4

Discussion

ymcki

Nov 4

I am getting empty results for the main benchmarks from my recent two submissions:
https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18/results_2024-11-04T02-06-36.084992.json
https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18-merge/results_2024-11-04T02-06-46.701689.json

Previous submissions about five days were good. What's going on?

Is it possible to calculate the main benchmark scores from the sub scores?

alozowski

Open LLM Leaderboard org Nov 4

Hi @ymcki ,

Thank you for reporting!

We've updated the Harness version for our evaluations and encounter this issue – everything is correct know, so I'll resubmit your models to get correct results

Thank you for your patience!

alozowski changed discussion status to closed Nov 4

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment