Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
157
14
28
Nathan Habib
SaylorTwift
Follow
arshad-ml's profile picture
mfontana355's profile picture
lalomorales's profile picture
90 followers
·
22 following
nathanhabib1011
NathanHB
AI & ML interests
None yet
Articles
Open LLM Leaderboard: DROP deep dive
Dec 1, 2023
•
3
What's going on with the Open LLM Leaderboard?
Jun 23, 2023
•
18
Organizations
SaylorTwift
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
New activity in
hf-doc-build/doc-build
about 1 month ago
Create lighteval/_versions.yml
#28 opened about 1 month ago by
SaylorTwift
New activity in
open-llm-leaderboard/open_llm_leaderboard
5 months ago
bump-transformers-to-4.41.1
2
#753 opened 5 months ago by
alozowski
apply-ruff
7
#748 opened 5 months ago by
alozowski
New activity in
open-llm-leaderboard/GenerationVisualizer
5 months ago
bbh_math_fixes
1
#1 opened 5 months ago by
alozowski
New activity in
open-llm-leaderboard/open_llm_leaderboard
5 months ago
Understanding raw result data files
4
#729 opened 6 months ago by
jerome-white
No good way to identify number of activated parameters causes MIxtral evaluation failures
32
#680 opened 6 months ago by
0-hero
New activity in
open-llm-leaderboard/open_llm_leaderboard
6 months ago
GSM8K failure with Llama 3 finetunes
12
#703 opened 6 months ago by
jeiku
New activity in
open-llm-leaderboard-old/results
6 months ago
Renaming Model hsramall/hsramall-8b-placeholder to meta-llama/Meta-Llama-3-8B
#57 opened 6 months ago by
SaylorTwift
New activity in
open-llm-leaderboard-old/requests
6 months ago
Renaming Model hsramall/hsramall-8b-placeholder to meta-llama/Meta-Llama-3-8B
#102 opened 6 months ago by
SaylorTwift
New activity in
databricks/dbrx-base
7 months ago
Update README.md
1
#16 opened 7 months ago by
SaylorTwift
New activity in
open-llm-leaderboard/open_llm_leaderboard
7 months ago
Model evaluation failed after 2 days
17
#622 opened 8 months ago by
migtissera
bigstral and bigyi both failed
1
#628 opened 7 months ago by
ehartford
New activity in
open-llm-leaderboard-old/results
8 months ago
Renaming Model gg-hf/gemma-2b to google/gemma-2b
2
#46 opened 8 months ago by
SaylorTwift
Renaming Model gg-hf/gemma-7b to google/gemma-7b
2
#45 opened 8 months ago by
SaylorTwift
New activity in
google/gemma-7b
8 months ago
Fix typo
#15 opened 8 months ago by
SaylorTwift
New activity in
open-llm-leaderboard-old/details_google__gemma-2b
8 months ago
Renaming Model gg-hf/gemma-2b to google/gemma-2b
#1 opened 8 months ago by
SaylorTwift
New activity in
open-llm-leaderboard-old/details_google__gemma-7b
8 months ago
Renaming Model gg-hf/gemma-7b to google/gemma-7b
#1 opened 8 months ago by
SaylorTwift
New activity in
open-llm-leaderboard-old/requests
8 months ago
Renaming Model gg-hf/gemma-2b to google/gemma-2b
#59 opened 8 months ago by
SaylorTwift
Renaming Model gg-hf/gemma-7b to google/gemma-7b
#58 opened 8 months ago by
SaylorTwift
New activity in
open-llm-leaderboard/open_llm_leaderboard
8 months ago
Model not found on hub
4
#584 opened 9 months ago by
akshay326
152334H/miqu-1-70b-sf marked as private or deleted
3
#587 opened 8 months ago by
TNTOutburst
Model submission
1
#590 opened 8 months ago by
axra
gsm8k score largely different from local run
6
#591 opened 8 months ago by
mobicham
Create REBOOT.md
4
#586 opened 8 months ago by
nisten
Model not showing up in the queue
2
#588 opened 8 months ago by
bhavyaaiplanet
Is the eval server down?
1
#589 opened 8 months ago by
appoose
New activity in
open-llm-leaderboard/open_llm_leaderboard
9 months ago
can I see MMLU By Task?
1
#559 opened 9 months ago by
jijivski
invisible models in openllm leaderboard
13
#554 opened 9 months ago by
leejunhyeok
[FLAG] Garrulus and Turdus based models
3
#548 opened 9 months ago by
MichaelKarpe
Add Mute Button for Notifications
6
#547 opened 9 months ago by
aigeek0x0
Cannot reproduce accuracy of mncai/Llama2-7B-guanaco-dolphin-500 gsm8k
16
#527 opened 9 months ago by
zhentaocc
Model works fine on A100 (80GB) GPU
3
#541 opened 9 months ago by
aigeek0x0
missing evaluation results of finished eval
16
#529 opened 9 months ago by
fblgit
Details of run not Found
1
#518 opened 9 months ago by
arshadshk
Evaluation of Dolphin Mixtral series failed
6
#517 opened 10 months ago by
Dampfinchen
New activity in
open-llm-leaderboard/open_llm_leaderboard
10 months ago
Bagel 8x7B evaluation failed
5
#516 opened 10 months ago by
Dampfinchen
my model is not showing on the llm leaderboard
2
#504 opened 10 months ago by
abdulrahman-nuzha
Update src/submission/check_validity.py
4
#509 opened 10 months ago by
BearSean
undeleted/unprivate model is invisible in leaderboard
11
#487 opened 10 months ago by
leejunhyeok
Models Failure
11
#374 opened 11 months ago by
Weyaxi
Model deletion from LLM leaderboard
4
#476 opened 10 months ago by
Toten5
Bug- model show incorrect # Params
4
#499 opened 10 months ago by
felixz
[FLAG] zyh3826 / GML-Mistral-merged-v1
2
#503 opened 10 months ago by
nlpguy
Model Upload Error: This model has been already submitted.
7
#506 opened 10 months ago by
kyujinpy
Remove model
1
#507 opened 10 months ago by
jeonsworld
Why is "trust_remote_code" not supported?
1
#501 opened 10 months ago by
Q-bert
[FLAG] ceadar-ie / FinanceConnect-13B
2
#502 opened 10 months ago by
nlpguy
Request for removal of models
1
#500 opened 10 months ago by
sequelbox
l3utterfly/minima-3b-layla-v2 shown as "submitted" but cannot find it anywhere
6
#493 opened 10 months ago by
l3utterfly
Model evaluation failed
3
#494 opened 10 months ago by
adamo1139
Deployed For Evaluation Still Not On Leaderboard
2
#497 opened 10 months ago by
vikash06
Model not visible on leaderboard
4
#489 opened 10 months ago by
mwitiderrick
There seems to be a problem with the mixtral finetuning evaluations
9
#491 opened 10 months ago by
DavidGF
New activity in
OpenGVLab/MVBench_Leaderboard
10 months ago
Better citation readability
1
#1 opened 10 months ago by
SaylorTwift
New activity in
open-llm-leaderboard/open_llm_leaderboard
10 months ago
Metis-0.3 failed
1
#479 opened 10 months ago by
Mihaiii
What does it mean for a model to be in the Running evaluation queue?
1
#480 opened 10 months ago by
Mihaiii
Potential data contamination with regards to ultrafeedback-binarized and Nectar datasets
42
#474 opened 10 months ago by
killawhale2
nathan-flagged-models-vis
5
#478 opened 10 months ago by
SaylorTwift
Brainstorming: Suggestions for improving the leaderboard
25
#477 opened 10 months ago by
xxyyy123
Please check this failed task
3
#446 opened 10 months ago by
JosephusCheung
Load more