Base Model: Qwen2.5-14B-Instruct
License: Apache 2.0
Blue And White_Flycatcher-3AD1E
model for roleplay is here Cran-May/T.E-8.1
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 29.12 |
IFEval (0-Shot) | 49.08 |
BBH (3-Shot) | 43.74 |
MATH Lvl 5 (4-Shot) | 15.71 |
GPQA (0-shot) | 10.74 |
MuSR (0-shot) | 13.88 |
MMLU-PRO (5-shot) | 41.56 |
- Downloads last month
- 11
Model tree for NLPark/B-and-W_Flycatcher-3AD1E
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard49.080
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard43.740
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard15.710
- acc_norm on GPQA (0-shot)Open LLM Leaderboard10.740
- acc_norm on MuSR (0-shot)Open LLM Leaderboard13.880
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard41.560