Spaces:

TheFinAI
/

open-finllm-reasoning-leaderboard

Running

lfqian commited on Feb 13

Commit

b6f40b7

verified ·

1 Parent(s): 3c31a15

Update model_performance.csv

Files changed (1) hide show

model_performance.csv CHANGED Viewed

@@ -1,18 +1,18 @@
 Models,Average,FinQA,DM-Simplong,XBRL-Math,Type
-GPT-4o,68.24,72.49,60.0,72.22,instruction-tuned
-GPT-o1,59.84,49.07,56.0,74.44,instruction-tuned
-GPT-o3-mini,65.51,60.87,59.0,76.67,instruction-tuned
-DeepSeek-V3,67.62,73.2,53.0,76.67,instruction-tuned
-DeepSeek-R1,68.93,65.13,53.0,86.67,instruction-tuned
-Qwen2.5-72B-Instruct,66.72,73.38,59.0,67.78,instruction-tuned
-Qwen2.5-72B-Instruct-Math,65.69,69.74,42.0,83.33,instruction-tuned
-DeepSeek-R1-Distill-Llama-70B,68.8,66.73,53.0,86.67,instruction-tuned
-Llama3-70B-Instruct,52.2,58.92,41.0,56.67,instruction-tuned
-Llama3.1-70B-Instruct,58.17,63.18,48.0,63.33,instruction-tuned
-Llama3.3-70B-Instruct,64.05,68.15,54.0,70.0,instruction-tuned
-DeepSeek-R1-Distill-Qwen-32B,68.97,65.48,55.0,84.44,instruction-tuned
-DeepSeek-R1-Distill-Qwen-14B,63.9,63.27,44.0,84.44,instruction-tuned
-DeepSeek-R1-Distill-Llama-8B,53.36,45.96,33.0,81.11,instruction-tuned
-Llama3-8B-Instruct,39.95,41.97,29.0,48.89,instruction-tuned
-Llama3.1-8B-Instruct,50.12,54.13,34.0,62.22,instruction-tuned
-Fino1-8B,61.03,60.87,40.0,82.22,instruction-tuned

 Models,Average,FinQA,DM-Simplong,XBRL-Math,Type
+GPT-4o,68.24,72.49,60.0,72.22,Instruction-tuned
+GPT-o1,59.84,49.07,56.0,74.44,Reasoning-enhanced
+GPT-o3-mini,65.51,60.87,59.0,76.67,Reasoning-enhanced
+DeepSeek-V3,67.62,73.2,53.0,76.67,Instruction-tuned
+DeepSeek-R1,68.93,65.13,53.0,86.67,Reasoning-enhanced
+Qwen2.5-72B-Instruct,66.72,73.38,59.0,67.78,Instruction-tuned
+Qwen2.5-72B-Instruct-Math,65.69,69.74,42.0,83.33,Reasoning-enhanced
+DeepSeek-R1-Distill-Llama-70B,68.8,66.73,53.0,86.67,Reasoning-enhanced
+Llama3-70B-Instruct,52.2,58.92,41.0,56.67,Instruction-tuned
+Llama3.1-70B-Instruct,58.17,63.18,48.0,63.33,Instruction-tuned
+Llama3.3-70B-Instruct,64.05,68.15,54.0,70.0,Instruction-tuned
+DeepSeek-R1-Distill-Qwen-32B,68.97,65.48,55.0,84.44,Reasoning-enhanced
+DeepSeek-R1-Distill-Qwen-14B,63.9,63.27,44.0,84.44,Reasoning-enhanced
+DeepSeek-R1-Distill-Llama-8B,53.36,45.96,33.0,81.11,Reasoning-enhanced
+Llama3-8B-Instruct,39.95,41.97,29.0,48.89,Instruction-tuned
+Llama3.1-8B-Instruct,50.12,54.13,34.0,62.22,Instruction-tuned
+Fino1-8B,61.03,60.87,40.0,82.22,Reasoning-enhanced