meditsolutions
/

Llama-3.2-SUN-2.5B-chat

@@ -15,7 +15,7 @@ model-index:
         num_few_shot: 0
     metrics:
     - type: inst_level_strict_acc and prompt_level_strict_acc
-      value: 53.89
       name: strict accuracy
     source:
       url: >-
@@ -31,7 +31,7 @@ model-index:
         num_few_shot: 3
     metrics:
     - type: acc_norm
-      value: 6.46
       name: normalized accuracy
     source:
       url: >-
@@ -47,7 +47,7 @@ model-index:
         num_few_shot: 4
     metrics:
     - type: exact_match
-      value: 3.25
       name: exact match
     source:
       url: >-
@@ -63,7 +63,7 @@ model-index:
         num_few_shot: 0
     metrics:
     - type: acc_norm
-      value: 0
       name: acc_norm
     source:
       url: >-
@@ -79,7 +79,7 @@ model-index:
         num_few_shot: 0
     metrics:
     - type: acc_norm
-      value: 2.38
       name: acc_norm
     source:
       url: >-
@@ -97,7 +97,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 5.91
       name: accuracy
     source:
       url: >-
@@ -153,14 +153,14 @@ As the model is still in training, performance and capabilities may vary. Users
 The Model is designed to be used as a smart assistant but not as a knowledge source within your applications, systems, or environments. It is not intended to provide 100% accurate answers, especially in scenarios where high precision and accuracy are crucial.
 # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
-Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_meditsolutions__Llama-3.2-SUN-2.4B-v1.0.0)
 |      Metric       |Value|
 |-------------------|----:|
-|Avg.               |11.98|
-|IFEval (0-Shot)    |53.89|
-|BBH (3-Shot)       | 6.46|
-|MATH Lvl 5 (4-Shot)| 3.25|
-|GPQA (0-shot)      | 0.00|
-|MuSR (0-shot)      | 2.38|
-|MMLU-PRO (5-shot)  | 5.91|

         num_few_shot: 0
     metrics:
     - type: inst_level_strict_acc and prompt_level_strict_acc
+      value: 56.37
       name: strict accuracy
     source:
       url: >-
         num_few_shot: 3
     metrics:
     - type: acc_norm
+      value: 7.21
       name: normalized accuracy
     source:
       url: >-
         num_few_shot: 4
     metrics:
     - type: exact_match
+      value: 4.83
       name: exact match
     source:
       url: >-
         num_few_shot: 0
     metrics:
     - type: acc_norm
+      value: 1.01
       name: acc_norm
     source:
       url: >-
         num_few_shot: 0
     metrics:
     - type: acc_norm
+      value: 3.02
       name: acc_norm
     source:
       url: >-
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 6.03
       name: accuracy
     source:
       url: >-
 The Model is designed to be used as a smart assistant but not as a knowledge source within your applications, systems, or environments. It is not intended to provide 100% accurate answers, especially in scenarios where high precision and accuracy are crucial.
 # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/meditsolutions__Llama-3.2-SUN-2.4B-v1.0.0-details)
 |      Metric       |Value|
 |-------------------|----:|
+|Avg.               |13.08|
+|IFEval (0-Shot)    |56.37|
+|BBH (3-Shot)       | 7.21|
+|MATH Lvl 5 (4-Shot)| 4.83|
+|GPQA (0-shot)      | 1.01|
+|MuSR (0-shot)      | 3.02|
+|MMLU-PRO (5-shot)  | 6.03|