Update README.md
Browse files
README.md
CHANGED
@@ -6,11 +6,10 @@ tags:
|
|
6 |
- logic
|
7 |
- planning
|
8 |
---
|
|
|
9 |
|
10 |
![img](./strix_rufipes.png)
|
11 |
|
12 |
-
# Strix Rufipes 70B
|
13 |
-
|
14 |
# Model Details
|
15 |
* **Trained by**: [ibivibiv](https://huggingface.co/ibivibiv)
|
16 |
* **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
|
@@ -18,6 +17,76 @@ tags:
|
|
18 |
* **Language(s)**: English
|
19 |
* **Purpose**: Has specific training for logic enforcement, will do well in ARC or other logic testing as well as critical thinking tasks. This model is targeted towards planning exercises.
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
# Prompting
|
22 |
|
23 |
## Prompt Template for alpaca style
|
|
|
6 |
- logic
|
7 |
- planning
|
8 |
---
|
9 |
+
# Strix Rufipes 70B
|
10 |
|
11 |
![img](./strix_rufipes.png)
|
12 |
|
|
|
|
|
13 |
# Model Details
|
14 |
* **Trained by**: [ibivibiv](https://huggingface.co/ibivibiv)
|
15 |
* **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
|
|
|
17 |
* **Language(s)**: English
|
18 |
* **Purpose**: Has specific training for logic enforcement, will do well in ARC or other logic testing as well as critical thinking tasks. This model is targeted towards planning exercises.
|
19 |
|
20 |
+
# Benchmark Scores
|
21 |
+
|
22 |
+
| Test Name | Accuracy |
|
23 |
+
|-------------------------------------------------------|----------------------|
|
24 |
+
| average of all | 0.6910894247381432 |
|
25 |
+
| arc:challenge | 0.674061433447099 |
|
26 |
+
| hellaswag | 0.6898028281218881 |
|
27 |
+
| hendrycksTest-abstract_algebra | 0.36 |
|
28 |
+
| hendrycksTest-anatomy | 0.6370370370370371 |
|
29 |
+
| hendrycksTest-astronomy | 0.7960526315789473 |
|
30 |
+
| hendrycksTest-business_ethics | 0.73 |
|
31 |
+
| hendrycksTest-clinical_knowledge | 0.7169811320754716 |
|
32 |
+
| hendrycksTest-college_biology | 0.8125 |
|
33 |
+
| hendrycksTest-college_chemistry | 0.47 |
|
34 |
+
| hendrycksTest-college_computer_science | 0.56 |
|
35 |
+
| hendrycksTest-college_mathematics | 0.36 |
|
36 |
+
| hendrycksTest-college_medicine | 0.6820809248554913 |
|
37 |
+
| hendrycksTest-college_physics | 0.43137254901960786 |
|
38 |
+
| hendrycksTest-computer_security | 0.75 |
|
39 |
+
| hendrycksTest-conceptual_physics | 0.6851063829787234 |
|
40 |
+
| hendrycksTest-econometrics | 0.4824561403508772 |
|
41 |
+
| hendrycksTest-electrical_engineering | 0.5793103448275863 |
|
42 |
+
| hendrycksTest-elementary_mathematics | 0.41534391534391535 |
|
43 |
+
| hendrycksTest-formal_logic | 0.48412698412698413 |
|
44 |
+
| hendrycksTest-global_facts | 0.5 |
|
45 |
+
| hendrycksTest-high_school_biology | 0.8064516129032258 |
|
46 |
+
| hendrycksTest-high_school_chemistry | 0.5073891625615764 |
|
47 |
+
| hendrycksTest-high_school_computer_science | 0.71 |
|
48 |
+
| hendrycksTest-high_school_european_history | 0.8424242424242424 |
|
49 |
+
| hendrycksTest-high_school_geography | 0.8787878787878788 |
|
50 |
+
| hendrycksTest-high_school_government_and_politics | 0.9326424870466321 |
|
51 |
+
| hendrycksTest-high_school_macroeconomics | 0.717948717948718 |
|
52 |
+
| hendrycksTest-high_school_mathematics | 0.2962962962962963 |
|
53 |
+
| hendrycksTest-high_school_microeconomics | 0.7521008403361344 |
|
54 |
+
| hendrycksTest-high_school_physics | 0.48344370860927155 |
|
55 |
+
| hendrycksTest-high_school_psychology | 0.8788990825688073 |
|
56 |
+
| hendrycksTest-high_school_statistics | 0.5277777777777778 |
|
57 |
+
| hendrycksTest-high_school_us_history | 0.9019607843137255 |
|
58 |
+
| hendrycksTest-high_school_world_history | 0.8776371308016878 |
|
59 |
+
| hendrycksTest-human_aging | 0.7802690582959642 |
|
60 |
+
| hendrycksTest-human_sexuality | 0.8244274809160306 |
|
61 |
+
| hendrycksTest-international_law | 0.8677685950413223 |
|
62 |
+
| hendrycksTest-jurisprudence | 0.8148148148148148 |
|
63 |
+
| hendrycksTest-logical_fallacies | 0.7914110429447853 |
|
64 |
+
| hendrycksTest-machine_learning | 0.5357142857142857 |
|
65 |
+
| hendrycksTest-management | 0.8543689320388349 |
|
66 |
+
| hendrycksTest-marketing | 0.8974358974358975 |
|
67 |
+
| hendrycksTest-medical_genetics | 0.73 |
|
68 |
+
| hendrycksTest-miscellaneous | 0.8569604086845466 |
|
69 |
+
| hendrycksTest-moral_disputes | 0.7687861271676301 |
|
70 |
+
| hendrycksTest-moral_scenarios | 0.5184357541899441 |
|
71 |
+
| hendrycksTest-nutrition | 0.7679738562091504 |
|
72 |
+
| hendrycksTest-philosophy | 0.7620578778135049 |
|
73 |
+
| hendrycksTest-prehistory | 0.8271604938271605 |
|
74 |
+
| hendrycksTest-professional_accounting | 0.5390070921985816 |
|
75 |
+
| hendrycksTest-professional_law | 0.5743155149934811 |
|
76 |
+
| hendrycksTest-professional_medicine | 0.6911764705882353 |
|
77 |
+
| hendrycksTest-professional_psychology | 0.7565359477124183 |
|
78 |
+
| hendrycksTest-public_relations | 0.7272727272727273 |
|
79 |
+
| hendrycksTest-security_studies | 0.8 |
|
80 |
+
| hendrycksTest-sociology | 0.8507462686567164 |
|
81 |
+
| hendrycksTest-us_foreign_policy | 0.89 |
|
82 |
+
| hendrycksTest-virology | 0.5542168674698795 |
|
83 |
+
| hendrycksTest-world_religions | 0.8596491228070176 |
|
84 |
+
| truthfulqa | 0.4712300987333333 |
|
85 |
+
| winogrande | 0.8476716653512234 |
|
86 |
+
| gsm8k | 0.5382865807429871 |
|
87 |
+
|
88 |
+
|
89 |
+
|
90 |
# Prompting
|
91 |
|
92 |
## Prompt Template for alpaca style
|