UNA: Uniform Neural Alignment
SFT Further:
- Linear
- 2e-5
Merges:
- Fan in:
0:2
- Fan out:
-4:
- Intermediary layers:
1/1/1/0/1/1/0/1/0/1/1/0/1/1/0
use the On/Off as a way of regularise.
Quants
Libraries:
- Transformers 4.35.0-UNA
- Pytorch 2.1.0
- Datasets 2.14.6
- Tokenizers 0.14.1
Evals LM-Evaluation Harness
mt-bench
:
Mode: single
Input file: data/mt_bench/model_judgment/gpt-4_single.jsonl
########## First turn ##########
score
model turn
gpt-4 1 8.95625
claude-v1 1 8.15000
gpt-3.5-turbo 1 8.07500
LUNA-SOLARkrautLM-Instruct 1 7.93750
UNA-SOLAR-10.7B-Instruct-v1.0 1 7.80625
vicuna-33b-v1.3 1 7.45625
wizardlm-30b 1 7.13125
tulu-30b 1 7.01875
vicuna-13b-v1.3 1 6.81250
guanaco-65b 1 6.78125
nous-hermes-13b 1 6.43125
alpaca-13b 1 4.97500
rwkv-4-raven-14b 1 4.74375
llama-13b 1 3.26250
########## Second turn ##########
score
model turn
gpt-4 2 9.025000
gpt-3.5-turbo 2 7.812500
claude-v1 2 7.650000
UNA-SOLAR-10.7B-Instruct-v1.0 2 7.237500
LUNA-SOLARkrautLM-Instruct 2 6.987500
wizardlm-30b 2 6.887500
vicuna-33b-v1.3 2 6.787500
guanaco-65b 2 6.037500
vicuna-13b-v1.3 2 5.962500
tulu-30b 2 5.850000
nous-hermes-13b 2 4.664557
alpaca-13b 2 4.087500
rwkv-4-raven-14b 2 3.225000
llama-13b 2 1.950000
########## Average ##########
score
model
gpt-4 8.990625
gpt-3.5-turbo 7.943750
claude-instant-v1 7.905660
claude-v1 7.900000
UNA-SOLAR-10.7B-Instruct-v1.0 7.521875
LUNA-SOLARkrautLM-Instruct 7.462500
vicuna-33b-v1.3 7.121875
wizardlm-30b 7.009375
Llama-2-70b-chat 6.856250
Llama-2-13b-chat 6.650000
guanaco-33b 6.528125
tulu-30b 6.434375
guanaco-65b 6.409375
oasst-sft-7-llama-30b 6.409375
palm-2-chat-bison-001 6.400000
mpt-30b-chat 6.393750
vicuna-13b-v1.3 6.387500
wizardlm-13b 6.353125
Llama-2-7b-chat 6.268750
vicuna-7b-v1.3 5.996875
baize-v2-13b 5.750000
nous-hermes-13b 5.553459
mpt-7b-chat 5.459119
gpt4all-13b-snoozy 5.452830
koala-13b 5.350000
mpt-30b-instruct 5.218750
falcon-40b-instruct 5.168750
h2ogpt-oasst-open-llama-13b 4.625000
alpaca-13b 4.531250
chatglm-6b 4.500000
oasst-sft-4-pythia-12b 4.318750
rwkv-4-raven-14b 3.984375
dolly-v2-12b 3.275000
fastchat-t5-3b 3.040625
stablelm-tuned-alpha-7b 2.753125
llama-13b 2.606250
big-refactor
branch:
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0), gen_kwargs: (None), limit: None, num_fewshot: 25, batch_size: auto (32)
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|-------------|-------|------|-----:|--------|-----:|---|-----:|
|arc_challenge|Yaml |none | 25|acc |0.6954|± |0.0134|
| | |none | 25|acc_norm|0.7167|± |0.0132|
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto
|Tasks|Version| Filter |n-shot| Metric |Value| |Stderr|
|-----|-------|----------|-----:|-----------|----:|---|-----:|
|gsm8k|Yaml |get-answer| 5|exact_match|0.671|± |0.0129|
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0), gen_kwargs: (), limit: None, num_fewshot: 0, batch_size: auto (64)
| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr|
|--------------|-------|------|-----:|------|-----:|---|-----:|
|truthfulqa_mc2|Yaml |none | 0|acc |0.7297|_ |0.0149|
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0), gen_kwargs: (None), limit: None, num_fewshot: 10, batch_size: auto (32)
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|---------|-------|------|-----:|--------|-----:|---|-----:|
|hellaswag|Yaml |none | 10|acc |0.7091|± |0.0045|
| | |none | 10|acc_norm|0.8821|± |0.0032|
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0,dtype=float16), gen_kwargs: (), limit: None, num_fewshot: 0, batch_size: auto (32)
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|--------------|-------|------|-----:|----------|-----:|---|-----:|
|boolq |Yaml |none | 0|acc |0.8807|_ |0.0057|
|lambada_openai|Yaml |none | 0|perplexity|3.2452|_ |0.0778|
| | |none | 0|acc |0.7207|_ |0.0063|
|piqa |Yaml |none | 0|acc |0.8020|_ |0.0093|
| | |none | 0|acc_norm |0.8009|_ |0.0093|
|sciq |Yaml |none | 0|acc |0.9730|_ |0.0051|
| | |none | 0|acc_norm |0.9630|_ |0.0060|
|winogrande |Yaml |none | 0|acc |0.7577|_ |0.0120|
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0,dtype=float16), gen_kwargs: (), limit: None, num_fewshot: 0, batch_size: auto (64)
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|--------|-------|------|-----:|--------|-----:|---|-----:|
|mathqa |Yaml |none | 0|acc |0.3474|_ |0.0087|
| | |none | 0|acc_norm|0.3568|_ |0.0088|
|pubmedqa|Yaml |none | 0|acc |0.5400|_ |0.0223|
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0,dtype=float16), gen_kwargs: (), limit: None, num_fewshot: 0, batch_size: auto
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|------------------------------------------------------|-------|------|-----:|-----------|-----:|---|-----:|
|bbh_fewshot |N/A |none | 0|exact_match|0.4660|_ |0.1771|
| - bbh_fewshot_boolean_expressions |Yaml |none | 0|exact_match|0.8160|_ |0.0246|
| - bbh_fewshot_causal_judgement |Yaml |none | 0|exact_match|0.4973|_ |0.0367|
| - bbh_fewshot_date_understanding |Yaml |none | 0|exact_match|0.4840|_ |0.0317|
| - bbh_fewshot_disambiguation_qa |Yaml |none | 0|exact_match|0.6520|_ |0.0302|
| - bbh_fewshot_dyck_languages |Yaml |none | 0|exact_match|0.2040|_ |0.0255|
| - bbh_fewshot_formal_fallacies |Yaml |none | 0|exact_match|0.5280|_ |0.0316|
| - bbh_fewshot_geometric_shapes |Yaml |none | 0|exact_match|0.3360|_ |0.0299|
| - bbh_fewshot_hyperbaton |Yaml |none | 0|exact_match|0.5520|_ |0.0315|
| - bbh_fewshot_logical_deduction_five_objects |Yaml |none | 0|exact_match|0.4520|_ |0.0315|
| - bbh_fewshot_logical_deduction_seven_objects |Yaml |none | 0|exact_match|0.3920|_ |0.0309|
| - bbh_fewshot_logical_deduction_three_objects |Yaml |none | 0|exact_match|0.6200|_ |0.0308|
| - bbh_fewshot_movie_recommendation |Yaml |none | 0|exact_match|0.6640|_ |0.0299|
| - bbh_fewshot_multistep_arithmetic_two |Yaml |none | 0|exact_match|0.0080|_ |0.0056|
| - bbh_fewshot_navigate |Yaml |none | 0|exact_match|0.6280|_ |0.0306|
| - bbh_fewshot_object_counting |Yaml |none | 0|exact_match|0.3960|_ |0.0310|
| - bbh_fewshot_penguins_in_a_table |Yaml |none | 0|exact_match|0.4726|_ |0.0415|
| - bbh_fewshot_reasoning_about_colored_objects |Yaml |none | 0|exact_match|0.5320|_ |0.0316|
| - bbh_fewshot_ruin_names |Yaml |none | 0|exact_match|0.5680|_ |0.0314|
| - bbh_fewshot_salient_translation_error_detection |Yaml |none | 0|exact_match|0.5480|_ |0.0315|
| - bbh_fewshot_snarks |Yaml |none | 0|exact_match|0.5169|_ |0.0376|
| - bbh_fewshot_sports_understanding |Yaml |none | 0|exact_match|0.8320|_ |0.0237|
| - bbh_fewshot_temporal_sequences |Yaml |none | 0|exact_match|0.5520|_ |0.0315|
| - bbh_fewshot_tracking_shuffled_objects_five_objects |Yaml |none | 0|exact_match|0.1480|_ |0.0225|
| - bbh_fewshot_tracking_shuffled_objects_seven_objects|Yaml |none | 0|exact_match|0.1720|_ |0.0239|
| - bbh_fewshot_tracking_shuffled_objects_three_objects|Yaml |none | 0|exact_match|0.2760|_ |0.0283|
| - bbh_fewshot_web_of_lies |Yaml |none | 0|exact_match|0.4760|_ |0.0316|
| - bbh_fewshot_word_sorting |Yaml |none | 0|exact_match|0.2840|_ |0.0286|
| Groups |Version|Filter|n-shot| Metric |Value| |Stderr|
|-----------|-------|------|-----:|-----------|----:|---|-----:|
|bbh_fewshot|N/A |none | 0|exact_match|0.466|_ |0.1771|
hf (pretrained=fblgit/UNA-SOLAR-10.7B-Instruct-v1.0), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto (16)
| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr|
|---------------------------------------|-------|------|-----:|------|-----:|---|-----:|
|mmlu |N/A |none | 0|acc |0.6513|± |0.1221|
| - humanities |N/A |none | 5|acc |0.6077|± |0.1185|
| - formal_logic |Yaml |none | 5|acc |0.4444|± |0.0444|
| - high_school_european_history |Yaml |none | 5|acc |0.8121|± |0.0305|
| - high_school_us_history |Yaml |none | 5|acc |0.8431|± |0.0255|
| - high_school_world_history |Yaml |none | 5|acc |0.8523|± |0.0231|
| - international_law |Yaml |none | 5|acc |0.7851|± |0.0375|
| - jurisprudence |Yaml |none | 5|acc |0.7870|± |0.0396|
| - logical_fallacies |Yaml |none | 5|acc |0.7546|± |0.0338|
| - moral_disputes |Yaml |none | 5|acc |0.7370|± |0.0237|
| - moral_scenarios |Yaml |none | 5|acc |0.4101|± |0.0164|
| - philosophy |Yaml |none | 5|acc |0.7170|± |0.0256|
| - prehistory |Yaml |none | 5|acc |0.7840|± |0.0229|
| - professional_law |Yaml |none | 5|acc |0.4941|± |0.0128|
| - world_religions |Yaml |none | 5|acc |0.7895|± |0.0313|
| - other |N/A |none | 5|acc |0.7116|± |0.0939|
| - business_ethics |Yaml |none | 5|acc |0.7600|± |0.0429|
| - clinical_knowledge |Yaml |none | 5|acc |0.6792|± |0.0287|
| - college_medicine |Yaml |none | 5|acc |0.6590|± |0.0361|
| - global_facts |Yaml |none | 5|acc |0.3400|± |0.0476|
| - human_aging |Yaml |none | 5|acc |0.6816|± |0.0313|
| - management |Yaml |none | 5|acc |0.8350|± |0.0368|
| - marketing |Yaml |none | 5|acc |0.8547|± |0.0231|
| - medical_genetics |Yaml |none | 5|acc |0.7000|± |0.0461|
| - miscellaneous |Yaml |none | 5|acc |0.8020|± |0.0142|
| - nutrition |Yaml |none | 5|acc |0.7418|± |0.0251|
| - professional_accounting |Yaml |none | 5|acc |0.5071|± |0.0298|
| - professional_medicine |Yaml |none | 5|acc |0.7500|± |0.0263|
| - virology |Yaml |none | 5|acc |0.5843|± |0.0384|
| - social_sciences |N/A |none | 5|acc |0.7537|± |0.0681|
| - econometrics |Yaml |none | 5|acc |0.5000|± |0.0470|
| - high_school_geography |Yaml |none | 5|acc |0.8586|± |0.0248|
| - high_school_government_and_politics|Yaml |none | 5|acc |0.9016|± |0.0215|
| - high_school_macroeconomics |Yaml |none | 5|acc |0.6615|± |0.0240|
| - high_school_microeconomics |Yaml |none | 5|acc |0.7311|± |0.0288|
| - high_school_psychology |Yaml |none | 5|acc |0.8404|± |0.0157|
| - human_sexuality |Yaml |none | 5|acc |0.7328|± |0.0388|
| - professional_psychology |Yaml |none | 5|acc |0.6814|± |0.0189|
| - public_relations |Yaml |none | 5|acc |0.6909|± |0.0443|
| - security_studies |Yaml |none | 5|acc |0.7469|± |0.0278|
| - sociology |Yaml |none | 5|acc |0.8308|± |0.0265|
| - us_foreign_policy |Yaml |none | 5|acc |0.8900|± |0.0314|
| - stem |N/A |none | 5|acc |0.5569|± |0.1380|
| - abstract_algebra |Yaml |none | 5|acc |0.4100|± |0.0494|
| - anatomy |Yaml |none | 5|acc |0.6222|± |0.0419|
| - astronomy |Yaml |none | 5|acc |0.7368|± |0.0358|
| - college_biology |Yaml |none | 5|acc |0.8056|± |0.0331|
| - college_chemistry |Yaml |none | 5|acc |0.4700|± |0.0502|
| - college_computer_science |Yaml |none | 5|acc |0.5100|± |0.0502|
| - college_mathematics |Yaml |none | 5|acc |0.2800|± |0.0451|
| - college_physics |Yaml |none | 5|acc |0.3431|± |0.0472|
| - computer_security |Yaml |none | 5|acc |0.7400|± |0.0441|
| - conceptual_physics |Yaml |none | 5|acc |0.6340|± |0.0315|
| - electrical_engineering |Yaml |none | 5|acc |0.6000|± |0.0408|
| - elementary_mathematics |Yaml |none | 5|acc |0.4815|± |0.0257|
| - high_school_biology |Yaml |none | 5|acc |0.8032|± |0.0226|
| - high_school_chemistry |Yaml |none | 5|acc |0.4877|± |0.0352|
| - high_school_computer_science |Yaml |none | 5|acc |0.7200|± |0.0451|
| - high_school_mathematics |Yaml |none | 5|acc |0.3815|± |0.0296|
| - high_school_physics |Yaml |none | 5|acc |0.3576|± |0.0391|
| - high_school_statistics |Yaml |none | 5|acc |0.5602|± |0.0339|
| - machine_learning |Yaml |none | 5|acc |0.4643|± |0.0473|
| Groups |Version|Filter|n-shot|Metric|Value | |Stderr|
|------------------|-------|------|-----:|------|-----:|---|-----:|
|mmlu |N/A |none | 0|acc |0.6513|± |0.1221|
| - humanities |N/A |none | 5|acc |0.6077|± |0.1185|
| - other |N/A |none | 5|acc |0.7116|± |0.0939|
| - social_sciences|N/A |none | 5|acc |0.7537|± |0.0681|
| - stem |N/A |none | 5|acc |0.5569|± |0.1380|
Citations
to Upstage.AI for its awesome base model, this is merely a UNA of it. It can only refine what its already in there :)
If you find UNA-SOLAR useful, cite and support the authors.
- Downloads last month
- 110
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for fblgit/UNA-SOLAR-10.7B-Instruct-v1.0
Base model
upstage/SOLAR-10.7B-v1.0
Finetuned
upstage/SOLAR-10.7B-Instruct-v1.0