given-mpt-7b / README.md

Ontocord.AI

Update README.md

2cd41a9 almost 2 years ago

12.1 kB

	---
	license: apache-2.0
	inference: false
	---
	# Given-MPT-7B

	This is a merge of the following MPT-7B models:

	- gorilla-llm/gorilla-mpt-7b-hf-v0
	- ibm/mpt-7b-instruct2
	- TehVenom/MPT-7b-WizardLM_Uncensored-Storywriter-Merge
	- emozilla/mpt-7b-storysummarizer
	- nomic-ai/gpt4all-mpt


	## Model License

	Apache 2.0


	## Purpose

	This model is for experimenting with merging and routing to expert layers.

	# Test eval on only 10% of eval set

	hf-causal (pretrained=Multi-Domain-Expert-Layers/given-mpt-7b,dtype=bfloat16,trust_remote_code=True), limit: 0.1, provide_description: False, num_fewshot: 0, batch_size: None
	\| Task \|Version\| Metric \| Value \| \|Stderr\|
	\|-------------------------------------------------\|------:\|-----------\|------:\|---\|-----:\|
	\|arc_challenge \| 0\|acc \| 0.4274\|± \|0.0459\|
	\| \| \|acc_norm \| 0.3846\|± \|0.0452\|
	\|arc_easy \| 0\|acc \| 0.7863\|± \|0.0381\|
	\| \| \|acc_norm \| 0.7350\|± \|0.0410\|
	\|hellaswag \| 0\|acc \| 0.5556\|± \|0.0461\|
	\| \| \|acc_norm \| 0.8120\|± \|0.0363\|
	\|hendrycksTest-college_chemistry \| 0\|acc \| 0.3600\|± \|0.0482\|
	\| \| \|acc_norm \| 0.3700\|± \|0.0485\|
	\|hendrycksTest-college_computer_science \| 0\|acc \| 0.3400\|± \|0.0476\|
	\| \| \|acc_norm \| 0.3600\|± \|0.0482\|
	\|hendrycksTest-college_mathematics \| 0\|acc \| 0.2500\|± \|0.0435\|
	\| \| \|acc_norm \| 0.2900\|± \|0.0456\|
	\|hendrycksTest-college_medicine \| 0\|acc \| 0.3675\|± \|0.0448\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-college_physics \| 0\|acc \| 0.2451\|± \|0.0428\|
	\| \| \|acc_norm \| 0.2941\|± \|0.0453\|
	\|hendrycksTest-computer_security \| 0\|acc \| 0.4800\|± \|0.0502\|
	\| \| \|acc_norm \| 0.4400\|± \|0.0499\|
	\|hendrycksTest-conceptual_physics \| 0\|acc \| 0.2051\|± \|0.0375\|
	\| \| \|acc_norm \| 0.1709\|± \|0.0350\|
	\|hendrycksTest-econometrics \| 0\|acc \| 0.2982\|± \|0.0430\|
	\| \| \|acc_norm \| 0.2368\|± \|0.0400\|
	\|hendrycksTest-electrical_engineering \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3590\|± \|0.0445\|
	\|hendrycksTest-elementary_mathematics \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-formal_logic \| 0\|acc \| 0.3077\|± \|0.0429\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-global_facts \| 0\|acc \| 0.3000\|± \|0.0461\|
	\| \| \|acc_norm \| 0.2700\|± \|0.0446\|
	\|hendrycksTest-high_school_biology \| 0\|acc \| 0.3675\|± \|0.0448\|
	\| \| \|acc_norm \| 0.3077\|± \|0.0429\|
	\|hendrycksTest-high_school_chemistry \| 0\|acc \| 0.2564\|± \|0.0405\|
	\| \| \|acc_norm \| 0.2906\|± \|0.0422\|
	\|hendrycksTest-high_school_computer_science \| 0\|acc \| 0.4100\|± \|0.0494\|
	\| \| \|acc_norm \| 0.4400\|± \|0.0499\|
	\|hendrycksTest-high_school_european_history \| 0\|acc \| 0.4359\|± \|0.0460\|
	\| \| \|acc_norm \| 0.3590\|± \|0.0445\|
	\|hendrycksTest-high_school_geography \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3675\|± \|0.0448\|
	\|hendrycksTest-high_school_government_and_politics\| 0\|acc \| 0.3932\|± \|0.0454\|
	\| \| \|acc_norm \| 0.3932\|± \|0.0454\|
	\|hendrycksTest-high_school_macroeconomics \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-high_school_mathematics \| 0\|acc \| 0.2051\|± \|0.0375\|
	\| \| \|acc_norm \| 0.2564\|± \|0.0405\|
	\|hendrycksTest-high_school_microeconomics \| 0\|acc \| 0.3504\|± \|0.0443\|
	\| \| \|acc_norm \| 0.4188\|± \|0.0458\|
	\|hendrycksTest-high_school_physics \| 0\|acc \| 0.2650\|± \|0.0410\|
	\| \| \|acc_norm \| 0.2906\|± \|0.0422\|
	\|hendrycksTest-high_school_psychology \| 0\|acc \| 0.3761\|± \|0.0450\|
	\| \| \|acc_norm \| 0.3419\|± \|0.0440\|
	\|hendrycksTest-high_school_statistics \| 0\|acc \| 0.3077\|± \|0.0429\|
	\| \| \|acc_norm \| 0.3504\|± \|0.0443\|
	\|hendrycksTest-high_school_us_history \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3333\|± \|0.0438\|
	\|hendrycksTest-high_school_world_history \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3419\|± \|0.0440\|
	\|hendrycksTest-human_aging \| 0\|acc \| 0.3761\|± \|0.0450\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-human_sexuality \| 0\|acc \| 0.4274\|± \|0.0459\|
	\| \| \|acc_norm \| 0.3761\|± \|0.0450\|
	\|hendrycksTest-international_law \| 0\|acc \| 0.4188\|± \|0.0458\|
	\| \| \|acc_norm \| 0.4957\|± \|0.0464\|
	\|hendrycksTest-jurisprudence \| 0\|acc \| 0.3148\|± \|0.0449\|
	\| \| \|acc_norm \| 0.4815\|± \|0.0483\|
	\|hendrycksTest-logical_fallacies \| 0\|acc \| 0.3504\|± \|0.0443\|
	\| \| \|acc_norm \| 0.3675\|± \|0.0448\|
	\|hendrycksTest-machine_learning \| 0\|acc \| 0.3214\|± \|0.0443\|
	\| \| \|acc_norm \| 0.2946\|± \|0.0433\|
	\|hendrycksTest-management \| 0\|acc \| 0.3786\|± \|0.0480\|
	\| \| \|acc_norm \| 0.3495\|± \|0.0472\|
	\|hendrycksTest-marketing \| 0\|acc \| 0.5043\|± \|0.0464\|
	\| \| \|acc_norm \| 0.4188\|± \|0.0458\|
	\|hendrycksTest-medical_genetics \| 0\|acc \| 0.3200\|± \|0.0469\|
	\| \| \|acc_norm \| 0.4100\|± \|0.0494\|
	\|hendrycksTest-miscellaneous \| 0\|acc \| 0.5299\|± \|0.0463\|
	\| \| \|acc_norm \| 0.4872\|± \|0.0464\|
	\|hendrycksTest-moral_disputes \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-moral_scenarios \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.2479\|± \|0.0401\|
	\|hendrycksTest-nutrition \| 0\|acc \| 0.3675\|± \|0.0448\|
	\| \| \|acc_norm \| 0.3932\|± \|0.0454\|
	\|hendrycksTest-philosophy \| 0\|acc \| 0.2991\|± \|0.0425\|
	\| \| \|acc_norm \| 0.3504\|± \|0.0443\|
	\|hendrycksTest-prehistory \| 0\|acc \| 0.2821\|± \|0.0418\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-professional_accounting \| 0\|acc \| 0.2137\|± \|0.0381\|
	\| \| \|acc_norm \| 0.2222\|± \|0.0386\|
	\|hendrycksTest-professional_law \| 0\|acc \| 0.3077\|± \|0.0429\|
	\| \| \|acc_norm \| 0.2735\|± \|0.0414\|
	\|hendrycksTest-professional_medicine \| 0\|acc \| 0.2991\|± \|0.0425\|
	\| \| \|acc_norm \| 0.2650\|± \|0.0410\|
	\|hendrycksTest-professional_psychology \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3419\|± \|0.0440\|
	\|hendrycksTest-public_relations \| 0\|acc \| 0.3909\|± \|0.0467\|
	\| \| \|acc_norm \| 0.3545\|± \|0.0458\|
	\|hendrycksTest-security_studies \| 0\|acc \| 0.3419\|± \|0.0440\|
	\| \| \|acc_norm \| 0.2906\|± \|0.0422\|
	\|hendrycksTest-sociology \| 0\|acc \| 0.3761\|± \|0.0450\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-us_foreign_policy \| 0\|acc \| 0.5000\|± \|0.0503\|
	\| \| \|acc_norm \| 0.4100\|± \|0.0494\|
	\|hendrycksTest-virology \| 0\|acc \| 0.3932\|± \|0.0454\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-world_religions \| 0\|acc \| 0.5299\|± \|0.0463\|
	\| \| \|acc_norm \| 0.5128\|± \|0.0464\|
	\|truthfulqa_gen \| 1\|bleurt_max \|-0.8551\|± \|0.0501\|
	\| \| \|bleurt_acc \| 0.3590\|± \|0.0445\|
	\| \| \|bleurt_diff\|-0.1292\|± \|0.0483\|
	\| \| \|bleu_max \|19.3738\|± \|1.8461\|
	\| \| \|bleu_acc \| 0.3932\|± \|0.0454\|
	\| \| \|bleu_diff \|-4.3883\|± \|2.1748\|
	\| \| \|rouge1_max \|41.8428\|± \|2.6156\|
	\| \| \|rouge1_acc \| 0.3162\|± \|0.0432\|
	\| \| \|rouge1_diff\|-8.8583\|± \|2.7745\|
	\| \| \|rouge2_max \|26.3956\|± \|2.8311\|
	\| \| \|rouge2_acc \| 0.2137\|± \|0.0381\|
	\| \| \|rouge2_diff\|-9.5287\|± \|3.3258\|
	\| \| \|rougeL_max \|39.5215\|± \|2.5620\|
	\| \| \|rougeL_acc \| 0.3162\|± \|0.0432\|
	\| \| \|rougeL_diff\|-8.5753\|± \|2.8259\|

	---
	license: apache-2.0
	inference: false
	---
	# Given-MPT-7B

	This is a merge of the following MPT-7B models:

	- gorilla-llm/gorilla-mpt-7b-hf-v0
	- ibm/mpt-7b-instruct2
	- TehVenom/MPT-7b-WizardLM_Uncensored-Storywriter-Merge
	- emozilla/mpt-7b-storysummarizer
	- nomic-ai/gpt4all-mpt


	## Model License

	Apache 2.0


	## Purpose

	This model is for experimenting with merging and routing to expert layers.

	# Test eval on only 10% of eval set

	hf-causal (pretrained=Multi-Domain-Expert-Layers/given-mpt-7b,dtype=bfloat16,trust_remote_code=True), limit: 0.1, provide_description: False, num_fewshot: 0, batch_size: None
	\| Task \|Version\| Metric \| Value \| \|Stderr\|
	\|-------------------------------------------------\|------:\|-----------\|------:\|---\|-----:\|
	\|arc_challenge \| 0\|acc \| 0.4274\|± \|0.0459\|
	\| \| \|acc_norm \| 0.3846\|± \|0.0452\|
	\|arc_easy \| 0\|acc \| 0.7863\|± \|0.0381\|
	\| \| \|acc_norm \| 0.7350\|± \|0.0410\|
	\|hellaswag \| 0\|acc \| 0.5556\|± \|0.0461\|
	\| \| \|acc_norm \| 0.8120\|± \|0.0363\|
	\|hendrycksTest-college_chemistry \| 0\|acc \| 0.3600\|± \|0.0482\|
	\| \| \|acc_norm \| 0.3700\|± \|0.0485\|
	\|hendrycksTest-college_computer_science \| 0\|acc \| 0.3400\|± \|0.0476\|
	\| \| \|acc_norm \| 0.3600\|± \|0.0482\|
	\|hendrycksTest-college_mathematics \| 0\|acc \| 0.2500\|± \|0.0435\|
	\| \| \|acc_norm \| 0.2900\|± \|0.0456\|
	\|hendrycksTest-college_medicine \| 0\|acc \| 0.3675\|± \|0.0448\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-college_physics \| 0\|acc \| 0.2451\|± \|0.0428\|
	\| \| \|acc_norm \| 0.2941\|± \|0.0453\|
	\|hendrycksTest-computer_security \| 0\|acc \| 0.4800\|± \|0.0502\|
	\| \| \|acc_norm \| 0.4400\|± \|0.0499\|
	\|hendrycksTest-conceptual_physics \| 0\|acc \| 0.2051\|± \|0.0375\|
	\| \| \|acc_norm \| 0.1709\|± \|0.0350\|
	\|hendrycksTest-econometrics \| 0\|acc \| 0.2982\|± \|0.0430\|
	\| \| \|acc_norm \| 0.2368\|± \|0.0400\|
	\|hendrycksTest-electrical_engineering \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3590\|± \|0.0445\|
	\|hendrycksTest-elementary_mathematics \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-formal_logic \| 0\|acc \| 0.3077\|± \|0.0429\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-global_facts \| 0\|acc \| 0.3000\|± \|0.0461\|
	\| \| \|acc_norm \| 0.2700\|± \|0.0446\|
	\|hendrycksTest-high_school_biology \| 0\|acc \| 0.3675\|± \|0.0448\|
	\| \| \|acc_norm \| 0.3077\|± \|0.0429\|
	\|hendrycksTest-high_school_chemistry \| 0\|acc \| 0.2564\|± \|0.0405\|
	\| \| \|acc_norm \| 0.2906\|± \|0.0422\|
	\|hendrycksTest-high_school_computer_science \| 0\|acc \| 0.4100\|± \|0.0494\|
	\| \| \|acc_norm \| 0.4400\|± \|0.0499\|
	\|hendrycksTest-high_school_european_history \| 0\|acc \| 0.4359\|± \|0.0460\|
	\| \| \|acc_norm \| 0.3590\|± \|0.0445\|
	\|hendrycksTest-high_school_geography \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3675\|± \|0.0448\|
	\|hendrycksTest-high_school_government_and_politics\| 0\|acc \| 0.3932\|± \|0.0454\|
	\| \| \|acc_norm \| 0.3932\|± \|0.0454\|
	\|hendrycksTest-high_school_macroeconomics \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-high_school_mathematics \| 0\|acc \| 0.2051\|± \|0.0375\|
	\| \| \|acc_norm \| 0.2564\|± \|0.0405\|
	\|hendrycksTest-high_school_microeconomics \| 0\|acc \| 0.3504\|± \|0.0443\|
	\| \| \|acc_norm \| 0.4188\|± \|0.0458\|
	\|hendrycksTest-high_school_physics \| 0\|acc \| 0.2650\|± \|0.0410\|
	\| \| \|acc_norm \| 0.2906\|± \|0.0422\|
	\|hendrycksTest-high_school_psychology \| 0\|acc \| 0.3761\|± \|0.0450\|
	\| \| \|acc_norm \| 0.3419\|± \|0.0440\|
	\|hendrycksTest-high_school_statistics \| 0\|acc \| 0.3077\|± \|0.0429\|
	\| \| \|acc_norm \| 0.3504\|± \|0.0443\|
	\|hendrycksTest-high_school_us_history \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3333\|± \|0.0438\|
	\|hendrycksTest-high_school_world_history \| 0\|acc \| 0.3333\|± \|0.0438\|
	\| \| \|acc_norm \| 0.3419\|± \|0.0440\|
	\|hendrycksTest-human_aging \| 0\|acc \| 0.3761\|± \|0.0450\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-human_sexuality \| 0\|acc \| 0.4274\|± \|0.0459\|
	\| \| \|acc_norm \| 0.3761\|± \|0.0450\|
	\|hendrycksTest-international_law \| 0\|acc \| 0.4188\|± \|0.0458\|
	\| \| \|acc_norm \| 0.4957\|± \|0.0464\|
	\|hendrycksTest-jurisprudence \| 0\|acc \| 0.3148\|± \|0.0449\|
	\| \| \|acc_norm \| 0.4815\|± \|0.0483\|
	\|hendrycksTest-logical_fallacies \| 0\|acc \| 0.3504\|± \|0.0443\|
	\| \| \|acc_norm \| 0.3675\|± \|0.0448\|
	\|hendrycksTest-machine_learning \| 0\|acc \| 0.3214\|± \|0.0443\|
	\| \| \|acc_norm \| 0.2946\|± \|0.0433\|
	\|hendrycksTest-management \| 0\|acc \| 0.3786\|± \|0.0480\|
	\| \| \|acc_norm \| 0.3495\|± \|0.0472\|
	\|hendrycksTest-marketing \| 0\|acc \| 0.5043\|± \|0.0464\|
	\| \| \|acc_norm \| 0.4188\|± \|0.0458\|
	\|hendrycksTest-medical_genetics \| 0\|acc \| 0.3200\|± \|0.0469\|
	\| \| \|acc_norm \| 0.4100\|± \|0.0494\|
	\|hendrycksTest-miscellaneous \| 0\|acc \| 0.5299\|± \|0.0463\|
	\| \| \|acc_norm \| 0.4872\|± \|0.0464\|
	\|hendrycksTest-moral_disputes \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-moral_scenarios \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.2479\|± \|0.0401\|
	\|hendrycksTest-nutrition \| 0\|acc \| 0.3675\|± \|0.0448\|
	\| \| \|acc_norm \| 0.3932\|± \|0.0454\|
	\|hendrycksTest-philosophy \| 0\|acc \| 0.2991\|± \|0.0425\|
	\| \| \|acc_norm \| 0.3504\|± \|0.0443\|
	\|hendrycksTest-prehistory \| 0\|acc \| 0.2821\|± \|0.0418\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-professional_accounting \| 0\|acc \| 0.2137\|± \|0.0381\|
	\| \| \|acc_norm \| 0.2222\|± \|0.0386\|
	\|hendrycksTest-professional_law \| 0\|acc \| 0.3077\|± \|0.0429\|
	\| \| \|acc_norm \| 0.2735\|± \|0.0414\|
	\|hendrycksTest-professional_medicine \| 0\|acc \| 0.2991\|± \|0.0425\|
	\| \| \|acc_norm \| 0.2650\|± \|0.0410\|
	\|hendrycksTest-professional_psychology \| 0\|acc \| 0.3248\|± \|0.0435\|
	\| \| \|acc_norm \| 0.3419\|± \|0.0440\|
	\|hendrycksTest-public_relations \| 0\|acc \| 0.3909\|± \|0.0467\|
	\| \| \|acc_norm \| 0.3545\|± \|0.0458\|
	\|hendrycksTest-security_studies \| 0\|acc \| 0.3419\|± \|0.0440\|
	\| \| \|acc_norm \| 0.2906\|± \|0.0422\|
	\|hendrycksTest-sociology \| 0\|acc \| 0.3761\|± \|0.0450\|
	\| \| \|acc_norm \| 0.3162\|± \|0.0432\|
	\|hendrycksTest-us_foreign_policy \| 0\|acc \| 0.5000\|± \|0.0503\|
	\| \| \|acc_norm \| 0.4100\|± \|0.0494\|
	\|hendrycksTest-virology \| 0\|acc \| 0.3932\|± \|0.0454\|
	\| \| \|acc_norm \| 0.3248\|± \|0.0435\|
	\|hendrycksTest-world_religions \| 0\|acc \| 0.5299\|± \|0.0463\|
	\| \| \|acc_norm \| 0.5128\|± \|0.0464\|
	\|truthfulqa_gen \| 1\|bleurt_max \|-0.8551\|± \|0.0501\|
	\| \| \|bleurt_acc \| 0.3590\|± \|0.0445\|
	\| \| \|bleurt_diff\|-0.1292\|± \|0.0483\|
	\| \| \|bleu_max \|19.3738\|± \|1.8461\|
	\| \| \|bleu_acc \| 0.3932\|± \|0.0454\|
	\| \| \|bleu_diff \|-4.3883\|± \|2.1748\|
	\| \| \|rouge1_max \|41.8428\|± \|2.6156\|
	\| \| \|rouge1_acc \| 0.3162\|± \|0.0432\|
	\| \| \|rouge1_diff\|-8.8583\|± \|2.7745\|
	\| \| \|rouge2_max \|26.3956\|± \|2.8311\|
	\| \| \|rouge2_acc \| 0.2137\|± \|0.0381\|
	\| \| \|rouge2_diff\|-9.5287\|± \|3.3258\|
	\| \| \|rougeL_max \|39.5215\|± \|2.5620\|
	\| \| \|rougeL_acc \| 0.3162\|± \|0.0432\|
	\| \| \|rougeL_diff\|-8.5753\|± \|2.8259\|