Weyaxi
/

Einstein-v6.1-Llama3-8B

Model card Files Files and versions Community

Einstein-v6.1-Llama3-8B / README.md

Weyaxi

Update README.md

d842bde verified 5 months ago

preview code

raw

history blame contribute delete

14.4 kB

	---
	language:
	- en
	license: other
	tags:
	- axolotl
	- generated_from_trainer
	- instruct
	- finetune
	- chatml
	- gpt4
	- synthetic data
	- science
	- physics
	- chemistry
	- biology
	- math
	- llama
	- llama3
	base_model: meta-llama/Meta-Llama-3-8B
	datasets:
	- allenai/ai2_arc
	- camel-ai/physics
	- camel-ai/chemistry
	- camel-ai/biology
	- camel-ai/math
	- metaeval/reclor
	- openbookqa
	- mandyyyyii/scibench
	- derek-thomas/ScienceQA
	- TIGER-Lab/ScienceEval
	- jondurbin/airoboros-3.2
	- LDJnr/Capybara
	- Cot-Alpaca-GPT4-From-OpenHermes-2.5
	- STEM-AI-mtl/Electrical-engineering
	- knowrohit07/saraswati-stem
	- sablo/oasst2_curated
	- lmsys/lmsys-chat-1m
	- TIGER-Lab/MathInstruct
	- bigbio/med_qa
	- meta-math/MetaMathQA-40K
	- openbookqa
	- piqa
	- metaeval/reclor
	- derek-thomas/ScienceQA
	- scibench
	- sciq
	- Open-Orca/SlimOrca
	- migtissera/Synthia-v1.3
	- TIGER-Lab/ScienceEval
	- allenai/WildChat
	- microsoft/orca-math-word-problems-200k
	- openchat/openchat_sharegpt4_dataset
	- teknium/GPTeacher-General-Instruct
	- m-a-p/CodeFeedback-Filtered-Instruction
	- totally-not-an-llm/EverythingLM-data-V3
	- HuggingFaceH4/no_robots
	- OpenAssistant/oasst_top1_2023-08-25
	- WizardLM/WizardLM_evol_instruct_70k
	model-index:
	- name: Einstein-v6.1-Llama3-8B
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 62.46
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 82.41
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 66.19
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 55.1
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 79.32
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 66.11
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 45.68
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 29.38
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 5.74
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 4.25
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 11.23
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 23.68
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B
	name: Open LLM Leaderboard
	---
	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/5s12oq859qLfDkkTNam_C.png)

	# 🔬 Einstein-v6.1-Llama3-8B

	This model is a full fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on diverse datasets.

	This model is finetuned using `8xRTX3090` + `1xRTXA6000` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).

	This model's training was sponsored by [sablo.ai](https://sablo.ai).

	<details><summary>See axolotl config</summary>

	axolotl version: `0.4.0`
	```yaml
	base_model: meta-llama/Meta-Llama-3-8B
	model_type: LlamaForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: false
	strict: false

	chat_template: chatml
	datasets:
	- path: data/merged_all.json
	ds_type: json
	type: alpaca
	conversation: chatml

	- path: data/gpteacher-instruct-special-alpaca.json
	ds_type: json
	type: gpteacher
	conversation: chatml

	- path: data/wizardlm_evol_instruct_70k_random_half.json
	ds_type: json
	type: alpaca
	conversation: chatml

	- path: data/capybara_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/synthia-v1.3_sharegpt_12500.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/slimorca_dedup_filtered_95k_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	- path: data/pippa_bagel_repo_3k_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/gpt4_data_lmys_1m_sharegpt.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/sharegpt_gpt4_english.json
	ds_type: json
	type: sharegpt
	conversation: chatml

	- path: data/no_robots_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	- path: data/oasst_top1_from_fusechatmixture_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	- path: data/everythinglm-data-v3_sharegpt.json
	ds_type: json
	type: sharegpt
	strict: false
	conversation: chatml

	dataset_prepared_path: last_run_prepared
	val_set_size: 0.002

	output_dir: ./Einstein-v6.1-Llama3-8B-model

	sequence_len: 8192
	sample_packing: true
	pad_to_sequence_len: true
	eval_sample_packing: false

	wandb_project: Einstein
	wandb_entity:
	wandb_watch:
	wandb_name: Einstein-v6.1-Llama3-2-epoch
	wandb_log_model:
	hub_model_id: Weyaxi/Einstein-v6.1-Llama3-8B

	save_safetensors: true

	gradient_accumulation_steps: 4
	micro_batch_size: 1
	num_epochs: 2
	optimizer: adamw_bnb_8bit # look
	lr_scheduler: cosine
	learning_rate: 0.000005 # look

	train_on_inputs: false
	group_by_length: false
	bf16: true
	fp16: false
	tf32: false

	gradient_checkpointing: true
	early_stopping_patience:
	resume_from_checkpoint:
	local_rank:
	logging_steps: 1
	xformers_attention:
	flash_attention: true

	warmup_steps: 10
	evals_per_epoch: 2
	eval_table_size:
	eval_table_max_new_tokens: 128
	saves_per_epoch: 2
	debug:

	deepspeed: zero3_bf16_cpuoffload_params.json
	weight_decay: 0.0
	fsdp:
	fsdp_config:
	special_tokens:
	bos_token: "<s>"
	eos_token: "<\|im_end\|>"
	unk_token: "<unk>"
	pad_token: <\|end_of_text\|> # changed
	tokens:
	- "<\|im_start\|>"
	```
	</details><br>

	# 💬 Prompt Template

	You can use ChatML prompt template while using the model:

	### ChatML

	```
	<\|im_start\|>system
	{system}<\|im_end\|>
	<\|im_start\|>user
	{user}<\|im_end\|>
	<\|im_start\|>assistant
	{asistant}<\|im_end\|>
	```

	This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
	`tokenizer.apply_chat_template()` method:

	```python
	messages = [
	{"role": "system", "content": "You are helpful AI asistant."},
	{"role": "user", "content": "Hello!"}
	]
	gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
	model.generate(**gen_input)
	```

	# 📊 Datasets used in this model

	The datasets used to train this model are listed in the metadata section of the model card.

	Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria.

	The results of this filtering process and its outcomes are in the data folder of this repository:

	[Weyaxi/Einstein-v6.1-Llama3-8B/data](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B/tree/main/data)

	# 🔄 Quantizationed versions

	## GGUF [@bartowski](https://huggingface.co/bartowski)

	- https://huggingface.co/bartowski/Einstein-v6.1-Llama3-8B-GGUF

	## ExLlamaV2 [@bartowski](https://huggingface.co/bartowski)

	- https://huggingface.co/bartowski/Einstein-v6.1-Llama3-8B-exl2

	## AWQ [@solidrust](https://huggingface.co/solidrust)

	- https://huggingface.co/solidrust/Einstein-v6.1-Llama3-8B-AWQ

	# 🎯 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Einstein-v6.1-Llama3-8B)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|68.60\|
	\|AI2 Reasoning Challenge (25-Shot)\|62.46\|
	\|HellaSwag (10-Shot) \|82.41\|
	\|MMLU (5-Shot) \|66.19\|
	\|TruthfulQA (0-shot) \|55.10\|
	\|Winogrande (5-shot) \|79.32\|
	\|GSM8k (5-shot) \|66.11\|

	# 🎯 [Open LLM Leaderboard v2 Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Einstein-v6.1-Llama3-8B)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|19.99\|
	\|IFEval (0-Shot) \|45.68\|
	\|BBH (3-Shot) \|29.38\|
	\|MATH Lvl 5 (4-Shot)\| 5.74\|
	\|GPQA (0-shot) \| 4.25\|
	\|MuSR (0-shot) \|11.23\|
	\|MMLU-PRO (5-shot) \|23.68\|


	# 📚 Some resources, discussions and reviews aboout this model

	#### 🐦 Announcement tweet:

	- https://twitter.com/Weyaxi/status/1783050724659675627

	#### 🔍 Reddit post in r/LocalLLaMA:

	- https://www.reddit.com/r/LocalLLaMA/comments/1cdlym1/introducing_einstein_v61_based_on_the_new_llama3/

	#### ▶️ Youtube Video(s)

	- [Install Einstein v6.1 Llama3-8B Locally on Windows](https://www.youtube.com/watch?v=VePvv6OM0JY)

	#### 📱 Octopus-V4-3B

	- [Octopus-V4-3B](https://huggingface.co/NexaAIDev/Octopus-v4) leverages the incredible physics capabilities of [Einstein-v6.1-Llama3-8B](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B) in their model.

	# 🤖 Additional information about training

	This model is full fine-tuned for 2 epoch.

	Total number of steps was 2026.

	<details><summary>Loss graph</summary>

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/Ycs7ZpoqmxFt0u9rybCO1.png)

	</details><br>

	# 🤝 Acknowledgments

	Thanks to [sablo.ai](https://sablo.ai) for sponsoring this model.

	Thanks to all the dataset authors mentioned in the datasets section.

	Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.

	Thanks to all open source AI community.

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)

	If you would like to support me:

	[☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)