Upload folder using huggingface_hub

8a3fa8c over 1 year ago

6.83 kB

	---
	base_model: meta-llama/Llama-2-7b-hf
	tags:
	- generated_from_trainer
	model-index:
	- name: qlora-out
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
	# qlora-out

	This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6420

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 10
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 0.9758 \| 0.03 \| 20 \| 0.6870 \|
	\| 0.7228 \| 0.06 \| 40 \| 0.6791 \|
	\| 0.6804 \| 0.09 \| 60 \| 0.6613 \|
	\| 0.8117 \| 0.11 \| 80 \| 0.6360 \|
	\| 0.6458 \| 0.14 \| 100 \| 0.6335 \|
	\| 0.7509 \| 0.17 \| 120 \| 0.6245 \|
	\| 0.6174 \| 0.2 \| 140 \| 0.6313 \|
	\| 0.7549 \| 0.23 \| 160 \| 0.6180 \|
	\| 0.6015 \| 0.26 \| 180 \| 0.6167 \|
	\| 0.716 \| 0.29 \| 200 \| 0.6165 \|
	\| 0.6304 \| 0.31 \| 220 \| 0.6014 \|
	\| 0.5781 \| 0.34 \| 240 \| 0.6107 \|
	\| 0.8 \| 0.37 \| 260 \| 0.5949 \|
	\| 0.6845 \| 0.4 \| 280 \| 0.5953 \|
	\| 0.5857 \| 0.43 \| 300 \| 0.5940 \|
	\| 0.6369 \| 0.46 \| 320 \| 0.5889 \|
	\| 0.4767 \| 0.49 \| 340 \| 0.5946 \|
	\| 0.4848 \| 0.52 \| 360 \| 0.5991 \|
	\| 0.9067 \| 0.54 \| 380 \| 0.5943 \|
	\| 0.5943 \| 0.57 \| 400 \| 0.5854 \|
	\| 0.6999 \| 0.6 \| 420 \| 0.5941 \|
	\| 0.5173 \| 0.63 \| 440 \| 0.5887 \|
	\| 0.4201 \| 0.66 \| 460 \| 0.5952 \|
	\| 0.667 \| 0.69 \| 480 \| 0.5802 \|
	\| 0.8568 \| 0.72 \| 500 \| 0.5922 \|
	\| 0.515 \| 0.74 \| 520 \| 0.5800 \|
	\| 0.504 \| 0.77 \| 540 \| 0.5894 \|
	\| 0.6361 \| 0.8 \| 560 \| 0.5983 \|
	\| 0.4896 \| 0.83 \| 580 \| 0.5770 \|
	\| 0.6044 \| 0.86 \| 600 \| 0.5717 \|
	\| 0.4925 \| 0.89 \| 620 \| 0.5715 \|
	\| 0.4704 \| 0.92 \| 640 \| 0.5707 \|
	\| 0.5342 \| 0.94 \| 660 \| 0.5748 \|
	\| 0.755 \| 0.97 \| 680 \| 0.5673 \|
	\| 0.6547 \| 1.0 \| 700 \| 0.5721 \|
	\| 0.6014 \| 1.03 \| 720 \| 0.5892 \|
	\| 0.4692 \| 1.06 \| 740 \| 0.5981 \|
	\| 0.407 \| 1.09 \| 760 \| 0.5995 \|
	\| 0.5351 \| 1.12 \| 780 \| 0.5948 \|
	\| 0.3004 \| 1.14 \| 800 \| 0.5758 \|
	\| 0.554 \| 1.17 \| 820 \| 0.5862 \|
	\| 0.6394 \| 1.2 \| 840 \| 0.5850 \|
	\| 0.7135 \| 1.23 \| 860 \| 0.5900 \|
	\| 0.6323 \| 1.26 \| 880 \| 0.5931 \|
	\| 0.3257 \| 1.29 \| 900 \| 0.5902 \|
	\| 0.5183 \| 1.32 \| 920 \| 0.5763 \|
	\| 0.5383 \| 1.34 \| 940 \| 0.5842 \|
	\| 0.453 \| 1.37 \| 960 \| 0.5878 \|
	\| 0.5305 \| 1.4 \| 980 \| 0.5975 \|
	\| 0.4316 \| 1.43 \| 1000 \| 0.5829 \|
	\| 0.5992 \| 1.46 \| 1020 \| 0.5801 \|
	\| 0.5043 \| 1.49 \| 1040 \| 0.5731 \|
	\| 0.4566 \| 1.52 \| 1060 \| 0.5777 \|
	\| 0.4879 \| 1.55 \| 1080 \| 0.5785 \|
	\| 0.7149 \| 1.57 \| 1100 \| 0.5727 \|
	\| 0.4555 \| 1.6 \| 1120 \| 0.5824 \|
	\| 0.5248 \| 1.63 \| 1140 \| 0.5821 \|
	\| 0.4981 \| 1.66 \| 1160 \| 0.5711 \|
	\| 0.5595 \| 1.69 \| 1180 \| 0.5931 \|
	\| 0.577 \| 1.72 \| 1200 \| 0.5898 \|
	\| 0.3202 \| 1.75 \| 1220 \| 0.5775 \|
	\| 0.7182 \| 1.77 \| 1240 \| 0.5800 \|
	\| 0.5608 \| 1.8 \| 1260 \| 0.5668 \|
	\| 0.5677 \| 1.83 \| 1280 \| 0.5797 \|
	\| 0.5046 \| 1.86 \| 1300 \| 0.5725 \|
	\| 0.5165 \| 1.89 \| 1320 \| 0.5709 \|
	\| 0.6432 \| 1.92 \| 1340 \| 0.5817 \|
	\| 0.4973 \| 1.95 \| 1360 \| 0.5695 \|
	\| 0.2903 \| 1.97 \| 1380 \| 0.5762 \|
	\| 0.3099 \| 2.0 \| 1400 \| 0.5832 \|
	\| 0.4383 \| 2.03 \| 1420 \| 0.6773 \|
	\| 0.287 \| 2.06 \| 1440 \| 0.6324 \|
	\| 0.3395 \| 2.09 \| 1460 \| 0.6600 \|
	\| 0.2677 \| 2.12 \| 1480 \| 0.6409 \|
	\| 0.4145 \| 2.15 \| 1500 \| 0.6259 \|
	\| 0.2435 \| 2.17 \| 1520 \| 0.6528 \|
	\| 0.2539 \| 2.2 \| 1540 \| 0.6379 \|
	\| 0.3619 \| 2.23 \| 1560 \| 0.6402 \|
	\| 0.3289 \| 2.26 \| 1580 \| 0.6355 \|
	\| 0.4993 \| 2.29 \| 1600 \| 0.6515 \|
	\| 0.2705 \| 2.32 \| 1620 \| 0.6357 \|
	\| 0.4863 \| 2.35 \| 1640 \| 0.6385 \|
	\| 0.356 \| 2.37 \| 1660 \| 0.6364 \|
	\| 0.3433 \| 2.4 \| 1680 \| 0.6390 \|
	\| 0.3215 \| 2.43 \| 1700 \| 0.6325 \|
	\| 0.4795 \| 2.46 \| 1720 \| 0.6336 \|
	\| 0.3457 \| 2.49 \| 1740 \| 0.6342 \|
	\| 0.6864 \| 2.52 \| 1760 \| 0.6435 \|
	\| 0.3965 \| 2.55 \| 1780 \| 0.6447 \|
	\| 0.3424 \| 2.58 \| 1800 \| 0.6344 \|
	\| 0.7203 \| 2.6 \| 1820 \| 0.6385 \|
	\| 0.6209 \| 2.63 \| 1840 \| 0.6475 \|
	\| 0.3693 \| 2.66 \| 1860 \| 0.6439 \|
	\| 0.4004 \| 2.69 \| 1880 \| 0.6410 \|
	\| 0.3499 \| 2.72 \| 1900 \| 0.6392 \|
	\| 0.4691 \| 2.75 \| 1920 \| 0.6396 \|
	\| 0.2775 \| 2.78 \| 1940 \| 0.6387 \|
	\| 0.26 \| 2.8 \| 1960 \| 0.6423 \|
	\| 0.2917 \| 2.83 \| 1980 \| 0.6432 \|
	\| 0.4461 \| 2.86 \| 2000 \| 0.6414 \|
	\| 0.4149 \| 2.89 \| 2020 \| 0.6433 \|
	\| 0.2863 \| 2.92 \| 2040 \| 0.6428 \|
	\| 0.1832 \| 2.95 \| 2060 \| 0.6424 \|
	\| 0.5409 \| 2.98 \| 2080 \| 0.6420 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.6
	- Tokenizers 0.14.1

	---
	base_model: meta-llama/Llama-2-7b-hf
	tags:
	- generated_from_trainer
	model-index:
	- name: qlora-out
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
	# qlora-out

	This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6420

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 10
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 0.9758 \| 0.03 \| 20 \| 0.6870 \|
	\| 0.7228 \| 0.06 \| 40 \| 0.6791 \|
	\| 0.6804 \| 0.09 \| 60 \| 0.6613 \|
	\| 0.8117 \| 0.11 \| 80 \| 0.6360 \|
	\| 0.6458 \| 0.14 \| 100 \| 0.6335 \|
	\| 0.7509 \| 0.17 \| 120 \| 0.6245 \|
	\| 0.6174 \| 0.2 \| 140 \| 0.6313 \|
	\| 0.7549 \| 0.23 \| 160 \| 0.6180 \|
	\| 0.6015 \| 0.26 \| 180 \| 0.6167 \|
	\| 0.716 \| 0.29 \| 200 \| 0.6165 \|
	\| 0.6304 \| 0.31 \| 220 \| 0.6014 \|
	\| 0.5781 \| 0.34 \| 240 \| 0.6107 \|
	\| 0.8 \| 0.37 \| 260 \| 0.5949 \|
	\| 0.6845 \| 0.4 \| 280 \| 0.5953 \|
	\| 0.5857 \| 0.43 \| 300 \| 0.5940 \|
	\| 0.6369 \| 0.46 \| 320 \| 0.5889 \|
	\| 0.4767 \| 0.49 \| 340 \| 0.5946 \|
	\| 0.4848 \| 0.52 \| 360 \| 0.5991 \|
	\| 0.9067 \| 0.54 \| 380 \| 0.5943 \|
	\| 0.5943 \| 0.57 \| 400 \| 0.5854 \|
	\| 0.6999 \| 0.6 \| 420 \| 0.5941 \|
	\| 0.5173 \| 0.63 \| 440 \| 0.5887 \|
	\| 0.4201 \| 0.66 \| 460 \| 0.5952 \|
	\| 0.667 \| 0.69 \| 480 \| 0.5802 \|
	\| 0.8568 \| 0.72 \| 500 \| 0.5922 \|
	\| 0.515 \| 0.74 \| 520 \| 0.5800 \|
	\| 0.504 \| 0.77 \| 540 \| 0.5894 \|
	\| 0.6361 \| 0.8 \| 560 \| 0.5983 \|
	\| 0.4896 \| 0.83 \| 580 \| 0.5770 \|
	\| 0.6044 \| 0.86 \| 600 \| 0.5717 \|
	\| 0.4925 \| 0.89 \| 620 \| 0.5715 \|
	\| 0.4704 \| 0.92 \| 640 \| 0.5707 \|
	\| 0.5342 \| 0.94 \| 660 \| 0.5748 \|
	\| 0.755 \| 0.97 \| 680 \| 0.5673 \|
	\| 0.6547 \| 1.0 \| 700 \| 0.5721 \|
	\| 0.6014 \| 1.03 \| 720 \| 0.5892 \|
	\| 0.4692 \| 1.06 \| 740 \| 0.5981 \|
	\| 0.407 \| 1.09 \| 760 \| 0.5995 \|
	\| 0.5351 \| 1.12 \| 780 \| 0.5948 \|
	\| 0.3004 \| 1.14 \| 800 \| 0.5758 \|
	\| 0.554 \| 1.17 \| 820 \| 0.5862 \|
	\| 0.6394 \| 1.2 \| 840 \| 0.5850 \|
	\| 0.7135 \| 1.23 \| 860 \| 0.5900 \|
	\| 0.6323 \| 1.26 \| 880 \| 0.5931 \|
	\| 0.3257 \| 1.29 \| 900 \| 0.5902 \|
	\| 0.5183 \| 1.32 \| 920 \| 0.5763 \|
	\| 0.5383 \| 1.34 \| 940 \| 0.5842 \|
	\| 0.453 \| 1.37 \| 960 \| 0.5878 \|
	\| 0.5305 \| 1.4 \| 980 \| 0.5975 \|
	\| 0.4316 \| 1.43 \| 1000 \| 0.5829 \|
	\| 0.5992 \| 1.46 \| 1020 \| 0.5801 \|
	\| 0.5043 \| 1.49 \| 1040 \| 0.5731 \|
	\| 0.4566 \| 1.52 \| 1060 \| 0.5777 \|
	\| 0.4879 \| 1.55 \| 1080 \| 0.5785 \|
	\| 0.7149 \| 1.57 \| 1100 \| 0.5727 \|
	\| 0.4555 \| 1.6 \| 1120 \| 0.5824 \|
	\| 0.5248 \| 1.63 \| 1140 \| 0.5821 \|
	\| 0.4981 \| 1.66 \| 1160 \| 0.5711 \|
	\| 0.5595 \| 1.69 \| 1180 \| 0.5931 \|
	\| 0.577 \| 1.72 \| 1200 \| 0.5898 \|
	\| 0.3202 \| 1.75 \| 1220 \| 0.5775 \|
	\| 0.7182 \| 1.77 \| 1240 \| 0.5800 \|
	\| 0.5608 \| 1.8 \| 1260 \| 0.5668 \|
	\| 0.5677 \| 1.83 \| 1280 \| 0.5797 \|
	\| 0.5046 \| 1.86 \| 1300 \| 0.5725 \|
	\| 0.5165 \| 1.89 \| 1320 \| 0.5709 \|
	\| 0.6432 \| 1.92 \| 1340 \| 0.5817 \|
	\| 0.4973 \| 1.95 \| 1360 \| 0.5695 \|
	\| 0.2903 \| 1.97 \| 1380 \| 0.5762 \|
	\| 0.3099 \| 2.0 \| 1400 \| 0.5832 \|
	\| 0.4383 \| 2.03 \| 1420 \| 0.6773 \|
	\| 0.287 \| 2.06 \| 1440 \| 0.6324 \|
	\| 0.3395 \| 2.09 \| 1460 \| 0.6600 \|
	\| 0.2677 \| 2.12 \| 1480 \| 0.6409 \|
	\| 0.4145 \| 2.15 \| 1500 \| 0.6259 \|
	\| 0.2435 \| 2.17 \| 1520 \| 0.6528 \|
	\| 0.2539 \| 2.2 \| 1540 \| 0.6379 \|
	\| 0.3619 \| 2.23 \| 1560 \| 0.6402 \|
	\| 0.3289 \| 2.26 \| 1580 \| 0.6355 \|
	\| 0.4993 \| 2.29 \| 1600 \| 0.6515 \|
	\| 0.2705 \| 2.32 \| 1620 \| 0.6357 \|
	\| 0.4863 \| 2.35 \| 1640 \| 0.6385 \|
	\| 0.356 \| 2.37 \| 1660 \| 0.6364 \|
	\| 0.3433 \| 2.4 \| 1680 \| 0.6390 \|
	\| 0.3215 \| 2.43 \| 1700 \| 0.6325 \|
	\| 0.4795 \| 2.46 \| 1720 \| 0.6336 \|
	\| 0.3457 \| 2.49 \| 1740 \| 0.6342 \|
	\| 0.6864 \| 2.52 \| 1760 \| 0.6435 \|
	\| 0.3965 \| 2.55 \| 1780 \| 0.6447 \|
	\| 0.3424 \| 2.58 \| 1800 \| 0.6344 \|
	\| 0.7203 \| 2.6 \| 1820 \| 0.6385 \|
	\| 0.6209 \| 2.63 \| 1840 \| 0.6475 \|
	\| 0.3693 \| 2.66 \| 1860 \| 0.6439 \|
	\| 0.4004 \| 2.69 \| 1880 \| 0.6410 \|
	\| 0.3499 \| 2.72 \| 1900 \| 0.6392 \|
	\| 0.4691 \| 2.75 \| 1920 \| 0.6396 \|
	\| 0.2775 \| 2.78 \| 1940 \| 0.6387 \|
	\| 0.26 \| 2.8 \| 1960 \| 0.6423 \|
	\| 0.2917 \| 2.83 \| 1980 \| 0.6432 \|
	\| 0.4461 \| 2.86 \| 2000 \| 0.6414 \|
	\| 0.4149 \| 2.89 \| 2020 \| 0.6433 \|
	\| 0.2863 \| 2.92 \| 2040 \| 0.6428 \|
	\| 0.1832 \| 2.95 \| 2060 \| 0.6424 \|
	\| 0.5409 \| 2.98 \| 2080 \| 0.6420 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.6
	- Tokenizers 0.14.1