--- license: mit library_name: peft tags: - trl - sft - generated_from_trainer base_model: HuggingFaceH4/zephyr-7b-beta model-index: - name: zephyr_instruct_generation results: [] --- # zephyr_instruct_generation This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.2879 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 4 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - lr_scheduler_warmup_steps: 0.03 - training_steps: 400 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 3.3773 | 0.77 | 10 | 2.9712 | | 2.8948 | 1.54 | 20 | 2.5856 | | 2.4289 | 2.31 | 30 | 2.2310 | | 2.0707 | 3.08 | 40 | 1.8374 | | 1.6398 | 3.85 | 50 | 1.4469 | | 1.2391 | 4.62 | 60 | 1.2461 | | 1.1218 | 5.38 | 70 | 1.0960 | | 0.969 | 6.15 | 80 | 0.9912 | | 0.8524 | 6.92 | 90 | 0.9169 | | 0.795 | 7.69 | 100 | 0.8785 | | 0.7214 | 8.46 | 110 | 0.8506 | | 0.6816 | 9.23 | 120 | 0.7987 | | 0.6392 | 10.0 | 130 | 0.8146 | | 0.5615 | 10.77 | 140 | 0.8004 | | 0.5619 | 11.54 | 150 | 0.7833 | | 0.4588 | 12.31 | 160 | 0.7906 | | 0.4674 | 13.08 | 170 | 0.8011 | | 0.4311 | 13.85 | 180 | 0.8330 | | 0.3866 | 14.62 | 190 | 0.8773 | | 0.3581 | 15.38 | 200 | 0.9049 | | 0.3656 | 16.15 | 210 | 0.8357 | | 0.3231 | 16.92 | 220 | 0.9665 | | 0.3144 | 17.69 | 230 | 0.9519 | | 0.3019 | 18.46 | 240 | 0.9800 | | 0.2953 | 19.23 | 250 | 1.0051 | | 0.2822 | 20.0 | 260 | 1.0017 | | 0.2573 | 20.77 | 270 | 1.0349 | | 0.2641 | 21.54 | 280 | 1.1069 | | 0.2494 | 22.31 | 290 | 1.1577 | | 0.2456 | 23.08 | 300 | 1.1906 | | 0.2393 | 23.85 | 310 | 1.0416 | | 0.2293 | 24.62 | 320 | 1.2253 | | 0.2286 | 25.38 | 330 | 1.1886 | | 0.2341 | 26.15 | 340 | 1.1371 | | 0.2222 | 26.92 | 350 | 1.3020 | | 0.2159 | 27.69 | 360 | 1.2110 | | 0.2185 | 28.46 | 370 | 1.2803 | | 0.2192 | 29.23 | 380 | 1.1759 | | 0.2203 | 30.0 | 390 | 1.2866 | | 0.2187 | 30.77 | 400 | 1.2879 | ### Framework versions - PEFT 0.7.1 - Transformers 4.36.2 - Pytorch 2.1.2+cu121 - Datasets 2.16.0 - Tokenizers 0.15.0