--- license: mit base_model: microsoft/DialoGPT-small tags: - generated_from_trainer model-index: - name: DialoGPT-small-FinalFantasyDialogue results: [] --- # DialoGPT-small-FinalFantasyDialogue This model is a fine-tuned version of [microsoft/DialoGPT-small](https://huggingface.co/microsoft/DialoGPT-small) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.3930 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.005 - train_batch_size: 32 - eval_batch_size: 32 - seed: 42 - gradient_accumulation_steps: 8 - total_train_batch_size: 256 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 1000 - num_epochs: 20 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 3.3955 | 1.0 | 141 | 2.8517 | | 2.5623 | 1.99 | 282 | 2.1898 | | 1.9315 | 3.0 | 424 | 1.7076 | | 1.5264 | 4.0 | 565 | 1.3901 | | 1.2892 | 4.99 | 706 | 1.1884 | | 1.1325 | 6.0 | 848 | 1.0805 | | 1.0404 | 7.0 | 989 | 0.9933 | | 0.8733 | 8.0 | 1131 | 0.8070 | | 0.6344 | 9.0 | 1272 | 0.6326 | | 0.5047 | 9.99 | 1413 | 0.5504 | | 0.413 | 11.0 | 1555 | 0.5021 | | 0.3457 | 12.0 | 1696 | 0.4586 | | 0.3049 | 12.99 | 1837 | 0.4294 | | 0.2475 | 14.0 | 1979 | 0.4154 | | 0.2081 | 15.0 | 2120 | 0.3943 | | 0.1808 | 16.0 | 2262 | 0.3886 | | 0.1601 | 17.0 | 2403 | 0.3839 | | 0.1431 | 17.99 | 2544 | 0.3850 | | 0.1323 | 19.0 | 2686 | 0.3843 | | 0.1221 | 19.95 | 2820 | 0.3930 | ### Framework versions - Transformers 4.33.2 - Pytorch 2.0.1+cu118 - Datasets 2.14.5 - Tokenizers 0.13.3