--- license: mit tags: - generated_from_trainer metrics: - accuracy model-index: - name: 22_12_13_luther_blocks_larger_fp16_20ep results: [] language: - de --- # 22_12_13_luther_blocks_larger_fp16_20ep This model is a fine-tuned version of [stefan-it/german-gpt2-larger](https://huggingface.co/stefan-it/german-gpt2-larger) on a dataset of texts by Martin Luther. It achieves the following results on the evaluation set: - Loss: 3.5847 - Accuracy: 0.3168 ## Model description This is a language model used to generate wishes for a happy new year to the readers of "reformiert" a journal in Switzerland (https://www.reformiert.info) ## Intended uses & limitations This is to test the capabilities of the GPT-2 transformer architecture. ## Training and evaluation data Automatic split of an edited and "cleaned" version of parts of Luther's writing. Cleaning refers here to the process of eliminating para-texts like page numbering, footnotes, etc. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 20.0 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:--------:| | No log | 1.6 | 50 | 4.6218 | 0.2156 | | 8.1175 | 3.22 | 100 | 4.0404 | 0.2633 | | 8.1175 | 4.83 | 150 | 3.8120 | 0.2871 | | 3.734 | 6.44 | 200 | 3.7062 | 0.2997 | | 3.734 | 8.06 | 250 | 3.6382 | 0.3082 | | 3.3639 | 9.67 | 300 | 3.6108 | 0.3128 | | 3.3639 | 11.29 | 350 | 3.6012 | 0.3148 | | 3.1363 | 12.89 | 400 | 3.5847 | 0.3168 | | 3.1363 | 14.51 | 450 | 3.5914 | 0.3180 | | 2.9884 | 16.13 | 500 | 3.5954 | 0.3177 | | 2.9884 | 17.73 | 550 | 3.6001 | 0.3176 | | 2.8748 | 19.35 | 600 | 3.6048 | 0.3188 | ### Framework versions - Transformers 4.26.0.dev0 - Pytorch 1.13.0 - Datasets 2.7.1 - Tokenizers 0.12.1