--- tags: - generated_from_trainer model-index: - name: tuned_XLM_RLarge_after_big results: [] --- # tuned_XLM_RLarge_after_big This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.9037 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 1.9717 | 1.0 | 594 | 1.5020 | | 1.4463 | 2.0 | 1188 | 1.1359 | | 1.1579 | 3.0 | 1782 | 0.9597 | | 0.9028 | 4.0 | 2376 | 0.9048 | | 0.6887 | 5.0 | 2970 | 0.9037 | | 0.3302 | 6.0 | 3564 | 0.9316 | | 0.2495 | 7.0 | 4158 | 1.0809 | | 0.1665 | 8.0 | 4752 | 1.1974 | | 0.1223 | 9.0 | 5346 | 1.3164 | ### Framework versions - Transformers 4.42.4 - Pytorch 2.3.1+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1