--- library_name: transformers license: cc-by-nc-4.0 base_model: facebook/nllb-200-distilled-600M tags: - generated_from_trainer metrics: - bleu model-index: - name: ko_en results: [] --- # ko_en This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.3202 - Bleu: 0.3969 - Gen Len: 26.6275 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 64 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 500 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len | |:-------------:|:------:|:-----:|:---------------:|:------:|:-------:| | 0.6692 | 0.2807 | 500 | 0.5532 | 0.2901 | 26.5817 | | 0.4137 | 0.5614 | 1000 | 0.3736 | 0.3364 | 26.4784 | | 0.3748 | 0.8421 | 1500 | 0.3566 | 0.3507 | 26.5661 | | 0.353 | 1.1224 | 2000 | 0.3484 | 0.3599 | 26.4103 | | 0.3389 | 1.4031 | 2500 | 0.3415 | 0.3644 | 26.6078 | | 0.3464 | 1.6838 | 3000 | 0.3362 | 0.3683 | 26.4936 | | 0.3501 | 1.9645 | 3500 | 0.3310 | 0.375 | 26.6515 | | 0.3173 | 2.2448 | 4000 | 0.3311 | 0.3729 | 26.4372 | | 0.3073 | 2.5255 | 4500 | 0.3275 | 0.378 | 26.556 | | 0.3056 | 2.8062 | 5000 | 0.3243 | 0.3811 | 26.5058 | | 0.2789 | 3.0865 | 5500 | 0.3244 | 0.3843 | 26.5323 | | 0.2808 | 3.3672 | 6000 | 0.3229 | 0.3824 | 26.6117 | | 0.277 | 3.6479 | 6500 | 0.3215 | 0.3857 | 26.4873 | | 0.2936 | 3.9286 | 7000 | 0.3189 | 0.388 | 26.6207 | | 0.2641 | 4.2088 | 7500 | 0.3205 | 0.3889 | 26.6148 | | 0.2675 | 4.4895 | 8000 | 0.3199 | 0.3901 | 26.543 | | 0.2565 | 4.7702 | 8500 | 0.3170 | 0.392 | 26.5881 | | 0.2502 | 5.0505 | 9000 | 0.3197 | 0.3919 | 26.6686 | | 0.2472 | 5.3312 | 9500 | 0.3199 | 0.3921 | 26.6675 | | 0.2613 | 5.6119 | 10000 | 0.3170 | 0.3918 | 26.5227 | | 0.2593 | 5.8926 | 10500 | 0.3168 | 0.3952 | 26.6377 | | 0.2432 | 6.1729 | 11000 | 0.3188 | 0.3938 | 26.5724 | | 0.2317 | 6.4536 | 11500 | 0.3184 | 0.3934 | 26.6351 | | 0.2254 | 6.7343 | 12000 | 0.3185 | 0.3943 | 26.6772 | | 0.2253 | 7.0146 | 12500 | 0.3192 | 0.3966 | 26.6785 | | 0.2368 | 7.2953 | 13000 | 0.3189 | 0.3959 | 26.6508 | | 0.2396 | 7.576 | 13500 | 0.3184 | 0.3949 | 26.6651 | | 0.2233 | 7.8567 | 14000 | 0.3185 | 0.3966 | 26.6405 | | 0.2289 | 8.1370 | 14500 | 0.3200 | 0.3959 | 26.6969 | | 0.2322 | 8.4177 | 15000 | 0.3199 | 0.3956 | 26.58 | | 0.2233 | 8.6984 | 15500 | 0.3195 | 0.3957 | 26.5942 | | 0.231 | 8.9791 | 16000 | 0.3188 | 0.3977 | 26.6186 | | 0.2186 | 9.2594 | 16500 | 0.3203 | 0.3964 | 26.6423 | | 0.2222 | 9.5401 | 17000 | 0.3205 | 0.3967 | 26.632 | | 0.2196 | 9.8208 | 17500 | 0.3202 | 0.3969 | 26.6275 | ### Framework versions - Transformers 4.47.0 - Pytorch 2.5.1+cu124 - Datasets 3.1.0 - Tokenizers 0.21.0