ko_en

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1034
  • Bleu: 0.7026
  • Gen Len: 25.331

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.621 0.2600 500 0.5442 0.2902 25.0401
0.3793 0.5200 1000 0.3397 0.3621 25.4688
0.3464 0.7799 1500 0.3081 0.3889 25.5025
0.2958 1.0395 2000 0.2868 0.4094 25.2725
0.2858 1.2995 2500 0.2721 0.4249 25.2765
0.2737 1.5595 3000 0.2570 0.4428 25.3789
0.2698 1.8194 3500 0.2431 0.458 25.2655
0.2295 2.0790 4000 0.2327 0.4724 25.3331
0.2105 2.3390 4500 0.2232 0.4856 25.3415
0.2168 2.5990 5000 0.2131 0.4975 25.2527
0.2181 2.8590 5500 0.2038 0.5101 25.341
0.189 3.1185 6000 0.1964 0.5221 25.3155
0.1927 3.3785 6500 0.1893 0.5338 25.3639
0.1804 3.6385 7000 0.1809 0.5486 25.427
0.1805 3.8985 7500 0.1740 0.5605 25.3071
0.1602 4.1581 8000 0.1678 0.5677 25.199
0.1538 4.4180 8500 0.1624 0.5787 25.2593
0.1586 4.6780 9000 0.1566 0.5897 25.2803
0.1559 4.9380 9500 0.1505 0.598 25.3867
0.1308 5.1976 10000 0.1463 0.61 25.3083
0.1234 5.4576 10500 0.1418 0.6184 25.2894
0.1298 5.7175 11000 0.1374 0.6275 25.3331
0.1277 5.9775 11500 0.1324 0.6357 25.2221
0.1234 6.2371 12000 0.1299 0.642 25.3381
0.1173 6.4971 12500 0.1263 0.6507 25.2842
0.1161 6.7571 13000 0.1229 0.6578 25.3069
0.1209 7.0166 13500 0.1197 0.6641 25.3606
0.1072 7.2766 14000 0.1176 0.6686 25.3898
0.1034 7.5366 14500 0.1150 0.6744 25.2982
0.11 7.7966 15000 0.1128 0.6788 25.3561
0.0976 8.0562 15500 0.1110 0.6835 25.2949
0.1058 8.3161 16000 0.1089 0.6883 25.2912
0.0948 8.5761 16500 0.1076 0.6924 25.3665
0.0932 8.8361 17000 0.1061 0.6958 25.3513
0.0936 9.0957 17500 0.1052 0.6967 25.3588
0.0888 9.3556 18000 0.1042 0.701 25.3234
0.0919 9.6156 18500 0.1040 0.7014 25.3289
0.0917 9.8756 19000 0.1034 0.7026 25.331

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
0
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ryusangwon/ko_en

Finetuned
(86)
this model