ko_en

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1034
Bleu: 0.7026
Gen Len: 25.331

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.621	0.2600	500	0.5442	0.2902	25.0401
0.3793	0.5200	1000	0.3397	0.3621	25.4688
0.3464	0.7799	1500	0.3081	0.3889	25.5025
0.2958	1.0395	2000	0.2868	0.4094	25.2725
0.2858	1.2995	2500	0.2721	0.4249	25.2765
0.2737	1.5595	3000	0.2570	0.4428	25.3789
0.2698	1.8194	3500	0.2431	0.458	25.2655
0.2295	2.0790	4000	0.2327	0.4724	25.3331
0.2105	2.3390	4500	0.2232	0.4856	25.3415
0.2168	2.5990	5000	0.2131	0.4975	25.2527
0.2181	2.8590	5500	0.2038	0.5101	25.341
0.189	3.1185	6000	0.1964	0.5221	25.3155
0.1927	3.3785	6500	0.1893	0.5338	25.3639
0.1804	3.6385	7000	0.1809	0.5486	25.427
0.1805	3.8985	7500	0.1740	0.5605	25.3071
0.1602	4.1581	8000	0.1678	0.5677	25.199
0.1538	4.4180	8500	0.1624	0.5787	25.2593
0.1586	4.6780	9000	0.1566	0.5897	25.2803
0.1559	4.9380	9500	0.1505	0.598	25.3867
0.1308	5.1976	10000	0.1463	0.61	25.3083
0.1234	5.4576	10500	0.1418	0.6184	25.2894
0.1298	5.7175	11000	0.1374	0.6275	25.3331
0.1277	5.9775	11500	0.1324	0.6357	25.2221
0.1234	6.2371	12000	0.1299	0.642	25.3381
0.1173	6.4971	12500	0.1263	0.6507	25.2842
0.1161	6.7571	13000	0.1229	0.6578	25.3069
0.1209	7.0166	13500	0.1197	0.6641	25.3606
0.1072	7.2766	14000	0.1176	0.6686	25.3898
0.1034	7.5366	14500	0.1150	0.6744	25.2982
0.11	7.7966	15000	0.1128	0.6788	25.3561
0.0976	8.0562	15500	0.1110	0.6835	25.2949
0.1058	8.3161	16000	0.1089	0.6883	25.2912
0.0948	8.5761	16500	0.1076	0.6924	25.3665
0.0932	8.8361	17000	0.1061	0.6958	25.3513
0.0936	9.0957	17500	0.1052	0.6967	25.3588
0.0888	9.3556	18000	0.1042	0.701	25.3234
0.0919	9.6156	18500	0.1040	0.7014	25.3289
0.0917	9.8756	19000	0.1034	0.7026	25.331

Framework versions

Transformers 4.47.0
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.21.0

ryusangwon
/

ko_en

ko_en

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ryusangwon/ko_en

Evaluation results