git-base-isg-288

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 100

Training Loss	Epoch	Step	Validation Loss	Wer Score
13.4828	5.5882	50	4.2850	16.7473
3.9992	11.1176	100	0.3655	0.7942
0.2368	16.7059	150	0.0692	0.6679
0.0533	22.2353	200	0.0733	0.7004
0.0339	27.8235	250	0.0765	0.8520
0.0249	33.3529	300	0.0795	1.8592
0.0165	38.9412	350	0.0821	2.3827
0.0074	44.4706	400	0.0861	2.0542
0.0034	50.0	450	0.0885	3.0361
0.0023	55.5882	500	0.0909	2.4946
0.0018	61.1176	550	0.0920	2.6426
0.0016	66.7059	600	0.0930	2.6354
0.0015	72.2353	650	0.0930	2.2527
0.0013	77.8235	700	0.0935	2.6859
0.0013	83.3529	750	0.0937	2.7726
0.0012	88.9412	800	0.0937	2.7076