reverse_add_replicate_eval17_corrupted

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 64
eval_batch_size: 64
seed: 7658372
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0	0	2.7152	0.0
4.1279	0.0233	100	2.4441	0.0
3.8869	0.0465	200	2.2535	0.0
3.7932	0.0698	300	2.2601	0.0
4.0653	0.0931	400	2.3063	0.0
3.5939	0.1164	500	2.1550	0.0
3.5464	0.1396	600	2.1346	0.0
2.7715	0.1629	700	1.9327	0.0
2.949	0.1862	800	1.7166	0.0
2.4314	0.2094	900	1.5630	0.0
2.384	0.2327	1000	1.3745	0.0
2.4366	0.2560	1100	1.4244	0.0
2.1071	0.2793	1200	1.3338	0.0
2.1589	0.3025	1300	1.2461	0.0
2.3178	0.3258	1400	1.3081	0.0
1.9503	0.3491	1500	1.3001	0.001
1.9743	0.3724	1600	1.2392	0.0
1.8305	0.3956	1700	1.3122	0.0
2.1996	0.4189	1800	1.2592	0.0
2.0105	0.4422	1900	1.2169	0.001
2.138	0.4654	2000	1.3759	0.0
2.1093	0.4887	2100	1.3241	0.0
1.9048	0.5120	2200	1.2938	0.0
2.0772	0.5353	2300	1.1998	0.0
1.8008	0.5585	2400	1.2685	0.0
1.9558	0.5818	2500	1.3011	0.0
1.9744	0.6051	2600	1.3717	0.0
1.9765	0.6283	2700	1.2421	0.0
2.0307	0.6516	2800	1.2278	0.0
1.9778	0.6749	2900	1.3581	0.0
1.7576	0.6982	3000	1.1796	0.0
1.9729	0.7214	3100	1.1137	0.003
1.6585	0.7447	3200	1.2091	0.0
1.2024	0.7680	3300	1.1949	0.0
0.7904	0.7912	3400	0.9786	0.008
0.6275	0.8145	3500	0.8475	0.001
0.3953	0.8378	3600	0.7642	0.0
0.1835	0.8611	3700	0.6556	0.0
0.111	0.8843	3800	0.6091	0.0
0.1189	0.9076	3900	0.6340	0.0
0.0729	0.9309	4000	0.6288	0.0
0.0609	0.9542	4100	0.5450	0.0
0.0449	0.9774	4200	0.5592	0.0