longformer-combined-2epoch

This model is a fine-tuned version of allenai/longformer-base-4096 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 20
eval_batch_size: 20
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.7476	0.0556	50	0.7044	0.494	0.3308
0.7543	0.1111	100	0.7187	0.5215	0.3427
0.6395	0.1667	150	0.7392	0.5215	0.3427
0.6322	0.2222	200	0.5797	0.648	0.5771
0.5879	0.2778	250	0.5351	0.678	0.5905
0.5179	0.3333	300	0.5117	0.6975	0.6116
0.5868	0.3889	350	0.5010	0.704	0.6184
0.4915	0.4444	400	0.4662	0.72	0.6348
0.5408	0.5	450	0.5064	0.67	0.5990
0.4393	0.5556	500	0.4584	0.6965	0.6271
0.4974	0.6111	550	0.4894	0.688	0.6183
0.461	0.6667	600	0.4711	0.688	0.6184
0.4446	0.7222	650	0.4520	0.719	0.6338
0.4059	0.7778	700	0.5283	0.703	0.6166
0.4914	0.8333	750	0.4977	0.6655	0.5949
0.4506	0.8889	800	0.4973	0.7155	0.6303
0.4433	0.9444	850	0.4571	0.6935	0.6239
0.5217	1.0	900	0.4675	0.6875	0.6177
0.5323	1.0556	950	0.4639	0.691	0.6213
0.4469	1.1111	1000	0.4869	0.6815	0.6116
0.4435	1.1667	1050	0.4606	0.698	0.6286
0.4596	1.2222	1100	0.4464	0.7235	0.6384
0.4454	1.2778	1150	0.4574	0.6985	0.6291
0.4308	1.3333	1200	0.4343	0.7295	0.6446
0.4656	1.3889	1250	0.4517	0.7235	0.6384
0.4057	1.4444	1300	0.4412	0.7035	0.6343
0.4322	1.5	1350	0.4306	0.7035	0.6343
0.417	1.5556	1400	0.4269	0.7305	0.6457
0.3974	1.6111	1450	0.4333	0.7295	0.6447
0.41	1.6667	1500	0.4340	0.7045	0.6354
0.4132	1.7222	1550	0.4532	0.7005	0.6312
0.445	1.7778	1600	0.4399	0.733	0.6482
0.4159	1.8333	1650	0.4374	0.7325	0.6477
0.3858	1.8889	1700	0.4234	0.735	0.6503
0.4641	1.9444	1750	0.4244	0.737	0.6524
0.3893	2.0	1800	0.4259	0.71	0.6410