bengali_qa_model_AGGRO_roberta-base

This model is a fine-tuned version of FacebookAI/roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1743
Exact Match: 96.2857
F1 Score: 97.2732

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 3407
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
training_steps: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match	F1 Score
5.9958	0.0053	1	6.0009	0.0	14.2511
6.0212	0.0107	2	5.9880	0.0	14.2134
5.9773	0.0160	3	5.9623	0.0	14.2530
5.9605	0.0214	4	5.9240	0.0	14.7064
5.922	0.0267	5	5.8733	0.0	14.2565
5.8831	0.0321	6	5.8088	0.0	14.4523
5.8306	0.0374	7	5.7290	0.0	16.0421
5.7652	0.0428	8	5.6310	5.6391	34.9089
5.6731	0.0481	9	5.5117	17.1429	51.1515
5.5294	0.0535	10	5.3626	31.5789	59.2927
5.4532	0.0588	11	5.1749	43.9098	64.9926
5.2211	0.0641	12	4.9679	49.9248	69.2611
5.0949	0.0695	13	4.7511	53.0075	71.4743
4.8805	0.0748	14	4.5299	55.1880	72.9875
4.651	0.0802	15	4.2956	57.8195	74.4621
4.4113	0.0855	16	4.0291	60.0752	75.8558
4.2577	0.0909	17	3.7318	62.2556	76.3222
4.0153	0.0962	18	3.4295	63.6842	77.0091
3.6706	0.1016	19	3.1520	61.1278	76.5172
3.5342	0.1069	20	2.8936	52.7820	73.9467
3.2798	0.1123	21	2.6488	46.1654	71.9439
3.1167	0.1176	22	2.4104	47.1429	71.3425
2.6525	0.1230	23	2.1745	52.2556	71.5805
2.497	0.1283	24	1.9401	58.4211	72.4925
2.3689	0.1336	25	1.7185	60.2256	72.6269
2.0833	0.1390	26	1.5153	60.2256	72.8689
1.8679	0.1443	27	1.3483	61.2782	74.1167
1.7384	0.1497	28	1.2158	64.5865	76.8679
1.47	0.1550	29	1.1047	67.2932	78.9366
1.397	0.1604	30	1.0146	70.5263	81.1621
1.2822	0.1657	31	0.9423	73.3083	83.6952
0.9928	0.1711	32	0.8767	75.2632	85.0494
0.7992	0.1764	33	0.8122	77.8947	86.9631
0.897	0.1818	34	0.7455	80.6767	89.1149
0.8307	0.1871	35	0.6772	83.3835	91.3579
0.8469	0.1924	36	0.6040	86.2406	93.9573
0.6431	0.1978	37	0.5333	86.8421	94.6721
0.8116	0.2031	38	0.4519	87.6692	95.6610
0.6474	0.2085	39	0.3950	87.6692	95.7701
0.6241	0.2138	40	0.3626	87.6692	95.9608
0.6299	0.2192	41	0.3394	87.7444	95.9051
0.2552	0.2245	42	0.3260	87.7444	95.9297
0.3891	0.2299	43	0.3234	87.6692	95.8513
0.3552	0.2352	44	0.3129	87.9699	95.6941
0.2864	0.2406	45	0.2998	88.0451	95.3209
0.4347	0.2459	46	0.2798	89.4737	95.0850
0.2938	0.2513	47	0.2587	90.3759	94.9503
0.2821	0.2566	48	0.2445	90.9023	95.1257
0.3619	0.2619	49	0.2320	91.3534	94.9029
0.4783	0.2673	50	0.2176	91.7293	95.0914
0.1834	0.2726	51	0.2116	91.8797	95.1105
0.3803	0.2780	52	0.2054	92.1805	94.9606
0.2242	0.2833	53	0.2052	92.3308	94.9873
0.1771	0.2887	54	0.2033	92.4812	95.3112
0.3369	0.2940	55	0.1978	93.0827	95.7403
0.2277	0.2994	56	0.1936	93.7594	96.3688
0.2296	0.3047	57	0.1947	93.8346	96.6249
0.2281	0.3101	58	0.1939	93.9098	96.8548
0.1287	0.3154	59	0.1905	94.4361	96.9572
0.191	0.3207	60	0.1865	95.0376	97.2070
0.1435	0.3261	61	0.1868	94.9624	97.1697
0.1648	0.3314	62	0.1900	94.5865	96.8381
0.1668	0.3368	63	0.1889	94.9624	96.9874
0.1634	0.3421	64	0.1850	95.3383	97.0437
0.2374	0.3475	65	0.1797	95.7895	97.4394
0.1382	0.3528	66	0.1768	96.3910	97.6053
0.2683	0.3582	67	0.1736	96.5414	97.6811
0.1452	0.3635	68	0.1720	96.3910	97.4557
0.1796	0.3689	69	0.1704	96.4662	97.4221
0.0786	0.3742	70	0.1686	96.5414	97.4985
0.2424	0.3796	71	0.1669	96.6917	97.5989
0.089	0.3849	72	0.1656	96.7669	97.6242
0.2073	0.3902	73	0.1654	96.7669	97.6238
0.1657	0.3956	74	0.1663	96.5414	97.4733
0.0868	0.4009	75	0.1677	96.3158	97.4407
0.1281	0.4063	76	0.1697	96.0150	97.1804
0.1729	0.4116	77	0.1705	95.8647	97.1085
0.1871	0.4170	78	0.1703	96.0150	97.2090
0.1735	0.4223	79	0.1695	96.0150	97.2090
0.1239	0.4277	80	0.1700	95.9398	97.2144
0.0944	0.4330	81	0.1696	95.8647	97.1392
0.2494	0.4384	82	0.1696	96.0150	97.2896
0.0746	0.4437	83	0.1689	95.8647	97.1392
0.1175	0.4490	84	0.1680	96.0150	97.2090
0.2597	0.4544	85	0.1665	96.0902	97.2082
0.1567	0.4597	86	0.1656	96.0150	97.1330
0.0738	0.4651	87	0.1647	96.1654	97.2834
0.1551	0.4704	88	0.1641	96.2406	97.3586
0.0965	0.4758	89	0.1634	96.0902	97.2833
0.1466	0.4811	90	0.1625	96.1654	97.3085
0.115	0.4865	91	0.1619	96.6165	97.6096
0.1848	0.4918	92	0.1613	96.6165	97.5345
0.0955	0.4972	93	0.1607	96.6165	97.5405
0.1348	0.5025	94	0.1603	96.6165	97.5405
0.1316	0.5079	95	0.1600	96.6165	97.4655
0.1544	0.5132	96	0.1598	96.6917	97.5407
0.1746	0.5185	97	0.1596	96.6917	97.5407
0.0762	0.5239	98	0.1596	96.5414	97.3903
0.1685	0.5292	99	0.1595	96.5414	97.3903
0.1243	0.5346	100	0.1595	96.6917	97.5407

Framework versions

Transformers 4.46.3
Pytorch 2.4.0
Datasets 3.1.0
Tokenizers 0.20.3

Mediocre-Judge
/

bengali_qa_model_AGGRO_roberta-base

bengali_qa_model_AGGRO_roberta-base

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Mediocre-Judge/bengali_qa_model_AGGRO_roberta-base

Evaluation results