math_question_topic_detection_T5_12-17-24_v1

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4043
Accuracy: 0.8670
Precision: 0.8678
Recall: 0.8670
F1: 0.8670

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 3
total_train_batch_size: 12
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 2200

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1
No log	0.0513	50	2.0102	0.1929	0.1121	0.1929	0.1006
No log	0.1025	100	1.6675	0.4243	0.3751	0.4243	0.3741
No log	0.1538	150	1.3705	0.4827	0.4186	0.4827	0.4263
No log	0.2051	200	1.1202	0.5942	0.5856	0.5942	0.5685
No log	0.2563	250	1.0150	0.6487	0.6613	0.6487	0.6429
No log	0.3076	300	1.0172	0.6380	0.6743	0.6380	0.6238
No log	0.3589	350	0.8297	0.7095	0.6994	0.7095	0.6993
No log	0.4101	400	0.7494	0.7333	0.7309	0.7333	0.7194
No log	0.4614	450	0.6765	0.7433	0.7573	0.7433	0.7351
1.2821	0.5126	500	0.6862	0.7510	0.7607	0.7510	0.7454
1.2821	0.5639	550	0.6518	0.7656	0.7788	0.7656	0.7612
1.2821	0.6152	600	0.6115	0.7786	0.7795	0.7786	0.7754
1.2821	0.6664	650	0.5832	0.7925	0.7971	0.7925	0.7895
1.2821	0.7177	700	0.5504	0.7971	0.8078	0.7971	0.7963
1.2821	0.7690	750	0.5197	0.8163	0.8198	0.8163	0.8156
1.2821	0.8202	800	0.5729	0.7932	0.8140	0.7932	0.7924
1.2821	0.8715	850	0.5184	0.8163	0.8268	0.8163	0.8158
1.2821	0.9228	900	0.5167	0.8186	0.8243	0.8186	0.8180
1.2821	0.9740	950	0.4947	0.8309	0.8388	0.8309	0.8303
0.6382	1.0253	1000	0.5191	0.8309	0.8392	0.8309	0.8313
0.6382	1.0766	1050	0.5115	0.8194	0.8318	0.8194	0.8186
0.6382	1.1278	1100	0.4724	0.8309	0.8320	0.8309	0.8304
0.6382	1.1791	1150	0.4883	0.8324	0.8350	0.8324	0.8321
0.6382	1.2303	1200	0.4628	0.8386	0.8410	0.8386	0.8378
0.6382	1.2816	1250	0.4567	0.8324	0.8353	0.8324	0.8326
0.6382	1.3329	1300	0.4908	0.8378	0.8429	0.8378	0.8377
0.6382	1.3841	1350	0.4606	0.8470	0.8504	0.8470	0.8472
0.6382	1.4354	1400	0.4714	0.8455	0.8505	0.8455	0.8460
0.6382	1.4867	1450	0.4576	0.8417	0.8436	0.8417	0.8412
0.4651	1.5379	1500	0.4409	0.8478	0.8493	0.8478	0.8480
0.4651	1.5892	1550	0.4189	0.8570	0.8589	0.8570	0.8573
0.4651	1.6405	1600	0.4159	0.8601	0.8623	0.8601	0.8603
0.4651	1.6917	1650	0.4295	0.8563	0.8602	0.8563	0.8566
0.4651	1.7430	1700	0.4235	0.8601	0.8634	0.8601	0.8601
0.4651	1.7943	1750	0.4214	0.8563	0.8593	0.8563	0.8565
0.4651	1.8455	1800	0.4169	0.8578	0.8601	0.8578	0.8578
0.4651	1.8968	1850	0.4181	0.8624	0.8654	0.8624	0.8626
0.4651	1.9481	1900	0.4126	0.8609	0.8619	0.8609	0.8607
0.4651	1.9993	1950	0.4077	0.8670	0.8688	0.8670	0.8667
0.4227	2.0506	2000	0.4095	0.8632	0.8644	0.8632	0.8634
0.4227	2.1018	2050	0.4051	0.8624	0.8637	0.8624	0.8626
0.4227	2.1531	2100	0.4049	0.8655	0.8662	0.8655	0.8656
0.4227	2.2044	2150	0.4050	0.8686	0.8695	0.8686	0.8687
0.4227	2.2556	2200	0.4043	0.8670	0.8678	0.8670	0.8670

Framework versions

Transformers 4.46.3
Pytorch 2.4.0
Datasets 3.1.0
Tokenizers 0.20.3

nzm97
/

math_question_topic_detection_T5_12-17-24_v1

math_question_topic_detection_T5_12-17-24_v1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for nzm97/math_question_topic_detection_T5_12-17-24_v1

Evaluation results