math_question_grade_detection_T5_12-17-24_bert

This model is a fine-tuned version of allenai/scibert_scivocab_uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4953
Accuracy: 0.8547
Precision: 0.8561
Recall: 0.8547
F1: 0.8550

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 1500

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1
No log	0.1366	50	1.7038	0.3528	0.3718	0.3528	0.3113
No log	0.2732	100	1.2605	0.4958	0.5148	0.4958	0.4709
No log	0.4098	150	1.0502	0.5688	0.6510	0.5688	0.5465
No log	0.5464	200	0.9026	0.6441	0.6991	0.6441	0.6399
No log	0.6831	250	0.9158	0.6418	0.6649	0.6418	0.6177
No log	0.8197	300	0.8312	0.6956	0.7294	0.6956	0.6925
No log	0.9563	350	0.7025	0.7194	0.7175	0.7194	0.7124
No log	1.0929	400	0.7299	0.7248	0.7462	0.7248	0.7247
No log	1.2295	450	0.7050	0.7356	0.7467	0.7356	0.7354
1.0321	1.3661	500	0.6515	0.7479	0.7642	0.7479	0.7458
1.0321	1.5027	550	0.6121	0.7709	0.7759	0.7709	0.7706
1.0321	1.6393	600	0.6158	0.7648	0.7736	0.7648	0.7641
1.0321	1.7760	650	0.5397	0.7848	0.7895	0.7848	0.7852
1.0321	1.9126	700	0.5654	0.7925	0.8017	0.7925	0.7897
1.0321	2.0492	750	0.5194	0.8094	0.8173	0.8094	0.8115
1.0321	2.1858	800	0.5017	0.8155	0.8159	0.8155	0.8154
1.0321	2.3224	850	0.5459	0.8032	0.8207	0.8032	0.8047
1.0321	2.4590	900	0.5172	0.8209	0.8228	0.8209	0.8204
1.0321	2.5956	950	0.5433	0.8094	0.8103	0.8094	0.8067
0.4212	2.7322	1000	0.5114	0.8240	0.8301	0.8240	0.8250
0.4212	2.8689	1050	0.5154	0.8332	0.8406	0.8332	0.8342
0.4212	3.0055	1100	0.4721	0.8370	0.8372	0.8370	0.8364
0.4212	3.1421	1150	0.5013	0.8455	0.8486	0.8455	0.8452
0.4212	3.2787	1200	0.4903	0.8486	0.8511	0.8486	0.8491
0.4212	3.4153	1250	0.5175	0.8509	0.8532	0.8509	0.8508
0.4212	3.5519	1300	0.5091	0.8501	0.8511	0.8501	0.8499
0.4212	3.6885	1350	0.5260	0.8509	0.8553	0.8509	0.8518
0.4212	3.8251	1400	0.5077	0.8555	0.8557	0.8555	0.8549
0.4212	3.9617	1450	0.4965	0.8532	0.8551	0.8532	0.8536
0.1309	4.0984	1500	0.4953	0.8547	0.8561	0.8547	0.8550

Framework versions

Transformers 4.46.3
Pytorch 2.4.0
Datasets 3.1.0
Tokenizers 0.20.3

nzm97
/

math_question_grade_detection_T5_12-17-24_bert

math_question_grade_detection_T5_12-17-24_bert

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for nzm97/math_question_grade_detection_T5_12-17-24_bert

Evaluation results