math_question_topic_detection_T5_12-17-24_v1

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4043
  • Accuracy: 0.8670
  • Precision: 0.8678
  • Recall: 0.8670
  • F1: 0.8670

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 12
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 2200

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Recall F1
No log 0.0513 50 2.0102 0.1929 0.1121 0.1929 0.1006
No log 0.1025 100 1.6675 0.4243 0.3751 0.4243 0.3741
No log 0.1538 150 1.3705 0.4827 0.4186 0.4827 0.4263
No log 0.2051 200 1.1202 0.5942 0.5856 0.5942 0.5685
No log 0.2563 250 1.0150 0.6487 0.6613 0.6487 0.6429
No log 0.3076 300 1.0172 0.6380 0.6743 0.6380 0.6238
No log 0.3589 350 0.8297 0.7095 0.6994 0.7095 0.6993
No log 0.4101 400 0.7494 0.7333 0.7309 0.7333 0.7194
No log 0.4614 450 0.6765 0.7433 0.7573 0.7433 0.7351
1.2821 0.5126 500 0.6862 0.7510 0.7607 0.7510 0.7454
1.2821 0.5639 550 0.6518 0.7656 0.7788 0.7656 0.7612
1.2821 0.6152 600 0.6115 0.7786 0.7795 0.7786 0.7754
1.2821 0.6664 650 0.5832 0.7925 0.7971 0.7925 0.7895
1.2821 0.7177 700 0.5504 0.7971 0.8078 0.7971 0.7963
1.2821 0.7690 750 0.5197 0.8163 0.8198 0.8163 0.8156
1.2821 0.8202 800 0.5729 0.7932 0.8140 0.7932 0.7924
1.2821 0.8715 850 0.5184 0.8163 0.8268 0.8163 0.8158
1.2821 0.9228 900 0.5167 0.8186 0.8243 0.8186 0.8180
1.2821 0.9740 950 0.4947 0.8309 0.8388 0.8309 0.8303
0.6382 1.0253 1000 0.5191 0.8309 0.8392 0.8309 0.8313
0.6382 1.0766 1050 0.5115 0.8194 0.8318 0.8194 0.8186
0.6382 1.1278 1100 0.4724 0.8309 0.8320 0.8309 0.8304
0.6382 1.1791 1150 0.4883 0.8324 0.8350 0.8324 0.8321
0.6382 1.2303 1200 0.4628 0.8386 0.8410 0.8386 0.8378
0.6382 1.2816 1250 0.4567 0.8324 0.8353 0.8324 0.8326
0.6382 1.3329 1300 0.4908 0.8378 0.8429 0.8378 0.8377
0.6382 1.3841 1350 0.4606 0.8470 0.8504 0.8470 0.8472
0.6382 1.4354 1400 0.4714 0.8455 0.8505 0.8455 0.8460
0.6382 1.4867 1450 0.4576 0.8417 0.8436 0.8417 0.8412
0.4651 1.5379 1500 0.4409 0.8478 0.8493 0.8478 0.8480
0.4651 1.5892 1550 0.4189 0.8570 0.8589 0.8570 0.8573
0.4651 1.6405 1600 0.4159 0.8601 0.8623 0.8601 0.8603
0.4651 1.6917 1650 0.4295 0.8563 0.8602 0.8563 0.8566
0.4651 1.7430 1700 0.4235 0.8601 0.8634 0.8601 0.8601
0.4651 1.7943 1750 0.4214 0.8563 0.8593 0.8563 0.8565
0.4651 1.8455 1800 0.4169 0.8578 0.8601 0.8578 0.8578
0.4651 1.8968 1850 0.4181 0.8624 0.8654 0.8624 0.8626
0.4651 1.9481 1900 0.4126 0.8609 0.8619 0.8609 0.8607
0.4651 1.9993 1950 0.4077 0.8670 0.8688 0.8670 0.8667
0.4227 2.0506 2000 0.4095 0.8632 0.8644 0.8632 0.8634
0.4227 2.1018 2050 0.4051 0.8624 0.8637 0.8624 0.8626
0.4227 2.1531 2100 0.4049 0.8655 0.8662 0.8655 0.8656
0.4227 2.2044 2150 0.4050 0.8686 0.8695 0.8686 0.8687
0.4227 2.2556 2200 0.4043 0.8670 0.8678 0.8670 0.8670

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.4.0
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
28
Safetensors
Model size
224M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nzm97/math_question_topic_detection_T5_12-17-24_v1

Finetuned
(659)
this model