age_sentence_cosine

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5026

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
5.2482 0.0127 2000 4.2534
4.1494 0.0254 4000 4.0313
4.003 0.0381 6000 3.9020
3.8886 0.0508 8000 3.8179
3.8368 0.0635 10000 3.7863
3.7814 0.0762 12000 3.7358
3.7297 0.0889 14000 3.6976
3.726 0.1016 16000 3.6959
3.6673 0.1143 18000 3.6500
3.6451 0.1270 20000 3.6304
3.6604 0.1397 22000 3.6592
3.5986 0.1524 24000 3.6039
3.5943 0.1651 26000 3.6218
3.5968 0.1778 28000 3.5981
3.5601 0.1905 30000 3.5745
3.5672 0.2032 32000 3.5908
3.5477 0.2159 34000 3.5710
3.5269 0.2286 36000 3.5532
3.5498 0.2413 38000 3.5685
3.5047 0.2540 40000 3.5435
3.5003 0.2667 42000 3.5336
3.5286 0.2794 44000 3.5777
3.4799 0.2921 46000 3.5303
3.4887 0.3048 48000 3.5562
3.4944 0.3175 50000 3.5346
3.4681 0.3302 52000 3.5193
3.4823 0.3429 54000 3.5421
3.4663 0.3556 56000 3.5235
3.4521 0.3682 58000 3.5071
3.4818 0.3809 60000 3.5267
3.4372 0.3936 62000 3.5073
3.4403 0.4063 64000 3.5032
3.4689 0.4190 66000 3.5450
3.4262 0.4317 68000 3.4986
3.4406 0.4444 70000 3.5267
3.4416 0.4571 72000 3.5079
3.4209 0.4698 74000 3.4937
3.4417 0.4825 76000 3.5148
3.4205 0.4952 78000 3.4973
3.4128 0.5079 80000 3.4883
3.4453 0.5206 82000 3.5062
3.3983 0.5333 84000 3.4864
3.4047 0.5460 86000 3.5144
3.4332 0.5587 88000 3.5226
3.3971 0.5714 90000 3.4823
3.4102 0.5841 92000 3.5111
3.4115 0.5968 94000 3.4884
3.3939 0.6095 96000 3.4796
3.4183 0.6222 98000 3.5024
3.3914 0.6349 100000 3.4837
3.3908 0.6476 102000 3.4744
3.4234 0.6603 104000 3.4961
3.3744 0.6730 106000 3.4737
3.3865 0.6857 108000 3.5026

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.4.1
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
25
Safetensors
Model size
1.86M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for fpadovani/age_sentence_cosine

Finetuned
(1268)
this model
Quantizations
1 model