wikipedia_conv

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9145

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: reduce_lr_on_plateau
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
5.7941 0.0175 1000 5.5612
5.2657 0.0350 2000 5.1503
5.0438 0.0526 3000 4.9584
4.9125 0.0701 4000 4.8251
4.8024 0.0876 5000 4.7222
4.7245 0.1051 6000 4.6339
4.6491 0.1226 7000 4.5608
4.5966 0.1401 8000 4.5026
4.5466 0.1577 9000 4.4498
4.5008 0.1752 10000 4.4027
4.4624 0.1927 11000 4.3679
4.4255 0.2102 12000 4.3319
4.4001 0.2277 13000 4.3000
4.373 0.2453 14000 4.2727
4.3503 0.2628 15000 4.2483
4.3254 0.2803 16000 4.2283
4.2975 0.2978 17000 4.2071
4.2917 0.3153 18000 4.1871
4.2657 0.3329 19000 4.1669
4.2558 0.3504 20000 4.1560
4.2321 0.3679 21000 4.1401
4.2249 0.3854 22000 4.1265
4.2113 0.4029 23000 4.1118
4.1946 0.4204 24000 4.0979
4.1946 0.4380 25000 4.0872
4.1766 0.4555 26000 4.0777
4.169 0.4730 27000 4.0686
4.1504 0.4905 28000 4.0575
4.1495 0.5080 29000 4.0473
4.137 0.5256 30000 4.0410
4.1313 0.5431 31000 4.0332
4.1195 0.5606 32000 4.0254
4.1087 0.5781 33000 4.0167
4.1138 0.5956 34000 4.0113
4.0945 0.6132 35000 4.0041
4.096 0.6307 36000 3.9989
4.0764 0.6482 37000 3.9927
4.0872 0.6657 38000 3.9898
4.0803 0.6832 39000 3.9823
4.0741 0.7007 40000 3.9754
4.0679 0.7183 41000 3.9722
4.0606 0.7358 42000 3.9702
4.062 0.7533 43000 3.9622
4.0412 0.7708 44000 3.9598
4.0503 0.7883 45000 3.9542
4.039 0.8059 46000 3.9550
4.0325 0.8234 47000 3.9446
4.0396 0.8409 48000 3.9425
4.0289 0.8584 49000 3.9371
4.0372 0.8759 50000 3.9370
4.0205 0.8935 51000 3.9345
4.0238 0.9110 52000 3.9304
4.0112 0.9285 53000 3.9281
4.0153 0.9460 54000 3.9233
4.0048 0.9635 55000 3.9192
4.0031 0.9810 56000 3.9197
4.0114 0.9986 57000 3.9145

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
47
Safetensors
Model size
1.86M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for fpadovani/wikipedia_conv

Finetuned
(1326)
this model