opt-125m-synthetic-finetuned

This model is a fine-tuned version of facebook/opt-125m on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5143

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 6.0577
No log 2.0 2 4.5061
No log 3.0 3 3.7047
No log 4.0 4 3.4758
No log 5.0 5 3.2895
No log 6.0 6 3.1412
No log 7.0 7 3.0168
No log 8.0 8 2.9220
No log 9.0 9 2.8425
No log 10.0 10 2.7750
No log 11.0 11 2.7173
No log 12.0 12 2.6698
No log 13.0 13 2.6316
No log 14.0 14 2.6005
No log 15.0 15 2.5753
No log 16.0 16 2.5548
No log 17.0 17 2.5386
No log 18.0 18 2.5265
No log 19.0 19 2.5184
No log 20.0 20 2.5143

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
619
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for mendeza/opt-125m-synthetic-finetuned

Base model

facebook/opt-125m
Finetuned
(70)
this model