Built with Axolotl

0f5d021e-a052-4299-81ee-2bb9522213bc

This model is a fine-tuned version of EleutherAI/pythia-14m on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 6.7863

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.000217
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
No log 0.0001 1 9.0324
15.6705 0.0048 50 8.4389
15.5777 0.0097 100 7.8442
16.1316 0.0145 150 8.3373
19.6493 0.0194 200 10.1151
14.9912 0.0242 250 7.2463
16.2995 0.0291 300 8.8719
14.4535 0.0339 350 6.8123
14.8802 0.0388 400 8.5803
14.6041 0.0436 450 6.8007
14.6507 0.0485 500 6.7863

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
6
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for lesso17/0f5d021e-a052-4299-81ee-2bb9522213bc

Adapter
(211)
this model