zephyr-7b

This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-qlora on the HuggingFaceH4/ultrafeedback_binarized dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6928
  • Rewards/chosen: -0.0289
  • Rewards/rejected: -0.1011
  • Rewards/accuracies: 0.3532
  • Rewards/margins: 0.0722
  • Logps/rejected: -85.5050
  • Logps/chosen: -71.7912
  • Logits/rejected: -2.1148
  • Logits/chosen: -2.1436
  • Use Label: 14417.4287
  • Pred Label: 5654.5713

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen Use Label Pred Label
0.6911 0.1 100 0.6919 -0.0053 -0.0356 0.3393 0.0303 -78.9541 -69.4262 -2.0935 -2.1210 1705.8572 150.1429
0.692 0.21 200 0.6927 -0.0264 -0.0695 0.3433 0.0431 -82.3504 -71.5409 -2.1057 -2.1268 3337.0476 622.9524
0.6924 0.31 300 0.6929 -0.0369 -0.0896 0.3393 0.0527 -84.3537 -72.5877 -2.1933 -2.2169 4863.7300 1200.2699
0.6927 0.42 400 0.6925 -0.0211 -0.0804 0.3413 0.0593 -83.4364 -71.0104 -2.0934 -2.1190 6324.0796 1843.9207
0.6924 0.52 500 0.6929 -0.0206 -0.0831 0.3433 0.0625 -83.7112 -70.9618 -2.1518 -2.1762 7772.7778 2499.2222
0.6929 0.63 600 0.6927 -0.0452 -0.1160 0.3512 0.0708 -86.9945 -73.4171 -2.1125 -2.1408 9198.8574 3177.1428
0.6928 0.73 700 0.6930 -0.0507 -0.1231 0.3512 0.0724 -87.7077 -73.9657 -2.1086 -2.1372 10627.2695 3852.7302
0.6927 0.84 800 0.6928 -0.0272 -0.0999 0.3552 0.0726 -85.3832 -71.6247 -2.1141 -2.1431 12045.5234 4538.4761
0.6929 0.94 900 0.6928 -0.0288 -0.1012 0.3492 0.0723 -85.5160 -71.7842 -2.1139 -2.1428 13461.3809 5226.6191

Framework versions

  • PEFT 0.7.1
  • Transformers 4.38.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.2
Downloads last month
32
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for jikaixuan/zephyr-7b

Adapter
(1753)
this model

Dataset used to train jikaixuan/zephyr-7b