collapse_gemma-2-9b_hs2_replace_iter4_sftsd1

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4399
  • Num Input Tokens Seen: 4608700

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 1
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.1171 0.0513 5 1.0875 234588
0.4281 0.1027 10 1.1673 469864
0.2498 0.1540 15 1.2896 699712
0.0779 0.2053 20 1.4259 938672
0.0482 0.2567 25 1.4626 1174704
0.0303 0.3080 30 1.5081 1413368
0.0276 0.3593 35 1.5372 1650736
0.032 0.4107 40 1.5854 1894804
0.0487 0.4620 45 1.4725 2125240
0.0235 0.5133 50 1.3449 2366084
0.024 0.5646 55 1.3595 2599380
0.0215 0.6160 60 1.3704 2841524
0.0211 0.6673 65 1.3511 3082136
0.0216 0.7186 70 1.3608 3324972
0.0241 0.7700 75 1.3791 3562804
0.0273 0.8213 80 1.3841 3798412
0.023 0.8726 85 1.4029 4033440
0.0213 0.9240 90 1.4191 4267912
0.0215 0.9753 95 1.4333 4514044

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
6
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter4_sftsd1

Base model

google/gemma-2-9b
Finetuned
(222)
this model