collapse_gemma-2-9b_hs2_replace_iter5_sftsd0

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5755
  • Num Input Tokens Seen: 4547896

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.1183 0.0511 5 1.0880 235876
0.4269 0.1021 10 1.2109 467904
0.1522 0.1532 15 1.3520 702336
0.0599 0.2043 20 1.4862 936440
0.0301 0.2553 25 1.4056 1170436
0.0313 0.3064 30 1.4970 1405684
0.0244 0.3575 35 1.5938 1639272
0.0252 0.4086 40 1.6390 1877036
0.0387 0.4596 45 1.5764 2120312
0.0263 0.5107 50 1.5438 2353548
0.0276 0.5618 55 1.5202 2581884
0.0355 0.6128 60 1.4916 2817540
0.0243 0.6639 65 1.5250 3050308
0.024 0.7150 70 1.5144 3280836
0.0336 0.7660 75 1.4980 3512932
0.0213 0.8171 80 1.5152 3747620
0.0239 0.8682 85 1.5247 3986112
0.0219 0.9192 90 1.5517 4217956
0.0232 0.9703 95 1.5675 4453812

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter5_sftsd0

Base model

google/gemma-2-9b
Finetuned
(226)
this model