collapse_gemma-2-9b_hs2_replace_iter6_sftsd1

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7744
  • Num Input Tokens Seen: 4644120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 1
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.0797 0.0511 5 1.0989 239060
0.4157 0.1021 10 1.2189 482296
0.1533 0.1532 15 1.3682 725960
0.0535 0.2043 20 1.5750 970144
0.0404 0.2553 25 1.7656 1205100
0.0218 0.3064 30 1.7967 1441880
0.0255 0.3575 35 1.8400 1687112
0.0221 0.4086 40 1.8623 1922136
0.0281 0.4596 45 1.8183 2165352
0.0228 0.5107 50 1.7449 2394656
0.0213 0.5618 55 1.7163 2646384
0.0224 0.6128 60 1.6743 2876716
0.0203 0.6639 65 1.6927 3108204
0.0266 0.7150 70 1.7124 3348808
0.0213 0.7660 75 1.7250 3588504
0.0205 0.8171 80 1.7315 3822104
0.0213 0.8682 85 1.7380 4060428
0.0215 0.9192 90 1.7523 4299016
0.02 0.9703 95 1.7699 4548500

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
11
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter6_sftsd1

Base model

google/gemma-2-9b
Finetuned
(222)
this model