collapse_gemma-2-9b_hs2_replace_iter6_sftsd0

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6039
  • Num Input Tokens Seen: 4589228

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.1176 0.0511 5 1.0858 239884
0.4848 0.1021 10 1.1820 481972
0.1768 0.1532 15 1.3521 712948
0.0603 0.2043 20 1.4684 949036
0.0334 0.2553 25 1.5924 1183812
0.0357 0.3064 30 1.5110 1424744
0.0227 0.3575 35 1.5334 1658884
0.0221 0.4086 40 1.5023 1890148
0.0259 0.4596 45 1.5749 2133712
0.0211 0.5107 50 1.6141 2371172
0.0223 0.5618 55 1.5852 2601984
0.0209 0.6128 60 1.5171 2840580
0.0234 0.6639 65 1.4546 3073152
0.0218 0.7150 70 1.4440 3306888
0.0287 0.7660 75 1.4554 3538980
0.0221 0.8171 80 1.4678 3771884
0.0215 0.8682 85 1.4857 4011808
0.023 0.9192 90 1.5572 4249172
0.0228 0.9703 95 1.5960 4488956

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
6
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter6_sftsd0

Base model

google/gemma-2-9b
Finetuned
(222)
this model