collapse_gemma-2-9b_hs2_replace_iter5_sftsd2

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4996
  • Num Input Tokens Seen: 4530052

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 2
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.3305 0.0511 5 1.0694 227892
0.5798 0.1021 10 1.1839 461460
0.1933 0.1532 15 1.3775 693424
0.07 0.2043 20 1.5235 923628
0.0329 0.2553 25 1.6848 1157516
0.0376 0.3064 30 1.7560 1391556
0.0283 0.3575 35 1.6790 1616780
0.0254 0.4086 40 1.4960 1853632
0.0229 0.4596 45 1.4455 2085060
0.0321 0.5107 50 1.4337 2328416
0.0245 0.5618 55 1.4225 2559936
0.021 0.6128 60 1.4414 2788152
0.0203 0.6639 65 1.4590 3024984
0.0269 0.7150 70 1.4644 3254832
0.025 0.7660 75 1.4640 3487400
0.0251 0.8171 80 1.4660 3722208
0.0203 0.8682 85 1.4709 3959612
0.022 0.9192 90 1.4843 4195692
0.0215 0.9703 95 1.4961 4430408

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
11
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter5_sftsd2

Base model

google/gemma-2-9b
Finetuned
(222)
this model