collapse_gemma-2-9b_hs2_replace_iter4_sftsd0

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3885
  • Num Input Tokens Seen: 4582088

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.1067 0.0516 5 1.0880 236144
0.4944 0.1031 10 1.1582 483480
0.2097 0.1547 15 1.2328 722548
0.0734 0.2063 20 1.3841 964700
0.0577 0.2578 25 1.3194 1207344
0.0352 0.3094 30 1.3818 1457700
0.0331 0.3609 35 1.3162 1704260
0.0392 0.4125 40 1.2421 1937904
0.0392 0.4641 45 1.2304 2173932
0.0269 0.5156 50 1.2235 2408292
0.0437 0.5672 55 1.2684 2644340
0.0325 0.6188 60 1.3594 2874144
0.0312 0.6703 65 1.4196 3116744
0.0226 0.7219 70 1.4455 3346236
0.0253 0.7734 75 1.4274 3585652
0.0252 0.8250 80 1.3708 3823236
0.0272 0.8766 85 1.3578 4054744
0.0239 0.9281 90 1.3346 4294332
0.0258 0.9797 95 1.3749 4533868

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
10
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter4_sftsd0

Base model

google/gemma-2-9b
Finetuned
(222)
this model