collapse_gemma-2-2b_hs2_replace_iter4_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0502
  • Num Input Tokens Seen: 5180224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.3909 0
1.4899 0.0521 5 1.2730 274808
1.0868 0.1042 10 1.2421 547496
0.7381 0.1564 15 1.4438 824840
0.4882 0.2085 20 1.5807 1093800
0.2442 0.2606 25 1.7384 1369528
0.1871 0.3127 30 1.8064 1636288
0.1002 0.3648 35 1.9732 1902848
0.1006 0.4169 40 1.9451 2182152
0.0597 0.4691 45 1.8709 2456304
0.0437 0.5212 50 1.9088 2723096
0.0506 0.5733 55 1.9351 3004616
0.0285 0.6254 60 1.9402 3274072
0.0452 0.6775 65 1.9482 3549112
0.0344 0.7296 70 1.9661 3816184
0.0326 0.7818 75 1.9696 4088360
0.0348 0.8339 80 1.9996 4363344
0.0335 0.8860 85 1.9486 4631752
0.0373 0.9381 90 1.9777 4905720
0.028 0.9902 95 2.0502 5180224

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter4_sftsd0

Base model

google/gemma-2-2b
Finetuned
(511)
this model