--- license: gemma base_model: google/gemma-2-27b tags: - trl - sft - generated_from_trainer model-index: - name: collapse_gemma-2-27b_hs2_replace_iter3_sftsd0 results: [] --- # collapse_gemma-2-27b_hs2_replace_iter3_sftsd0 This model is a fine-tuned version of [google/gemma-2-27b](https://huggingface.co/google/gemma-2-27b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.3653 - Num Input Tokens Seen: 3955416 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-06 - train_batch_size: 4 - eval_batch_size: 16 - seed: 0 - gradient_accumulation_steps: 32 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant_with_warmup - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen | |:-------------:|:------:|:----:|:---------------:|:-----------------:| | No log | 0 | 0 | 1.1282 | 0 | | 3.8489 | 0.0583 | 5 | 1.0535 | 228936 | | 3.3414 | 0.1165 | 10 | 1.1298 | 463812 | | 2.8437 | 0.1748 | 15 | 1.1488 | 702592 | | 1.9341 | 0.2331 | 20 | 1.2179 | 938224 | | 1.1621 | 0.2913 | 25 | 1.2570 | 1165920 | | 0.6806 | 0.3496 | 30 | 1.2791 | 1403276 | | 0.6728 | 0.4079 | 35 | 1.2535 | 1650592 | | 0.5266 | 0.4661 | 40 | 1.2409 | 1880524 | | 0.5377 | 0.5244 | 45 | 1.2414 | 2104356 | | 0.4042 | 0.5827 | 50 | 1.2466 | 2335700 | | 0.7168 | 0.6409 | 55 | 1.2873 | 2564852 | | 0.3333 | 0.6992 | 60 | 1.3003 | 2791324 | | 0.5753 | 0.7575 | 65 | 1.3164 | 3032688 | | 0.3997 | 0.8157 | 70 | 1.3235 | 3267132 | | 0.3566 | 0.8740 | 75 | 1.3464 | 3502604 | | 0.4565 | 0.9323 | 80 | 1.3853 | 3727432 | | 0.1841 | 0.9905 | 85 | 1.3653 | 3955416 | ### Framework versions - Transformers 4.44.0 - Pytorch 2.4.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1