RylanSchaeffer
/

collapse_gemma-2-2b_hs2_accumulatesubsample_iter8_sftsd2

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_accumulatesubsample_iter8_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1905
Num Input Tokens Seen: 5048840

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 2
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.403	0.0533	5	1.2724	279256
1.1785	0.1066	10	1.1963	545440
1.0295	0.1599	15	1.1799	811104
0.967	0.2132	20	1.1842	1083896
0.7945	0.2665	25	1.1937	1361912
0.786	0.3198	30	1.2116	1632696
0.8056	0.3731	35	1.2189	1905960
0.7204	0.4264	40	1.2143	2171384
0.7197	0.4797	45	1.2007	2448800
0.6811	0.5330	50	1.2093	2721560
0.692	0.5863	55	1.2032	2989280
0.5352	0.6396	60	1.2017	3264080
0.5358	0.6929	65	1.1925	3537184
0.4779	0.7462	70	1.2035	3812032
0.4526	0.7995	75	1.1994	4079120
0.5517	0.8528	80	1.1940	4348552
0.5031	0.9061	85	1.1921	4616056
0.507	0.9594	90	1.1952	4883528

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 4

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_accumulatesubsample_iter8_sftsd2

Base model

google/gemma-2-2b

Finetuned

(487)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard