fpadovani
/

german_clm_childes_42

+---
+library_name: transformers
+tags:
+- generated_from_trainer
+model-index:
+- name: childes_42
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# childes_42
+This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 4.3348
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 32
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 40000
+- training_steps: 100000
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch   | Step   | Validation Loss |
+|:-------------:|:-------:|:------:|:---------------:|
+| No log        | 1.3276  | 2000   | 6.8111          |
+| 6.7052        | 2.6552  | 4000   | 5.2489          |
+| 6.7052        | 3.9827  | 6000   | 4.8334          |
+| 4.6011        | 5.3103  | 8000   | 4.5945          |
+| 4.6011        | 6.6379  | 10000  | 4.4471          |
+| 4.1311        | 7.9655  | 12000  | 4.3496          |
+| 4.1311        | 9.2931  | 14000  | 4.2791          |
+| 3.8551        | 10.6206 | 16000  | 4.2176          |
+| 3.8551        | 11.9482 | 18000  | 4.1808          |
+| 3.6553        | 13.2758 | 20000  | 4.1480          |
+| 3.6553        | 14.6034 | 22000  | 4.1244          |
+| 3.4944        | 15.9310 | 24000  | 4.1036          |
+| 3.4944        | 17.2585 | 26000  | 4.0837          |
+| 3.3589        | 18.5861 | 28000  | 4.0664          |
+| 3.3589        | 19.9137 | 30000  | 4.0494          |
+| 3.2511        | 21.2413 | 32000  | 4.0538          |
+| 3.2511        | 22.5689 | 34000  | 4.0513          |
+| 3.1592        | 23.8964 | 36000  | 4.0287          |
+| 3.1592        | 25.2240 | 38000  | 4.0395          |
+| 3.0779        | 26.5516 | 40000  | 4.0467          |
+| 3.0779        | 27.8792 | 42000  | 4.0412          |
+| 3.0034        | 29.2068 | 44000  | 4.0514          |
+| 3.0034        | 30.5344 | 46000  | 4.0581          |
+| 2.9282        | 31.8619 | 48000  | 4.0613          |
+| 2.9282        | 33.1895 | 50000  | 4.0739          |
+| 2.8601        | 34.5171 | 52000  | 4.0870          |
+| 2.8601        | 35.8447 | 54000  | 4.0898          |
+| 2.8037        | 37.1723 | 56000  | 4.1204          |
+| 2.8037        | 38.4998 | 58000  | 4.1324          |
+| 2.751         | 39.8274 | 60000  | 4.1339          |
+| 2.751         | 41.1550 | 62000  | 4.1532          |
+| 2.7015        | 42.4826 | 64000  | 4.1742          |
+| 2.7015        | 43.8102 | 66000  | 4.1717          |
+| 2.6626        | 45.1377 | 68000  | 4.1888          |
+| 2.6626        | 46.4653 | 70000  | 4.2004          |
+| 2.6203        | 47.7929 | 72000  | 4.2134          |
+| 2.6203        | 49.1205 | 74000  | 4.2247          |
+| 2.585         | 50.4481 | 76000  | 4.2420          |
+| 2.585         | 51.7756 | 78000  | 4.2502          |
+| 2.5544        | 53.1032 | 80000  | 4.2732          |
+| 2.5544        | 54.4308 | 82000  | 4.2792          |
+| 2.5223        | 55.7584 | 84000  | 4.2878          |
+| 2.5223        | 57.0860 | 86000  | 4.2994          |
+| 2.496         | 58.4135 | 88000  | 4.3056          |
+| 2.496         | 59.7411 | 90000  | 4.3080          |
+| 2.4733        | 61.0687 | 92000  | 4.3204          |
+| 2.4733        | 62.3963 | 94000  | 4.3243          |
+| 2.4503        | 63.7239 | 96000  | 4.3270          |
+| 2.4503        | 65.0514 | 98000  | 4.3339          |
+| 2.4342        | 66.3790 | 100000 | 4.3348          |
+### Framework versions
+- Transformers 4.45.2
+- Pytorch 2.5.1+cu124
+- Datasets 3.0.1
+- Tokenizers 0.20.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 0,
+  "eos_token_id": 1,
+  "transformers_version": "4.45.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fdc1093c3d26d52186b8ab698bace69116a613000867c85197e2787b5d5fde17
 size 51007160

 version https://git-lfs.github.com/spec/v1
+oid sha256:8bd992d12f49ddc814383784312f40db18a08227d3eb4035e0d221b289680a03
 size 51007160