File size: 4,119 Bytes
07835c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
410ed03
07835c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
244b090
 
410ed03
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07835c3
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
library_name: transformers
tags:
- generated_from_trainer
model-index:
- name: childes_30
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# childes_30

This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 5.3366

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 30
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 40000
- training_steps: 100000
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch    | Step   | Validation Loss |
|:-------------:|:--------:|:------:|:---------------:|
| No log        | 2.0964   | 2000   | 7.1029          |
| 6.9987        | 4.1929   | 4000   | 5.8842          |
| 6.9987        | 6.2893   | 6000   | 5.5487          |
| 5.2204        | 8.3857   | 8000   | 5.2793          |
| 5.2204        | 10.4822  | 10000  | 5.1049          |
| 4.7358        | 12.5786  | 12000  | 4.9836          |
| 4.7358        | 14.6751  | 14000  | 4.8829          |
| 4.4216        | 16.7715  | 16000  | 4.8029          |
| 4.4216        | 18.8679  | 18000  | 4.7423          |
| 4.1842        | 20.9644  | 20000  | 4.6904          |
| 4.1842        | 23.0608  | 22000  | 4.6458          |
| 3.9858        | 25.1572  | 24000  | 4.6234          |
| 3.9858        | 27.2537  | 26000  | 4.6056          |
| 3.8189        | 29.3501  | 28000  | 4.5909          |
| 3.8189        | 31.4465  | 30000  | 4.5868          |
| 3.6763        | 33.5430  | 32000  | 4.5830          |
| 3.6763        | 35.6394  | 34000  | 4.5782          |
| 3.5493        | 37.7358  | 36000  | 4.5854          |
| 3.5493        | 39.8323  | 38000  | 4.5964          |
| 3.4327        | 41.9287  | 40000  | 4.6104          |
| 3.4327        | 44.0252  | 42000  | 4.6369          |
| 3.3112        | 46.1216  | 44000  | 4.6697          |
| 3.3112        | 48.2180  | 46000  | 4.6953          |
| 3.1908        | 50.3145  | 48000  | 4.7280          |
| 3.1908        | 52.4109  | 50000  | 4.7629          |
| 3.0857        | 54.5073  | 52000  | 4.7928          |
| 3.0857        | 56.6038  | 54000  | 4.8196          |
| 2.9936        | 58.7002  | 56000  | 4.8564          |
| 2.9936        | 60.7966  | 58000  | 4.8890          |
| 2.9113        | 62.8931  | 60000  | 4.9200          |
| 2.9113        | 64.9895  | 62000  | 4.9539          |
| 2.8353        | 67.0860  | 64000  | 4.9934          |
| 2.8353        | 69.1824  | 66000  | 5.0297          |
| 2.7673        | 71.2788  | 68000  | 5.0610          |
| 2.7673        | 73.3753  | 70000  | 5.0805          |
| 2.7091        | 75.4717  | 72000  | 5.1054          |
| 2.7091        | 77.5681  | 74000  | 5.1283          |
| 2.6563        | 79.6646  | 76000  | 5.1594          |
| 2.6563        | 81.7610  | 78000  | 5.1836          |
| 2.6077        | 83.8574  | 80000  | 5.2009          |
| 2.6077        | 85.9539  | 82000  | 5.2230          |
| 2.5635        | 88.0503  | 84000  | 5.2444          |
| 2.5635        | 90.1468  | 86000  | 5.2631          |
| 2.5229        | 92.2432  | 88000  | 5.2798          |
| 2.5229        | 94.3396  | 90000  | 5.2951          |
| 2.4886        | 96.4361  | 92000  | 5.3101          |
| 2.4886        | 98.5325  | 94000  | 5.3189          |
| 2.4584        | 100.6289 | 96000  | 5.3300          |
| 2.4584        | 102.7254 | 98000  | 5.3327          |
| 2.4337        | 104.8218 | 100000 | 5.3366          |


### Framework versions

- Transformers 4.45.2
- Pytorch 2.5.1+cu124
- Datasets 3.0.1
- Tokenizers 0.20.1