File size: 4,119 Bytes
0bf994f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53fcd37
0bf994f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e84809e
 
53fcd37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0bf994f
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
library_name: transformers
tags:
- generated_from_trainer
model-index:
- name: childes_42
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# childes_42

This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 5.3390

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 40000
- training_steps: 100000
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch    | Step   | Validation Loss |
|:-------------:|:--------:|:------:|:---------------:|
| No log        | 2.0964   | 2000   | 7.0908          |
| 6.9765        | 4.1929   | 4000   | 5.8746          |
| 6.9765        | 6.2893   | 6000   | 5.5441          |
| 5.2182        | 8.3857   | 8000   | 5.2788          |
| 5.2182        | 10.4822  | 10000  | 5.0992          |
| 4.7379        | 12.5786  | 12000  | 4.9710          |
| 4.7379        | 14.6751  | 14000  | 4.8759          |
| 4.4249        | 16.7715  | 16000  | 4.8005          |
| 4.4249        | 18.8679  | 18000  | 4.7436          |
| 4.1842        | 20.9644  | 20000  | 4.6922          |
| 4.1842        | 23.0608  | 22000  | 4.6481          |
| 3.9843        | 25.1572  | 24000  | 4.6155          |
| 3.9843        | 27.2537  | 26000  | 4.5982          |
| 3.8181        | 29.3501  | 28000  | 4.5845          |
| 3.8181        | 31.4465  | 30000  | 4.5811          |
| 3.6751        | 33.5430  | 32000  | 4.5796          |
| 3.6751        | 35.6394  | 34000  | 4.5828          |
| 3.5484        | 37.7358  | 36000  | 4.5869          |
| 3.5484        | 39.8323  | 38000  | 4.5976          |
| 3.4328        | 41.9287  | 40000  | 4.6090          |
| 3.4328        | 44.0252  | 42000  | 4.6298          |
| 3.31          | 46.1216  | 44000  | 4.6598          |
| 3.31          | 48.2180  | 46000  | 4.6983          |
| 3.1908        | 50.3145  | 48000  | 4.7263          |
| 3.1908        | 52.4109  | 50000  | 4.7624          |
| 3.0864        | 54.5073  | 52000  | 4.7913          |
| 3.0864        | 56.6038  | 54000  | 4.8263          |
| 2.993         | 58.7002  | 56000  | 4.8538          |
| 2.993         | 60.7966  | 58000  | 4.8770          |
| 2.9108        | 62.8931  | 60000  | 4.9097          |
| 2.9108        | 64.9895  | 62000  | 4.9486          |
| 2.8352        | 67.0860  | 64000  | 4.9929          |
| 2.8352        | 69.1824  | 66000  | 5.0339          |
| 2.7677        | 71.2788  | 68000  | 5.0516          |
| 2.7677        | 73.3753  | 70000  | 5.0869          |
| 2.708         | 75.4717  | 72000  | 5.1078          |
| 2.708         | 77.5681  | 74000  | 5.1317          |
| 2.6552        | 79.6646  | 76000  | 5.1598          |
| 2.6552        | 81.7610  | 78000  | 5.1774          |
| 2.6082        | 83.8574  | 80000  | 5.1928          |
| 2.6082        | 85.9539  | 82000  | 5.2273          |
| 2.5633        | 88.0503  | 84000  | 5.2497          |
| 2.5633        | 90.1468  | 86000  | 5.2644          |
| 2.5227        | 92.2432  | 88000  | 5.2840          |
| 2.5227        | 94.3396  | 90000  | 5.2921          |
| 2.4873        | 96.4361  | 92000  | 5.3118          |
| 2.4873        | 98.5325  | 94000  | 5.3205          |
| 2.458         | 100.6289 | 96000  | 5.3308          |
| 2.458         | 102.7254 | 98000  | 5.3365          |
| 2.4331        | 104.8218 | 100000 | 5.3390          |


### Framework versions

- Transformers 4.45.2
- Pytorch 2.5.1+cu124
- Datasets 3.0.1
- Tokenizers 0.20.1