sayanmandal
commited on
Commit
•
0988db9
1
Parent(s):
df13e9d
Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,7 @@ tags:
|
|
3 |
- translation
|
4 |
- generated_from_trainer
|
5 |
datasets:
|
6 |
-
-
|
7 |
metrics:
|
8 |
- bleu
|
9 |
model-index:
|
@@ -13,8 +13,8 @@ model-index:
|
|
13 |
name: Sequence-to-sequence Language Modeling
|
14 |
type: text2text-generation
|
15 |
dataset:
|
16 |
-
name:
|
17 |
-
type:
|
18 |
args: hi_en-en
|
19 |
metrics:
|
20 |
- name: Bleu
|
@@ -27,7 +27,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
27 |
|
28 |
# t5-small_6_3-hi_en-to-en
|
29 |
|
30 |
-
This model was trained from scratch on the
|
31 |
It achieves the following results on the evaluation set:
|
32 |
- Loss: 2.3662
|
33 |
- Bleu: 18.0863
|
@@ -35,7 +35,9 @@ It achieves the following results on the evaluation set:
|
|
35 |
|
36 |
## Model description
|
37 |
|
38 |
-
|
|
|
|
|
39 |
|
40 |
## Intended uses & limitations
|
41 |
|
@@ -43,7 +45,14 @@ More information needed
|
|
43 |
|
44 |
## Training and evaluation data
|
45 |
|
46 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
## Training procedure
|
49 |
|
@@ -166,6 +175,12 @@ The following hyperparameters were used during training:
|
|
166 |
| 1.9924 | 99.0 | 12474 | 2.1760 | 14.7756 | 12.3832 |
|
167 |
| 1.9903 | 100.0 | 12600 | 2.1761 | 14.7713 | 12.3822 |
|
168 |
|
|
|
|
|
|
|
|
|
|
|
|
|
169 |
|
170 |
### Framework versions
|
171 |
|
|
|
3 |
- translation
|
4 |
- generated_from_trainer
|
5 |
datasets:
|
6 |
+
- cmu_hinglish_dog
|
7 |
metrics:
|
8 |
- bleu
|
9 |
model-index:
|
|
|
13 |
name: Sequence-to-sequence Language Modeling
|
14 |
type: text2text-generation
|
15 |
dataset:
|
16 |
+
name: cmu_hinglish_dog
|
17 |
+
type: cmu_hinglish_dog
|
18 |
args: hi_en-en
|
19 |
metrics:
|
20 |
- name: Bleu
|
|
|
27 |
|
28 |
# t5-small_6_3-hi_en-to-en
|
29 |
|
30 |
+
This model was trained from scratch on the cmu_hinglish_dog dataset.
|
31 |
It achieves the following results on the evaluation set:
|
32 |
- Loss: 2.3662
|
33 |
- Bleu: 18.0863
|
|
|
35 |
|
36 |
## Model description
|
37 |
|
38 |
+
Model generated using:
|
39 |
+
```python make_student.py t5-small t5_small_6_3 6 3```
|
40 |
+
Check this [link](https://discuss.huggingface.co/t/questions-on-distilling-from-t5/1193/9) for more information.
|
41 |
|
42 |
## Intended uses & limitations
|
43 |
|
|
|
45 |
|
46 |
## Training and evaluation data
|
47 |
|
48 |
+
Used cmu_hinglish_dog dataset. Please check this [link](https://huggingface.co/datasets/cmu_hinglish_dog) for dataset description
|
49 |
+
|
50 |
+
## Translation:
|
51 |
+
|
52 |
+
* Source: hi_en: The text in Hinglish
|
53 |
+
* Target: en: The text in English
|
54 |
+
|
55 |
+
|
56 |
|
57 |
## Training procedure
|
58 |
|
|
|
175 |
| 1.9924 | 99.0 | 12474 | 2.1760 | 14.7756 | 12.3832 |
|
176 |
| 1.9903 | 100.0 | 12600 | 2.1761 | 14.7713 | 12.3822 |
|
177 |
|
178 |
+
### Evaluation results
|
179 |
+
|
180 |
+
| Data Split | Bleu |
|
181 |
+
|:----------:|:-------:|
|
182 |
+
| Validation | 17.8061 |
|
183 |
+
| Test | 18.0863 |
|
184 |
|
185 |
### Framework versions
|
186 |
|