n3wtou commited on
Commit
0fd9bbf
1 Parent(s): b6ea60e

Training in progress epoch 0

Browse files
Files changed (3) hide show
  1. README.md +5 -12
  2. config.json +1 -0
  3. tf_model.h5 +2 -2
README.md CHANGED
@@ -14,9 +14,9 @@ probably proofread and complete it, then remove this comment. -->
14
 
15
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
- - Train Loss: nan
18
- - Validation Loss: nan
19
- - Epoch: 7
20
 
21
  ## Model description
22
 
@@ -35,21 +35,14 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0004, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0004, 'decay_steps': 12532, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 100, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
39
  - training_precision: mixed_float16
40
 
41
  ### Training results
42
 
43
  | Train Loss | Validation Loss | Epoch |
44
  |:----------:|:---------------:|:-----:|
45
- | nan | nan | 0 |
46
- | nan | nan | 1 |
47
- | nan | nan | 2 |
48
- | nan | nan | 3 |
49
- | nan | nan | 4 |
50
- | nan | nan | 5 |
51
- | nan | nan | 6 |
52
- | nan | nan | 7 |
53
 
54
 
55
  ### Framework versions
 
14
 
15
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Train Loss: 5.5637
18
+ - Validation Loss: 3.0045
19
+ - Epoch: 0
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0003, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0003, 'decay_steps': 1924, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 50, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
39
  - training_precision: mixed_float16
40
 
41
  ### Training results
42
 
43
  | Train Loss | Validation Loss | Epoch |
44
  |:----------:|:---------------:|:-----:|
45
+ | 5.5637 | 3.0045 | 0 |
 
 
 
 
 
 
 
46
 
47
 
48
  ### Framework versions
config.json CHANGED
@@ -18,6 +18,7 @@
18
  "length_penalty": 0.6,
19
  "max_length": 128,
20
  "model_type": "mt5",
 
21
  "num_beams": 15,
22
  "num_decoder_layers": 8,
23
  "num_heads": 6,
 
18
  "length_penalty": 0.6,
19
  "max_length": 128,
20
  "model_type": "mt5",
21
+ "no_repeat_ngram_size": 2,
22
  "num_beams": 15,
23
  "num_decoder_layers": 8,
24
  "num_heads": 6,
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0f6654ed46c303812a0171c599f72e90593c2f5bb7e88b5ae56bfa819d15b568
3
- size 2225560376
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf55598e68a86bdf1cb74717b623753a8f77ca6ff11d067532a659e020c40ace
3
+ size 2225556280