describe latest checkpoint
Browse files
README.md
CHANGED
@@ -56,7 +56,7 @@ parameters:
|
|
56 |
min_length: 8
|
57 |
no_repeat_ngram_size: 3
|
58 |
early_stopping: True
|
59 |
-
repetition_penalty:
|
60 |
length_penalty: 0.3
|
61 |
encoder_no_repeat_ngram_size : 3
|
62 |
num_beams : 4
|
@@ -72,12 +72,13 @@ parameters:
|
|
72 |
|
73 |
_As I make updates to this WIP checkpoint I will post a note here._
|
74 |
|
|
|
75 |
- July 4, 2022: add checkpoint with 6 additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving prior issue of short summaries.
|
76 |
|
77 |
## About
|
78 |
|
79 |
- a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 26 epochs
|
80 |
-
- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final
|
81 |
|
82 |
|
83 |
## Comparisons
|
|
|
56 |
min_length: 8
|
57 |
no_repeat_ngram_size: 3
|
58 |
early_stopping: True
|
59 |
+
repetition_penalty: 1.5
|
60 |
length_penalty: 0.3
|
61 |
encoder_no_repeat_ngram_size : 3
|
62 |
num_beams : 4
|
|
|
72 |
|
73 |
_As I make updates to this WIP checkpoint I will post a note here._
|
74 |
|
75 |
+
- July 8, 2022: add checkpoint with ~4 epochs of training on A100, equating to approx 350 steps of functional batch size 128
|
76 |
- July 4, 2022: add checkpoint with 6 additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving prior issue of short summaries.
|
77 |
|
78 |
## About
|
79 |
|
80 |
- a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 26 epochs
|
81 |
+
- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final 10+ epochs**
|
82 |
|
83 |
|
84 |
## Comparisons
|