pszemraj commited on
Commit
dbcf4ab
·
1 Parent(s): 2173531

describe latest checkpoint

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -56,7 +56,7 @@ parameters:
56
  min_length: 8
57
  no_repeat_ngram_size: 3
58
  early_stopping: True
59
- repetition_penalty: 3.5
60
  length_penalty: 0.3
61
  encoder_no_repeat_ngram_size : 3
62
  num_beams : 4
@@ -72,12 +72,13 @@ parameters:
72
 
73
  _As I make updates to this WIP checkpoint I will post a note here._
74
 
 
75
  - July 4, 2022: add checkpoint with 6 additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving prior issue of short summaries.
76
 
77
  ## About
78
 
79
  - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 26 epochs
80
- - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final eight epochs**
81
 
82
 
83
  ## Comparisons
 
56
  min_length: 8
57
  no_repeat_ngram_size: 3
58
  early_stopping: True
59
+ repetition_penalty: 1.5
60
  length_penalty: 0.3
61
  encoder_no_repeat_ngram_size : 3
62
  num_beams : 4
 
72
 
73
  _As I make updates to this WIP checkpoint I will post a note here._
74
 
75
+ - July 8, 2022: add checkpoint with ~4 epochs of training on A100, equating to approx 350 steps of functional batch size 128
76
  - July 4, 2022: add checkpoint with 6 additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving prior issue of short summaries.
77
 
78
  ## About
79
 
80
  - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 26 epochs
81
+ - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final 10+ epochs**
82
 
83
 
84
  ## Comparisons