format
Browse files
README.md
CHANGED
@@ -88,7 +88,7 @@ A fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/goo
|
|
88 |
|
89 |
- At the time of writing, the model is not _fully converged_ despite training for 20+ epochs. This checkpoint is serviceable enough (see examples).
|
90 |
- I plan to update this page with newer checkpoints and post some metrics over time.
|
91 |
-
- Compare performance to [LED-base](https://huggingface.co/pszemraj/led-base-book-summary) trained on the same dataset.
|
92 |
|
93 |
## Training and evaluation data
|
94 |
|
@@ -111,7 +111,7 @@ The following hyperparameters were used during the **final** training round\*:
|
|
111 |
- lr_scheduler_warmup_ratio: 0.02
|
112 |
- num_epochs: 2
|
113 |
|
114 |
-
\*_Prior training sessions used roughly similar parameters, multiple sessions were required as this takes eons to
|
115 |
|
116 |
### Training results
|
117 |
|
|
|
88 |
|
89 |
- At the time of writing, the model is not _fully converged_ despite training for 20+ epochs. This checkpoint is serviceable enough (see examples).
|
90 |
- I plan to update this page with newer checkpoints and post some metrics over time.
|
91 |
+
- Compare performance to [LED-base](https://huggingface.co/pszemraj/led-base-book-summary) trained on the same dataset (API gen parameters are the same).
|
92 |
|
93 |
## Training and evaluation data
|
94 |
|
|
|
111 |
- lr_scheduler_warmup_ratio: 0.02
|
112 |
- num_epochs: 2
|
113 |
|
114 |
+
\*_Prior training sessions used roughly similar parameters, multiple sessions were required as this takes eons to train_
|
115 |
|
116 |
### Training results
|
117 |
|