jordiclive
commited on
Commit
•
a40b6f3
1
Parent(s):
b179c47
Update README.md
Browse files
README.md
CHANGED
@@ -194,15 +194,13 @@ result = summarizer(
|
|
194 |
If having computing constraints, try the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary)
|
195 |
- all the parameters for generation on the API here are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
|
196 |
|
197 |
-
## Training and evaluation data
|
198 |
-
|
199 |
-
- the [booksum](https://arxiv.org/abs/2105.08209) dataset (this is what adds the `bsd-3-clause` license)
|
200 |
-
- During training, the input text was the text of the `chapter`, and the output was `summary_text`
|
201 |
-
- Eval results can be found [here](https://huggingface.co/datasets/autoevaluate/autoeval-staging-eval-project-kmfoda__booksum-79c1c0d8-10905463) with metrics on the sidebar.
|
202 |
-
|
203 |
## Training procedure
|
204 |
|
205 |
- Training was done in BF16, deepspeed stage 2 for 6 epochs with ROUGE2 monitored on the validation set.
|
|
|
|
|
|
|
|
|
206 |
-
|
207 |
### Training hyperparameters
|
208 |
|
|
|
194 |
If having computing constraints, try the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary)
|
195 |
- all the parameters for generation on the API here are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
|
196 |
|
|
|
|
|
|
|
|
|
|
|
|
|
197 |
## Training procedure
|
198 |
|
199 |
- Training was done in BF16, deepspeed stage 2 for 6 epochs with ROUGE2 monitored on the validation set.
|
200 |
+
|
201 |
+
## Hardware
|
202 |
+
- GPU count 8 NVIDIA A100-SXM4-40GB
|
203 |
+
- CPU count 48
|
204 |
-
|
205 |
### Training hyperparameters
|
206 |
|