Introduction
A led-large-16384 model to summarize ArXiv papers. Inputs are the abstracts of papers and full documents, and outputs are the summaries of the papers.
Allenai's Longformer Encoder-Decoder (LED).
As described in Longformer: The Long-Document Transformer by Iz Beltagy, Matthew E. Peters, Arman Cohan, led-base-16384 was initialized from bart-base since both models share the exact same architecture. To be able to process 16K tokens, bart-base's position embedding matrix was simply copied 16 times.
- Downloads last month
- 110
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Spaces using AlgorithmicResearchGroup/led_large_16384_arxiv_summarization 4
Evaluation results
- ROUGE-1 on ccdv/arxiv-summarizationtest set verified37.947
- ROUGE-2 on ccdv/arxiv-summarizationtest set verified11.314
- ROUGE-L on ccdv/arxiv-summarizationtest set verified20.556
- ROUGE-LSUM on ccdv/arxiv-summarizationtest set verified33.834
- loss on ccdv/arxiv-summarizationtest set verified2.806
- gen_len on ccdv/arxiv-summarizationtest set verified157.417