AlgorithmicResearchGroup
/

led_base_16384_arxiv_summarization

text2text-generation

Model card Files Files and versions Community

Introduction

A led-base-16384 model to summarize ArXiv papers. Inputs are the abstracts of papers and full documents, and outputs are the summaries of the papers.

Allenai's Longformer Encoder-Decoder (LED).

As described in Longformer: The Long-Document Transformer by Iz Beltagy, Matthew E. Peters, Arman Cohan, led-base-16384 was initialized from bart-base since both models share the exact same architecture. To be able to process 16K tokens, bart-base's position embedding matrix was simply copied 16 times.

Rouge 2

Type	Score
`precision`	0.1839148953011932
`recall`	0.14904707945189774
`fmeasure`	0.1580026685776864

Downloads last month: 19

Safetensors

Model size

162M params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

ROUGE-1 on ccdv/arxiv-summarization
test set verified

37.325
ROUGE-2 on ccdv/arxiv-summarization
test set verified

10.895
ROUGE-L on ccdv/arxiv-summarization
test set verified

20.387
ROUGE-LSUM on ccdv/arxiv-summarization
test set verified

33.301
loss on ccdv/arxiv-summarization
test set verified

3.182
gen_len on ccdv/arxiv-summarization
test set verified

145.590

View on Papers With Code