LED-Based Summarization Model: Condensing Long and Technical Information

Open In Colab

The Longformer Encoder-Decoder (LED) for Narrative-Esque Long Text Summarization is a model I fine-tuned from allenai/led-base-16384 to condense extensive technical, academic, and narrative content in a fairly generalizable way.

Key Features and Use Cases

  • Ideal for summarizing long narratives, articles, papers, textbooks, and other documents.
    • the sparknotes-esque style leads to 'explanations' in the summarized content, offering insightful output.
  • High capacity: Handles up to 16,384 tokens per batch.
  • demos: try it out in the notebook linked above or in the demo on Spaces

Note: The API widget has a max length of ~96 tokens due to inference timeout constraints.

Training Details

The model was trained on the BookSum dataset released by SalesForce, which leads to the bsd-3-clause license. The training process involved 16 epochs with parameters tweaked to facilitate very fine-tuning-type training (super low learning rate).

Model checkpoint: pszemraj/led-base-16384-finetuned-booksum.

Other Related Checkpoints

This model is the smallest/fastest booksum-tuned model I have worked on. If you're looking for higher quality summaries, check out:

There are also other variants on other datasets etc on my hf profile, feel free to try them out :)


Basic Usage

I recommend using encoder_no_repeat_ngram_size=3 when calling the pipeline object, as it enhances the summary quality by encouraging the use of new vocabulary and crafting an abstractive summary.

Create the pipeline object:

import torch
from transformers import pipeline

hf_name = "pszemraj/led-base-book-summary"

summarizer = pipeline(
    "summarization",
    hf_name,
    device=0 if torch.cuda.is_available() else -1,
)

Feed the text into the pipeline object:

wall_of_text = "your words here"

result = summarizer(
    wall_of_text,
    min_length=8,
    max_length=256,
    no_repeat_ngram_size=3,
    encoder_no_repeat_ngram_size=3,
    repetition_penalty=3.5,
    num_beams=4,
    do_sample=False,
    early_stopping=True,
)
print(result[0]["generated_text"])

Simplified Usage with TextSum

To streamline the process of using this and other models, I've developed a Python package utility named textsum. This package offers simple interfaces for applying summarization models to text documents of arbitrary length.

Install TextSum:

pip install textsum

Then use it in Python with this model:

from textsum.summarize import Summarizer

model_name = "pszemraj/led-base-book-summary"
summarizer = Summarizer(
    model_name_or_path=model_name,  # you can use any Seq2Seq model on the Hub
    token_batch_length=4096,  # how many tokens to batch summarize at a time
)
long_string = "This is a long string of text that will be summarized."
out_str = summarizer.summarize_string(long_string)
print(f"summary: {out_str}")

Currently implemented interfaces include a Python API, a Command-Line Interface (CLI), and a shareable demo/web UI.

For detailed explanations and documentation, check the README or the wiki


Downloads last month
392
Safetensors
Model size
162M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pszemraj/led-base-book-summary

Finetunes
15 models

Dataset used to train pszemraj/led-base-book-summary

Spaces using pszemraj/led-base-book-summary 21

Collection including pszemraj/led-base-book-summary

Evaluation results