arxiv-summarization / README.md
Talina06's picture
Update README.md
a63262d verified
metadata
license: apache-2.0
datasets:
  - armanc/scientific_papers
language:
  - en
base_model:
  - google/flan-t5-small
tags:
  - summarization
  - research-papers
  - arxiv
  - t5

arxiv-summarization

This model is a fine-tuned version of google/flan-t5-small on a dataset of armanc/scientific_papers (arxiv). It is optimized for summarizing scientific abstracts.

Model Details

  • Base Model: google/flan-t5-small
  • Training Data: Arxiv Research Papers (articleabstract)
  • Fine-Tuned Task: Text Summarization
  • Use Case: Generate shorter summaries of long research papers
  • License: Apache 2.0

How to Use

from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization")
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization")

text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection."
inputs = tokenizer(text, return_tensors="pt")
summary_ids = model.generate(**inputs)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print("Generated Summary:", summary)

Training Details

  • Training Data: 100k+ Arxiv research papers
  • Training Framework: Hugging Face Transformers
  • Hyperparameters:
    • Learning Rate: 5e-5
    • Batch Size: 8
    • Epochs: 10
  • Hardware Used: TPU & GPU

Limitations

  • ❌ May struggle with very technical papers (e.g., complex math formulas).

Example Summaries

Original Abstract Generated Summary
"Deep learning has transformed many fields... We propose a new CNN for cancer detection..." "A CNN model is proposed for cancer detection using deep learning."
"Quantum computing has shown potential for cryptographic applications..." "Quantum computing can be used in cryptography."