arxiv-summarization

This model is a fine-tuned version of google/flan-t5-small on a dataset of armanc/scientific_papers (arxiv). It is optimized for summarizing scientific abstracts.

Model Details

  • Base Model: google/flan-t5-small
  • Training Data: Arxiv Research Papers (article โ†’ abstract)
  • Fine-Tuned Task: Text Summarization
  • Use Case: Generate shorter summaries of long research papers
  • License: Apache 2.0

How to Use

from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization")
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization")

text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection."
inputs = tokenizer(text, return_tensors="pt")
summary_ids = model.generate(**inputs)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print("Generated Summary:", summary)

Training Details

  • Training Data: 100k+ Arxiv research papers
  • Training Framework: Hugging Face Transformers
  • Hyperparameters:
    • Learning Rate: 5e-5
    • Batch Size: 8
    • Epochs: 10
  • Hardware Used: TPU & GPU

Limitations

  • โŒ May struggle with very technical papers (e.g., complex math formulas).

Example Summaries

Original Abstract Generated Summary
"Deep learning has transformed many fields... We propose a new CNN for cancer detection..." "A CNN model is proposed for cancer detection using deep learning."
"Quantum computing has shown potential for cryptographic applications..." "Quantum computing can be used in cryptography."
Downloads last month
151
Safetensors
Model size
77M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Talina06/arxiv-summarization

Finetuned
(352)
this model

Dataset used to train Talina06/arxiv-summarization