---
license: apache-2.0
tags:
  - summarization
  - custom-model
  - pegasus
  - seq2seq
  - huggingface
  - transformers
library_name: transformers
inference: false
model-index:
  - name: Custom Pegasus Summarizer
    results: []
---

# 🦅 Custom Pegasus Summarizer

This model is a **custom-wrapped version** of \[`google/pegasus-xsum`\](https://huggingface.co/google/pegasus-xsum) built for **summarization tasks**. It\'s implemented using Hugging Face\'s \`transformers\` library and wrapped with a custom model class for educational and experimental flexibility.

✅ It supports:
- Easy fine-tuning and extension \(e.g., adapters, prompt tuning\)
- Drop-in replacement for the original model
- Hugging Face Hub compatibility
- Works with \`AutoTokenizer\` and \`CustomSeq2SeqModel\`

---

## 🧠 Model Architecture

- **Base**: google/pegasus-xsum  
- **Wrapper**: CustomSeq2SeqModel \(inherits from PreTrainedModel\)  
- **Tokenizer**: AutoTokenizer from the same repo  
- **Configuration**: CustomSeq2SeqConfig \(inherits from PretrainedConfig\)

---

## 🧪 Training Details

- **Dataset**: xsum \(500-sample subset\)  
- **Task**: Abstractive Summarization  
- **Epochs**: 1  
- **Batch Size**: 4  
- **Learning Rate**: 2e-5  
- **Training Framework**: Hugging Face Trainer  

---

## 💡 Usage Example

\`\`\`python
from transformers import AutoTokenizer
from model import CustomSeq2SeqModel  # Your custom wrapper

tokenizer = AutoTokenizer.from_pretrained("your-username/custom-pegasus-summarizer")
model = CustomSeq2SeqModel.from_pretrained("your-username/custom-pegasus-summarizer")

text = "summarize: The Apollo program was a major milestone in space exploration..."
inputs = tokenizer(text, return_tensors="pt", truncation=True)
summary_ids = model.generate(**inputs, max_length=60)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))
\`\`\`

---

## 🎛 Live Demos

You can try this model interactively on Hugging Face Spaces:

- Gradio App: https://huggingface.co/spaces/your-username/custom-pegasus-gradio  
- Streamlit App: https://huggingface.co/spaces/your-username/custom-pegasus-streamlit  

---

## 📦 Files Included

- \`config.json\` – Model configuration \(used by \`from_pretrained\`\)  
- \`pytorch_model.bin\` – Fine-tuned model weights  
- \`tokenizer_config.json\` – Tokenizer settings  
- \`vocab.json\` / \`merges.txt\` – Tokenizer vocab \(depends on tokenizer type\)  
- \`special_tokens_map.json\` – Special tokens for summarization  
- \`README.md\` – This model card  
- \`model.py\` – \(if included\) Your \`CustomSeq2SeqModel\` class  

---

## 📜 License

Apache 2.0 — same license as the original \`pegasus-xsum\`.

---

## 🙏 Acknowledgments

- Hugging Face for \`transformers\`, \`datasets\`, and \`hub\`  
- Authors of PEGASUS  
- Educational/Research communities building custom NLP models