--- license: apache-2.0 tags: - summarization - custom-model - pegasus - seq2seq - huggingface - transformers library_name: transformers inference: false model-index: - name: Custom Pegasus Summarizer results: [] --- # ๐Ÿฆ… Custom Pegasus Summarizer This model is a **custom-wrapped version** of \[`google/pegasus-xsum`\](https://huggingface.co/google/pegasus-xsum) built for **summarization tasks**. It\'s implemented using Hugging Face\'s \`transformers\` library and wrapped with a custom model class for educational and experimental flexibility. โœ… It supports: - Easy fine-tuning and extension \(e.g., adapters, prompt tuning\) - Drop-in replacement for the original model - Hugging Face Hub compatibility - Works with \`AutoTokenizer\` and \`CustomSeq2SeqModel\` --- ## ๐Ÿง  Model Architecture - **Base**: google/pegasus-xsum - **Wrapper**: CustomSeq2SeqModel \(inherits from PreTrainedModel\) - **Tokenizer**: AutoTokenizer from the same repo - **Configuration**: CustomSeq2SeqConfig \(inherits from PretrainedConfig\) --- ## ๐Ÿงช Training Details - **Dataset**: xsum \(500-sample subset\) - **Task**: Abstractive Summarization - **Epochs**: 1 - **Batch Size**: 4 - **Learning Rate**: 2e-5 - **Training Framework**: Hugging Face Trainer --- ## ๐Ÿ’ก Usage Example \`\`\`python from transformers import AutoTokenizer from model import CustomSeq2SeqModel # Your custom wrapper tokenizer = AutoTokenizer.from_pretrained("your-username/custom-pegasus-summarizer") model = CustomSeq2SeqModel.from_pretrained("your-username/custom-pegasus-summarizer") text = "summarize: The Apollo program was a major milestone in space exploration..." inputs = tokenizer(text, return_tensors="pt", truncation=True) summary_ids = model.generate(**inputs, max_length=60) print(tokenizer.decode(summary_ids[0], skip_special_tokens=True)) \`\`\` --- ## ๐ŸŽ› Live Demos You can try this model interactively on Hugging Face Spaces: - Gradio App: https://huggingface.co/spaces/your-username/custom-pegasus-gradio - Streamlit App: https://huggingface.co/spaces/your-username/custom-pegasus-streamlit --- ## ๐Ÿ“ฆ Files Included - \`config.json\` โ€“ Model configuration \(used by \`from_pretrained\`\) - \`pytorch_model.bin\` โ€“ Fine-tuned model weights - \`tokenizer_config.json\` โ€“ Tokenizer settings - \`vocab.json\` / \`merges.txt\` โ€“ Tokenizer vocab \(depends on tokenizer type\) - \`special_tokens_map.json\` โ€“ Special tokens for summarization - \`README.md\` โ€“ This model card - \`model.py\` โ€“ \(if included\) Your \`CustomSeq2SeqModel\` class --- ## ๐Ÿ“œ License Apache 2.0 โ€” same license as the original \`pegasus-xsum\`. --- ## ๐Ÿ™ Acknowledgments - Hugging Face for \`transformers\`, \`datasets\`, and \`hub\` - Authors of PEGASUS - Educational/Research communities building custom NLP models