--- language: en tags: - summarization - transformers - t5 - youtube license: apache-2.0 datasets: - custom model-index: - name: T5 YouTube Summarizer results: [] --- # 📺 T5 YouTube Summarizer This is a fine-tuned [`t5-base`](https://huggingface.co/t5-base) model for abstractive summarization of YouTube video transcripts. The model is trained on a custom dataset of video transcriptions and their manually written summaries. --- ## ✨ Model Details - **Base Model**: [`t5-base`](https://huggingface.co/t5-base) - **Task**: Abstractive Summarization - **Training Data**: YouTube video transcripts and human-written summaries - **Max Input Length**: 512 tokens - **Max Output Length**: 256 tokens - **Fine-tuning Epochs**: 10 - **Tokenizer**: `T5Tokenizer` (pretrained) --- ## 🧠 Intended Use This model is designed to generate short, informative summaries from long transcripts of educational or conceptual YouTube videos. It can be used for: - Quick understanding of long videos - Automated content summaries for blogs, platforms, or note-taking tools - Enhancing accessibility for long-form spoken content --- ## 🚀 How to Use ```python from transformers import T5ForConditionalGeneration, T5Tokenizer # Load the model model = T5ForConditionalGeneration.from_pretrained("your-username/t5-youtube-summarizer") tokenizer = T5Tokenizer.from_pretrained("your-username/t5-youtube-summarizer") # Define input text text = "The video talks about coordinate covalent bonds, giving examples from..." # Preprocess and summarize inputs = tokenizer.encode("summarize: " + text, return_tensors="pt", max_length=512, truncation=True) summary_ids = model.generate( inputs, max_length=256, min_length=80, num_beams=5, length_penalty=2.0, no_repeat_ngram_size=3, early_stopping=True ) summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) print(summary) ``` ## 📊 Evaluation | Metric | Value | | ------- | ------------ | | ROUGE-1 | \~0.60 | | ROUGE-2 | \~0.25 | | ROUGE-L | \~0.47 | | Gen Len | \~187 tokens | ## 📌 Citation If you use this model in your work, consider citing: ``` @misc{t5ytsummarizer2025, title={T5 YouTube Transcript Summarizer}, author={Muhammad Bilal Yousaf}, year={2025}, howpublished={\url{https://huggingface.co/bilal521/t5-youtube-summarizer}}, } ```