--- language: - en license: apache-2.0 datasets: - xsum metrics: - rouge model-index: - name: t5-base-finetuned-xsum results: - task: name: Text Summarization type: text-summarization dataset: name: Xsum type: xsum args: xsum metrics: - name: rouge type: rouge value: 0.3414 --- # t5-base-finetuned-xsum This model is t5-base fine-tuned on Xsum dataset for text summarization. ## Model Details T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format. ## Training Procedure To train the T5 model for text-summarization, I have used "summarize" prefix before every sentence and gave the encoding of this sentence as input ids and attention mask. For the labels, I used the encoding of the summaries as the decoder input ids and decoder attention mask. ## Usage: For generating summaries on a example use: ```python predictions = [] tokenised_dataset = tokenizer(documents, truncation=True, padding='max_length', max_length=1024, return_tensors='pt') source_ids = tokenised_dataset['input_ids'] source_mask = tokenised_dataset['attention_mask'] output = model.generate(input_ids=source_ids, attention_mask=source_mask, max_length=256) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Experiments We report the ROUGE-1, ROUGE-2 and ROUGE-L on the test datasets. ### Xsum | ROUGE-1 | ROUGE-2| ROUGE-L| |---------|--------|--------| | 0.3414 | 0.1260 | 0.2832 |