PavanNeerudu
/

t5-base-finetuned-xsum

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

PavanNeerudu commited on May 29, 2023

Commit

f0ad688

·

1 Parent(s): 7453612

Create README.md

Files changed (1) hide show

README.md +59 -0

README.md ADDED Viewed

	@@ -0,0 +1,59 @@

+---
+language:
+- en
+license: apache-2.0
+datasets:
+- xsum
+metrics:
+- rouge
+model-index:
+- name: t5-base-finetuned-xsum
+  results:
+  - task:
+      name: Text Summarization
+      type: text-summarization
+    dataset:
+      name: Xsum
+      type: xsum
+      args: xsum
+    metrics:
+    - name: rouge
+      type: rouge
+      value: 0.3414
+---
+# gpt2-finetuned-xsum
+<!-- Provide a quick summary of what the model is/does. -->
+This model is t5-base fine-tuned on Xsum dataset for text summarization.
+## Model Details
+T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format.
+## Training Procedure
+To train the T5 model for text-summarization, I have used "summarize" prefix before every sentence and gave the encoding of this sentence as input ids and attention mask.
+For the labels, I used the encoding of the summaries as the decoder input ids and decoder attention mask.
+## Usage:
+For generating summaries on a example use:
+```python
+predictions = []
+tokenised_dataset = tokenizer(documents, truncation=True, padding='max_length', max_length=1024, return_tensors='pt')
+source_ids = tokenised_dataset['input_ids']
+source_mask = tokenised_dataset['attention_mask']
+output = model.generate(input_ids=source_ids, attention_mask=source_mask, max_length=256)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+## Experiments
+We report the ROUGE-1, ROUGE-2 and ROUGE-L on the test datasets.
+### Xsum
+| ROUGE-1 | ROUGE-2| ROUGE-L|
+|---------|--------|--------|
+|  0.3414 | 0.1260 | 0.2832 |