File size: 4,035 Bytes
51a4d2f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
library_name: transformers
license: mit
model_name: MBart-Urdu-Text-Summarization
pipeline_tag: summarization
tags:
- text-generation
- mbart
- nlp
- transformers
- text-generation-inference
author: Wali Muhammad Ahmad
private: false
gated: false
inference: true
mask_token: <mask>
widget_data:
text: Enter your para here
transformers_info:
auto_class: MBartForConditionalGeneration
processor: AutoTokenizer
language:
- en
- ur
---
# Model Card
MBart-Urdu-Text-Summarization is a fine-tuned MBart model designed for summarizing Urdu text. It leverages the multilingual capabilities of MBart to generate concise and accurate summaries for Urdu paragraphs.
## Model Details
### Model Description
This model is based on the MBart architecture, which is a sequence-to-sequence model pre-trained on multilingual data. It has been fine-tuned specifically for Urdu text summarization tasks. The model is capable of understanding and generating text in both English and Urdu, making it suitable for multilingual applications.
### Model Sources [optional]
- **Repository:** [https://github.com/WaliMuhammadAhmad/UrduTextSummarizationUsingm-BART]
- **Paper [Multilingual Denoising Pre-training for Neural Machine Translation]:** [https://arxiv.org/abs/2001.08210]
## Uses
### Direct Use
This model can be used directly for Urdu text summarization tasks. It is suitable for applications such as news summarization, document summarization, and content generation.
### Downstream Use [optional]
The model can be fine-tuned for specific downstream tasks such as sentiment analysis, question answering, or machine translation for Urdu and English.
### Out-of-Scope Use
This model is not intended for generating biased, harmful, or misleading content. It should not be used for tasks outside of text summarization without proper fine-tuning and evaluation.
## Bias, Risks, and Limitations
- The model may generate biased or inappropriate content if the input text contains biases.
- It is trained on a specific dataset and may not generalize well to other domains or languages.
- The model's performance may degrade for very long input texts.
### Recommendations
Users should carefully evaluate the model's outputs for biases and appropriateness. Fine-tuning on domain-specific data is recommended for better performance in specialized applications.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, MBartForConditionalGeneration
# Load the model and tokenizer
model_name = "ihatenlp/MBart-Urdu-Text-Summarization"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = MBartForConditionalGeneration.from_pretrained(model_name)
# Example input text
input_text = "Enter your Urdu paragraph here."
# Tokenize and generate summary
inputs = tokenizer(input_text, return_tensors="pt")
summary_ids = model.generate(inputs["input_ids"], max_length=50, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print("Summary:", summary)
```
## Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
## Citation [optional]
**BibTeX:**
```bibtex
@misc{liu2020multilingualdenoisingpretrainingneural,
title={Multilingual Denoising Pre-training for Neural Machine Translation},
author={Yinhan Liu and Jiatao Gu and Naman Goyal and Xian Li and Sergey Edunov and Marjan Ghazvininejad and Mike Lewis and Luke Zettlemoyer},
year={2020},
eprint={2001.08210},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2001.08210},
}
```
## Model Card Authors [optional]
- **Wali Muhammad Ahmad**
- **Muhammad Labeeb Tariq**
## Model Card Contact
- **Email:** [[email protected]]
- **Hugging Face Profile:** [Wali Muhammad Ahmad](https://huggingface.co/ihatenlp) |