File size: 5,492 Bytes
a2d538b 4a46fe2 c3be920 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 6e1d91b a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 5371262 a2d538b 6e1d91b a2d538b 5371262 a2d538b 5371262 |
|
---
library_name: transformers
tags: []
widget:
- text: 'Please correct the following sentence: ndaids kurnda kumba kwaco'
example_title: Spelling Correction
---
# Model Card for T5-Shona-SC
<!-- Provide a quick summary of what the model is/does. -->
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/flan2_architecture.jpg"
alt="drawing" width="600"/>
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** [Thabolezwe Mabandla](http://www.linkedin.com/in/thabolezwe-mabandla-81a62a22b)
- **Model type:** Language Model
- **Language(s) (NLP):** Shona
- **Finetuned from model:** [FLAN-T5](https://huggingface.co/google/flan-t5-small)
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
> Correction of spelling errors in shona sentences or phrases.
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
> Spelling correction
# Bias, Risks, and Limitations
The information below in this section are copied from the model's [official model card](https://arxiv.org/pdf/2210.11416.pdf):
> Language models, including Flan-T5, can potentially be used for language generation in a harmful way, according to Rae et al. (2021). Flan-T5 should not be used directly in any application, without a prior assessment of safety and fairness concerns specific to the application.
## Ethical considerations and risks
> Flan-T5 is fine-tuned on a large corpus of text data that was not filtered for explicit content or assessed for existing biases. As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases in the underlying data.
## How to Get Started with the Model
Use the code below to get started with the model.
### Running the model on a CPU
<details>
<summary> Click to expand </summary>
```python
from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector")
input_text = "Please correct the following sentence: ndaids kurnda kumba kwaco"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```
</details>
### Running the model on a GPU
<details>
<summary> Click to expand </summary>
```python
# pip install accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector", device_map="auto")
input_text = "Please correct the following sentence: ndaids kurnda kumba kwaco"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```
</details>
### Running the model on a GPU using different precisions
#### FP16
<details>
<summary> Click to expand </summary>
```python
# pip install accelerate
import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector", device_map="auto", torch_dtype=torch.float16)
input_text = "Please correct the following sentence: ndaids kurnda kumba kwaco"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```
</details>
#### INT8
<details>
<summary> Click to expand </summary>
```python
# pip install bitsandbytes accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector", device_map="auto", load_in_8bit=True)
input_text = "Please correct the following sentence: ndaids kurnda kumba kwaco"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```
</details>
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Metrics
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
<img src=""
alt="metrics" width="600"/>
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** [T4 GPU x 2]
- **Hours used:** [8]
- **Cloud Provider:** [Kaggle]
## Model Card Authors
Thabolezwe Mabandla
## Model Card Contact
[email protected] |