File size: 5,619 Bytes

abb536e
 
 
8f7536b
c99607a
8f7536b
abb536e
 
8f7536b
abb536e
 
8f7536b
 
abb536e
 
 
 
 
 
 
 
8f7536b
 
6dd4550
8f7536b
abb536e
8f7536b
abb536e
 
 
 
8f7536b
 
abb536e
 
 
 
8f7536b
abb536e
 
 
 
8f7536b
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
abb536e
 
 
 
 
8f7536b
abb536e
8f7536b
 
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
 
abb536e
c99607a
8f7536b
abb536e
8f7536b
 
 
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
 
abb536e
8f7536b
 
 
abb536e
8f7536b
 
abb536e
c99607a
8f7536b
abb536e
8f7536b
 
 
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
 
abb536e
8f7536b
 
 
 
abb536e
8f7536b
 
abb536e
c99607a
8f7536b
abb536e
8f7536b
 
 
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
 
abb536e
8f7536b
 
 
 
 
 
abb536e
c99607a
8f7536b
abb536e
8f7536b
 
 
abb536e
8f7536b
abb536e
 
8f7536b
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
abb536e
8f7536b
 
 
abb536e
 
 
 
 
 
 
8f7536b
 
 
abb536e
8f7536b
abb536e
8f7536b
abb536e
 
 
8f7536b

---
library_name: transformers
tags: []
widget:
- text: 'Please correct the following sentence: ukuti yiles sivnmelwano'
  example_title: Spelling Correction
---

# Model Card for T5-Ndebele-SC

<!-- Provide a quick summary of what the model is/does. -->
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/flan2_architecture.jpg"
alt="drawing" width="600"/>


## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** [Thabolezwe Mabandla](http://www.linkedin.com/in/thabolezwe-mabandla-81a62a22b)
- **Model type:** Language Model
- **Language(s) (NLP):** Ndebele
- **Finetuned from model:** [FLAN-T5](https://huggingface.co/google/flan-t5-small)

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** [More Information Needed]
- **Paper:** [More Information Needed]
- **Demo:** [More Information Needed]

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
> Correction of spelling errors in ndebele sentences or phrases.

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
> Spelling correction

# Bias, Risks, and Limitations

The information below in this section are copied from the model's [official model card](https://arxiv.org/pdf/2210.11416.pdf):

> Language models, including Flan-T5, can potentially be used for language generation in a harmful way, according to Rae et al. (2021). Flan-T5 should not be used directly in any application, without a prior assessment of safety and fairness concerns specific to the application.

## Ethical considerations and risks

> Flan-T5 is fine-tuned on a large corpus of text data that was not filtered for explicit content or assessed for existing biases. As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases in the underlying data.

## How to Get Started with the Model

Use the code below to get started with the model.

### Running the model on a CPU

<details>
<summary> Click to expand </summary>

```python

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector-ndebele")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector-ndebele")

input_text = "Please correct the following sentence: ukuti yiles sivnmelwano"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```

</details>

### Running the model on a GPU

<details>
<summary> Click to expand </summary>

```python
# pip install accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector-ndebele")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector-ndebele", device_map="auto")

input_text = "Please correct the following sentence: ukuti yiles sivnmelwano"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```

</details>

### Running the model on a GPU using different precisions

#### FP16

<details>
<summary> Click to expand </summary>

```python
# pip install accelerate
import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector-ndebele")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector-ndebele", device_map="auto", torch_dtype=torch.float16)

input_text = "Please correct the following sentence: ukuti yiles sivnmelwano"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```

</details>

#### INT8

<details>
<summary> Click to expand </summary>

```python
# pip install bitsandbytes accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("thaboe01/t5-spelling-corrector-ndebele")
model = T5ForConditionalGeneration.from_pretrained("thaboe01/t5-spelling-corrector-ndebele", device_map="auto", load_in_8bit=True)

input_text = "Please correct the following sentence: ukuti yiles sivnmelwano"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
```

</details>


## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Metrics

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->
<img src="https://huggingface.co/thaboe01/t5-spelling-corrector/blob/main/Screenshot%202024-05-21%20121138.png"
alt="metrics" width="600"/>

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** [T4 GPU x 2]
- **Hours used:** [8]
- **Cloud Provider:** [Kaggle]

## Model Card Authors

Thabolezwe Mabandla

## Model Card Contact

[email protected]