blackhole33
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -12,8 +12,84 @@ base_model:
|
|
12 |
pipeline_tag: translation
|
13 |
---
|
14 |
|
15 |
-
#
|
16 |
|
17 |
-
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
|
|
|
12 |
pipeline_tag: translation
|
13 |
---
|
14 |
|
15 |
+
# English to Uzbek Translation Model
|
16 |
|
17 |
+
This repository provides a translation model based on **facebook/nllb-200-distilled-600M** to translate text from **English** to **Uzbek**. The model also supports translation to a partial list of other languages, offering flexibility for multi-language translation tasks.
|
18 |
|
19 |
+
---
|
20 |
+
|
21 |
+
## Model Description:
|
22 |
+
|
23 |
+
### Model Name:
|
24 |
+
- **facebook/nllb-200-distilled-600M**
|
25 |
+
|
26 |
+
### Task:
|
27 |
+
- **Text Translation** (English → Uzbek)
|
28 |
+
|
29 |
+
### License:
|
30 |
+
- **Apache-2.0**
|
31 |
+
|
32 |
+
### Supported Languages:
|
33 |
+
- **Primary**: English → Uzbek
|
34 |
+
- **Partial support for additional languages** (e.g., Russian, Azerbaijani, etc.)
|
35 |
+
|
36 |
+
---
|
37 |
+
|
38 |
+
## Installation
|
39 |
+
|
40 |
+
To use the model, install the required dependencies first:
|
41 |
+
|
42 |
+
```bash
|
43 |
+
pip install transformers torch
|
44 |
+
|
45 |
+
```
|
46 |
+
|
47 |
+
## Example Usage
|
48 |
+
|
49 |
+
```
|
50 |
+
|
51 |
+
from transformers import MarianMTModel, MarianTokenizer
|
52 |
+
|
53 |
+
# Load the pre-trained model and tokenizer
|
54 |
+
model_name = 'facebook/nllb-200-distilled-600M'
|
55 |
+
model = MarianMTModel.from_pretrained(model_name)
|
56 |
+
tokenizer = MarianTokenizer.from_pretrained(model_name)
|
57 |
+
|
58 |
+
# Function to translate text
|
59 |
+
def translate_text(text: str, target_lang: str = 'uz'):
|
60 |
+
# Tokenize the input text
|
61 |
+
inputs = tokenizer.encode(text, return_tensors="pt", padding=True)
|
62 |
+
# Translate the input text
|
63 |
+
translated = model.generate(inputs, num_beams=5, max_length=200, early_stopping=True)
|
64 |
+
# Decode the output
|
65 |
+
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
|
66 |
+
return translated_text
|
67 |
+
|
68 |
+
# Example input text
|
69 |
+
input_text = "An Azerbaijan Airlines Embraer ERJ-190AR aircraft crashed at Aktau Airport in Kazakhstan while attempting an emergency landing. The plane, registered as 4K-AZ65, was carrying 67 passengers and five crew members at the time. Some media reports suggest that the number of passengers exceeded 100, with over 60 identified as Russian citizens."
|
70 |
+
|
71 |
+
# Translate the input text to Uzbek
|
72 |
+
output_text = translate_text(input_text)
|
73 |
+
print("Translated text:", output_text)
|
74 |
+
|
75 |
+
```
|
76 |
+
|
77 |
+
## Test and Comparing
|
78 |
+
|
79 |
+
* "An Azerbaijan Airlines Embraer ERJ-190AR aircraft crashed at Aktau Airport in Kazakhstan while attempting an emergency landing. The plane, registered as 4K-AZ65, was carrying 67 passengers and five crew members at the time. Some media reports suggest that the number of passengers exceeded 100, with over 60 identified as Russian citizens."
|
80 |
+
|
81 |
+
* Booba
|
82 |
+
|
83 |
+
Anen Azerbaijan Airlines Embraer ERJ-190AR samolyotlari favqulodda qo'nishga kirishga urinayotganda Qazaxistonning Aktau aeroportida halokatga uchradi. 4K-AZ65 nomli aviakompaniya o'sha vaqtda 67 yo'lovchi va beshta ekipaj a'zosi bilan birga edi. Ayrim ommaviy axborot vositalarining xabarlariga ko'ra, yo'lovchilar soni 100 dan oshgan, ularning 60 dan oshiqining rus fuqarolari ekanligi aniqlandi.
|
84 |
+
|
85 |
+
* Google
|
86 |
+
|
87 |
+
Ozarbayjon aviakompaniyasining Embraer ERJ-190AR samolyoti Qozog‘istonning Aktau aeroportida favqulodda qo‘nishga urinayotgan vaqtda halokatga uchradi. 4K-AZ65 sifatida ro‘yxatga olingan samolyotda o‘sha paytda 67 yo‘lovchi va besh nafar ekipaj a’zosi bo‘lgan. Ayrim ommaviy axborot vositalarida yo‘lovchilar soni 100 dan oshgani, 60 dan ortig‘i Rossiya fuqarolari ekani aytilmoqda.
|
88 |
+
|
89 |
+
* Yandex
|
90 |
+
|
91 |
+
Ozarbayjon aviakompaniyasining Embraer ERJ-190ar samolyoti Favqulodda qo'nishga urinayotganda Qozog'istonning Aktau aeroportida qulab tushdi. 4K-AZ65 sifatida ro'yxatdan o'tgan samolyotda o'sha paytda 67 yo'lovchi va besh ekipaj a'zosi bo'lgan. Ba'zi OAV xabarlariga ko'ra, yo'lovchilar soni 100 dan oshgan, 60 dan ortiq Rossiya fuqarolari aniqlangan.
|
92 |
+
|
93 |
+
# Note:
|
94 |
|
95 |
+
This model was built with open source dataset, if there is any mistakes, sorry us!
|