echarif commited on
Commit
024351f
·
verified ·
1 Parent(s): 60b522d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -3
README.md CHANGED
@@ -1,3 +1,53 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ metrics:
4
+ - bleu
5
+ base_model:
6
+ - facebook/mbart-large-cc25
7
+ pipeline_tag: translation
8
+ ---
9
+ # Moroccan Darija to English Translation Model (Fine-Tuned mBART)
10
+
11
+ This model is a fine-tuned version of the mBART model, designed specifically for the Moroccan Darija to English translation task. It is based on Facebook's mBART, a multilingual model capable of handling various language pairs. This fine-tuned model has been trained on a Moroccan Darija dataset to perform accurate translations from Darija to English.
12
+
13
+ ## Model Overview
14
+
15
+ - **Model Type**: mBART (Multilingual BART)
16
+ - **Language Pair**: Moroccan Darija → English
17
+ - **Task**: Machine Translation
18
+ - **Training Dataset**: The model was fine-tuned on a custom dataset containing Moroccan Darija to English translation pairs.
19
+
20
+ ## Model Details
21
+
22
+ The mBART model is a transformer-based sequence-to-sequence model, designed to handle multiple languages. It is particularly useful for tasks such as translation, text generation, and summarization.
23
+
24
+ For this specific task, the model has been fine-tuned to accurately translate text from **Moroccan Darija** to **English**, making it suitable for applications involving the translation of conversational and informal text from Morocco.
25
+
26
+ ## Intended Use
27
+
28
+ This model can be used to:
29
+ - Translate sentences from Moroccan Darija to English.
30
+
31
+ ## How to Use the Model
32
+
33
+ You can easily load the model and tokenizer using the Hugging Face `transformers` library. Here's an example:
34
+
35
+ ```python
36
+ from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
37
+
38
+ # Load the pre-trained model and tokenizer
39
+ model_name = 'echarif/mBART_for_darija_transaltion'
40
+ model = MBartForConditionalGeneration.from_pretrained(model_name)
41
+ tokenizer = MBart50TokenizerFast.from_pretrained(model_name)
42
+
43
+ # Prepare your input text (Moroccan Darija)
44
+ input_text = "insert your Moroccan Darija sentence here"
45
+
46
+ # Tokenize the input text
47
+ inputs = tokenizer(input_text, return_tensors="pt", padding=True)
48
+
49
+ # Generate the translated output
50
+ translated_tokens = model.generate(**inputs)
51
+ translated_text = tokenizer.decode(translated_tokens[0], skip_special_tokens=True)
52
+
53
+ print(f"Translated Text: {translated_text}")