itsmeussa commited on
Commit
37411fe
1 Parent(s): 7a00a21

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -4
README.md CHANGED
@@ -13,6 +13,11 @@ model-index:
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
 
 
 
 
 
16
  # AdabTranslate-Darija
17
 
18
  This model is a fine-tuned version of [moussaKam/arabart](https://huggingface.co/moussaKam/arabart) on an unknown dataset.
@@ -23,16 +28,15 @@ It achieves the following results on the evaluation set:
23
 
24
  ## Model description
25
 
26
- The Darija to MSA Translator is a cutting-edge translation model developed to facilitate seamless communication between Darija (Moroccan Arabic) and Modern Standard Arabic (MSA). Leveraging state-of-the-art techniques in natural language processing and powered by the Hugging Face Transformers library, this model offers high-quality translations with accuracy and fluency at its core.
27
 
28
  ## Intended uses & limitations
29
 
30
- This model is designed to cater to a wide range of users, including language enthusiasts, researchers, and developers working on multilingual projects. Its intuitive interface and customizable nature allow for easy integration into various applications and workflows. However, like any machine learning model, it does have limitations and may not be suitable for highly specialized or domain-specific translations.
31
 
32
  ## Training and evaluation data
33
 
34
- The Darija to MSA Translator was trained on a diverse dataset comprising Darija and MSA text pairs, enabling it to learn the nuances and intricacies of both languages. The evaluation data was meticulously selected to ensure robust performance and validate the model's accuracy and effectiveness in real-world scenarios.
35
-
36
 
37
  ## Training procedure
38
 
@@ -88,6 +92,22 @@ The following hyperparameters were used during training:
88
  | 0.7638 | 4.75 | 7000 | 1.0901 | 46.4753 | 9.6439 |
89
  | 0.7448 | 4.88 | 7200 | 1.0892 | 46.4939 | 9.6377 |
90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
 
92
  ### Framework versions
93
 
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
+ # Authors
17
+ - Oussama Mounajjim
18
+ - Imad Zaoug
19
+ - Mehdi Soufiane
20
+
21
  # AdabTranslate-Darija
22
 
23
  This model is a fine-tuned version of [moussaKam/arabart](https://huggingface.co/moussaKam/arabart) on an unknown dataset.
 
28
 
29
  ## Model description
30
 
31
+ The Darija to MSA Translator is a state-of-the-art translation model meticulously trained on a diverse dataset comprising 26,000 text pairs meticulously annotated by human annotators and augmented using GPT-4 techniques. Leveraging the datasets available on Hugging Face and employing advanced training techniques, this model achieves exceptional accuracy and fluency in translating between Darija (Moroccan Arabic) and Modern Standard Arabic (MSA). Powered by the Hugging Face Transformers library, it represents a significant advancement in natural language processing technology, making it a valuable tool for bridging language barriers and promoting linguistic diversity.
32
 
33
  ## Intended uses & limitations
34
 
35
+ The Darija to MSA Translator is designed to cater to a wide range of users, including language enthusiasts, researchers, and developers working on multilingual projects. Its robust training on a diverse dataset ensures accuracy and effectiveness in various contexts. However, users should be aware of its limitations, particularly in highly specialized or domain-specific translations where additional fine-tuning may be necessary.
36
 
37
  ## Training and evaluation data
38
 
39
+ The training data for the Darija to MSA Translator consists of 26,000 text pairs generated via human annotation and augmented using GPT-4 techniques. These datasets were sourced from Hugging Face, ensuring a comprehensive and diverse set of examples for training. The evaluation data was carefully selected to validate the model's performance and accuracy in real-world scenarios, ensuring its reliability and effectiveness in practical applications.
 
40
 
41
  ## Training procedure
42
 
 
92
  | 0.7638 | 4.75 | 7000 | 1.0901 | 46.4753 | 9.6439 |
93
  | 0.7448 | 4.88 | 7200 | 1.0892 | 46.4939 | 9.6377 |
94
 
95
+ # How to use it ?
96
+
97
+ Just copy and paste this code after installing the necessary libraries
98
+
99
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
100
+
101
+ model_path = 'itsmeussa/AdabTranslate-Darija'
102
+ model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
103
+ tokenizer = AutoTokenizer.from_pretrained('moussaKam/arabart')
104
+
105
+
106
+ seq = "مرحبا بيكم"
107
+ tok = tokenizer.encode(seq, return_tensors='pt')
108
+
109
+ res = model.generate(tok)
110
+ tokenizer.decode(res[0])
111
 
112
  ### Framework versions
113