itsmeussa
/

AdabTranslate-Darija

@@ -13,6 +13,11 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # AdabTranslate-Darija
 This model is a fine-tuned version of [moussaKam/arabart](https://huggingface.co/moussaKam/arabart) on an unknown dataset.
@@ -23,16 +28,15 @@ It achieves the following results on the evaluation set:
 ## Model description
-The Darija to MSA Translator is a cutting-edge translation model developed to facilitate seamless communication between Darija (Moroccan Arabic) and Modern Standard Arabic (MSA). Leveraging state-of-the-art techniques in natural language processing and powered by the Hugging Face Transformers library, this model offers high-quality translations with accuracy and fluency at its core.
 ## Intended uses & limitations
-This model is designed to cater to a wide range of users, including language enthusiasts, researchers, and developers working on multilingual projects. Its intuitive interface and customizable nature allow for easy integration into various applications and workflows. However, like any machine learning model, it does have limitations and may not be suitable for highly specialized or domain-specific translations.
 ## Training and evaluation data
-The Darija to MSA Translator was trained on a diverse dataset comprising Darija and MSA text pairs, enabling it to learn the nuances and intricacies of both languages. The evaluation data was meticulously selected to ensure robust performance and validate the model's accuracy and effectiveness in real-world scenarios.
 ## Training procedure
@@ -88,6 +92,22 @@ The following hyperparameters were used during training:
 | 0.7638        | 4.75  | 7000 | 1.0901          | 46.4753 | 9.6439  |
 | 0.7448        | 4.88  | 7200 | 1.0892          | 46.4939 | 9.6377  |
 ### Framework versions

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# Authors
+- Oussama Mounajjim
+- Imad Zaoug
+- Mehdi Soufiane
 # AdabTranslate-Darija
 This model is a fine-tuned version of [moussaKam/arabart](https://huggingface.co/moussaKam/arabart) on an unknown dataset.
 ## Model description
+The Darija to MSA Translator is a state-of-the-art translation model meticulously trained on a diverse dataset comprising 26,000 text pairs meticulously annotated by human annotators and augmented using GPT-4 techniques. Leveraging the datasets available on Hugging Face and employing advanced training techniques, this model achieves exceptional accuracy and fluency in translating between Darija (Moroccan Arabic) and Modern Standard Arabic (MSA). Powered by the Hugging Face Transformers library, it represents a significant advancement in natural language processing technology, making it a valuable tool for bridging language barriers and promoting linguistic diversity.
 ## Intended uses & limitations
+The Darija to MSA Translator is designed to cater to a wide range of users, including language enthusiasts, researchers, and developers working on multilingual projects. Its robust training on a diverse dataset ensures accuracy and effectiveness in various contexts. However, users should be aware of its limitations, particularly in highly specialized or domain-specific translations where additional fine-tuning may be necessary.
 ## Training and evaluation data
+The training data for the Darija to MSA Translator consists of 26,000 text pairs generated via human annotation and augmented using GPT-4 techniques. These datasets were sourced from Hugging Face, ensuring a comprehensive and diverse set of examples for training. The evaluation data was carefully selected to validate the model's performance and accuracy in real-world scenarios, ensuring its reliability and effectiveness in practical applications.
 ## Training procedure
 | 0.7638        | 4.75  | 7000 | 1.0901          | 46.4753 | 9.6439  |
 | 0.7448        | 4.88  | 7200 | 1.0892          | 46.4939 | 9.6377  |
+# How to use it ?
+Just copy and paste this code after installing the necessary libraries
+from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
+model_path = 'itsmeussa/AdabTranslate-Darija'
+model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
+tokenizer = AutoTokenizer.from_pretrained('moussaKam/arabart')
+seq = "مرحبا بيكم"
+tok = tokenizer.encode(seq, return_tensors='pt')
+res = model.generate(tok)
+tokenizer.decode(res[0])
 ### Framework versions