--- language: - ar pipeline_tag: text-generation --- # Model Card for Model ID This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1). ## Model Details ### Model Description - **Developed by:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model [optional]:** [More Information Needed] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. from transformers import GPT2Tokenizer from arabert.preprocess import ArabertPreprocessor from arabert.aragpt2.grover.modeling_gpt2 import GPT2LMHeadModel from pyarabic.araby import strip_tashkeel import pyarabic.trans model_name='alsubari/aragpt2-mega-pos-msa' tokenizer = GPT2Tokenizer.from_pretrained('alsubari/aragpt2-mega-pos-msa') model = GPT2LMHeadModel.from_pretrained('alsubari/aragpt2-mega-pos-msa').to("cuda") arabert_prep = ArabertPreprocessor(model_name='aubmindlab/aragpt2-mega') prml=['اعراب الجملة :', ' صنف الكلمات من الجملة :'] text='تعلَّمْ من أخطائِكَ' text=arabert_prep.preprocess(strip_tashkeel(text)) generation_args = { 'pad_token_id':tokenizer.eos_token_id, 'max_length': 256, 'num_beams':20, 'no_repeat_ngram_size': 3, 'top_k': 20, 'top_p': 0.1, # Consider all tokens with non-zero probability 'do_sample': True, 'repetition_penalty':2.0 } input_text = f'<|startoftext|>Instruction: {prml[1]} {text}<|pad|>Answer:' input_ids = tokenizer.encode(input_text, return_tensors='pt').to("cuda") output_ids = model.generate(input_ids=input_ids,**generation_args) output_text = tokenizer.decode(output_ids[0],skip_special_tokens=True).split('Answer:')[1] answer_pose=pyarabic.trans.delimite_language(output_text, start="", end="") print(answer_pose) ## تعلم : تعلم : Verb من : من : Relative pronoun أخطائك : اخطا : Noun ك : Personal pronunction input_text = f'<|startoftext|>Instruction: {prml[0]} {text}<|pad|>Answer:' input_ids = tokenizer.encode(input_text, return_tensors='pt').to("cuda") output_ids = model.generate(input_ids=input_ids,**generation_args) output_text = tokenizer.decode(output_ids[0],skip_special_tokens=True).split('Answer:')[1] print(output_text) ##تعلم : تعلم : فعل ، مفرد المخاطب للمذكر ، فعل مضارع ، مرفوع من : من : حرف جر أخطائك : اخطا : اسم ، جمع المذكر ، مجرور ك : ضمير ، مفرد المتكلم [More Information Needed] ## Training Details ### Training Data [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]