QCRI
/

Translation
Safetensors
m2m_100
BaselMousi commited on
Commit
2a21256
·
verified ·
1 Parent(s): 35dd697

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -8,6 +8,23 @@ pipeline_tag: translation
8
 
9
  This repository includes an MSA-to-LEV machine translation model. This model was used to curate dialectal benchmarks. The human post-edited benchmarks can be found<a href="https://huggingface.co/datasets/QCRI/AraDiCE" target="_blank" style="margin-right: 15px; margin-left: 10px">here.</a>
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ## License
12
 
13
  The model is distributed under the **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)**. The full license text can be found in the accompanying `licenses_by-nc-sa_4.0_legalcode.txt` file.
 
8
 
9
  This repository includes an MSA-to-LEV machine translation model. This model was used to curate dialectal benchmarks. The human post-edited benchmarks can be found<a href="https://huggingface.co/datasets/QCRI/AraDiCE" target="_blank" style="margin-right: 15px; margin-left: 10px">here.</a>
10
 
11
+ ## Sample Usage
12
+
13
+ ```python
14
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
15
+
16
+ tokenizer = AutoTokenizer.from_pretrained("QCRI/AraDiCE-msa-to-lev")
17
+ model = AutoModelForSeq2SeqLM.from_pretrained("QCRI/AraDiCE-msa-to-lev")
18
+
19
+ article = "يظهر سلف الأدب المكسيكي في آداب المستعمرات الأصلية في أمريكا الوسطى"
20
+ inputs = tokenizer(article, return_tensors="pt")
21
+
22
+ translated_tokens = model.generate(
23
+ **inputs, forced_bos_token_id=tokenizer.convert_tokens_to_ids("ajp_Arab"), max_length=30
24
+ )
25
+ translation = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
26
+ print(translation)
27
+ ```
28
  ## License
29
 
30
  The model is distributed under the **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)**. The full license text can be found in the accompanying `licenses_by-nc-sa_4.0_legalcode.txt` file.