ahmedheakl commited on
Commit
54d9efa
1 Parent(s): 2088722

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md CHANGED
@@ -13,6 +13,63 @@ library_name: transformers
13
  pipeline_tag: translation
14
  ---
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  **Please see paper & code for more information:**
17
  - https://github.com/ahmedheakl/arazn-llm
18
  - https://arxiv.org/abs/2406.18120
 
13
  pipeline_tag: translation
14
  ---
15
 
16
+ ## How to use
17
+ Just install `peft`, `transformers` and `pytorch` first.
18
+
19
+ ```bash
20
+ pip install peft transformers torch
21
+ ```
22
+
23
+ Then login with your huggingface token to get access to base models
24
+ ```bash
25
+ huggingface-cli login --token <YOUR_HF_TOKEN>
26
+ ```
27
+
28
+ Then load the model.
29
+ ```python
30
+ from peft import PeftConfig, PeftModel
31
+ from transformers import AutoModelForCausalLM, AutoTokenizer
32
+
33
+ peft_model_id = "ahmedheakl/arazn-gemma1.1-7B-eng-extra"
34
+ peft_config = PeftConfig.from_pretrained(peft_model_id)
35
+ base_model_name = peft_config.base_model_name_or_path
36
+ base_model = AutoModelForCausalLM.from_pretrained(base_model_name)
37
+ model = PeftModel.from_pretrained(base_model, peft_model_id)
38
+ model = model.to("cuda")
39
+ tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
40
+ ```
41
+
42
+ Then do inference
43
+ ```python
44
+ import torch
45
+
46
+ raw_prompt = """<bos><start_of_turn>user
47
+ Translate the following code-switched Arabic-English-mixed text to English only.
48
+ {source}<end_of_turn>
49
+ <start_of_turn>model
50
+ """
51
+ def inference(prompt) -> str:
52
+ prompt = raw_prompt.format(source=prompt)
53
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
54
+ generated_ids = model.generate(
55
+ **inputs,
56
+ use_cache=True,
57
+ num_return_sequences=1,
58
+ max_new_tokens=100,
59
+ do_sample=True,
60
+ num_beams=1,
61
+ temperature=0.7,
62
+ eos_token_id=tokenizer.eos_token_id,
63
+ pad_token_id=tokenizer.pad_token_id,
64
+ )
65
+ outputs = tokenizer.batch_decode(generated_ids)[0]
66
+ torch.cuda.empty_cache()
67
+ torch.cuda.synchronize()
68
+ return outputs.split("<start_of_turn>model\n")[-1].split("<end_of_turn>")[0]
69
+
70
+ print(inference("أنا أحب الbanana")) # I like bananas.
71
+ ```
72
+
73
  **Please see paper & code for more information:**
74
  - https://github.com/ahmedheakl/arazn-llm
75
  - https://arxiv.org/abs/2406.18120