AhmedBou commited on
Commit
2bdf1d8
1 Parent(s): 4c2ca9f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -1
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
  language:
3
  - en
 
4
  license: apache-2.0
5
  tags:
6
  - text-generation-inference
@@ -11,6 +12,47 @@ tags:
11
  base_model: unsloth/gemma-7b-bnb-4bit
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Uploaded model
15
 
16
  - **Developed by:** AhmedBou
@@ -19,4 +61,4 @@ base_model: unsloth/gemma-7b-bnb-4bit
19
 
20
  This gemma model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
  language:
3
  - en
4
+ - ar
5
  license: apache-2.0
6
  tags:
7
  - text-generation-inference
 
12
  base_model: unsloth/gemma-7b-bnb-4bit
13
  ---
14
 
15
+ ## Inference code:
16
+ Use this python code for inference
17
+
18
+ ````python
19
+ # Installs Unsloth, Xformers (Flash Attention) and all other packages!
20
+ !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
21
+ !pip install --no-deps xformers trl peft accelerate bitsandbytes
22
+
23
+ from unsloth import FastLanguageModel
24
+
25
+ max_seq_length = 2048
26
+ dtype = None
27
+ load_in_4bit = True
28
+
29
+ model, tokenizer = FastLanguageModel.from_pretrained(
30
+ model_name = "AhmedBou/Gemma-7b-EngText-ArabicSummary",
31
+ max_seq_length = max_seq_length,
32
+ dtype = dtype,
33
+ load_in_4bit = load_in_4bit,
34
+ )
35
+ FastLanguageModel.for_inference(model)
36
+
37
+ input = """
38
+ past a news article here
39
+ """
40
+
41
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
42
+ inputs = tokenizer(
43
+ [
44
+ alpaca_prompt.format(
45
+ input, # input
46
+ "", # output - leave this blank for generation!
47
+ )
48
+ ], return_tensors = "pt").to("cuda")
49
+
50
+ outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
51
+ tokenizer.batch_decode(outputs)
52
+
53
+
54
+ ````
55
+
56
  # Uploaded model
57
 
58
  - **Developed by:** AhmedBou
 
61
 
62
  This gemma model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
63
 
64
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)