lokinfey commited on
Commit
73b9226
·
verified ·
1 Parent(s): 9eb3e64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -3
README.md CHANGED
@@ -1,3 +1,96 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # **Phi-4-onnx-cpu-int4 Unofficial version**
6
+
7
+ <b><span style="text-decoration:underline">Note: This is unoffical version,just for test and dev.</span></b>
8
+
9
+ This is a Phi-4 version of ONNX CPU, based on Olive [https://github.com/microsoft/olive](https://github.com/microsoft/olive). Convert with the following command
10
+
11
+
12
+ ## **1. Install the SDK**
13
+
14
+
15
+ ```
16
+
17
+ pip install olive-ai
18
+
19
+ pip install transformers==4.44.2
20
+
21
+ ```
22
+
23
+
24
+
25
+ ## **2. Convert CPU ONNX Support**
26
+
27
+
28
+ ```bash
29
+
30
+ olive auto-opt --model_name_or_path Your Phi-4 location --output_path Your onnx ouput location --device cpu --provider CPUExecutionProvider --precision int4 --use_model_builder --log_level 1
31
+
32
+ ```
33
+
34
+ This is a conversion, but no specific optimization has been done. Please look forward to the official version.
35
+
36
+
37
+ ## **Sample - Inference ONNX**
38
+
39
+
40
+ ```python
41
+
42
+
43
+
44
+ import onnxruntime_genai as og
45
+ import numpy as np
46
+ import os
47
+
48
+
49
+ model_folder = "./onnx/model"
50
+
51
+
52
+ model = og.Model(model_folder)
53
+
54
+
55
+ tokenizer = og.Tokenizer(model)
56
+ tokenizer_stream = tokenizer.create_stream()
57
+
58
+
59
+ search_options = {}
60
+ search_options['max_length'] = 2048
61
+ search_options['past_present_share_buffer'] = False
62
+
63
+
64
+ chat_template = "<|user|>\n{input}</s>\n<|assistant|>"
65
+
66
+
67
+ text = """I have $20,000 in my savings account, where I receive a 4% profit per year and payments twice a year. Can you please tell me how long it will take for me to become a millionaire? Also, can you please explain the math step by step as if you were explaining it to an uneducated person?"""
68
+
69
+
70
+ prompt = f'{chat_template.format(input=text)}'
71
+
72
+
73
+ input_tokens = tokenizer.encode(prompt)
74
+
75
+
76
+ params = og.GeneratorParams(model)
77
+
78
+
79
+ params.set_search_options(**search_options)
80
+ params.input_ids = input_tokens
81
+
82
+
83
+ generator = og.Generator(model, params)
84
+
85
+
86
+ while not generator.is_done():
87
+ generator.compute_logits()
88
+ generator.generate_next_token()
89
+
90
+ new_token = generator.get_next_tokens()[0]
91
+ print(tokenizer_stream.decode(new_token), end='', flush=True)
92
+
93
+
94
+
95
+
96
+ ```