Update README.md
Browse filesadd itrex inference eg
README.md
CHANGED
@@ -44,6 +44,22 @@ python3 main.py \
|
|
44 |
|
45 |
|
46 |
### Use the model
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
### INT4 Inference with AutoGPTQ
|
49 |
|
|
|
44 |
|
45 |
|
46 |
### Use the model
|
47 |
+
### INT4 Inference with ITREX on CPU
|
48 |
+
Install the latest [intel-extension-for-transformers](https://github.com/intel/intel-extension-for-transformers)
|
49 |
+
```python
|
50 |
+
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
|
51 |
+
from transformers import AutoTokenizer
|
52 |
+
quantized_model_dir = "Intel/neural-chat-7b-v3-3-int4-inc"
|
53 |
+
model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
|
54 |
+
device_map="auto",
|
55 |
+
trust_remote_code=False,
|
56 |
+
use_neural_speed=False,
|
57 |
+
)
|
58 |
+
tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)
|
59 |
+
print(tokenizer.decode(model.generate(**tokenizer("There is a girl who likes adventure,", return_tensors="pt").to(model.device),max_new_tokens=50)[0]))
|
60 |
+
## <s> There is a girl who likes adventure, and she is a bit of a daredevil. She loves to travel and explore new places. She is always looking for the next thrill, whether it be skydiving, bungee jumping, or even just hiking up a mountain
|
61 |
+
```
|
62 |
+
|
63 |
|
64 |
### INT4 Inference with AutoGPTQ
|
65 |
|