Merge branch 'main' of https://huggingface.co/Intel/neural-chat-7b-v3 into main
Browse files
README.md
CHANGED
@@ -2,7 +2,7 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
-
##
|
6 |
|
7 |
This model is a fine-tuned model based on [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the open source dataset [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca). Then we align it with DPO algorithm. For more details, you can refer our blog: [The Practice of Supervised Fine-tuning and Direct Preference Optimization on Habana Gaudi2](https://medium.com/@NeuralCompressor/the-practice-of-supervised-finetuning-and-direct-preference-optimization-on-habana-gaudi2-a1197d8a3cd3).
|
8 |
|
@@ -38,12 +38,37 @@ The following hyperparameters were used during training:
|
|
38 |
- lr_scheduler_warmup_ratio: 0.02
|
39 |
- num_epochs: 2.0
|
40 |
|
41 |
-
## Inference with transformers
|
42 |
|
43 |
```shell
|
44 |
-
import
|
45 |
-
|
46 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
)
|
48 |
```
|
49 |
|
@@ -58,9 +83,8 @@ The license on this model does not constitute legal advice. We are not responsib
|
|
58 |
|
59 |
## Organizations developing the model
|
60 |
|
61 |
-
The NeuralChat team with members from Intel/
|
62 |
|
63 |
## Useful links
|
64 |
* Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
|
65 |
* Intel Extension for Transformers [link](https://github.com/intel/intel-extension-for-transformers)
|
66 |
-
* Intel Extension for PyTorch [link](https://github.com/intel/intel-extension-for-pytorch)
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
## Fine-tuning on [Habana](https://habana.ai/) Gaudi
|
6 |
|
7 |
This model is a fine-tuned model based on [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the open source dataset [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca). Then we align it with DPO algorithm. For more details, you can refer our blog: [The Practice of Supervised Fine-tuning and Direct Preference Optimization on Habana Gaudi2](https://medium.com/@NeuralCompressor/the-practice-of-supervised-finetuning-and-direct-preference-optimization-on-habana-gaudi2-a1197d8a3cd3).
|
8 |
|
|
|
38 |
- lr_scheduler_warmup_ratio: 0.02
|
39 |
- num_epochs: 2.0
|
40 |
|
41 |
+
## FP32 Inference with transformers
|
42 |
|
43 |
```shell
|
44 |
+
from transformers import AutoTokenizer, TextStreamer
|
45 |
+
model_name = "Intel/neural-chat-7b-v3"
|
46 |
+
prompt = "Once upon a time, there existed a little girl,"
|
47 |
+
|
48 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
49 |
+
inputs = tokenizer(prompt, return_tensors="pt").input_ids
|
50 |
+
streamer = TextStreamer(tokenizer)
|
51 |
+
|
52 |
+
model = AutoModelForCausalLM.from_pretrained(model_name)
|
53 |
+
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
|
54 |
+
)
|
55 |
+
```
|
56 |
+
|
57 |
+
## INT4 Inference with transformers
|
58 |
+
|
59 |
+
```shell
|
60 |
+
from transformers import AutoTokenizer, TextStreamer
|
61 |
+
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig
|
62 |
+
model_name = "Intel/neural-chat-7b-v3"
|
63 |
+
config = WeightOnlyQuantConfig(compute_dtype="int8", weight_dtype="int4")
|
64 |
+
prompt = "Once upon a time, there existed a little girl,"
|
65 |
+
|
66 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
67 |
+
inputs = tokenizer(prompt, return_tensors="pt").input_ids
|
68 |
+
streamer = TextStreamer(tokenizer)
|
69 |
+
|
70 |
+
model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=config)
|
71 |
+
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
|
72 |
)
|
73 |
```
|
74 |
|
|
|
83 |
|
84 |
## Organizations developing the model
|
85 |
|
86 |
+
The NeuralChat team with members from Intel/DCAI/AISE. Core team members: Kaokao Lv, Liang Lv, Chang Wang, Wenxin Zhang, Xuhui Ren, and Haihao Shen.
|
87 |
|
88 |
## Useful links
|
89 |
* Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
|
90 |
* Intel Extension for Transformers [link](https://github.com/intel/intel-extension-for-transformers)
|
|