Intel
/

neural-chat-7b-v3-3-int4-inc

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

lvkaokao commited on Feb 20

Commit

c266cb1

•

1 Parent(s): 82ed75b

rename

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ language:
 - en
 ---
-## Model Details: Neural-Chat-v3-3-int4-inc
 This model is an int4 model with group_size 128 of [Intel/neural-chat-7b-v3-3](https://huggingface.co/Intel/neural-chat-7b-v3-3)  generated by [intel/auto-round](https://github.com/intel/auto-round).
@@ -50,7 +50,7 @@ Install [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ) from source first
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-quantized_model_dir = "Intel/neural-chat-v3-3-int4-inc"
 model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
                                              device_map="auto",
                                              trust_remote_code=False,
@@ -66,7 +66,7 @@ print(tokenizer.decode(model.generate(**tokenizer("There is a girl who likes adv
 Install [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness.git) from source, we used the git id f3b7917091afba325af3980a35d8a6dcba03dc3f
 ```bash
-lm_eval  --model hf --model_args pretrained="Intel/neural-chat-v3-3-int4-inc",autogptq=True,gptq_use_triton=True --device cuda:0 --tasks lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,rte,arc_easy,arc_challenge  --batch_size 128
 ```
 | Metric         | FP16   | INT4   |

 - en
 ---
+## Model Details: Neural-Chat-7b-v3-3-int4-inc
 This model is an int4 model with group_size 128 of [Intel/neural-chat-7b-v3-3](https://huggingface.co/Intel/neural-chat-7b-v3-3)  generated by [intel/auto-round](https://github.com/intel/auto-round).
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+quantized_model_dir = "Intel/neural-chat-7b-v3-3-int4-inc"
 model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
                                              device_map="auto",
                                              trust_remote_code=False,
 Install [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness.git) from source, we used the git id f3b7917091afba325af3980a35d8a6dcba03dc3f
 ```bash
+lm_eval  --model hf --model_args pretrained="Intel/neural-chat-7b-v3-3-int4-inc",autogptq=True,gptq_use_triton=True --device cuda:0 --tasks lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,rte,arc_easy,arc_challenge  --batch_size 128
 ```
 | Metric         | FP16   | INT4   |