TheBloke
/

Mistral-7B-OpenOrca-AWQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

SushantGautam commited on Feb 1, 2024

Commit

f134ad1

·

verified ·

1 Parent(s): 7048b2a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -111,7 +111,7 @@ If you try the vLLM examples below and get an error about `quantization` being u
 - When using vLLM as a server, pass the `--quantization awq` parameter, for example:
 ```shell
-python3 python -m vllm.entrypoints.api_server --model TheBloke/Mistral-7B-OpenOrca-AWQ --quantization awq --dtype half
 ```
 When using vLLM from Python code, pass the `quantization=awq` parameter, for example:

 - When using vLLM as a server, pass the `--quantization awq` parameter, for example:
 ```shell
+python3 -m vllm.entrypoints.api_server --model TheBloke/Mistral-7B-OpenOrca-AWQ --quantization awq --dtype half
 ```
 When using vLLM from Python code, pass the `quantization=awq` parameter, for example: