TheBloke commited on
Commit
bd661e6
·
1 Parent(s): 7a04423

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -50,13 +50,23 @@ tags:
50
  This repo contains AWQ model files for [Charles Goddard's MixtralRPChat ZLoss](https://huggingface.co/chargoddard/MixtralRPChat-ZLoss).
51
 
52
 
 
 
 
 
 
 
 
 
 
 
53
  ### About AWQ
54
 
55
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
56
 
57
  AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
58
 
59
- It is supported by:
60
 
61
  - [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
62
  - [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
 
50
  This repo contains AWQ model files for [Charles Goddard's MixtralRPChat ZLoss](https://huggingface.co/chargoddard/MixtralRPChat-ZLoss).
51
 
52
 
53
+ **MIXTRAL AWQ**
54
+
55
+ This is a Mixtral AWQ model.
56
+
57
+ For AutoAWQ inference, please install AutoAWQ from source.
58
+
59
+ Support via Transformers is coming soon, via this PR: https://github.com/huggingface/transformers/pull/27950 which should be merged to Transformers `main` very soon.
60
+
61
+ Support via vLLM and TGI has not yet been confirmed.
62
+
63
  ### About AWQ
64
 
65
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
66
 
67
  AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
68
 
69
+ AWQ models are supported by (note that note all of these may support Mixtral models yet):
70
 
71
  - [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
72
  - [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.