Upload README.md
Browse files
README.md
CHANGED
@@ -50,13 +50,23 @@ tags:
|
|
50 |
This repo contains AWQ model files for [Charles Goddard's MixtralRPChat ZLoss](https://huggingface.co/chargoddard/MixtralRPChat-ZLoss).
|
51 |
|
52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
### About AWQ
|
54 |
|
55 |
AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
|
56 |
|
57 |
AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
|
58 |
|
59 |
-
|
60 |
|
61 |
- [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
|
62 |
- [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
|
|
|
50 |
This repo contains AWQ model files for [Charles Goddard's MixtralRPChat ZLoss](https://huggingface.co/chargoddard/MixtralRPChat-ZLoss).
|
51 |
|
52 |
|
53 |
+
**MIXTRAL AWQ**
|
54 |
+
|
55 |
+
This is a Mixtral AWQ model.
|
56 |
+
|
57 |
+
For AutoAWQ inference, please install AutoAWQ from source.
|
58 |
+
|
59 |
+
Support via Transformers is coming soon, via this PR: https://github.com/huggingface/transformers/pull/27950 which should be merged to Transformers `main` very soon.
|
60 |
+
|
61 |
+
Support via vLLM and TGI has not yet been confirmed.
|
62 |
+
|
63 |
### About AWQ
|
64 |
|
65 |
AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
|
66 |
|
67 |
AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
|
68 |
|
69 |
+
AWQ models are supported by (note that note all of these may support Mixtral models yet):
|
70 |
|
71 |
- [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
|
72 |
- [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
|