Upload README.md
Browse files
README.md
CHANGED
@@ -67,6 +67,7 @@ It should soon be possible to make llama.cpp GGUFs for Falcon 180B models. Curre
|
|
67 |
## Repositories available
|
68 |
|
69 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Falcon-180B-Chat-GPTQ)
|
|
|
70 |
* [Technology Innovation Institute's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/tiiuae/falcon-180B-chat)
|
71 |
<!-- repositories-available end -->
|
72 |
|
|
|
67 |
## Repositories available
|
68 |
|
69 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Falcon-180B-Chat-GPTQ)
|
70 |
+
* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Falcon-180B-Chat-GGUF)
|
71 |
* [Technology Innovation Institute's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/tiiuae/falcon-180B-chat)
|
72 |
<!-- repositories-available end -->
|
73 |
|