lavawolfiee
/

Mixtral-8x7B-Instruct-v0.1-offloading-hqq-4bit-3bit

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

lavawolfiee commited on Dec 30, 2023

Commit

d2214d5

•

1 Parent(s): ee0081f

Update README.md

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -1,2 +1,15 @@
 Attention quantization: HQQ 4-bit, groupsize 64, compress zero, compress scale with groupsize 256 \
 Experts quantization: HQQ 3-bit, groupsize 64, compress zero, compress scale with groupsize 128

+---
+license: mit
+language:
+- en
+- fr
+- it
+- de
+- es
+library_name: transformers
+tags:
+- mixtral
+- text-generation-inference
+---
 Attention quantization: HQQ 4-bit, groupsize 64, compress zero, compress scale with groupsize 256 \
 Experts quantization: HQQ 3-bit, groupsize 64, compress zero, compress scale with groupsize 128