shenzhi-wang

mradermacher commited on Feb 3

Commit

98de3fd

verified ·

0 Parent(s):

Duplicate from mradermacher/Xwen-72B-Chat-i1-GGUF

Browse files

Co-authored-by: team mradermacher <[email protected]>

Files changed (29) hide show

.gitattributes +62 -0
README.md +76 -0
Xwen-72B-Chat.i1-IQ1_M.gguf +3 -0
Xwen-72B-Chat.i1-IQ1_S.gguf +3 -0
Xwen-72B-Chat.i1-IQ2_M.gguf +3 -0
Xwen-72B-Chat.i1-IQ2_S.gguf +3 -0
Xwen-72B-Chat.i1-IQ2_XS.gguf +3 -0
Xwen-72B-Chat.i1-IQ2_XXS.gguf +3 -0
Xwen-72B-Chat.i1-IQ3_M.gguf +3 -0
Xwen-72B-Chat.i1-IQ3_S.gguf +3 -0
Xwen-72B-Chat.i1-IQ3_XS.gguf +3 -0
Xwen-72B-Chat.i1-IQ3_XXS.gguf +3 -0
Xwen-72B-Chat.i1-IQ4_XS.gguf +3 -0
Xwen-72B-Chat.i1-Q2_K.gguf +3 -0
Xwen-72B-Chat.i1-Q2_K_S.gguf +3 -0
Xwen-72B-Chat.i1-Q3_K_L.gguf +3 -0
Xwen-72B-Chat.i1-Q3_K_M.gguf +3 -0
Xwen-72B-Chat.i1-Q3_K_S.gguf +3 -0
Xwen-72B-Chat.i1-Q4_0.gguf +3 -0
Xwen-72B-Chat.i1-Q4_1.gguf +3 -0
Xwen-72B-Chat.i1-Q4_K_M.gguf +3 -0
Xwen-72B-Chat.i1-Q4_K_S.gguf +3 -0
Xwen-72B-Chat.i1-Q5_K_M.gguf.part1of2 +3 -0
Xwen-72B-Chat.i1-Q5_K_M.gguf.part2of2 +3 -0
Xwen-72B-Chat.i1-Q5_K_S.gguf.part1of2 +3 -0
Xwen-72B-Chat.i1-Q5_K_S.gguf.part2of2 +3 -0
Xwen-72B-Chat.i1-Q6_K.gguf.part1of2 +3 -0
Xwen-72B-Chat.i1-Q6_K.gguf.part2of2 +3 -0
imatrix.dat +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,62 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+imatrix.dat filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ3_XXS.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ2_M.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q6_K.gguf.part1of2 filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q6_K.gguf.part2of2 filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q2_K_S.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ1_M.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ2_XXS.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ2_XS.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q5_K_S.gguf.part1of2 filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q5_K_S.gguf.part2of2 filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ2_S.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ1_S.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q5_K_M.gguf.part1of2 filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q5_K_M.gguf.part2of2 filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
+Xwen-72B-Chat.i1-IQ3_S.gguf filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+base_model: xwen-team/Xwen-72B-Chat
+language:
+- en
+- zh
+library_name: transformers
+license: apache-2.0
+quantized_by: mradermacher
+---
+## About
+<!-- ### quantize_version: 2 -->
+<!-- ### output_tensor_quantised: 1 -->
+<!-- ### convert_type: hf -->
+<!-- ### vocab_type:  -->
+<!-- ### tags: nicoboss -->
+weighted/imatrix quants of https://huggingface.co/xwen-team/Xwen-72B-Chat
+<!-- provided-files -->
+static quants are available at https://huggingface.co/mradermacher/Xwen-72B-Chat-GGUF
+## Usage
+If you are unsure how to use GGUF files, refer to one of [TheBloke's
+READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
+more details, including on how to concatenate multi-part files.
+## Provided Quants
+(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
+| Link | Type | Size/GB | Notes |
+|:-----|:-----|--------:|:------|
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ1_S.gguf) | i1-IQ1_S | 22.8 | for the desperate |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ1_M.gguf) | i1-IQ1_M | 23.8 | mostly desperate |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 25.6 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ2_XS.gguf) | i1-IQ2_XS | 27.2 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ2_S.gguf) | i1-IQ2_S | 28.0 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ2_M.gguf) | i1-IQ2_M | 29.4 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q2_K_S.gguf) | i1-Q2_K_S | 29.7 | very low quality |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q2_K.gguf) | i1-Q2_K | 29.9 | IQ3_XXS probably better |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 31.9 | lower quality |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ3_XS.gguf) | i1-IQ3_XS | 32.9 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ3_S.gguf) | i1-IQ3_S | 34.6 | beats Q3_K* |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q3_K_S.gguf) | i1-Q3_K_S | 34.6 | IQ3_XS probably better |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ3_M.gguf) | i1-IQ3_M | 35.6 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q3_K_M.gguf) | i1-Q3_K_M | 37.8 | IQ3_S probably better |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q3_K_L.gguf) | i1-Q3_K_L | 39.6 | IQ3_M probably better |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-IQ4_XS.gguf) | i1-IQ4_XS | 39.8 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q4_0.gguf) | i1-Q4_0 | 41.5 | fast, low quality |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q4_K_S.gguf) | i1-Q4_K_S | 44.0 | optimal size/speed/quality |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q4_1.gguf) | i1-Q4_1 | 45.8 |  |
+| [GGUF](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q4_K_M.gguf) | i1-Q4_K_M | 47.5 | fast, recommended |
+| [PART 1](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q5_K_S.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q5_K_S.gguf.part2of2) | i1-Q5_K_S | 51.5 |  |
+| [PART 1](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q5_K_M.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q5_K_M.gguf.part2of2) | i1-Q5_K_M | 54.5 |  |
+| [PART 1](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Xwen-72B-Chat-i1-GGUF/resolve/main/Xwen-72B-Chat.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 64.4 | practically like static Q6_K |
+Here is a handy graph by ikawrakow comparing some lower-quality quant
+types (lower is better):
+![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)
+And here are Artefact2's thoughts on the matter:
+https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
+## FAQ / Model Request
+See https://huggingface.co/mradermacher/model_requests for some answers to
+questions you might have and/or if you want some other model quantized.
+## Thanks
+I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
+me use its servers and providing upgrades to my workstation to enable
+this work in my free time. Additional thanks to [@nicoboss](https://huggingface.co/nicoboss) for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
+<!-- end -->

Xwen-72B-Chat.i1-IQ1_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d5fa4028256c3bb2ad875181e9f6fbd6b9b4228b74c74eaffd8bc962f489f459
+size 23740213152

Xwen-72B-Chat.i1-IQ1_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cb29d274e2c3c76a3f99807acde61c2c1d91e45d6b4627480ab4d87f718013f6
+size 22690326432

Xwen-72B-Chat.i1-IQ2_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ebbee29364d0769c0d1b5aea183bd40471b1dfab546a0fc7a8d3e7d50bdff7d6
+size 29338986400

Xwen-72B-Chat.i1-IQ2_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4f8ed3468468afaaeb094c77d7548090696fbb6e458f57ca93e2b436280ba926
+size 27939137440

Xwen-72B-Chat.i1-IQ2_XS.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ebc40420e81675f2e912258c214ac2838f9cffe1d850e5abc96a45b16cb9fb6c
+size 27057645472

Xwen-72B-Chat.i1-IQ2_XXS.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c4b88bad1b05bf9a907add8da23b28a2184344f6ae5492a8ed0af8f204ea17e
+size 25490024352

Xwen-72B-Chat.i1-IQ3_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5f2f3629fd49c11887ea656ed4ddf6ee5853ee5407eab38dfe9f08326c7df38f
+size 35503597472

Xwen-72B-Chat.i1-IQ3_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d338ece62325c5c2a9867290778e51c06b4db7bf080a21c5e9f73f3731463494
+size 34487789472

Xwen-72B-Chat.i1-IQ3_XS.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c141a2840c332244538d8f6e4282846faa94103b6da23fa8ccee7b814b643d8d
+size 32842180512

Xwen-72B-Chat.i1-IQ3_XXS.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6f6255e60798be70ea15ee363c940de75b336cfd66faba94b3a714bb80edb0cd
+size 31845083040

Xwen-72B-Chat.i1-IQ4_XS.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c994120e59369fbef4ab2ef639c9a542e0737b25eb1cc7b021f4972f927c4da9
+size 39709075360

Xwen-72B-Chat.i1-Q2_K.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0caeb5f0a710592762cd522cb6700e9f9aefe0d925ad993cc243a735c8834157
+size 29811763104

Xwen-72B-Chat.i1-Q2_K_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:47b717921ab496df8951de90cef3c0a36819b9acbcb413bd23b4b8a22a92fb8c
+size 29569279904

Xwen-72B-Chat.i1-Q3_K_L.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e0a931e580560b21f1f581805c4c39e9d19d48ced1eb90b5e6a4aae6feb27dba
+size 39505225632

Xwen-72B-Chat.i1-Q3_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b73a7249b43235ddffd077f3e482b7c4e4ffdc9272667d9f5684c9246a9ff9ba
+size 37698725792

Xwen-72B-Chat.i1-Q3_K_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:792bc6c29b183a923f1adc1280c476db80085f79aa3daeaa1cf27e0cac33ea3b
+size 34487789472

Xwen-72B-Chat.i1-Q4_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7d141ff64271d3108d53097cf296767a68859a77e31144016f1daa2395bb1afb
+size 41383126944

Xwen-72B-Chat.i1-Q4_1.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5204354d927747585d335f94f4005ead4735f26ef176e6dfe40700fead1cfb93
+size 45697886112

Xwen-72B-Chat.i1-Q4_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:95fdbdb4b32304ff2114b73a1db3779e755b6fcd412d36ccb79f460ecb1cb8d7
+size 47415715744

Xwen-72B-Chat.i1-Q4_K_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9d6a91dc2609ac130c6930a5984c5ab2c6e6ee3d36691924ff38dd5fb78f1d7e
+size 43889223584

Xwen-72B-Chat.i1-Q5_K_M.gguf.part1of2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f88b5696bc8a5d7bcc39259c76c018606e5c0b2e88636dacb48e45d2e833e6f7
+size 27917287424

Xwen-72B-Chat.i1-Q5_K_M.gguf.part2of2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:18829f36b9ba561679a0d0b3e394c3a52bd0d2c18daebf291ffc0524fa8d0bad
+size 26530178976

Xwen-72B-Chat.i1-Q5_K_S.gguf.part1of2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c5935c254b9629a781d2786c62e6c2cc4a8be2764dffe565cbde46bc2c780a6c
+size 25769803776

Xwen-72B-Chat.i1-Q5_K_S.gguf.part2of2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0277baefacaeeccd14638a3295e612ab11dc15642926ae9b0176fda1ad436521
+size 25605334944

Xwen-72B-Chat.i1-Q6_K.gguf.part1of2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7f750b64844c259016307f165ca0d1b7b0bcb23f6f123c3862930036c3823d29
+size 32212254720

Xwen-72B-Chat.i1-Q6_K.gguf.part2of2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3479ee6b030adb9b0da869c7fb2fc4db25208246604b2fa3eece4727a98bd969
+size 32135374752

imatrix.dat ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ecf2f2eaca92618f0542e467e3e1017269b09a8085cf0c73b7e469f6c7ae874f
+size 25209005