Quants of Qwen2.5-72B base (not instruct) would be cool https://huggingface.co/Qwen/Qwen2.5-72B

#297
by anxcat - opened

Literally no one has posted quants of the base model yet, would love to try it if you're willing. Thanks! Especially IQ4_XS

Please do Qwen/Qwen2.5-72B-Instruct as well. I somehow totally missed the release of Qwen2.5-72B and now I'm excited to try it out.

Please do Qwen/Qwen2.5-72B-Instruct as well. I somehow totally missed the release of Qwen2.5-72B and now I'm excited to try it out.

There's lots of 72B Instruct quants posted, Bartowski has some plus several other accounts that I've seen. That's why I'm here requesting base, haha.

sure, they will be on the way soon :)

mradermacher changed discussion status to closed

@nicoboss and the 32b variant(s) might be a great choice for a medium/smell-sized model as benchmark for the yet-top-be-tackled relative quality scale

@anxcat oh, and btw., you can watch the progress at http://hf.tst.eu/status.html

Thanks man, hugely appreciate it. :)

all seemed to go through, except Qwen2.5-Math-RM-72B,. which isn't supported by llama.cpp

What a mess. All the Qwen 2.5 models released with the wrong tokenizer_config.json and they where fixing them exactly back when we started to conveart all of them so some will have the fixed version while others have the broken one. We probably should requanzize all the broken ones. Please check in your logs when you started downloading them and see if it was before or after they where fixed.

Time when the models where fixed

Qwen2.5

https://huggingface.co/Qwen/Qwen2.5-0.5B: 20th at 07:57 GMT
https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct: 19th at 07:29 GMT
https://huggingface.co/Qwen/Qwen2.5-1.5B: 20th at 07:57 GMT
https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct: 19th at 07:29 GMT
https://huggingface.co/Qwen/Qwen2.5-3B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-3B-Instruct: 19th at 07:30 GMT
https://huggingface.co/Qwen/Qwen2.5-7B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-7B-Instruct: 19th at 07:30 GMT
https://huggingface.co/Qwen/Qwen2.5-14B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-14B-Instruct: 19th at 07:30 GMT
https://huggingface.co/Qwen/Qwen2.5-32B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-32B-Instruct: 19th at 07:30 GMT
https://huggingface.co/Qwen/Qwen2.5-72B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-72B-Instruct: 19th at 07:30 GMT

Qwen2.5-Coder

https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B: 20th at 08:43 GMT
https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct: 20th at 03:47 GMT
https://huggingface.co/Qwen/Qwen2.5-Coder-7B: 20th at 08:44 GMT
https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct: 20th at 03:41 GMT

Qwen2.5-Math

https://huggingface.co/Qwen/Qwen2.5-Math-1.5B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-Math-1.5B-Instruct: 19th at 07:30 GMT
https://huggingface.co/Qwen/Qwen2.5-Math-7B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-Math-72B: 20th at 07:58 GMT
https://huggingface.co/Qwen/Qwen2.5-Math-72B-Instruct: 19th at 07:30 GMT

Drat

Well, for (my) reference:

Fri 20 Sep 2024 03:34:39 CEST db1     Qwen2.5-0.5B 954M ./Qwen2.5-0.5B
Fri 20 Sep 2024 03:31:47 CEST db3     Qwen2.5-1.5B-Instruct 2.9G        ./Qwen2.5-1.5B-Instruct
Fri 20 Sep 2024 03:35:21 CEST db1     Qwen2.5-1.5B 2.9G ./Qwen2.5-1.5B
Fri 20 Sep 2024 03:35:23 CEST marco   Qwen2.5-3B 5.8G   ./Qwen2.5-3B
Fri 20 Sep 2024 03:37:29 CEST marco   Qwen2.5-3B-Instruct 5.8G  ./Qwen2.5-3B-Instruct
Fri 20 Sep 2024 03:40:11 CEST db1     Qwen2.5-7B 15G    ./Qwen2.5-7B
Fri 20 Sep 2024 03:36:38 CEST db3     Qwen2.5-7B-Instruct 15G   ./Qwen2.5-7B-Instruct
Fri 20 Sep 2024 03:44:03 CEST db3     Qwen2.5-14B 28G   ./Qwen2.5-14B
Fri 20 Sep 2024 03:47:39 CEST db1     Qwen2.5-14B-Instruct 28G  ./Qwen2.5-14B-Instruct
Fri 20 Sep 2024 04:05:53 CEST db2     Qwen2.5-72B 136G  ./Qwen2.5-72B
Fri 20 Sep 2024 04:17:09 CEST db1     Qwen2.5-32B-Instruct 62G  ./Qwen2.5-32B-Instruct
Fri 20 Sep 2024 04:19:19 CEST marco   Qwen2.5-32B 62G   ./Qwen2.5-32B
Fri 20 Sep 2024 04:25:14 CEST db3     Qwen2.5-72B-Instruct 136G ./Qwen2.5-72B-Instruct
Fri 20 Sep 2024 04:36:28 CEST marco   Qwen2.5-Coder-1.5B-Instruct 2.9G  ./Qwen2.5-Coder-1.5B-Instruct
Fri 20 Sep 2024 04:38:47 CEST kaos    Qwen2.5-Coder-7B 15G      ./Qwen2.5-Coder-7B
Fri 20 Sep 2024 04:58:26 CEST db1     Qwen2.5-Coder-7B-Instruct 15G     ./Qwen2.5-Coder-7B-Instruct
Fri 20 Sep 2024 04:57:44 CEST db3     Qwen2.5-Math-1.5B 2.9G    ./Qwen2.5-Math-1.5B
Fri 20 Sep 2024 06:46:08 CEST db1     Qwen2.5-Math-7B 15G       ./Qwen2.5-Math-7B
Fri 20 Sep 2024 07:31:28 CEST db1     Qwen2.5-Math-72B 136G     ./Qwen2.5-Math-72B
Sat 21 Sep 2024 06:04:10 CEST db1     Qwen2.5-Math-72B-Instruct 136G    ./Qwen2.5-Math-72B-Instruct
Sat 21 Sep 2024 06:24:24 CEST db3     Qwen2.5-Math-RM-72B 136G  ./Qwen2.5-Math-RM-72B

seems only 72b-instruct and 72b-math-instruct escaped the rerun.

the only good that came of that is that i improved my hffs-api wrapper and now can delete models (more conveniently) from the command line. globs really work well for qwen. and in favour of qwen, it's really nice to see them cover more or less the whole range of sizes.

All are re-done by now. Let's hope I didn't reuse an old imatrix file or forgot one....

Sign up or log in to comment