FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview

Jan 21

•

Hi. Out of curiosity, was there any attempts to merge the Qwen 2.5 32b coder model? If not, is this something you are willing to try or do?

Edit: I tried this myself by using their merge technique as they describe it. RDson/CoderO1-DeepSeekR1-Coder-32B-Preview and RDson/CoderO1-DeepSeekR1-Coder-14B-Preview

sm54

Jan 22

Hi. Out of curiosity, was there any attempts to merge the Qwen 2.5 32b coder model? If not, is this something you are willing to try or do?

Edit: I tried this myself by using their merge technique as they describe it. RDson/CoderO1-DeepSeekR1-Coder-32B-Preview and RDson/CoderO1-DeepSeekR1-Coder-14B-Preview

Have you tested or benchmarked your merges? Was thinking of giving the 14b a try.

Wanfq

FuseAI org Jan 22

We find the evaluation results for math and code are not correct in our current version. To address this issue, we use the code from Qwen2.5-Math and Qwen2.5-Coder for math and code evaluation. With this approach, we have successfully reproduced the results reported in the DeepSeek-R1 paper. We will update all the results—including those for this coding model—tomorrow. Please stay tuned. Thank you for use FuseO1-Preview.

Wanfq

FuseAI org Jan 22

Here is our merged coding model: https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview

RDson

Jan 22

Hi. Out of curiosity, was there any attempts to merge the Qwen 2.5 32b coder model? If not, is this something you are willing to try or do?

Edit: I tried this myself by using their merge technique as they describe it. RDson/CoderO1-DeepSeekR1-Coder-32B-Preview and RDson/CoderO1-DeepSeekR1-Coder-14B-Preview

Have you tested or benchmarked your merges? Was thinking of giving the 14b a try.

I have not had the time to run any benchmarks due to limited resources. You would have to try it out yourself.

We find the evaluation results for math and code are not correct in our current version. To address this issue, we use the code from Qwen2.5-Math and Qwen2.5-Coder for math and code evaluation. With this approach, we have successfully reproduced the results reported in the DeepSeek-R1 paper. We will update all the results—including those for this coding model—tomorrow. Please stay tuned. Thank you for use FuseO1-Preview.

Thank you for your time making this coder model too. I hope it turned out well!

AaronFeng753

Jan 23

@Wanfq

With this approach, we have successfully reproduced the results reported in the DeepSeek-R1 paper.

Could you share your approach? Thanks

AaronFeng753

Jan 23

Please share the details as well, like temperature, top p&k, repeat penalty

Wanfq

FuseAI org Jan 23

•

edited Jan 23

@Wanfq

With this approach, we have successfully reproduced the results reported in the DeepSeek-R1 paper.

Could you share your approach? Thanks

Please share the details as well, like temperature, top p&k, repeat penalty

We find the evaluation results for math and code are not correct in our current version. To address this issue, we use the code from Qwen2.5-Math and Qwen2.5-Coder for math and code evaluation. With this approach, we have successfully reproduced the results reported in the DeepSeek-R1 paper.

We have finished all the evaluation and updated the results here:

The reproduce details can be found in our blog: https://huggingface.co/blog/Wanfq/fuseo1-preview

We also provide the code in our github repo: https://github.com/fanqiwan/FuseAI/tree/main/FuseO1-Preview

Our models are in : https://huggingface.co/collections/FuseAI/fuseo1-preview-678eb56093649b2688bc9977

Have fun!

AaronFeng753

Jan 23

@Wanfq

With this approach, we have successfully reproduced the results reported in the DeepSeek-R1 paper.

Could you share your approach? Thanks

Please share the details as well, like temperature, top p&k, repeat penalty

Hello, we have finished all the evaluation and updated the results here:

The reproduce details can be found in our blog: https://huggingface.co/blog/Wanfq/fuseo1-preview

We also provide the code in our github repo: https://github.com/fanqiwan/FuseAI/tree/main/FuseO1-Preview

Our models are in : https://huggingface.co/collections/FuseAI/fuseo1-preview-678eb56093649b2688bc9977

Have fun!

Thank you so much for running these tests and sharing the results!

AaronFeng753

Jan 23

bro you must check this out:

User:

Hello! I have multiple different files with different sizes,
I want to move files from disk 1 to disk 2, which has only 688 space available.
Without yapping, and being as concise as possible.
What combination of files gets me closer to that number?
The file sizes are: 36, 36, 49, 53, 54, 54, 63, 94, 94, 107, 164, 201, 361, 478

FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-IQ4_XS:

To maximize the use of disk 2's 688-unit capacity, here are the optimal file combinations:

1. **478 + 53 + 54 + 54 + 49 = 688**
2. **361 + 201 + 36 + 36 + 54 = 688**
3. **478 + 107 + 49 + 54 = 688**
4. **361 + 107 + 49 + 54 + 54 + 63 = 688**

Each combination precisely fills the available space on disk 2.

DeepSeek-R1-Distill-Qwen-32B-IQ4_XS:

The combination of files that gets you closest to the 688 space limit is:

361 + 164 + 107 + 54 + 54 = 686

This totals 686 units, which is just 2 units less than the available space on disk 2.

This merge is crazy good, like insane, this is first time I saw a 32b model give me 4 correct answers for this question, this is crazy man, you guys really did a fantastic merge

@Wanfq

AaronFeng753

Jan 23

full answer with thoughts:

FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-IQ4_XS:
https://pastebin.com/W2QFACDM

DeepSeek-R1-Distill-Qwen-32B-IQ4_XS:
https://pastebin.com/2VJrjVmJ

sm54

Jan 23

•

edited Jan 23

FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview is very good, it's basically o1-mini or in some cases better. It's one shotting coding tasks that I used to have to go between 4o and Sonnet with to get corrected, and usually that would take 30 minutes to an hour before I'd get working code.

AaronFeng753

Jan 23

https://www.reddit.com/r/LocalLLaMA/comments/1i81pbk/this_merge_is_amazing/

BingoBird

21 days ago

What are important parameters here? I seem to be running okay with this:
myprompt="Please reason step by step, avoiding going down rabbit holes too deeply. Put your final answer within \boxed{{}}."
${lbin} -m "${myopt}" -t 8 --threads 8 -nkvo -ngl 99 -c 32768 --temp 0.6 --repeat_penalty 1.0 --flash-attn --cache-type-k q8_0 --cache-type-v q8_0 -ngld 99 --multiline-input --color -i -ins -n -1 -ngl 71 -p "${myprompt}"
Except the 'boxed' prompt seems to give junk.

What are you guys finding? Maybe that reddit thread is a better place to talk about tweaking params?

BingoBird

21 days ago

full answer with thoughts:

FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-IQ4_XS:
https://pastebin.com/W2QFACDM

DeepSeek-R1-Distill-Qwen-32B-IQ4_XS:
https://pastebin.com/2VJrjVmJ

Unable to reproduce here with fresh build of llama.cpp (cli)

FuseAI
/

FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview

Merge with 32b coder?