Crazy good for its size
This model competes with if not outperforms QWQ-32b, and outperforms gemma-3-27b-it and Mistral-Small-24B-Instruct-2501.
Considering its smaller than all these models, you did an amazing job. Good work. Cant wait to see future versions with greater success from you guys.
Can you provide your settings? In LM Studio it's very weird. Something is very off. I will try it in llama.cpp and see if it works...
Check out my quants too
https://huggingface.co/Rombo-Org/reka-flash-3-GGUF_QX_k_Bf16/tree/main
Maybe my gguf is broken? I downloaded lmstudio-community/reka-flash-3-GGUF/blob/main/reka-flash-3-Q4_K_M.gguf
But i'll try your settings...
Update: Ok, with your settings it's probably better, but i compared it to https://space.reka.ai/ (i guess it's the same model there with reasoning turned on?) , and it's not performing on same level.
This example question it failed completely while on their website it gave me perfect answer. The question was:
Explain the bug in the following code:
from time import sleep
from multiprocessing.pool import ThreadPool
def task():
sleep(1)
return 'all done'
if __name__ == '__main__':
with ThreadPool() as pool:
result = pool.apply_async(task())
value = result.get()
print(value)
On their website, when it starts thinking it immediately finds the problem in result = pool.apply_async(task())
While in LM Studio it's thinking forever and not in the right direction at all.
@rombodawg
Update: So, with both your settings and your quant, looks like it finally works :) Thanks. I'm not sure what worked, but it seems that quant made the difference (RekaAI_reka-flash-3_q3_k_bf16.gguf).
Now i will compare it to QWQ32B on all my test questions, and see which one gonna win :)
@urtuuuu
i also have the same type of quants for QWQ if you wanna compare
https://huggingface.co/Rombo-Org/Qwen_QwQ-32B-GGUF_QX_k_Bf16/tree/main