Crazy good for its size

#11

by rombodawg - opened 1 day ago

1 day ago

This model competes with if not outperforms QWQ-32b, and outperforms gemma-3-27b-it and Mistral-Small-24B-Instruct-2501.

Considering its smaller than all these models, you did an amazing job. Good work. Cant wait to see future versions with greater success from you guys.

urtuuuu

1 day ago

•

edited 1 day ago

Can you provide your settings? In LM Studio it's very weird. Something is very off. I will try it in llama.cpp and see if it works...

rombodawg

1 day ago

Here you go, this should make things work better for you, make sure you are updated to the latest beta build too.

rombodawg

1 day ago

Check out my quants too
https://huggingface.co/Rombo-Org/reka-flash-3-GGUF_QX_k_Bf16/tree/main

urtuuuu

1 day ago

•

edited about 24 hours ago

Maybe my gguf is broken? I downloaded lmstudio-community/reka-flash-3-GGUF/blob/main/reka-flash-3-Q4_K_M.gguf
But i'll try your settings...
Update: Ok, with your settings it's probably better, but i compared it to https://space.reka.ai/ (i guess it's the same model there with reasoning turned on?) , and it's not performing on same level.
This example question it failed completely while on their website it gave me perfect answer. The question was:

Explain the bug in the following code:

from time import sleep
from multiprocessing.pool import ThreadPool
 
def task():
    sleep(1)
    return 'all done'

if __name__ == '__main__':
    with ThreadPool() as pool:
        result = pool.apply_async(task())
        value = result.get()
        print(value)

On their website, when it starts thinking it immediately finds the problem in result = pool.apply_async(task())
While in LM Studio it's thinking forever and not in the right direction at all.

urtuuuu

about 24 hours ago

•

edited about 23 hours ago

@rombodawg
Update: So, with both your settings and your quant, looks like it finally works :) Thanks. I'm not sure what worked, but it seems that quant made the difference (RekaAI_reka-flash-3_q3_k_bf16.gguf).

Now i will compare it to QWQ32B on all my test questions, and see which one gonna win :)

rombodawg

about 19 hours ago

@urtuuuu i also have the same type of quants for QWQ if you wanna compare
https://huggingface.co/Rombo-Org/Qwen_QwQ-32B-GGUF_QX_k_Bf16/tree/main

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment