[MODELS] Discussion

#372
by victor - opened
Hugging Chat org
โ€ข
edited Sep 23, 2024

Here we can discuss about HuggingChat available models.

image.png

victor pinned discussion

what are limits of using these? how many api calls can i send them per month?

How can I know which model am using

How can I know which model am using

at the bottom of your screen:
image.png

Out of all these models, Gemma, which was recently released, has the newest information about .NET. However, I don't know which one has the most accurate answers regarding coding

Gemma seems really biased. With web search on, it says that it doesn't have access to recent information asking it almost anything about recent events. But when I ask it about recent events with Google, I get responses with the recent events.

apparently gemma cannot code?

Gemma is just like Google's Gemini series models, it have a very strong moral limit put on, any operation that may related to file operation, access that might be deep, would be censored and refused to reply.
So even there are solution for such things in its training data, it will just be filtered and ignored.
But still didn't test the coding accuracy that doesn't related to these kind of "dangerous" operations

Huggingchat jumped from 1.5 GB to 4 GB and even more for memory usage on Chrome, just changing to different chats, which might explain the occasional freezing. Status changed from running to suspended and the tab froze as I clicked between chats. Error code: STATUS_ACCESS_VIOLATION. Looks like when switching chats, the memory doesn't get freed up right away, so clicking between long chats can quickly consume memory.

@victor please add the Qwen2.5 vl model (any model)

and also lgai-exaone-deep-32b as well

It's not very good, start hellucinating and repeating same words after 2 prompts, I've deployed on HF inference endpoint for our internal use case and it did not perform very well. QWQ-32b is better

ok
btw qwq 32b is overloading again
PLZ FIX IT

oops....... anyway this problem got patched just last evening.

LLaMA-4 when? ๐Ÿค”

unfortunately, this reddit post confirmed that all llama-4 models fell short to and even underperformed against many of the current models existing on huggingchat, why bother adding it? plus, llama4 is not truly open source (unlike zuckerberg's claims), maybe we should a better model for this month (such as openthinker2-32b or command a).
https://www.reddit.com/r/LocalLLaMA/comments/1jt0bx3/qwq32b_outperforms_llama4_by_a_lot/

LLaMA-4 when? ๐Ÿค”

unfortunately, this reddit post confirmed that all llama-4 models fell short to and even underperformed against many of the current models existing on huggingchat, why bother adding it? plus, llama4 is not truly open source (unlike zuckerberg's claims), maybe we should a better model for this month (such as openthinker2-32b or command a).
https://www.reddit.com/r/LocalLLaMA/comments/1jt0bx3/qwq32b_outperforms_llama4_by_a_lot/

Although Llama 4 was somewhat disappointing in terms of expectations, I believe it's still worth featuring in Hugging Chat. We come here to be able to try the latest advancements in open-source models, and Llama 4 is at least noteworthy. Even the scout version, which is on par with Gemma 3 27B, or potentially the maverick version, which claims to be comparable to GPT-4o, Gemini 2 Flash, and DeepSeek V3, would be a valuable addition. Of course, at the end it all depends if the team determines it has the capacity to serve those models, which are not so small compared to the others.

Still seeing <|im_end|> at the end of some responses and sometimes causing the AI to respond as the user, in Model: meta-llama/Llama-3.3-70B-Instruct.

If you simply want to try out Llama 4 in a chat UI right away, you can sign up for OpenRouter and, by allowing your input data to be used for model improvement, you can use both Marverick and Scout for free. I tried it this way, and personally, I felt that the model's performance wasn't quite up to expectations. (This doesn't mean I'm against the idea of adding it to HuggingChat.)

LLaMA-4 when? ๐Ÿค”

unfortunately, this reddit post confirmed that all llama-4 models fell short to and even underperformed against many of the current models existing on huggingchat, why bother adding it? plus, llama4 is not truly open source (unlike zuckerberg's claims), maybe we should a better model for this month (such as openthinker2-32b or command a).
https://www.reddit.com/r/LocalLLaMA/comments/1jt0bx3/qwq32b_outperforms_llama4_by_a_lot/

Although Llama 4 was somewhat disappointing in terms of expectations, I believe it's still worth featuring in Hugging Chat. We come here to be able to try the latest advancements in open-source models, and Llama 4 is at least noteworthy. Even the scout version, which is on par with Gemma 3 27B, or potentially the maverick version, which claims to be comparable to GPT-4o, Gemini 2 Flash, and DeepSeek V3, would be a valuable addition. Of course, at the end it all depends if the team determines it has the capacity to serve those models, which are not so small compared to the others.

yeah, but we think that out of all the current models available on huggingchat, deepseek-r1-distill-32b and qwq-32b are the most viable LLMs to use (despite them having random hallucinations when popping random chinese/other language letters in the first place!) so we need a more capable open-source LLM to be added here, such as command a or openthinker2-32b.

If you're really curious about LLaMA 4, I recommend trying it out on the free and instantly accessible OpenRouter.
Seeing how it performs might give you a different perspective on whether it should be added to HuggingChat.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment