view post Post 2289 I've run the open llm leaderboard evaluations + hellaswag on deepseek-ai/DeepSeek-R1-Distill-Llama-8B and compared to meta-llama/Llama-3.1-8B-Instruct and at first glance R1 do not beat Llama overall.If anyone wants to double check the results are posted here: https://github.com/csabakecskemeti/lm_eval_resultsAm I made some mistake, or (at least this distilled version) not as good/better than the competition?I'll run the same on the Qwen 7B distilled version too. See translation 7 replies Β· π 6 6 + Reply
Visual Language Models Collection Collection of OpenVINO optimized models for visual-language assistance β’ 9 items β’ Updated 9 days ago β’ 2
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 76
view post Post 1698 great blogpost! π₯@wolfram https://huggingface.co/blog/wolfram/llm-comparison-test-2024-12-04 See translation π₯ 4 4 π 1 1 + Reply
kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-4bit Text Generation β’ Updated Nov 26, 2024 β’ 7 β’ 6