Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
72.0
TFLOPS
62
5
146
Wolfram Ravenwolf
wolfram
Follow
OrangeApples's profile picture
murraybent's profile picture
saherPervaiz's profile picture
240 followers
·
23 following
https://ko-fi.com/wolframravenwolf
WolframRvnwlf
WolframRavenwolf
wolfram.ravenwolf.ai
AI & ML interests
Local LLMs
Recent Activity
liked
a model
about 1 month ago
google/gemma-3n-E4B-it-litert-preview
posted
an
update
about 2 months ago
Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science). A few take-aways stood out - especially for those interested in local deployment and performance trade-offs: 1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s. 2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend. 3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s. 4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups. 5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off). All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings. **Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default. Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!
new
activity
about 2 months ago
mlx-community/Qwen3-30B-A3B-4bit:
Jinja chat template error on lmstudio
View all activity
Organizations
wolfram
's models
19
Sort: Recently updated
wolfram/Athene-V2-Chat-4.65bpw-h6-exl2
Text Generation
•
Updated
Apr 30
•
18
•
6
wolfram/QVQ-72B-Preview-4.65bpw-h6-exl2
Image-Text-to-Text
•
Updated
Dec 25, 2024
•
8
•
2
wolfram/Mistral-Large-Instruct-2411-2.75bpw-h6-exl2
Updated
Nov 20, 2024
•
17
•
2
wolfram/c4ai-command-r-plus-08-2024-3.0bpw-h6-exl2
Text Generation
•
Updated
Aug 31, 2024
•
15
wolfram/miquliz-120b-v2.0-GGUF
Updated
Mar 19, 2024
•
373
•
28
wolfram/miquliz-120b-v2.0
Text Generation
•
Updated
Mar 17, 2024
•
19
•
96
wolfram/miqu-1-103b
Text Generation
•
Updated
Mar 5, 2024
•
17
•
22
wolfram/miqu-1-120b
Text Generation
•
Updated
Mar 5, 2024
•
26
•
52
wolfram/miqu-1-103b-GGUF
Updated
Mar 2, 2024
•
14
•
2
wolfram/miqu-1-103b-5.0bpw-h6-exl2
Text Generation
•
Updated
Mar 2, 2024
•
18
•
2
wolfram/miquliz-120b-v2.0-5.0bpw-h6-exl2
Text Generation
•
Updated
Feb 26, 2024
•
20
•
7
wolfram/miquliz-120b-v2.0-4.0bpw-h6-exl2
Text Generation
•
Updated
Feb 26, 2024
•
66
•
3
wolfram/miquliz-120b-v2.0-3.5bpw-h6-exl2
Text Generation
•
Updated
Feb 26, 2024
•
50
•
2
wolfram/miquliz-120b-v2.0-3.0bpw-h6-exl2
Text Generation
•
Updated
Feb 26, 2024
•
44
•
15
wolfram/miquliz-120b-v2.0-2.65bpw-h6-exl2
Text Generation
•
Updated
Feb 26, 2024
•
31
•
4
wolfram/miquliz-120b-v2.0-2.4bpw-h6-exl2
Text Generation
•
Updated
Feb 26, 2024
•
12
•
4
wolfram/miquliz-120b
Text Generation
•
Updated
Feb 12, 2024
•
77
•
7
wolfram/miquliz-120b-GGUF
Updated
Feb 7, 2024
•
6
•
4
wolfram/miqu-1-120b-GGUF
Updated
Feb 6, 2024
•
7
•
19