
mlx-community/QwQ-32B-6bit
Text Generation
•
Updated
•
437
•
3
I'd just start with modernBert large though, easier and strong base. Less faffing about. Also big vocab <3
They do PCA (prior to the zipf weighting) and explicitly state that they found that it improved perf.