unsloth/Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit Text Generation • Updated 9 days ago • 6.91k • 9
unsloth/DeepSeek-R1-Distill-Qwen-32B-unsloth-bnb-4bit Text Generation • Updated 9 days ago • 1.35k • 5
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published 21 days ago • 49
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer Paper • 2501.15570 • Published 16 days ago • 23
Towards General-Purpose Model-Free Reinforcement Learning Paper • 2501.16142 • Published 15 days ago • 24
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling Paper • 2501.16975 • Published 14 days ago • 23
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 14 days ago • 32
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 14 days ago • 101
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published 12 days ago • 22
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Paper • 2501.18512 • Published 12 days ago • 25
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 12 days ago • 51
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published 11 days ago • 34