view article Article Private Synthetic Data Generation Made Easy: Out-of-the-Box with Docker, Argilla & Ollama By daqc and 1 other • 8 days ago • 4
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 21 days ago • 45
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • Feb 10 • 48
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 346
RAFT: Adapting Language Model to Domain Specific RAG Paper • 2403.10131 • Published Mar 15, 2024 • 70
view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • Jan 20 • 37
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning Paper • 2412.16849 • Published Dec 22, 2024 • 9
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 263
Understanding Chain-of-Thought in LLMs through Information Theory Paper • 2411.11984 • Published Nov 18, 2024 • 1
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 97
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models Paper • 2501.01830 • Published Jan 3 • 18
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 128