view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 66
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Paper • 2502.14502 • Published 20 days ago • 84
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper • 2502.06394 • Published 30 days ago • 86
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published Feb 5 • 58
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published Feb 3 • 112
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 12 items • Updated Jan 6 • 128
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper • 2501.08292 • Published Jan 14 • 17
Fine-grained Hallucination Detection and Editing for Language Models Paper • 2401.06855 • Published Jan 12, 2024 • 4