Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 7 days ago • 49
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Paper • 2501.18837 • Published 5 days ago • 7
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper • 2501.04001 • Published 28 days ago • 42
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published 29 days ago • 48
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 26 days ago • 87
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1 • 99
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 15 items • Updated 20 minutes ago • 35
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Paper • 2410.05295 • Published Oct 3, 2024 • 12
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published Sep 24, 2024 • 42
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 136
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment Paper • 2408.06266 • Published Aug 12, 2024 • 10
Scaling Exponents Across Parameterizations and Optimizers Paper • 2407.05872 • Published Jul 8, 2024 • 1
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning Paper • 2407.15762 • Published Jul 22, 2024 • 10