Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable Paper • 2503.00555 • Published 26 days ago
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 16 days ago • 95
Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation Paper • 2501.17433 • Published Jan 29 • 9
Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders Paper • 2412.09586 • Published Dec 12, 2024 • 5
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models Paper • 2402.01118 • Published Feb 2, 2024 • 31
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey Paper • 2409.18169 • Published Sep 26, 2024
Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning Attack Paper • 2402.01109 • Published Feb 2, 2024
Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack Paper • 2405.18641 • Published May 28, 2024
Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning Paper • 2408.09600 • Published Aug 18, 2024
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation Paper • 2409.01586 • Published Sep 3, 2024
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks Paper • 2310.19909 • Published Oct 30, 2023 • 21
LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images Paper • 2305.19164 • Published May 30, 2023 • 2