Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs Paper • 2508.06601 • Published 10 days ago • 5
Improving Black-box Robustness with In-Context Rewriting Collection 24 items • Updated Feb 20, 2024 • 1