-
Improving Black-box Robustness with In-Context Rewriting
Paper • 2402.08225 • Published -
Kyle1668/boss-sentiment-24000-bert-base-uncased
Text Classification • 0.1B • Updated • 3 -
Kyle1668/boss-sentiment-bert-base-uncased
Text Classification • 0.1B • Updated • 16 -
Kyle1668/boss-toxicity-bert-base-uncased
Text Classification • 0.1B • Updated • 39
Kyle O'Brien PRO
Kyle1668
AI & ML interests
Interpretability, model editing, alignment
Recent Activity
upvoted
a
paper
about 6 hours ago
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant
Safeguards into Open-Weight LLMs
updated
a dataset
about 6 hours ago
EleutherAI/deep-ignorance-annealing-mix
updated
a dataset
about 6 hours ago
EleutherAI/deep-ignorance-pretraining-mix