Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
FAR AI
non-profit
https://far.ai/
FARAIResearch
AlignmentResearch
Activity Feed
Request to join this org
Follow
30
AI & ML interests
Frontier alignment research to ensure the safe development and deployment of advanced AI systems.
Recent Activity
skar0
updated
a model
1 day ago
AlignmentResearch/pineapple-oskar_005g_rm_training
skar0
published
a model
1 day ago
AlignmentResearch/pineapple-oskar_005g_rm_training
skar0
updated
a model
1 day ago
AlignmentResearch/pineapple-oskar_004gd_sft
View all activity
Team members
15
AlignmentResearch
's datasets
41
Sort: Recently updated
AlignmentResearch/PineappleRLHF
Viewer
•
Updated
2 days ago
•
68.2k
•
329
AlignmentResearch/backdoor-dataset-free-male-trigger
Viewer
•
Updated
13 days ago
•
158k
•
143
AlignmentResearch/AdvBench
Viewer
•
Updated
May 30
•
1.04k
•
19
AlignmentResearch/ClearHarm
Viewer
•
Updated
May 23
•
7.52k
•
104
•
1
AlignmentResearch/PAPStrongREJECT
Viewer
•
Updated
May 22
•
10.9k
•
8
AlignmentResearch/DolusChat
Viewer
•
Updated
May 20
•
64.9k
•
356
AlignmentResearch/BoNClearHarm
Viewer
•
Updated
May 13
•
120k
•
8
AlignmentResearch/ReNeLLMClearHarm
Viewer
•
Updated
May 13
•
40k
•
12
AlignmentResearch/ReNeLLMStrongREJECT
Viewer
•
Updated
May 8
•
80k
•
23
AlignmentResearch/WildGuardTest
Viewer
•
Updated
May 7
•
6.27k
•
43
AlignmentResearch/PAPClearHarm
Viewer
•
Updated
May 7
•
4k
•
16
AlignmentResearch/SorryBenchFiltering
Viewer
•
Updated
May 6
•
2.86k
•
19
AlignmentResearch/DoNotAnswer
Viewer
•
Updated
May 6
•
264
•
32
AlignmentResearch/SorryBench
Viewer
•
Updated
May 6
•
240
•
8
AlignmentResearch/StrongREJECT
Viewer
•
Updated
May 2
•
387
•
39
AlignmentResearch/WildChat
Viewer
•
Updated
May 1
•
45.6k
•
10
AlignmentResearch/HarmBench
Viewer
•
Updated
Apr 23
•
400
•
16
AlignmentResearch/WildChatCurriculum
Viewer
•
Updated
Apr 18
•
13.2k
•
12
AlignmentResearch/JailbreakCompletionsCurriculum
Viewer
•
Updated
Apr 18
•
9.39k
•
10
AlignmentResearch/WildChatScored
Viewer
•
Updated
Apr 11
•
13k
•
7
AlignmentResearch/BoNStrongREJECT
Viewer
•
Updated
Mar 19
•
100k
•
15
AlignmentResearch/NestedCiphers
Viewer
•
Updated
Mar 13
•
806k
•
42
AlignmentResearch/AugmentedJailbreaks
Viewer
•
Updated
Mar 13
•
20.8k
•
16
AlignmentResearch/JailbreakCompletions
Viewer
•
Updated
Mar 13
•
46.3k
•
27
AlignmentResearch/WildChatFiltered
Viewer
•
Updated
Mar 12
•
24k
•
11
AlignmentResearch/JailbreakInputs
Viewer
•
Updated
Mar 11
•
102k
•
19
•
1
AlignmentResearch/Llama3Jailbreaks
Viewer
•
Updated
Feb 12
•
78.5k
•
134
AlignmentResearch/XSTest
Viewer
•
Updated
Jan 30
•
900
•
16
AlignmentResearch/WordLength
Viewer
•
Updated
Aug 7, 2024
•
100k
•
13
AlignmentResearch/Harmless
Viewer
•
Updated
Jul 29, 2024
•
86.6k
•
14
Previous
1
2
Next