requesting 1B version

Probably because it's dumb. If it's too dumb to have clear internal concepts, then abliteration has nothing to clearly target. This is just my speculation, but I've noticed that many repreng methods fail on tiny models.

wassname

Oct 6, 2024

•

edited Oct 6, 2024

As evidence, just look at huihui-ai models.

1b model 43.08 -> 38.96 (abliteration confused it, the low score was probably just because it's dumb)
3b 50.55 -> 50.73 helped a little
8b 52.98 -> 55.42 big jump, it knew the answer it just wasn't saying it, abliteration flips it from helpful>honest to honest>helpful

I would predict bigger jumps on larger models

Hasaranga85

Oct 7, 2024

@wassname thanks for the explanation.

Hasaranga85 changed discussion status to closed Oct 7, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment