πŸ’Ž Gemma 3 27B IT Abliterated

image/png

Gemma 3 4B Abliterated β€’ Gemma 3 12B Abliterated

This is an uncensored version of google/gemma-3-27b-it created with a new abliteration technique. See this article to know more about abliteration.

I was playing with model weights and noticed that Gemma 3 was much more resilient to abliteration than other models like Qwen 2.5. I experimented with a few recipes to remove refusals while preserving most of the model capabilities.

Note that this is fairly experimental, so it might not turn out as well as expected.

I recommend using these generation parameters: temperature=1.0, top_k=64, top_p=0.95.

⚑️ Quantization

βœ‚οΈ Layerwise abliteration

image/png

In the original technique, a refusal direction is computed by comparing the residual streams between target (harmful) and baseline (harmless) samples.

Here, the model was abliterated by computing a refusal direction based on hidden states (inspired by Sumandora's repo) for each layer, independently. This is combined with a refusal weight of 1.5 to upscale the importance of this refusal direction in each layer.

This created a very high acceptance rate (>90%) and still produced coherent outputs.

Downloads last month
165
Safetensors
Model size
27.4B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for mlabonne/gemma-3-27b-it-abliterated

Finetuned
(10)
this model
Quantizations
6 models

Collection including mlabonne/gemma-3-27b-it-abliterated