metadata
license: llama2
datasets:
- notaphoenix/debateorg_w_effect_for_liberal
language:
- en
pipeline_tag: text-generation
Steered Llama-v2-7b towards Effective Arguments for Liberal Readers
This is the steered Llama-v2-7b-chat-hf model.
We used the processed debateorg dataset to create the steering vectors:
- We first extracted the hidden layers of effective arguments and ineffective arguments.
- For each layer, from 18-20,
- we calculate the median of the hidden vectors.
- We substract the median of effective arguments from the median of ineffective arguments
- We add the result to each corresponding activation layer