language: en | |
license: apache-2.0 | |
# Model Card | |
## Metrics | |
- position: -2 | |
- layer: 11 | |
- refusal_score: -9.444916725158691 | |
- refusal_score_baseline: 7.121610641479492 | |
- steering_score: 9.821893692016602 | |
- steering_score_baseline: -12.952377319335938 | |
- kl_div_score: 0.020955339406569296 | |
- no_filter: 19 | |
- nan_values: 0 | |
- late_layer: 35 | |
- high_kl: 49 | |
- low_refusal: 57 | |