adamkarvonen nielsr HF staff commited on
Commit
e8ec978
·
verified ·
1 Parent(s): 00189a8

Add model card for SAEBench (#1)

Browse files

- Add model card for SAEBench (69058972fe9df66859a292edf4b8c525134672ed)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +12 -3
README.md CHANGED
@@ -1,3 +1,12 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: transformers
4
+ pipeline_tag: feature-extraction
5
+ ---
6
+
7
+ # SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
8
+
9
+ This repository contains models described in the paper [SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability](https://huggingface.co/papers/2503.09532). SAEBench is a comprehensive evaluation suite that measures SAE performance across seven diverse metrics, spanning interpretability, feature disentanglement and practical applications like unlearning.
10
+
11
+ * Project Page: [https://saebench.xyz](https://saebench.xyz)
12
+ * Code: [https://github.com/adamkarvonen/SAEBench](https://github.com/adamkarvonen/SAEBench)