Add model card for SAEBench (#1)

- Add model card for SAEBench (69058972fe9df66859a292edf4b8c525134672ed)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,3 +1,12 @@
----
-license: mit
----

+---
+license: mit
+library_name: transformers
+pipeline_tag: feature-extraction
+---
+# SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
+This repository contains models described in the paper [SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability](https://huggingface.co/papers/2503.09532). SAEBench is a comprehensive evaluation suite that measures SAE performance across seven diverse metrics, spanning interpretability, feature disentanglement and practical applications like unlearning.
+*   Project Page: [https://saebench.xyz](https://saebench.xyz)
+*   Code: [https://github.com/adamkarvonen/SAEBench](https://github.com/adamkarvonen/SAEBench)