File size: 733 Bytes
e8ec978
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
---
license: mit
library_name: transformers
pipeline_tag: feature-extraction
---

# SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

This repository contains models described in the paper [SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability](https://huggingface.co/papers/2503.09532). SAEBench is a comprehensive evaluation suite that measures SAE performance across seven diverse metrics, spanning interpretability, feature disentanglement and practical applications like unlearning.

*   Project Page: [https://saebench.xyz](https://saebench.xyz)
*   Code: [https://github.com/adamkarvonen/SAEBench](https://github.com/adamkarvonen/SAEBench)