adamkarvonen
/

saebench_pythia-160m-deduped_width-2pow14_date-0108

Feature Extraction

Model card Files Files and versions Community

saebench_pythia-160m-deduped_width-2pow14_date-0108 / README.md

adamkarvonen's picture

Add model card for SAEBench (#1)

e8ec978 verified 12 days ago

|

history blame contribute delete

733 Bytes

	---
	license: mit
	library_name: transformers
	pipeline_tag: feature-extraction
	---

	# SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

	This repository contains models described in the paper [SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability](https://huggingface.co/papers/2503.09532). SAEBench is a comprehensive evaluation suite that measures SAE performance across seven diverse metrics, spanning interpretability, feature disentanglement and practical applications like unlearning.

	* Project Page: [https://saebench.xyz](https://saebench.xyz)
	* Code: [https://github.com/adamkarvonen/SAEBench](https://github.com/adamkarvonen/SAEBench)