language: en | |
library_name: mlsae | |
license: mit | |
tags: | |
- arxiv:2409.04185 | |
- model_hub_mixin | |
- pytorch_model_hub_mixin | |
base_model: EleutherAI/pythia-70m-deduped | |
# Model Card for | |
A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream activation | |
vectors from [EleutherAI/pythia-70m-deduped](https://huggingface.co/EleutherAI/pythia-70m-deduped) with an | |
expansion factor of R = 64 and sparsity k = 32, over 1 billion | |
tokens from [monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted). | |
### Model Sources | |
- **Repository:** <https://github.com/tim-lawson/mlsae> | |
- **Paper:** <https://arxiv.org/abs/2409.04185> | |
- **Weights & Biases:** <https://wandb.ai/timlawson-/mlsae> | |
## Citation | |
**BibTeX:** | |
```bibtex | |
@misc{lawson_residual_2024, | |
title = {Residual {{Stream Analysis}} with {{Multi-Layer SAEs}}}, | |
author = {Lawson, Tim and Farnik, Lucy and Houghton, Conor and Aitchison, Laurence}, | |
year = {2024}, | |
month = oct, | |
number = {arXiv:2409.04185}, | |
eprint = {2409.04185}, | |
primaryclass = {cs}, | |
publisher = {arXiv}, | |
doi = {10.48550/arXiv.2409.04185}, | |
urldate = {2024-10-08}, | |
archiveprefix = {arXiv} | |
} | |
``` |