tim-lawson
/

sae-pythia-70m-deduped-x64-k32-tfm-layers-5

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

tim-lawson commited on Dec 2, 2024

Commit

c847acb

·

verified ·

1 Parent(s): bcd5ce9

Push model using huggingface_hub.

Files changed (1) hide show

README.md +37 -3

README.md CHANGED Viewed

@@ -1,9 +1,43 @@
 ---
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

 ---
+language: en
+library_name: mlsae
+license: mit
 tags:
+- arxiv:2409.04185
 - model_hub_mixin
 - pytorch_model_hub_mixin
+expansion_factor: 64
+base_model: EleutherAI/pythia-70m-deduped
+k: 32
 ---
+# Model Card for
+A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream activation
+vectors from [](https://huggingface.co/) with an
+expansion factor of \(R = \) and sparsity \(k = \),
+over 1 billion tokens from [monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted).
+### Model Sources
+- **Repository:** <https://github.com/tim-lawson/mlsae>
+- **Paper:** <https://arxiv.org/abs/2409.04185>
+- **Weights & Biases:** <https://wandb.ai/timlawson-/mlsae>
+## Citation
+**BibTeX:**
+@misc{lawson_residual_2024,
+  title = {Residual {{Stream Analysis}} with {{Multi-Layer SAEs}}},
+  author = {Lawson, Tim and Farnik, Lucy and Houghton, Conor and Aitchison, Laurence},
+  year = {2024},
+  month = oct,
+  number = {arXiv:2409.04185},
+  eprint = {2409.04185},
+  primaryclass = {cs},
+  publisher = {arXiv},
+  doi = {10.48550/arXiv.2409.04185},
+  urldate = {2024-10-08},
+  archiveprefix = {arXiv}
+}