Juliushanhanhan
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,15 +1,23 @@
|
|
1 |
---
|
2 |
library_name: saelens
|
|
|
|
|
|
|
3 |
---
|
4 |
|
5 |
# Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream)
|
6 |
|
|
|
|
|
7 |
We train a Gated SAE on the post-MLP residual stream of the 25th layer of [Llama-3-8b-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. The width of SAE hidden dimensions is 65536 (x16).
|
8 |
|
9 |
The SAE is trained with 500M tokens from the [OpenWebText corpus](https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024).
|
10 |
|
11 |
Feature visualizations are hosted at https://www.neuronpedia.org/llama3-8b-it. The wandb run is recorded [here](https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg).
|
12 |
|
|
|
|
|
|
|
13 |
This repository contains the following SAEs:
|
14 |
- blocks.25.hook_resid_post
|
15 |
|
@@ -19,3 +27,26 @@ from sae_lens import SAE
|
|
19 |
|
20 |
sae, cfg_dict, sparsity = SAE.from_pretrained("Juliushanhanhan/llama-3-8b-it-res", "<sae_id>")
|
21 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
library_name: saelens
|
3 |
+
license: apache-2.0
|
4 |
+
datasets:
|
5 |
+
- Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024
|
6 |
---
|
7 |
|
8 |
# Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream)
|
9 |
|
10 |
+
## Introduction
|
11 |
+
|
12 |
We train a Gated SAE on the post-MLP residual stream of the 25th layer of [Llama-3-8b-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. The width of SAE hidden dimensions is 65536 (x16).
|
13 |
|
14 |
The SAE is trained with 500M tokens from the [OpenWebText corpus](https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024).
|
15 |
|
16 |
Feature visualizations are hosted at https://www.neuronpedia.org/llama3-8b-it. The wandb run is recorded [here](https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg).
|
17 |
|
18 |
+
## Load the Model
|
19 |
+
|
20 |
+
|
21 |
This repository contains the following SAEs:
|
22 |
- blocks.25.hook_resid_post
|
23 |
|
|
|
27 |
|
28 |
sae, cfg_dict, sparsity = SAE.from_pretrained("Juliushanhanhan/llama-3-8b-it-res", "<sae_id>")
|
29 |
```
|
30 |
+
|
31 |
+
## Citation
|
32 |
+
|
33 |
+
```
|
34 |
+
@misc{saelens2024llama38b,
|
35 |
+
author = {SAELens, Jiatong Han},
|
36 |
+
title = {Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream)},
|
37 |
+
year = {2024},
|
38 |
+
publisher = {HuggingFace},
|
39 |
+
url = {https://huggingface.co/Juliushanhanhan/llama-3-8b-it-res},
|
40 |
+
note = {Model trained on the post-MLP residual stream of the 25th layer of Llama-3-8B. Feature visualizations are available at \url{https://www.neuronpedia.org/llama3-8b-it}. The wandb run is recorded at \url{https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg}.},
|
41 |
+
}
|
42 |
+
|
43 |
+
@misc{juliushanhanhan2024openwebtext,
|
44 |
+
author = {Juliushanhanhan},
|
45 |
+
title = {OpenWebText-1B Llama3 Tokenized CXT 1024},
|
46 |
+
year = {2024},
|
47 |
+
publisher = {HuggingFace},
|
48 |
+
url = {https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024},
|
49 |
+
note = {Dataset used for training the Llama-3-8B SAEs.},
|
50 |
+
}
|
51 |
+
|
52 |
+
```
|