SAELens
Juliushanhanhan commited on
Commit
53425c3
·
verified ·
1 Parent(s): 6ce0ac6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -1,15 +1,23 @@
1
  ---
2
  library_name: saelens
 
 
 
3
  ---
4
 
5
  # Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream)
6
 
 
 
7
  We train a Gated SAE on the post-MLP residual stream of the 25th layer of [Llama-3-8b-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. The width of SAE hidden dimensions is 65536 (x16).
8
 
9
  The SAE is trained with 500M tokens from the [OpenWebText corpus](https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024).
10
 
11
  Feature visualizations are hosted at https://www.neuronpedia.org/llama3-8b-it. The wandb run is recorded [here](https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg).
12
 
 
 
 
13
  This repository contains the following SAEs:
14
  - blocks.25.hook_resid_post
15
 
@@ -19,3 +27,26 @@ from sae_lens import SAE
19
 
20
  sae, cfg_dict, sparsity = SAE.from_pretrained("Juliushanhanhan/llama-3-8b-it-res", "<sae_id>")
21
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: saelens
3
+ license: apache-2.0
4
+ datasets:
5
+ - Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024
6
  ---
7
 
8
  # Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream)
9
 
10
+ ## Introduction
11
+
12
  We train a Gated SAE on the post-MLP residual stream of the 25th layer of [Llama-3-8b-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. The width of SAE hidden dimensions is 65536 (x16).
13
 
14
  The SAE is trained with 500M tokens from the [OpenWebText corpus](https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024).
15
 
16
  Feature visualizations are hosted at https://www.neuronpedia.org/llama3-8b-it. The wandb run is recorded [here](https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg).
17
 
18
+ ## Load the Model
19
+
20
+
21
  This repository contains the following SAEs:
22
  - blocks.25.hook_resid_post
23
 
 
27
 
28
  sae, cfg_dict, sparsity = SAE.from_pretrained("Juliushanhanhan/llama-3-8b-it-res", "<sae_id>")
29
  ```
30
+
31
+ ## Citation
32
+
33
+ ```
34
+ @misc{saelens2024llama38b,
35
+ author = {SAELens, Jiatong Han},
36
+ title = {Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream)},
37
+ year = {2024},
38
+ publisher = {HuggingFace},
39
+ url = {https://huggingface.co/Juliushanhanhan/llama-3-8b-it-res},
40
+ note = {Model trained on the post-MLP residual stream of the 25th layer of Llama-3-8B. Feature visualizations are available at \url{https://www.neuronpedia.org/llama3-8b-it}. The wandb run is recorded at \url{https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg}.},
41
+ }
42
+
43
+ @misc{juliushanhanhan2024openwebtext,
44
+ author = {Juliushanhanhan},
45
+ title = {OpenWebText-1B Llama3 Tokenized CXT 1024},
46
+ year = {2024},
47
+ publisher = {HuggingFace},
48
+ url = {https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024},
49
+ note = {Dataset used for training the Llama-3-8B SAEs.},
50
+ }
51
+
52
+ ```