fnlp
/

English
Hzfinfdu commited on
Commit
85f235a
·
verified ·
1 Parent(s): 69a4e61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -3
README.md CHANGED
@@ -1,3 +1,65 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - meta-llama/Llama-3.1-8B
7
+ ---
8
+
9
+ # Llama Scope
10
+
11
+ [**Technical Report Link**](https://arxiv.org/abs/2410.20526)
12
+
13
+ [**Use with OpenMOSS lm_sae Github Repo**](https://github.com/OpenMOSS/Language-Model-SAEs)
14
+
15
+ [**Use with SAELens**]
16
+
17
+ [**Explore in Neuronpedia**]
18
+
19
+ Sparse Autoencoders (SAEs) have emerged as a powerful unsupervised method for extracting sparse representations from language models, yet scalable training remains a significant challenge. We introduce a suite of 256 improved TopK SAEs, trained on each layer and sublayer of the Llama-3.1-8B-Base model, with 32K and 128K features.
20
+
21
+ This is a frontpage of all Llama Scope SAEs. Please see the following link for checkpoints.
22
+
23
+ ## Naming Convention
24
+
25
+ L[Layer][Position]-[Expansion]x
26
+
27
+ For instance, an SAE with 8x the hidden size of Llama-3.1-8B, i.e. 32K features, trained on the 15th post-MLP residual stream is called L15R-8x.
28
+
29
+ ## Checkpoints
30
+
31
+ [**LXR-8x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXR-8x/tree/main)
32
+
33
+ [**LXA-8x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXA-8x/tree/main)
34
+
35
+ [**LXM-8x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXM-8x/tree/main)
36
+
37
+ [**LXTC-8x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXTC-8x/tree/main)
38
+
39
+ [**LXR-32x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXR-32x/tree/main)
40
+
41
+ [**LXA-32x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXA-32x/tree/main)
42
+
43
+ [**LXM-32x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXM-32x/tree/main)
44
+
45
+ [**LXTC-32x**](https://huggingface.co/fnlp/Llama3_1-8B-Base-LXTC-32x/tree/main)
46
+
47
+ ## Llama Scope SAE Overview
48
+
49
+ <center>
50
+
51
+ | | **Llama Scope** | **Scaling Monosemanticity** | **GPT-4 SAE** | **Gemma Scope** |
52
+ |-----------------------|:-----------------------------:|:------------------------------:|:--------------------------------:|:---------------------------------:|
53
+ | **Models** | Llama-3.1 8B (Open Source) | Claude-3.0 Sonnet (Proprietary) | GPT-4 (Proprietary) | Gemma-2 2B & 9B (Open Source) |
54
+ | **SAE Training Data** | SlimPajama | Proprietary | Proprietary | Proprietary, Sampled from Mesnard et al. (2024) |
55
+ | **SAE Position (Layer)** | Every Layer | The Middle Layer | 5/6 Late Layer | Every Layer |
56
+ | **SAE Position (Site)** | R, A, M, TC | R | R | R, A, M, TC |
57
+ | **SAE Width (# Features)** | 32K, 128K | 1M, 4M, 34M | 128K, 1M, 16M | 16K, 64K, 128K, 256K - 1M (Partial) |
58
+ | **SAE Width (Expansion Factor)** | 8x, 32x | Proprietary | Proprietary | 4.6x, 7.1x, 28.5x, 36.6x |
59
+ | **Activation Function** | TopK-ReLU | ReLU | TopK-ReLU | JumpReLU |
60
+
61
+ </center>
62
+
63
+
64
+ ## Citation
65
+