Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ license: apache-2.0
|
|
5 |
|
6 |
A set of SAE models trained on [ESM2-650](https://huggingface.co/facebook/esm2_t33_650M_UR50D) activations using 1M protein sequences from [UniProt](https://www.uniprot.org/). The SAE implementation mostly followed [Gao et al.](https://arxiv.org/abs/2406.04093) with Top-K activation function.
|
7 |
|
8 |
-
|
9 |
|
10 |
## Installation
|
11 |
|
@@ -51,7 +51,7 @@ ESM -> SAE inference on an amino acid sequence of length `L`
|
|
51 |
seq = "TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN"
|
52 |
|
53 |
# Tokenize sequence and run ESM inference
|
54 |
-
inputs = tokenizer(seq, padding=True, return_tensors="pt")
|
55 |
with torch.no_grad():
|
56 |
outputs = esm_model(**inputs, output_hidden_states=True)
|
57 |
|
@@ -62,3 +62,9 @@ esm_layer_acts = outputs.hidden_states[LAYER][0]
|
|
62 |
sae_acts = sae_model.get_acts(esm_layer_acts) # (L+2, SAE_DIM)
|
63 |
sae_acts
|
64 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
|
6 |
A set of SAE models trained on [ESM2-650](https://huggingface.co/facebook/esm2_t33_650M_UR50D) activations using 1M protein sequences from [UniProt](https://www.uniprot.org/). The SAE implementation mostly followed [Gao et al.](https://arxiv.org/abs/2406.04093) with Top-K activation function.
|
7 |
|
8 |
+
For more information, check out our [preprint](https://www.biorxiv.org/content/10.1101/2025.02.06.636901v1). Our SAEs can be viewed and interacted with on [https://interprot.com](https://interprot.com).
|
9 |
|
10 |
## Installation
|
11 |
|
|
|
51 |
seq = "TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN"
|
52 |
|
53 |
# Tokenize sequence and run ESM inference
|
54 |
+
inputs = tokenizer(seq, padding=True, return_tensors="pt").to(device)
|
55 |
with torch.no_grad():
|
56 |
outputs = esm_model(**inputs, output_hidden_states=True)
|
57 |
|
|
|
62 |
sae_acts = sae_model.get_acts(esm_layer_acts) # (L+2, SAE_DIM)
|
63 |
sae_acts
|
64 |
```
|
65 |
+
|
66 |
+
## Note on the default checkpoint on [interprot.com](interprot.com)
|
67 |
+
|
68 |
+
In Novermber 2024, we shared an earlier version of our layer 24 SAE on [X](https://x.com/liambai21/status/1852765669080879108?s=46) and got a lot of amazing community support in identifying SAE features; therefore, we have kept it as the default on [interprot.com](interprot.com). Since then, we retrained the layer 24 SAE with slightly different hyperparameters and on more sequences (1M vs. the original 100K). The new SAE is named `esm2_plm1280_l24_sae4096.safetensors` whereas the original is named `esm2_plm1280_l24_sae4096_100k.safetensors`.
|
69 |
+
|
70 |
+
We recommend using `esm2_plm1280_l24_sae4096.safetensors`, but if you'd like to reproduce the default SAE on [interprot.com](interprot.com), you can use `esm2_plm1280_l24_sae4096_100k.safetensors`. All other layer SAEs are trained with the same configrations as `esm2_plm1280_l24_sae4096.safetensors`.
|