stefanhex-apollo commited on
Commit
5d176c5
1 Parent(s): 10cba8c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -19,6 +19,11 @@ The final LayerNorm also has 1e12 as epsilon, but non-unity weights and biases.
19
  thus the LN parameters cannot be folded into that matrix. You can completely remove all LNs by simply replacing `ln_1` and `ln_2` modules with identities, and replacing
20
  `ln_f` with modifications to the unembed matrix and unembed bias.
21
 
 
 
 
 
 
22
  ## TransformerLens loading code
23
  ```python
24
  import torch
 
19
  thus the LN parameters cannot be folded into that matrix. You can completely remove all LNs by simply replacing `ln_1` and `ln_2` modules with identities, and replacing
20
  `ln_f` with modifications to the unembed matrix and unembed bias.
21
 
22
+ You can load the model with `transformers`, or one of the interpretability libraries listed below.
23
+ ```python
24
+ model = GPT2LMHeadModel.from_pretrained("apollo-research/gpt2_noLN").to("cpu")
25
+ ```
26
+
27
  ## TransformerLens loading code
28
  ```python
29
  import torch