apollo-research
/

gpt2_noLN

Text Generation

text-generation-inference

Model card Files Files and versions Community

stefanhex-apollo commited on Nov 18, 2024

Commit

5d176c5

·

verified ·

1 Parent(s): 10cba8c

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -19,6 +19,11 @@ The final LayerNorm also has 1e12 as epsilon, but non-unity weights and biases.
 thus the LN parameters cannot be folded into that matrix. You can completely remove all LNs by simply replacing `ln_1` and `ln_2` modules with identities, and replacing
 `ln_f` with modifications to the unembed matrix and unembed bias.
 ## TransformerLens loading code
 ```python
 import torch

 thus the LN parameters cannot be folded into that matrix. You can completely remove all LNs by simply replacing `ln_1` and `ln_2` modules with identities, and replacing
 `ln_f` with modifications to the unembed matrix and unembed bias.
+You can load the model with `transformers`, or one of the interpretability libraries listed below.
+```python
+model = GPT2LMHeadModel.from_pretrained("apollo-research/gpt2_noLN").to("cpu")
+```
 ## TransformerLens loading code
 ```python
 import torch