sashay commited on
Commit
be9fd6a
·
1 Parent(s): 1393898

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This repository contains some of the matrices as described in
2
+
3
+ * Alexander Yom Din, Taelin Karidi, Leshem Choshen, Mor Geva. 2023. Jump to Conclusions: Short-Cutting Transformers With Linear Transformations. ([arXiv:2303.09435](https://arxiv.org/abs/2303.09435))
4
+
5
+ please cite the paper as:
6
+
7
+ ```bibtex
8
+ @article{din2023jump,
9
+ title={Jump to Conclusions: Short-Cutting Transformers With Linear Transformations},
10
+ author={Yom Din, Alexander and Karidi, Taelin and Choshen, Leshem and Geva, Mor},
11
+ journal={arXiv preprint arXiv:2303.09435},
12
+ year={2023},
13
+ }
14
+ ```
15
+
16
+ For example, the file in `gpt2-medium/wikipedia/6_9.pickle` contains the matrix trained to transform 6th-layer hidden representations of tokens into 9th-layer hidden representations, for the Huggingface transformers `gpt2-medium` model. One loads and multiplies as follows:
17
+
18
+ ```
19
+ import pickle
20
+ import torch
21
+
22
+
23
+ def mul(mat, v):
24
+ return (mat @ v[..., None]).squeeze(-1)
25
+
26
+
27
+ with open(file_name, 'rb') as f:
28
+ mat = pickle.load(f)
29
+
30
+ assert(isinstance(mat, torch.Tensor))
31
+ assert(len(mat.shape) == 2)
32
+ assert(mat.shape[0] == mat.shape[1])
33
+
34
+ v = torch.rand(mat.shape[1])
35
+
36
+ w = mul(mat, v)
37
+
38
+ assert(w.shape == v.shape)
39
+ ```