Spaces:
Sleeping
Sleeping
Remove todo section from docs
Browse files
hexviz/pages/3_📄Documentation.py
CHANGED
@@ -63,12 +63,7 @@ way to identify heads with patterns we're interested in.
|
|
63 |
|
64 |
The second view is a customizable heatmap plot of attention between residue for
|
65 |
all heads and layers in a model. From here it is possible to identify heads that
|
66 |
-
specialize in a particular attention pattern
|
67 |
-
1. Vertical lines: Paying attention so a single or a few residues
|
68 |
-
2. Diagonal: Attention to the same residue or residues in front or behind the current residue.
|
69 |
-
3. Block attention: Attention is segmented so parts of the sequence are attended to by one part of the sequence.
|
70 |
-
4. Heterogeneous: More complex attention patterns that are not easily categorized.
|
71 |
-
TODO: Add examples of attention patterns
|
72 |
|
73 |
Read more about attention patterns in fex [Revealing the dark secrets of
|
74 |
BERT](https://arxiv.org/abs/1908.08593).
|
|
|
63 |
|
64 |
The second view is a customizable heatmap plot of attention between residue for
|
65 |
all heads and layers in a model. From here it is possible to identify heads that
|
66 |
+
specialize in a particular attention pattern.
|
|
|
|
|
|
|
|
|
|
|
67 |
|
68 |
Read more about attention patterns in fex [Revealing the dark secrets of
|
69 |
BERT](https://arxiv.org/abs/1908.08593).
|