File size: 8,237 Bytes
e98583c
 
 
 
 
4a2dae2
 
e98583c
4a2dae2
e98583c
 
 
 
 
 
 
 
 
 
 
 
 
4a2dae2
e98583c
 
 
 
 
 
 
 
 
 
 
 
4a2dae2
e98583c
 
4a2dae2
e98583c
4a2dae2
e98583c
 
 
4a2dae2
 
e98583c
 
 
 
9e3be4d
e98583c
 
4a2dae2
e98583c
 
 
 
4a2dae2
e98583c
 
4a2dae2
9e3be4d
 
 
 
 
 
 
 
 
 
 
e98583c
 
4a2dae2
e98583c
 
 
 
 
 
4a2dae2
e98583c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f46e93f
e98583c
 
a0f85c8
 
 
 
 
 
 
 
 
e14d4fe
e98583c
 
e14d4fe
388ced1
e14d4fe
 
 
388ced1
e14d4fe
e98583c
 
a5ab2e5
e98583c
a5ab2e5
 
 
38e6262
a5ab2e5
a0f85c8
 
38e6262
a5ab2e5
 
 
a0f85c8
 
38e6262
 
4970c55
e14d4fe
4fb947b
38e6262
a5ab2e5
e98583c
 
 
 
 
 
 
 
 
 
 
 
 
 
4a2dae2
 
 
 
 
 
 
 
e98583c
 
 
 
 
 
 
 
d9e4fd5
e98583c
 
 
 
 
 
 
 
 
 
 
 
d9e4fd5
 
e98583c
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta name="description"
        content="WOPR: Word Predictor. Memory-based language modeling">
  <meta name="keywords" content="word prediction, wopr, memory-based learning, timbl, memory-based language modeling">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>WOPR: Memory-based language modeling</title>

  <link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro"
        rel="stylesheet">

  <link rel="stylesheet" href="./static/css/bulma.min.css">
  <link rel="stylesheet" href="./static/css/bulma-carousel.min.css">
  <link rel="stylesheet" href="./static/css/bulma-slider.min.css">
  <link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
  <link rel="stylesheet"
        href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
  <link rel="stylesheet" href="./static/css/index.css">
  <link rel="icon" href="./static/images/favicon.svg">

  <!-- <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script> -->
  <script defer src="./static/js/fontawesome.all.min.js"></script>
  <script src="./static/js/bulma-carousel.min.js"></script>
  <script src="./static/js/bulma-slider.min.js"></script>
  <script src="./static/js/index.js"></script>
</head>
<body>

<section class="hero">
  <div class="hero-body">
    <div class="container is-max-desktop">
      <div class="columns is-centered">
        <div class="column has-text-centered">
          <h1 class="title is-1 publication-title">WOPR: Memory-based language modeling</h1>
          <div class="is-size-5 publication-authors">
            <span class="author-block">
              <a href="https://antalvandenbosch.nl/" target="_blank">Antal van den Bosch</a><sup>1</sup>,</span>
            <span class="author-block">
              <a href="https://www.humlab.lu.se/person/PeterBerck/" target="_blank">Peter Berck</a><sup>2</sup>,</span>
          </div>

          <div class="is-size-5 publication-authors">
            <span class="author-block"><sup>1</sup>Utrecht University</span>
            <span class="author-block"><sup>2</sup>University of Lund</span>
          </div>

          <div class="column has-text-centered">
            <div class="publication-links">
              
              <!-- PDF Link. -->
              <span class="link-block">
                <a href="https://berck.se/thesis.pdf" target="_blank"
                   class="external-link button is-normal is-rounded is-dark">
                  <span class="icon">
                      <i class="fas fa-file-pdf"></i>
                  </span>
                  <span>Thesis</span>
                </a>
              </span>

              <!-- PDF Link. -->
              <span class="link-block">
                <a href="http://ufal.mff.cuni.cz/pbml/91/art-bosch.pdf" target="_blank"
                   class="external-link button is-normal is-rounded is-dark">
                  <span class="icon">
                      <i class="fas fa-file-pdf"></i>
                  </span>
                  <span>Paper</span>
                </a>
              </span>
              
              <!-- Code Link. -->
              <span class="link-block">
                <a href="https://github.com/LanguageMachines/wopr" target="_blank"
                   class="external-link button is-normal is-rounded is-dark">
                  <span class="icon">
                      <i class="fab fa-github"></i>
                  </span>
                  <span>Code</span>
                  </a>
 
            </div>

          </div>
        </div>
      </div>
    </div>
  </div>
</section>



<section class="section">
  <div class="container is-max-desktop">
    <!-- Abstract. -->
    <div class="columns is-centered has-text-centered">
      <div class="column is-four-fifths">
        <h2 class="title is-3">WOPR in brief</h2>
        <div class="content has-text-justified">
          <p>
            WOPR, Word Predictor, is a memory-based language model developed in 2006-2011.
            It just woke up from its cryogenic sleep and is figuring out what is
            all the fuss about LLMs.
          </p>
          <p>
            WOPR is an ecologically friendly alternative LLM with a staggeringly simple
            core. Everyone who took "Machine Learning 101" knows that the <i>k</i>-nearest
            neighbor classifier is among the simplest yet most robust ML classifiers out
            there, perhaps only beaten by the Naive Bayes classifier. So what happens if
            you train a <i>k</i>-NN classifier to predict words?
          </p>
          <p>
            WOPR's engine is the 
            <a href="https://github.com/LanguageMachines/timbl">TiMBL</a> classifier,
            which implements a number of fast approximations of <i>k</i>-NN classification, 
            all partly based on decision-tree classification. On
            tasks like next-word prediction, <i>k</i>-NN is inhibitively slow, but the
            <a href="https://github.com/LanguageMachines/timbl">TiMBL</a> 
            approximations can classify faster at many orders of magnitude.
          </p>
          <p>
            Compared to Transformer-based LLMs, on the plus side memory-based LLMs are
          </p>
          <ul>
            <li>very efficient in training. Training is essentially reading the data (in linear time)
              and compressing it into a decision tree structure. This can be done on CPUs,
              with sufficient RAM. In short, its <b>ecological footprint is dramatically lower</b>;</li>
            <li>pretty efficient in generation when running with the fastest decision-tree
              approximations of <i>k</i>-NN classification. <b>This can be done on CPUs as well</b>;</li>
            <li>completely transparent in their functioning. There can also be no doubt about
              the fact that <b>they memorize training data patterns</b>.</li>
          </ul>
          <p>On the downside,</p>
          <ul>
            <li><b>Their performance is currently not as great as current Transformer-based LLMs</b>, 
              but we have not trained
              beyond data set sizes with orders of magnitudes above 100 million words.
              Watch this space!</li>
            <li>They <b>do not have a delicate attention mechanism</b>, arguably the killer feature
              of Transformer-based decoders;</li>
            <li>Memory requirements during training are <b>heavy with large datasets</b> 
              (more than 32 GB RAM with more than 100 million words);</li>
          </ul>
        </div>
      </div>
    </div>
    <!--/ Abstract. -->

  </div>
</section>




<section class="section" id="BibTeX">
  <div class="container is-max-desktop content">
    <h2 class="title">BibTeX</h2>
    <pre><code>@article{VandenBosch+09,
	author = {A. {Van den Bosch} and P. Berck},
	journal = {The Prague Bulletin of Mathematical Linguistics},
	pages = {17--26},
	title = {Memory-based machine translation and language modeling},
	volume = {91},
	year = {2009},
	bdsk-url-1 = {http://ufal.mff.cuni.cz/pbml/91/art-bosch.pdf}}
}</code></pre>
  </div>
</section>


<footer class="footer">
  <div class="container">
    <div class="content has-text-centered">
      <a class="icon-link" href="https://github.com/LanguageMachines/wopr" target="_blank" class="external-link" disabled>
        <i class="fab fa-github"></i>
      </a>
    </div>
    <div class="columns is-centered">
      <div class="column is-8">
        <div class="content">
          <p>
            This website is licensed under a <a rel="license" target="_blank"
                                                href="http://creativecommons.org/licenses/by-sa/4.0/">Creative
            Commons Attribution-ShareAlike 4.0 International License</a>.
          </p>
          <p>
            This websites gladly made use of the <a target="_blank"
              href="https://github.com/nerfies/nerfies.github.io">source code</a> of the Nerfies website. Thanks!
          </p>
        </div>
      </div>
    </div>
  </div>
</footer>

</body>
</html>