antalvdb commited on
Commit
4fb947b
·
verified ·
1 Parent(s): a5ab2e5

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +9 -6
index.html CHANGED
@@ -100,7 +100,8 @@
100
  <h2 class="title is-3">Abstract</h2>
101
  <div class="content has-text-justified">
102
  <p>
103
- WOPR, Word Predictor, is a memory-based language model developed in 2006-2011.
 
104
  </p>
105
  <p>
106
  A memory-based language model, in this case running on the TiMBL classifier,
@@ -115,16 +116,18 @@
115
  <ul>
116
  <li>very efficient in training. Training is essentially reading the data (in linear time)
117
  and compressing it into a decision tree structure. This can be done on CPUs,
118
- with sufficient RAM;</li>
119
  <li>pretty efficient in generation when running with the fastest decision-tree
120
- approximations of <i>k</i>-NN classification. This can be done on CPUs as well.
121
- Accuracy is traded for speed, however.</li>
 
122
  </ul>
123
  <p>On the downside,</p>
124
  <ul>
125
- <li>Memory requirements during training are heavy with large datasets (>100 million words);</li>
 
126
  <li>Memory-based LLMs are not efficient at generation time when running relatively
127
- slower approximations of <i>k</i>-NN classifiers, trading speed for accuracy.</li>
128
  </ul>
129
  </div>
130
  </div>
 
100
  <h2 class="title is-3">Abstract</h2>
101
  <div class="content has-text-justified">
102
  <p>
103
+ WOPR, Word Predictor, is a memory-based language model developed in 2006-2011,
104
+ and woken up from its cryogenic sleep in a better era.
105
  </p>
106
  <p>
107
  A memory-based language model, in this case running on the TiMBL classifier,
 
116
  <ul>
117
  <li>very efficient in training. Training is essentially reading the data (in linear time)
118
  and compressing it into a decision tree structure. This can be done on CPUs,
119
+ with sufficient RAM. In short, its <b>ecological footprint is dramatically lower</b>;</li>
120
  <li>pretty efficient in generation when running with the fastest decision-tree
121
+ approximations of <i>k</i>-NN classification. This can be done on CPUs as well.</li>
122
+ <li>completely transparent in their functioning. There is also no question about
123
+ the fact that <b>they memorize training data patterns</b>.</li>
124
  </ul>
125
  <p>On the downside,</p>
126
  <ul>
127
+ <li>Memory requirements during training are <b>heavy with large datasets</b>
128
+ (>32 GB RAM with >100 million words);</li>
129
  <li>Memory-based LLMs are not efficient at generation time when running relatively
130
+ slower approximations of <i>k</i>-NN classifiers, <b>trading speed for accuracy</b>.</li>
131
  </ul>
132
  </div>
133
  </div>