Update index.html
Browse files- index.html +7 -4
index.html
CHANGED
@@ -118,18 +118,21 @@
|
|
118 |
<ul>
|
119 |
<li>very efficient in training. Training is essentially reading the data (in linear time)
|
120 |
and compressing it into a decision tree structure. This can be done on CPUs,
|
121 |
-
|
122 |
<li>pretty efficient in generation when running with the fastest decision-tree
|
123 |
approximations of <i>k</i>-NN classification. This can be done on CPUs as well.</li>
|
124 |
<li>completely transparent in their functioning. There is also no question about
|
125 |
-
|
126 |
</ul>
|
127 |
<p>On the downside,</p>
|
128 |
<ul>
|
|
|
|
|
|
|
129 |
<li>Memory requirements during training are <b>heavy with large datasets</b>
|
130 |
-
(
|
131 |
<li>Memory-based LLMs are not efficient at generation time when running relatively
|
132 |
-
|
133 |
</ul>
|
134 |
</div>
|
135 |
</div>
|
|
|
118 |
<ul>
|
119 |
<li>very efficient in training. Training is essentially reading the data (in linear time)
|
120 |
and compressing it into a decision tree structure. This can be done on CPUs,
|
121 |
+
with sufficient RAM. In short, its <b>ecological footprint is dramatically lower</b>;</li>
|
122 |
<li>pretty efficient in generation when running with the fastest decision-tree
|
123 |
approximations of <i>k</i>-NN classification. This can be done on CPUs as well.</li>
|
124 |
<li>completely transparent in their functioning. There is also no question about
|
125 |
+
the fact that <b>they memorize training data patterns</b>.</li>
|
126 |
</ul>
|
127 |
<p>On the downside,</p>
|
128 |
<ul>
|
129 |
+
<li>Not as great as current Transformer-based LLMs, but we have not trained
|
130 |
+
beyond data set sizes with orders of magnitudes above 100 million words.
|
131 |
+
Watch this space!</li>
|
132 |
<li>Memory requirements during training are <b>heavy with large datasets</b>
|
133 |
+
(more than 32 GB RAM with more than 100 million words);</li>
|
134 |
<li>Memory-based LLMs are not efficient at generation time when running relatively
|
135 |
+
slower approximations of <i>k</i>-NN classifiers, <b>trading speed for accuracy</b>.</li>
|
136 |
</ul>
|
137 |
</div>
|
138 |
</div>
|