wopr / index.html
antalvdb's picture
Update index.html
896a747 verified
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="description"
content="WOPR: Word Predictor. Memory-based language modeling">
<meta name="keywords" content="word prediction, wopr, memory-based learning, timbl, memory-based language modeling">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>WOPR: Memory-based language modeling</title>
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro"
rel="stylesheet">
<link rel="stylesheet" href="./static/css/bulma.min.css">
<link rel="stylesheet" href="./static/css/bulma-carousel.min.css">
<link rel="stylesheet" href="./static/css/bulma-slider.min.css">
<link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
<link rel="stylesheet"
href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<link rel="stylesheet" href="./static/css/index.css">
<link rel="icon" href="./static/images/favicon.svg">
<!-- <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script> -->
<script defer src="./static/js/fontawesome.all.min.js"></script>
<script src="./static/js/bulma-carousel.min.js"></script>
<script src="./static/js/bulma-slider.min.js"></script>
<script src="./static/js/index.js"></script>
</head>
<body>
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h1 class="title is-1 publication-title">WOPR: Memory-based language modeling</h1>
<div class="is-size-5 publication-authors">
<span class="author-block">
<a href="https://antalvandenbosch.nl/" target="_blank">Antal van den Bosch</a><sup>1</sup>,</span>
<span class="author-block">
<a href="https://www.humlab.lu.se/person/PeterBerck/" target="_blank">Peter Berck</a><sup>2</sup>,</span>
</div>
<div class="is-size-5 publication-authors">
<span class="author-block"><sup>1</sup>Utrecht University</span>
<span class="author-block"><sup>2</sup>University of Lund</span>
</div>
<div class="column has-text-centered">
<div class="publication-links">
<!-- PDF Link. -->
<span class="link-block">
<a href="https://berck.se/thesis.pdf" target="_blank"
class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fas fa-file-pdf"></i>
</span>
<span>Thesis</span>
</a>
</span>
<!-- PDF Link. -->
<span class="link-block">
<a href="http://ufal.mff.cuni.cz/pbml/91/art-bosch.pdf" target="_blank"
class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fas fa-file-pdf"></i>
</span>
<span>Paper</span>
</a>
</span>
<!-- Code Link. -->
<span class="link-block">
<a href="https://github.com/LanguageMachines/wopr" target="_blank"
class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fab fa-github"></i>
</span>
<span>Code</span>
</a>
</div>
</div>
</div>
</div>
</div>
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- Abstract. -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">WOPR in brief</h2>
<div class="content has-text-justified">
<p>
WOPR, Word Predictor, is a memory-based language model developed in 2006-2011.
It just woke up from its cryogenic sleep and is figuring out what is
all the fuss about LLMs.
</p>
<p>
WOPR is an ecologically friendly alternative LLM with a staggeringly simple
core. Everyone who took "Machine Learning 101" knows that the <i>k</i>-nearest
neighbor classifier is among the simplest yet most robust ML classifiers out
there, perhaps only beaten by the Naive Bayes classifier. So what happens if
you train a <i>k</i>-NN classifier to predict words?
</p>
<p>
WOPR's engine is the
<a href="https://github.com/LanguageMachines/timbl">TiMBL</a> classifier,
which implements a number of fast approximations of <i>k</i>-NN classification,
all partly based on decision-tree classification. On
tasks like next-word prediction, <i>k</i>-NN is inhibitively slow, but the
<a href="https://github.com/LanguageMachines/timbl">TiMBL</a>
approximations can classify faster at many orders of magnitude.
</p>
<p>
Compared to Transformer-based LLMs, on the plus side memory-based LLMs are
</p>
<ul>
<li>very efficient in training. Training is essentially reading the data (in linear time)
and compressing it into a decision tree structure. This can be done on CPUs,
with sufficient RAM. In short, its <b>ecological footprint is dramatically lower</b>;</li>
<li>pretty efficient in generation when running with the fastest decision-tree
approximations of <i>k</i>-NN classification. <b>This can be done on CPUs as well</b>;</li>
<li>completely transparent in their functioning. There can also be no doubt about
the fact that <b>they memorize training data patterns</b>.</li>
</ul>
<p>On the downside,</p>
<ul>
<li><b>Their performance is currently not as great as current Transformer-based LLMs</b>,
but we have not trained
beyond data set sizes with orders of magnitudes above 100 million words.
Watch this space!</li>
<li>They <b>do not have a delicate attention mechanism</b>, arguably the killer feature
of Transformer-based decoders;</li>
<li>Memory requirements during training are <b>heavy with large datasets</b>
(more than 32 GB RAM with more than 100 million words);</li>
</ul>
</div>
</div>
</div>
<!--/ Abstract. -->
</div>
</section>
<section class="section" id="BibTeX">
<div class="container is-max-desktop content">
<h2 class="title">BibTeX</h2>
<pre><code>@article{VandenBosch+09,
author = {A. {Van den Bosch} and P. Berck},
journal = {The Prague Bulletin of Mathematical Linguistics},
pages = {17--26},
title = {Memory-based machine translation and language modeling},
volume = {91},
year = {2009},
bdsk-url-1 = {http://ufal.mff.cuni.cz/pbml/91/art-bosch.pdf}}
}</code></pre>
</div>
</section>
<footer class="footer">
<div class="container">
<div class="content has-text-centered">
<a class="icon-link" href="https://github.com/LanguageMachines/wopr" target="_blank" class="external-link" disabled>
<i class="fab fa-github"></i>
</a>
</div>
<div class="columns is-centered">
<div class="column is-8">
<div class="content">
<p>
This website is licensed under a <a rel="license" target="_blank"
href="http://creativecommons.org/licenses/by-sa/4.0/">Creative
Commons Attribution-ShareAlike 4.0 International License</a>.
</p>
<p>
This websites gladly made use of the <a target="_blank"
href="https://github.com/nerfies/nerfies.github.io">source code</a> of the Nerfies website. Thanks!
</p>
</div>
</div>
</div>
</div>
</footer>
</body>
</html>