arad1367 commited on
Commit
1d22dcd
·
verified ·
1 Parent(s): d3628d6

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +1941 -18
index.html CHANGED
@@ -1,19 +1,1942 @@
1
- <!doctype html>
2
- <html>
3
- <head>
4
- <meta charset="utf-8" />
5
- <meta name="viewport" content="width=device-width" />
6
- <title>My static Space</title>
7
- <link rel="stylesheet" href="style.css" />
8
- </head>
9
- <body>
10
- <div class="card">
11
- <h1>Welcome to your static Space!</h1>
12
- <p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
13
- <p>
14
- Also don't forget to check the
15
- <a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
16
- </p>
17
- </div>
18
- </body>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  </html>
 
1
+ <!-- Vector Search Simulation By Pejman Ebrahimi -->
2
+ <!DOCTYPE html>
3
+ <html lang="en">
4
+ <head>
5
+ <meta charset="UTF-8" />
6
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
+ <title>Vector Search Methods Comparison</title>
8
+ <style>
9
+ body {
10
+ font-family: "Segoe UI", Tahoma, Geneva, Verdana, sans-serif;
11
+ line-height: 1.6;
12
+ color: #333;
13
+ max-width: 1200px;
14
+ margin: 0 auto;
15
+ padding: 20px;
16
+ background-color: #f5f7fa;
17
+ }
18
+
19
+ h1,
20
+ h2,
21
+ h3 {
22
+ color: #2c3e50;
23
+ }
24
+
25
+ h1 {
26
+ text-align: center;
27
+ margin-bottom: 40px;
28
+ font-size: 2.2em;
29
+ border-bottom: 2px solid #3498db;
30
+ padding-bottom: 10px;
31
+ }
32
+
33
+ .container {
34
+ display: flex;
35
+ flex-wrap: wrap;
36
+ gap: 20px;
37
+ justify-content: center;
38
+ }
39
+
40
+ .search-type {
41
+ flex: 1 1 500px;
42
+ background: white;
43
+ border-radius: 8px;
44
+ box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
45
+ margin-bottom: 30px;
46
+ overflow: hidden;
47
+ transition: transform 0.2s;
48
+ }
49
+
50
+ .search-type:hover {
51
+ transform: translateY(-5px);
52
+ }
53
+
54
+ .search-header {
55
+ padding: 15px 20px;
56
+ color: white;
57
+ font-weight: bold;
58
+ font-size: 1.2em;
59
+ }
60
+
61
+ .search-content {
62
+ padding: 20px;
63
+ position: relative;
64
+ }
65
+
66
+ .enn .search-header {
67
+ background-color: #3498db;
68
+ }
69
+
70
+ .ann .search-header {
71
+ background-color: #e74c3c;
72
+ }
73
+
74
+ .semantic .search-header {
75
+ background-color: #2ecc71;
76
+ }
77
+
78
+ .sparse .search-header {
79
+ background-color: #9b59b6;
80
+ }
81
+
82
+ .canvas-container {
83
+ position: relative;
84
+ height: 300px;
85
+ width: 100%;
86
+ background: #f8f9fa;
87
+ border: 1px solid #ddd;
88
+ border-radius: 4px;
89
+ margin-bottom: 15px;
90
+ overflow: hidden;
91
+ }
92
+
93
+ canvas {
94
+ display: block;
95
+ }
96
+
97
+ .controls {
98
+ display: flex;
99
+ justify-content: space-between;
100
+ margin-bottom: 15px;
101
+ flex-wrap: wrap;
102
+ gap: 10px;
103
+ }
104
+
105
+ select,
106
+ button {
107
+ padding: 8px 12px;
108
+ border-radius: 4px;
109
+ border: 1px solid #ccc;
110
+ background: white;
111
+ font-size: 14px;
112
+ }
113
+
114
+ button {
115
+ background: #3498db;
116
+ color: white;
117
+ border: none;
118
+ cursor: pointer;
119
+ transition: background 0.2s;
120
+ }
121
+
122
+ button:hover {
123
+ background: #2980b9;
124
+ }
125
+
126
+ .step-display {
127
+ background: #f0f4f8;
128
+ padding: 15px;
129
+ border-radius: 4px;
130
+ margin-top: 15px;
131
+ font-size: 14px;
132
+ }
133
+
134
+ .step-title {
135
+ font-weight: bold;
136
+ margin-bottom: 8px;
137
+ }
138
+
139
+ .step-description {
140
+ color: #555;
141
+ }
142
+
143
+ ul.features {
144
+ padding-left: 20px;
145
+ }
146
+
147
+ .features li {
148
+ margin-bottom: 5px;
149
+ }
150
+
151
+ .distance-formula {
152
+ font-style: italic;
153
+ background: #f0f0f0;
154
+ padding: 5px;
155
+ border-radius: 4px;
156
+ margin: 5px 0;
157
+ display: inline-block;
158
+ }
159
+
160
+ .tooltip {
161
+ position: absolute;
162
+ background: rgba(0, 0, 0, 0.8);
163
+ color: white;
164
+ padding: 5px 10px;
165
+ border-radius: 4px;
166
+ font-size: 12px;
167
+ z-index: 100;
168
+ pointer-events: none;
169
+ display: none;
170
+ }
171
+
172
+ .legend {
173
+ display: flex;
174
+ flex-wrap: wrap;
175
+ gap: 15px;
176
+ margin-top: 10px;
177
+ }
178
+
179
+ .legend-item {
180
+ display: flex;
181
+ align-items: center;
182
+ font-size: 12px;
183
+ }
184
+
185
+ .legend-color {
186
+ width: 12px;
187
+ height: 12px;
188
+ border-radius: 50%;
189
+ margin-right: 5px;
190
+ }
191
+
192
+ .tabs {
193
+ display: flex;
194
+ margin-bottom: 15px;
195
+ }
196
+
197
+ .tab {
198
+ padding: 8px 15px;
199
+ background: #ddd;
200
+ border: none;
201
+ cursor: pointer;
202
+ border-radius: 4px 4px 0 0;
203
+ margin-right: 2px;
204
+ }
205
+
206
+ .tab.active {
207
+ background: #f0f4f8;
208
+ font-weight: bold;
209
+ }
210
+
211
+ .tab-content {
212
+ display: none;
213
+ background: #f0f4f8;
214
+ padding: 15px;
215
+ border-radius: 0 4px 4px 4px;
216
+ }
217
+
218
+ .tab-content.active {
219
+ display: block;
220
+ }
221
+
222
+ table {
223
+ width: 100%;
224
+ border-collapse: collapse;
225
+ margin: 15px 0;
226
+ }
227
+
228
+ table th,
229
+ table td {
230
+ border: 1px solid #ddd;
231
+ padding: 8px;
232
+ text-align: left;
233
+ }
234
+
235
+ table th {
236
+ background-color: #f0f4f8;
237
+ }
238
+
239
+ tr:nth-child(even) {
240
+ background-color: #f8f9fa;
241
+ }
242
+
243
+ .comparison-table {
244
+ margin-top: 40px;
245
+ }
246
+
247
+ /* Responsive adjustments */
248
+ @media (max-width: 768px) {
249
+ .search-type {
250
+ flex: 1 1 100%;
251
+ }
252
+
253
+ .controls {
254
+ flex-direction: column;
255
+ }
256
+ }
257
+ </style>
258
+ </head>
259
+ <body>
260
+ <h1>Vector Search Methods Comparison Simulation - By Pejman Ebrahimi</h1>
261
+
262
+ <div class="container">
263
+ <!-- ENN Search -->
264
+ <div class="search-type enn">
265
+ <div class="search-header">1. Exact Nearest Neighbor Search (ENN)</div>
266
+ <div class="search-content">
267
+ <p>
268
+ Finds the <strong>exact</strong> closest data points to a query by
269
+ calculating distances to all vectors in the dataset.
270
+ </p>
271
+
272
+ <div class="canvas-container">
273
+ <canvas id="ennCanvas" width="460" height="300"></canvas>
274
+ <div id="ennTooltip" class="tooltip"></div>
275
+ </div>
276
+
277
+ <div class="controls">
278
+ <div>
279
+ <label for="ennDistance">Distance Metric:</label>
280
+ <select id="ennDistance">
281
+ <option value="euclidean">Euclidean (L2)</option>
282
+ <option value="manhattan">Manhattan (L1)</option>
283
+ <option value="cosine">Cosine Similarity</option>
284
+ </select>
285
+ </div>
286
+
287
+ <div>
288
+ <label for="ennStep">Step:</label>
289
+ <select id="ennStep">
290
+ <option value="0">0. Data points</option>
291
+ <option value="1">1. Calculate all distances</option>
292
+ <option value="2">2. Sort by distance</option>
293
+ <option value="3">3. Return nearest neighbors</option>
294
+ </select>
295
+ </div>
296
+ </div>
297
+
298
+ <div class="step-display">
299
+ <div class="step-title" id="ennStepTitle">Step 0: Data points</div>
300
+ <div class="step-description" id="ennStepDesc">
301
+ Initial dataset with vectors in feature space. The query point
302
+ (red) will be compared against all data points.
303
+ </div>
304
+ </div>
305
+
306
+ <div class="legend">
307
+ <div class="legend-item">
308
+ <div class="legend-color" style="background: #3498db"></div>
309
+ <span>Dataset Points</span>
310
+ </div>
311
+ <div class="legend-item">
312
+ <div class="legend-color" style="background: #e74c3c"></div>
313
+ <span>Query Point</span>
314
+ </div>
315
+ <div class="legend-item">
316
+ <div class="legend-color" style="background: #2ecc71"></div>
317
+ <span>Nearest Neighbor</span>
318
+ </div>
319
+ </div>
320
+
321
+ <h3>Key Features:</h3>
322
+ <ul class="features">
323
+ <li>100% accuracy - finds the true nearest neighbors</li>
324
+ <li>
325
+ Computationally expensive for large datasets (O(n) complexity)
326
+ </li>
327
+ <li>
328
+ Becomes inefficient in high dimensions (curse of dimensionality)
329
+ </li>
330
+ <li>
331
+ Simple implementation - just calculate all distances and sort
332
+ </li>
333
+ </ul>
334
+ </div>
335
+ </div>
336
+
337
+ <!-- ANN Search -->
338
+ <div class="search-type ann">
339
+ <div class="search-header">
340
+ 2. Approximate Nearest Neighbor Search (ANN)
341
+ </div>
342
+ <div class="search-content">
343
+ <p>
344
+ Sacrifices perfect accuracy for <strong>speed</strong> by using
345
+ efficient data structures to approximate nearest neighbors.
346
+ </p>
347
+
348
+ <div class="canvas-container">
349
+ <canvas id="annCanvas" width="460" height="300"></canvas>
350
+ <div id="annTooltip" class="tooltip"></div>
351
+ </div>
352
+
353
+ <div class="controls">
354
+ <div>
355
+ <label for="annAlgorithm">Algorithm:</label>
356
+ <select id="annAlgorithm">
357
+ <option value="hnsw">Hierarchical NSW</option>
358
+ <option value="pq">Product Quantization</option>
359
+ <option value="lsh">Locality-Sensitive Hashing</option>
360
+ </select>
361
+ </div>
362
+
363
+ <div>
364
+ <label for="annStep">Step:</label>
365
+ <select id="annStep">
366
+ <option value="0">0. Indexed structure</option>
367
+ <option value="1">1. Navigate to region</option>
368
+ <option value="2">2. Local search</option>
369
+ <option value="3">3. Return approximate NN</option>
370
+ </select>
371
+ </div>
372
+ </div>
373
+
374
+ <div class="step-display">
375
+ <div class="step-title" id="annStepTitle">
376
+ Step 0: Indexed structure
377
+ </div>
378
+ <div class="step-description" id="annStepDesc">
379
+ Data is pre-organized into efficient lookup structures that
380
+ cluster or partition the vector space for faster searching.
381
+ </div>
382
+ </div>
383
+
384
+ <div class="legend">
385
+ <div class="legend-item">
386
+ <div class="legend-color" style="background: #3498db"></div>
387
+ <span>Dataset Points</span>
388
+ </div>
389
+ <div class="legend-item">
390
+ <div class="legend-color" style="background: #e74c3c"></div>
391
+ <span>Query Point</span>
392
+ </div>
393
+ <div class="legend-item">
394
+ <div class="legend-color" style="background: #f39c12"></div>
395
+ <span>Search Region</span>
396
+ </div>
397
+ <div class="legend-item">
398
+ <div class="legend-color" style="background: #2ecc71"></div>
399
+ <span>Returned Neighbors</span>
400
+ </div>
401
+ </div>
402
+
403
+ <h3>Key Features:</h3>
404
+ <ul class="features">
405
+ <li>
406
+ Much faster than ENN for large datasets (sub-linear time
407
+ complexity)
408
+ </li>
409
+ <li>Trades accuracy for speed (95-99% accurate typically)</li>
410
+ <li>Requires pre-processing to build index structures</li>
411
+ <li>Various algorithms optimized for different use cases</li>
412
+ </ul>
413
+ </div>
414
+ </div>
415
+
416
+ <!-- Semantic Search -->
417
+ <div class="search-type semantic">
418
+ <div class="search-header">3. Semantic Search</div>
419
+ <div class="search-content">
420
+ <p>
421
+ Uses <strong>meaning</strong> of content rather than keywords by
422
+ searching through dense embedding vectors that capture semantic
423
+ relationships.
424
+ </p>
425
+
426
+ <div class="canvas-container">
427
+ <canvas id="semanticCanvas" width="460" height="300"></canvas>
428
+ <div id="semanticTooltip" class="tooltip"></div>
429
+ </div>
430
+
431
+ <div class="controls">
432
+ <div>
433
+ <label for="semanticModel">Embedding Model:</label>
434
+ <select id="semanticModel">
435
+ <option value="bert">BERT</option>
436
+ <option value="use">Universal Sentence Encoder</option>
437
+ <option value="custom">Domain-Specific</option>
438
+ </select>
439
+ </div>
440
+
441
+ <div>
442
+ <label for="semanticStep">Step:</label>
443
+ <select id="semanticStep">
444
+ <option value="0">0. Text documents</option>
445
+ <option value="1">1. Generate embeddings</option>
446
+ <option value="2">2. Vector similarity search</option>
447
+ <option value="3">3. Return relevant results</option>
448
+ </select>
449
+ </div>
450
+ </div>
451
+
452
+ <div class="step-display">
453
+ <div class="step-title" id="semanticStepTitle">
454
+ Step 0: Text documents
455
+ </div>
456
+ <div class="step-description" id="semanticStepDesc">
457
+ Starting with raw text documents or queries before encoding into
458
+ vector space.
459
+ </div>
460
+ </div>
461
+
462
+ <div class="legend">
463
+ <div class="legend-item">
464
+ <div class="legend-color" style="background: #3498db"></div>
465
+ <span>Document Embeddings</span>
466
+ </div>
467
+ <div class="legend-item">
468
+ <div class="legend-color" style="background: #e74c3c"></div>
469
+ <span>Query Embedding</span>
470
+ </div>
471
+ <div class="legend-item">
472
+ <div class="legend-color" style="background: #2ecc71"></div>
473
+ <span>Semantic Matches</span>
474
+ </div>
475
+ </div>
476
+
477
+ <h3>Key Features:</h3>
478
+ <ul class="features">
479
+ <li>Understands meaning beyond exact keyword matches</li>
480
+ <li>
481
+ Uses dense vector embeddings (typically 768-1536 dimensions)
482
+ </li>
483
+ <li>Trained on large text corpora to capture language patterns</li>
484
+ <li>
485
+ Effective for natural language, images, and multimodal content
486
+ </li>
487
+ <li>Usually implemented with ANN algorithms for efficiency</li>
488
+ </ul>
489
+ </div>
490
+ </div>
491
+
492
+ <!-- Sparse Vector Search -->
493
+ <div class="search-type sparse">
494
+ <div class="search-header">4. Sparse Vector Search</div>
495
+ <div class="search-content">
496
+ <p>
497
+ Uses <strong>high-dimensional sparse vectors</strong> where most
498
+ elements are zero, optimized for keyword and token matching.
499
+ </p>
500
+
501
+ <div class="canvas-container">
502
+ <canvas id="sparseCanvas" width="460" height="300"></canvas>
503
+ <div id="sparseTooltip" class="tooltip"></div>
504
+ </div>
505
+
506
+ <div class="controls">
507
+ <div>
508
+ <label for="sparseModel">Representation:</label>
509
+ <select id="sparseModel">
510
+ <option value="tfidf">TF-IDF</option>
511
+ <option value="bm25">BM25</option>
512
+ <option value="hybrid">Hybrid (Sparse+Dense)</option>
513
+ </select>
514
+ </div>
515
+
516
+ <div>
517
+ <label for="sparseStep">Step:</label>
518
+ <select id="sparseStep">
519
+ <option value="0">0. Tokenized content</option>
520
+ <option value="1">1. Create sparse vectors</option>
521
+ <option value="2">2. Inverted index search</option>
522
+ <option value="3">3. Return matches</option>
523
+ </select>
524
+ </div>
525
+ </div>
526
+
527
+ <div class="step-display">
528
+ <div class="step-title" id="sparseStepTitle">
529
+ Step 0: Tokenized content
530
+ </div>
531
+ <div class="step-description" id="sparseStepDesc">
532
+ Documents broken down into tokens (words/terms) before converting
533
+ to sparse vector representation.
534
+ </div>
535
+ </div>
536
+
537
+ <div class="legend">
538
+ <div class="legend-item">
539
+ <div class="legend-color" style="background: #3498db"></div>
540
+ <span>Vocabulary Dimensions</span>
541
+ </div>
542
+ <div class="legend-item">
543
+ <div class="legend-color" style="background: #e74c3c"></div>
544
+ <span>Query Terms</span>
545
+ </div>
546
+ <div class="legend-item">
547
+ <div class="legend-color" style="background: #2ecc71"></div>
548
+ <span>Matching Terms</span>
549
+ </div>
550
+ </div>
551
+
552
+ <h3>Key Features:</h3>
553
+ <ul class="features">
554
+ <li>Efficient for exact matching and keyword search</li>
555
+ <li>Very high dimensionality (vocabulary size) but mostly zeros</li>
556
+ <li>Uses specialized inverted index for quick lookup</li>
557
+ <li>Good for precision when exact matches are required</li>
558
+ <li>Often combined with semantic search for hybrid approaches</li>
559
+ </ul>
560
+ </div>
561
+ </div>
562
+ </div>
563
+
564
+ <div class="comparison-table">
565
+ <h2>Comparison of Vector Search Methods</h2>
566
+ <table>
567
+ <thead>
568
+ <tr>
569
+ <th>Feature</th>
570
+ <th>Exact NN (ENN)</th>
571
+ <th>Approximate NN (ANN)</th>
572
+ <th>Semantic Search</th>
573
+ <th>Sparse Vector Search</th>
574
+ </tr>
575
+ </thead>
576
+ <tbody>
577
+ <tr>
578
+ <td>Accuracy</td>
579
+ <td>100% exact</td>
580
+ <td>High (95-99%)</td>
581
+ <td>Context dependent</td>
582
+ <td>High for exact matches</td>
583
+ </tr>
584
+ <tr>
585
+ <td>Speed</td>
586
+ <td>Slow (O(n))</td>
587
+ <td>Fast (sub-linear)</td>
588
+ <td>Moderate to fast</td>
589
+ <td>Very fast for keywords</td>
590
+ </tr>
591
+ <tr>
592
+ <td>Scalability</td>
593
+ <td>Poor</td>
594
+ <td>Good</td>
595
+ <td>Good with ANN</td>
596
+ <td>Excellent</td>
597
+ </tr>
598
+ <tr>
599
+ <td>Vector Type</td>
600
+ <td>Dense or Sparse</td>
601
+ <td>Usually Dense</td>
602
+ <td>Dense</td>
603
+ <td>Sparse</td>
604
+ </tr>
605
+ <tr>
606
+ <td>Use Cases</td>
607
+ <td>Small datasets, high precision required</td>
608
+ <td>Large-scale vector search, recommenders</td>
609
+ <td>NLP, content discovery, similar item search</td>
610
+ <td>Search engines, document retrieval</td>
611
+ </tr>
612
+ <tr>
613
+ <td>Common Metrics</td>
614
+ <td>Euclidean, Manhattan, Cosine</td>
615
+ <td>Euclidean, Inner Product, Cosine</td>
616
+ <td>Cosine, Dot Product</td>
617
+ <td>Jaccard, BM25, TF-IDF</td>
618
+ </tr>
619
+ <tr>
620
+ <td>Dimensions</td>
621
+ <td>Any</td>
622
+ <td>Moderate to high</td>
623
+ <td>High (768-1536 typical)</td>
624
+ <td>Very high (vocabulary size)</td>
625
+ </tr>
626
+ <tr>
627
+ <td>Example Tools</td>
628
+ <td>SciPy, NumPy</td>
629
+ <td>FAISS, Annoy, HNSW</td>
630
+ <td>Pinecone, Weaviate, Milvus</td>
631
+ <td>Elasticsearch, Lucene</td>
632
+ </tr>
633
+ </tbody>
634
+ </table>
635
+ </div>
636
+
637
+ <script>
638
+ // Common data and utility functions
639
+ const dataPoints = [
640
+ { id: 1, x: 80, y: 70, label: "P1" },
641
+ { id: 2, x: 160, y: 120, label: "P2" },
642
+ { id: 3, x: 240, y: 60, label: "P3" },
643
+ { id: 4, x: 300, y: 180, label: "P4" },
644
+ { id: 5, x: 400, y: 90, label: "P5" },
645
+ { id: 6, x: 180, y: 220, label: "P6" },
646
+ { id: 7, x: 320, y: 260, label: "P7" },
647
+ { id: 8, x: 370, y: 150, label: "P8" },
648
+ { id: 9, x: 130, y: 180, label: "P9" },
649
+ ];
650
+
651
+ const queryPoint = { x: 220, y: 140, label: "Q" };
652
+
653
+ // Semantic search "documents"
654
+ const semanticDocs = [
655
+ { id: 1, text: "How to train a dog", embedding: [0.2, 0.7] },
656
+ { id: 2, text: "Dog training techniques", embedding: [0.25, 0.65] },
657
+ { id: 3, text: "Cat behavior explained", embedding: [0.7, 0.3] },
658
+ { id: 4, text: "Pet care for beginners", embedding: [0.4, 0.5] },
659
+ { id: 5, text: "Feline health issues", embedding: [0.8, 0.2] },
660
+ { id: 6, text: "Training puppies at home", embedding: [0.15, 0.75] },
661
+ { id: 7, text: "Bird watching guide", embedding: [0.9, 0.7] },
662
+ { id: 8, text: "Exotic pet ownership", embedding: [0.6, 0.8] },
663
+ { id: 9, text: "Dog breeds comparison", embedding: [0.3, 0.6] },
664
+ ];
665
+
666
+ const semanticQuery = {
667
+ text: "How to train my puppy",
668
+ embedding: [0.2, 0.8],
669
+ };
670
+
671
+ // Sparse vector "documents"
672
+ const vocabulary = [
673
+ "dog",
674
+ "cat",
675
+ "train",
676
+ "pet",
677
+ "health",
678
+ "food",
679
+ "guide",
680
+ "home",
681
+ "behavior",
682
+ "puppy",
683
+ ];
684
+
685
+ const sparseVectors = [
686
+ {
687
+ id: 1,
688
+ text: "Dog training guide",
689
+ vector: [0.8, 0, 0.7, 0.1, 0, 0, 0.3, 0, 0, 0],
690
+ },
691
+ {
692
+ id: 2,
693
+ text: "Cat health and food",
694
+ vector: [0, 0.9, 0, 0.2, 0.7, 0.6, 0, 0, 0, 0],
695
+ },
696
+ {
697
+ id: 3,
698
+ text: "Puppy behavior at home",
699
+ vector: [0.3, 0, 0, 0, 0, 0, 0, 0.7, 0.8, 0.9],
700
+ },
701
+ {
702
+ id: 4,
703
+ text: "Pet food guide",
704
+ vector: [0, 0, 0, 0.7, 0, 0.8, 0.6, 0, 0, 0],
705
+ },
706
+ {
707
+ id: 5,
708
+ text: "Cat and dog behavior",
709
+ vector: [0.5, 0.5, 0, 0, 0, 0, 0, 0, 0.9, 0],
710
+ },
711
+ {
712
+ id: 6,
713
+ text: "Training your puppy",
714
+ vector: [0, 0, 0.8, 0, 0, 0, 0, 0, 0, 0.8],
715
+ },
716
+ ];
717
+
718
+ const sparseQuery = {
719
+ text: "dog training puppies",
720
+ vector: [0.6, 0, 0.7, 0, 0, 0, 0, 0, 0, 0.5],
721
+ };
722
+
723
+ // Distance functions
724
+ function euclideanDistance(p1, p2) {
725
+ return Math.sqrt(Math.pow(p1.x - p2.x, 2) + Math.pow(p1.y - p2.y, 2));
726
+ }
727
+
728
+ function manhattanDistance(p1, p2) {
729
+ return Math.abs(p1.x - p2.x) + Math.abs(p1.y - p2.y);
730
+ }
731
+
732
+ function cosineDistance(p1, p2) {
733
+ // Convert to vectors from origin
734
+ const dotProduct = p1.x * p2.x + p1.y * p2.y;
735
+ const mag1 = Math.sqrt(p1.x * p1.x + p1.y * p1.y);
736
+ const mag2 = Math.sqrt(p2.x * p2.x + p2.y * p2.y);
737
+ return 1 - dotProduct / (mag1 * mag2);
738
+ }
739
+
740
+ function cosineSimilarity(v1, v2) {
741
+ let dotProduct = 0;
742
+ let mag1 = 0;
743
+ let mag2 = 0;
744
+
745
+ for (let i = 0; i < v1.length; i++) {
746
+ dotProduct += v1[i] * v2[i];
747
+ mag1 += v1[i] * v1[i];
748
+ mag2 += v2[i] * v2[i];
749
+ }
750
+
751
+ mag1 = Math.sqrt(mag1);
752
+ mag2 = Math.sqrt(mag2);
753
+
754
+ return dotProduct / (mag1 * mag2);
755
+ }
756
+
757
+ // ENN Canvas Setup
758
+ const ennCanvas = document.getElementById("ennCanvas");
759
+ const ennCtx = ennCanvas.getContext("2d");
760
+ const ennDistanceSelect = document.getElementById("ennDistance");
761
+ const ennStepSelect = document.getElementById("ennStep");
762
+ const ennStepTitle = document.getElementById("ennStepTitle");
763
+ const ennStepDesc = document.getElementById("ennStepDesc");
764
+ const ennTooltip = document.getElementById("ennTooltip");
765
+
766
+ // ANN Canvas Setup
767
+ const annCanvas = document.getElementById("annCanvas");
768
+ const annCtx = annCanvas.getContext("2d");
769
+ const annAlgorithmSelect = document.getElementById("annAlgorithm");
770
+ const annStepSelect = document.getElementById("annStep");
771
+ const annStepTitle = document.getElementById("annStepTitle");
772
+ const annStepDesc = document.getElementById("annStepDesc");
773
+ const annTooltip = document.getElementById("annTooltip");
774
+
775
+ // Semantic Canvas Setup
776
+ const semanticCanvas = document.getElementById("semanticCanvas");
777
+ const semanticCtx = semanticCanvas.getContext("2d");
778
+ const semanticModelSelect = document.getElementById("semanticModel");
779
+ const semanticStepSelect = document.getElementById("semanticStep");
780
+ const semanticStepTitle = document.getElementById("semanticStepTitle");
781
+ const semanticStepDesc = document.getElementById("semanticStepDesc");
782
+ const semanticTooltip = document.getElementById("semanticTooltip");
783
+
784
+ // Sparse Canvas Setup
785
+ const sparseCanvas = document.getElementById("sparseCanvas");
786
+ const sparseCtx = sparseCanvas.getContext("2d");
787
+ const sparseModelSelect = document.getElementById("sparseModel");
788
+ const sparseStepSelect = document.getElementById("sparseStep");
789
+ const sparseStepTitle = document.getElementById("sparseStepTitle");
790
+ const sparseStepDesc = document.getElementById("sparseStepDesc");
791
+ const sparseTooltip = document.getElementById("sparseTooltip");
792
+
793
+ // Event listeners for ENN
794
+ ennDistanceSelect.addEventListener("change", renderENNSearch);
795
+ ennStepSelect.addEventListener("change", renderENNSearch);
796
+
797
+ // Event listeners for ANN
798
+ annAlgorithmSelect.addEventListener("change", renderANNSearch);
799
+ annStepSelect.addEventListener("change", renderANNSearch);
800
+
801
+ // Event listeners for Semantic Search
802
+ semanticModelSelect.addEventListener("change", renderSemanticSearch);
803
+ semanticStepSelect.addEventListener("change", renderSemanticSearch);
804
+
805
+ // Event listeners for Sparse Vector Search
806
+ sparseModelSelect.addEventListener("change", renderSparseSearch);
807
+ sparseStepSelect.addEventListener("change", renderSparseSearch);
808
+
809
+ // Draw all visualizations initially
810
+ renderENNSearch();
811
+ renderANNSearch();
812
+ renderSemanticSearch();
813
+ renderSparseSearch();
814
+
815
+ // ENN Search visualization
816
+ function renderENNSearch() {
817
+ const distanceMetric = ennDistanceSelect.value;
818
+ const step = parseInt(ennStepSelect.value);
819
+
820
+ // Clear canvas
821
+ ennCtx.clearRect(0, 0, ennCanvas.width, ennCanvas.height);
822
+
823
+ // Draw grid
824
+ drawGrid(ennCtx);
825
+
826
+ // Calculate distances based on selected metric
827
+ let distances = dataPoints.map((point) => {
828
+ let dist;
829
+ if (distanceMetric === "euclidean") {
830
+ dist = euclideanDistance(point, queryPoint);
831
+ } else if (distanceMetric === "manhattan") {
832
+ dist = manhattanDistance(point, queryPoint);
833
+ } else if (distanceMetric === "cosine") {
834
+ dist = cosineDistance(point, queryPoint);
835
+ }
836
+ return { ...point, distance: dist };
837
+ });
838
+
839
+ // Sort by distance
840
+ let sortedPoints = [...distances].sort(
841
+ (a, b) => a.distance - b.distance
842
+ );
843
+
844
+ // Draw data points
845
+ dataPoints.forEach((point) => {
846
+ drawPoint(ennCtx, point.x, point.y, "#3498db", point.label);
847
+ });
848
+
849
+ // Draw query point
850
+ drawPoint(
851
+ ennCtx,
852
+ queryPoint.x,
853
+ queryPoint.y,
854
+ "#e74c3c",
855
+ queryPoint.label,
856
+ 12
857
+ );
858
+
859
+ // Step-specific rendering
860
+ if (step >= 1) {
861
+ // Draw distance lines to all points
862
+ distances.forEach((point) => {
863
+ drawLine(
864
+ ennCtx,
865
+ queryPoint.x,
866
+ queryPoint.y,
867
+ point.x,
868
+ point.y,
869
+ "#aaa",
870
+ [3, 3]
871
+ );
872
+
873
+ // Draw distance value
874
+ const midX = (queryPoint.x + point.x) / 2;
875
+ const midY = (queryPoint.y + point.y) / 2;
876
+ ennCtx.fillStyle = "#555";
877
+ ennCtx.font = "11px Arial";
878
+ ennCtx.textAlign = "center";
879
+ ennCtx.fillText(point.distance.toFixed(1), midX, midY);
880
+ });
881
+ }
882
+
883
+ if (step >= 2) {
884
+ // Visualize sorting by distance
885
+ let yPos = 20;
886
+ ennCtx.fillStyle = "#333";
887
+ ennCtx.font = "12px Arial";
888
+ ennCtx.textAlign = "left";
889
+ ennCtx.fillText("Sorted by distance:", 10, yPos);
890
+
891
+ for (let i = 0; i < Math.min(5, sortedPoints.length); i++) {
892
+ yPos += 15;
893
+ ennCtx.fillText(
894
+ `${i + 1}. ${sortedPoints[i].label} (${sortedPoints[
895
+ i
896
+ ].distance.toFixed(1)})`,
897
+ 15,
898
+ yPos
899
+ );
900
+ }
901
+ }
902
+
903
+ if (step >= 3) {
904
+ // Highlight nearest neighbor(s)
905
+ const nearest = sortedPoints[0];
906
+ drawPoint(
907
+ ennCtx,
908
+ nearest.x,
909
+ nearest.y,
910
+ "#3498db",
911
+ nearest.label,
912
+ 10,
913
+ "#2ecc71",
914
+ 3
915
+ );
916
+ drawLine(
917
+ ennCtx,
918
+ queryPoint.x,
919
+ queryPoint.y,
920
+ nearest.x,
921
+ nearest.y,
922
+ "#2ecc71",
923
+ [],
924
+ 2
925
+ );
926
+
927
+ // Draw threshold for the nearest distance
928
+ if (distanceMetric === "euclidean") {
929
+ ennCtx.beginPath();
930
+ ennCtx.arc(
931
+ queryPoint.x,
932
+ queryPoint.y,
933
+ nearest.distance,
934
+ 0,
935
+ Math.PI * 2
936
+ );
937
+ ennCtx.strokeStyle = "rgba(231, 76, 60, 0.4)";
938
+ ennCtx.stroke();
939
+ ennCtx.fillStyle = "rgba(231, 76, 60, 0.05)";
940
+ ennCtx.fill();
941
+ } else if (distanceMetric === "manhattan") {
942
+ // Draw diamond shape
943
+ ennCtx.beginPath();
944
+ ennCtx.moveTo(queryPoint.x, queryPoint.y - nearest.distance);
945
+ ennCtx.lineTo(queryPoint.x + nearest.distance, queryPoint.y);
946
+ ennCtx.lineTo(queryPoint.x, queryPoint.y + nearest.distance);
947
+ ennCtx.lineTo(queryPoint.x - nearest.distance, queryPoint.y);
948
+ ennCtx.closePath();
949
+ ennCtx.strokeStyle = "rgba(231, 76, 60, 0.4)";
950
+ ennCtx.stroke();
951
+ ennCtx.fillStyle = "rgba(231, 76, 60, 0.05)";
952
+ ennCtx.fill();
953
+ } else if (distanceMetric === "cosine") {
954
+ // Complicated to visualize in 2D space, show a text note
955
+ ennCtx.fillStyle = "rgba(231, 76, 60, 0.7)";
956
+ ennCtx.fillText(
957
+ "Cosine similarity measures angle between vectors",
958
+ 250,
959
+ 30
960
+ );
961
+ ennCtx.fillText("smaller angle = more similar", 250, 45);
962
+ }
963
+ }
964
+
965
+ // Update step description
966
+ updateENNStepInfo(step, distanceMetric);
967
+ }
968
+
969
+ // ANN Search visualization
970
+ function renderANNSearch() {
971
+ const algorithm = annAlgorithmSelect.value;
972
+ const step = parseInt(annStepSelect.value);
973
+
974
+ // Clear canvas
975
+ annCtx.clearRect(0, 0, annCanvas.width, annCanvas.height);
976
+
977
+ // Draw grid
978
+ drawGrid(annCtx);
979
+
980
+ // Draw data points
981
+ dataPoints.forEach((point) => {
982
+ drawPoint(annCtx, point.x, point.y, "#3498db", point.label);
983
+ });
984
+
985
+ // Draw query point
986
+ drawPoint(
987
+ annCtx,
988
+ queryPoint.x,
989
+ queryPoint.y,
990
+ "#e74c3c",
991
+ queryPoint.label,
992
+ 12
993
+ );
994
+
995
+ // Step-specific rendering based on algorithm
996
+ if (algorithm === "hnsw") {
997
+ renderHNSW(annCtx, step);
998
+ } else if (algorithm === "pq") {
999
+ renderProductQuantization(annCtx, step);
1000
+ } else if (algorithm === "lsh") {
1001
+ renderLSH(annCtx, step);
1002
+ }
1003
+
1004
+ // Update step description
1005
+ updateANNStepInfo(step, algorithm);
1006
+ }
1007
+
1008
+ // Semantic Search visualization
1009
+ function renderSemanticSearch() {
1010
+ const model = semanticModelSelect.value;
1011
+ const step = parseInt(semanticStepSelect.value);
1012
+
1013
+ // Clear canvas
1014
+ semanticCtx.clearRect(
1015
+ 0,
1016
+ 0,
1017
+ semanticCanvas.width,
1018
+ semanticCanvas.height
1019
+ );
1020
+
1021
+ if (step === 0) {
1022
+ // Show text documents
1023
+ drawTextDocuments(semanticCtx, semanticDocs, semanticQuery);
1024
+ } else {
1025
+ // Draw embedding space
1026
+ drawGrid(semanticCtx);
1027
+
1028
+ // Draw document embeddings (2D projection)
1029
+ semanticDocs.forEach((doc) => {
1030
+ // Scale to canvas
1031
+ const x = doc.embedding[0] * 400 + 30;
1032
+ const y = (1 - doc.embedding[1]) * 250 + 20;
1033
+ drawPoint(semanticCtx, x, y, "#3498db", `D${doc.id}`);
1034
+ });
1035
+
1036
+ // Draw query embedding
1037
+ const qx = semanticQuery.embedding[0] * 400 + 30;
1038
+ const qy = (1 - semanticQuery.embedding[1]) * 250 + 20;
1039
+ drawPoint(semanticCtx, qx, qy, "#e74c3c", "Q", 12);
1040
+
1041
+ if (step >= 2) {
1042
+ // Calculate similarities
1043
+ const similarities = semanticDocs
1044
+ .map((doc) => ({
1045
+ ...doc,
1046
+ similarity: cosineSimilarity(
1047
+ doc.embedding,
1048
+ semanticQuery.embedding
1049
+ ),
1050
+ }))
1051
+ .sort((a, b) => b.similarity - a.similarity);
1052
+
1053
+ // Draw lines to most similar docs
1054
+ for (let i = 0; i < 3; i++) {
1055
+ const doc = similarities[i];
1056
+ const dx = doc.embedding[0] * 400 + 30;
1057
+ const dy = (1 - doc.embedding[1]) * 250 + 20;
1058
+
1059
+ const lineWidth = 3 - i;
1060
+ drawLine(semanticCtx, qx, qy, dx, dy, "#2ecc71", [], lineWidth);
1061
+
1062
+ // Highlight the similar document
1063
+ drawPoint(
1064
+ semanticCtx,
1065
+ dx,
1066
+ dy,
1067
+ "#3498db",
1068
+ `D${doc.id}`,
1069
+ 10,
1070
+ "#2ecc71",
1071
+ 2
1072
+ );
1073
+
1074
+ // Show similarity score
1075
+ const midX = (qx + dx) / 2;
1076
+ const midY = (qy + dy) / 2 - 10;
1077
+ semanticCtx.fillStyle = "#555";
1078
+ semanticCtx.font = "11px Arial";
1079
+ semanticCtx.textAlign = "center";
1080
+ semanticCtx.fillText(doc.similarity.toFixed(2), midX, midY);
1081
+ }
1082
+
1083
+ if (step >= 3) {
1084
+ // Display top results
1085
+ let yPos = 20;
1086
+ semanticCtx.fillStyle = "#333";
1087
+ semanticCtx.font = "12px Arial";
1088
+ semanticCtx.textAlign = "left";
1089
+ semanticCtx.fillText("Top matches:", 10, yPos);
1090
+
1091
+ for (let i = 0; i < Math.min(3, similarities.length); i++) {
1092
+ yPos += 15;
1093
+ semanticCtx.fillText(
1094
+ `${similarities[i].text} (${similarities[
1095
+ i
1096
+ ].similarity.toFixed(2)})`,
1097
+ 15,
1098
+ yPos
1099
+ );
1100
+ }
1101
+ }
1102
+ }
1103
+ }
1104
+
1105
+ // Update step description
1106
+ updateSemanticStepInfo(step, model);
1107
+ }
1108
+
1109
+ // Sparse Vector Search visualization
1110
+ function renderSparseSearch() {
1111
+ const model = sparseModelSelect.value;
1112
+ const step = parseInt(sparseStepSelect.value);
1113
+
1114
+ // Clear canvas
1115
+ sparseCtx.clearRect(0, 0, sparseCanvas.width, sparseCanvas.height);
1116
+
1117
+ if (step === 0) {
1118
+ // Show text documents with highlighted tokens
1119
+ drawTokenizedDocuments(sparseCtx, sparseVectors, sparseQuery);
1120
+ } else {
1121
+ // Draw sparse vectors visualization
1122
+ drawSparseVectors(sparseCtx, sparseVectors, sparseQuery, step, model);
1123
+
1124
+ if (step >= 2) {
1125
+ // Calculate matching scores
1126
+ const matches = sparseVectors
1127
+ .map((doc) => {
1128
+ let score = 0;
1129
+ for (let i = 0; i < doc.vector.length; i++) {
1130
+ score += doc.vector[i] * sparseQuery.vector[i];
1131
+ }
1132
+ return { ...doc, score };
1133
+ })
1134
+ .sort((a, b) => b.score - a.score);
1135
+
1136
+ if (step >= 3) {
1137
+ // Display top results
1138
+ let yPos = 20;
1139
+ sparseCtx.fillStyle = "#333";
1140
+ sparseCtx.font = "12px Arial";
1141
+ sparseCtx.textAlign = "left";
1142
+ sparseCtx.fillText("Top matches:", 300, yPos);
1143
+
1144
+ for (let i = 0; i < Math.min(3, matches.length); i++) {
1145
+ yPos += 15;
1146
+ sparseCtx.fillText(
1147
+ `${matches[i].text} (${matches[i].score.toFixed(2)})`,
1148
+ 300,
1149
+ yPos
1150
+ );
1151
+ }
1152
+ }
1153
+ }
1154
+ }
1155
+
1156
+ // Update step description
1157
+ updateSparseStepInfo(step, model);
1158
+ }
1159
+
1160
+ // Algorithm-specific renderers for ANN
1161
+ function renderHNSW(ctx, step) {
1162
+ if (step >= 1) {
1163
+ // Draw HNSW layers
1164
+ ctx.strokeStyle = "#f39c12";
1165
+ ctx.lineWidth = 1;
1166
+
1167
+ // Top layer (sparse connections)
1168
+ const topLayer = [dataPoints[2], dataPoints[4], dataPoints[7]];
1169
+ topLayer.forEach((p1, i) => {
1170
+ topLayer.forEach((p2, j) => {
1171
+ if (i !== j) {
1172
+ drawLine(ctx, p1.x, p1.y, p2.x, p2.y, "#f39c12", [2, 2], 1);
1173
+ }
1174
+ });
1175
+ });
1176
+
1177
+ // Middle layer (more connections)
1178
+ if (step >= 2) {
1179
+ const midLayer = [
1180
+ dataPoints[1],
1181
+ dataPoints[2],
1182
+ dataPoints[4],
1183
+ dataPoints[6],
1184
+ dataPoints[7],
1185
+ ];
1186
+ midLayer.forEach((p1, i) => {
1187
+ let connections = 0;
1188
+ midLayer.forEach((p2, j) => {
1189
+ if (i !== j && connections < 3) {
1190
+ drawLine(ctx, p1.x, p1.y, p2.x, p2.y, "#f39c12", [], 1);
1191
+ connections++;
1192
+ }
1193
+ });
1194
+ });
1195
+
1196
+ // Entry point search
1197
+ const entryPoint = dataPoints[4]; // An arbitrary entry point - Error is solved
1198
+ drawPoint(
1199
+ ctx,
1200
+ entryPoint.x,
1201
+ entryPoint.y,
1202
+ "#3498db",
1203
+ entryPoint.label,
1204
+ 10,
1205
+ "#f39c12",
1206
+ 2
1207
+ );
1208
+ drawLine(
1209
+ ctx,
1210
+ queryPoint.x,
1211
+ queryPoint.y,
1212
+ entryPoint.x,
1213
+ entryPoint.y,
1214
+ "#f39c12",
1215
+ [],
1216
+ 2
1217
+ );
1218
+ }
1219
+
1220
+ if (step >= 3) {
1221
+ // Show local greedy search path
1222
+ const searchPath = [
1223
+ dataPoints[4],
1224
+ dataPoints[7],
1225
+ dataPoints[6],
1226
+ dataPoints[2],
1227
+ ];
1228
+
1229
+ for (let i = 0; i < searchPath.length - 1; i++) {
1230
+ const p1 = searchPath[i];
1231
+ const p2 = searchPath[i + 1];
1232
+ drawLine(ctx, p1.x, p1.y, p2.x, p2.y, "#e74c3c", [], 2);
1233
+
1234
+ if (i < searchPath.length - 2) {
1235
+ drawPoint(
1236
+ ctx,
1237
+ p1.x,
1238
+ p1.y,
1239
+ "#3498db",
1240
+ p1.label,
1241
+ 10,
1242
+ "#f39c3c",
1243
+ 2
1244
+ );
1245
+ }
1246
+ }
1247
+
1248
+ // Final result
1249
+ const nearest = dataPoints[2];
1250
+ drawPoint(
1251
+ ctx,
1252
+ nearest.x,
1253
+ nearest.y,
1254
+ "#3498db",
1255
+ nearest.label,
1256
+ 10,
1257
+ "#2ecc71",
1258
+ 3
1259
+ );
1260
+ drawLine(
1261
+ ctx,
1262
+ queryPoint.x,
1263
+ queryPoint.y,
1264
+ nearest.x,
1265
+ nearest.y,
1266
+ "#2ecc71",
1267
+ [],
1268
+ 2
1269
+ );
1270
+ }
1271
+ }
1272
+ }
1273
+
1274
+ function renderProductQuantization(ctx, step) {
1275
+ if (step >= 1) {
1276
+ // Draw PQ centroids and quantized regions
1277
+
1278
+ // Split canvas into 4 regions (simple quantization visualization)
1279
+ ctx.strokeStyle = "#f39c12";
1280
+ ctx.lineWidth = 2;
1281
+ ctx.setLineDash([]);
1282
+
1283
+ // Vertical split
1284
+ ctx.beginPath();
1285
+ ctx.moveTo(ennCanvas.width / 2, 0);
1286
+ ctx.lineTo(ennCanvas.width / 2, ennCanvas.height);
1287
+ ctx.stroke();
1288
+
1289
+ // Horizontal split
1290
+ ctx.beginPath();
1291
+ ctx.moveTo(0, ennCanvas.height / 2);
1292
+ ctx.lineTo(ennCanvas.width, ennCanvas.height / 2);
1293
+ ctx.stroke();
1294
+
1295
+ // Label regions
1296
+ ctx.fillStyle = "#f39c12";
1297
+ ctx.font = "12px Arial";
1298
+ ctx.textAlign = "center";
1299
+ ctx.fillText("Region 1", ennCanvas.width / 4, ennCanvas.height / 4);
1300
+ ctx.fillText(
1301
+ "Region 2",
1302
+ (3 * ennCanvas.width) / 4,
1303
+ ennCanvas.height / 4
1304
+ );
1305
+ ctx.fillText(
1306
+ "Region 3",
1307
+ ennCanvas.width / 4,
1308
+ (3 * ennCanvas.height) / 4
1309
+ );
1310
+ ctx.fillText(
1311
+ "Region 4",
1312
+ (3 * ennCanvas.width) / 4,
1313
+ (3 * ennCanvas.height) / 4
1314
+ );
1315
+
1316
+ if (step >= 2) {
1317
+ // Identify query region
1318
+ let queryRegion;
1319
+ if (queryPoint.x < ennCanvas.width / 2) {
1320
+ if (queryPoint.y < ennCanvas.height / 2) {
1321
+ queryRegion = 1;
1322
+ } else {
1323
+ queryRegion = 3;
1324
+ }
1325
+ } else {
1326
+ if (queryPoint.y < ennCanvas.height / 2) {
1327
+ queryRegion = 2;
1328
+ } else {
1329
+ queryRegion = 4;
1330
+ }
1331
+ }
1332
+
1333
+ // Highlight query region
1334
+ ctx.fillStyle = "rgba(243, 156, 18, 0.1)";
1335
+ if (queryRegion === 1) {
1336
+ ctx.fillRect(0, 0, ennCanvas.width / 2, ennCanvas.height / 2);
1337
+ } else if (queryRegion === 2) {
1338
+ ctx.fillRect(
1339
+ ennCanvas.width / 2,
1340
+ 0,
1341
+ ennCanvas.width / 2,
1342
+ ennCanvas.height / 2
1343
+ );
1344
+ } else if (queryRegion === 3) {
1345
+ ctx.fillRect(
1346
+ 0,
1347
+ ennCanvas.height / 2,
1348
+ ennCanvas.width / 2,
1349
+ ennCanvas.height / 2
1350
+ );
1351
+ } else {
1352
+ ctx.fillRect(
1353
+ ennCanvas.width / 2,
1354
+ ennCanvas.height / 2,
1355
+ ennCanvas.width / 2,
1356
+ ennCanvas.height / 2
1357
+ );
1358
+ }
1359
+
1360
+ // Only search points in that region
1361
+ const pointsInRegion = dataPoints.filter((p) => {
1362
+ const region =
1363
+ p.x < ennCanvas.width / 2
1364
+ ? p.y < ennCanvas.height / 2
1365
+ ? 1
1366
+ : 3
1367
+ : p.y < ennCanvas.height / 2
1368
+ ? 2
1369
+ : 4;
1370
+ return region === queryRegion;
1371
+ });
1372
+
1373
+ // Draw lines to only those points
1374
+ pointsInRegion.forEach((point) => {
1375
+ drawLine(
1376
+ ctx,
1377
+ queryPoint.x,
1378
+ queryPoint.y,
1379
+ point.x,
1380
+ point.y,
1381
+ "#aaa",
1382
+ [3, 3]
1383
+ );
1384
+ });
1385
+ }
1386
+
1387
+ if (step >= 3) {
1388
+ // Find approximated nearest (would be from the shortlisted region)
1389
+ const distances = dataPoints.map((point) => ({
1390
+ ...point,
1391
+ distance: euclideanDistance(point, queryPoint),
1392
+ }));
1393
+
1394
+ // Filter to correct region first
1395
+ let queryRegion;
1396
+ if (queryPoint.x < ennCanvas.width / 2) {
1397
+ if (queryPoint.y < ennCanvas.height / 2) {
1398
+ queryRegion = 1;
1399
+ } else {
1400
+ queryRegion = 3;
1401
+ }
1402
+ } else {
1403
+ if (queryPoint.y < ennCanvas.height / 2) {
1404
+ queryRegion = 2;
1405
+ } else {
1406
+ queryRegion = 4;
1407
+ }
1408
+ }
1409
+
1410
+ const pointsInRegion = distances.filter((p) => {
1411
+ const region =
1412
+ p.x < ennCanvas.width / 2
1413
+ ? p.y < ennCanvas.height / 2
1414
+ ? 1
1415
+ : 3
1416
+ : p.y < ennCanvas.height / 2
1417
+ ? 2
1418
+ : 4;
1419
+ return region === queryRegion;
1420
+ });
1421
+
1422
+ // Sort to find nearest in region
1423
+ const nearest = pointsInRegion.sort(
1424
+ (a, b) => a.distance - b.distance
1425
+ )[0];
1426
+
1427
+ // Highlight approximate nearest neighbor
1428
+ drawPoint(
1429
+ ctx,
1430
+ nearest.x,
1431
+ nearest.y,
1432
+ "#3498db",
1433
+ nearest.label,
1434
+ 10,
1435
+ "#2ecc71",
1436
+ 3
1437
+ );
1438
+ drawLine(
1439
+ ctx,
1440
+ queryPoint.x,
1441
+ queryPoint.y,
1442
+ nearest.x,
1443
+ nearest.y,
1444
+ "#2ecc71",
1445
+ [],
1446
+ 2
1447
+ );
1448
+
1449
+ // Check if it's actually the true nearest neighbor
1450
+ const trueNearest = distances.sort(
1451
+ (a, b) => a.distance - b.distance
1452
+ )[0];
1453
+ if (nearest.id !== trueNearest.id) {
1454
+ // Show actual nearest as reference
1455
+ drawPoint(
1456
+ ctx,
1457
+ trueNearest.x,
1458
+ trueNearest.y,
1459
+ "#3498db",
1460
+ trueNearest.label,
1461
+ 10,
1462
+ "#e74c3c",
1463
+ 2
1464
+ );
1465
+ drawLine(
1466
+ ctx,
1467
+ queryPoint.x,
1468
+ queryPoint.y,
1469
+ trueNearest.x,
1470
+ trueNearest.y,
1471
+ "#e74c3c",
1472
+ [5, 5],
1473
+ 1
1474
+ );
1475
+
1476
+ ctx.fillStyle = "#e74c3c";
1477
+ ctx.font = "12px Arial";
1478
+ ctx.textAlign = "left";
1479
+ ctx.fillText("Approximation error", 10, 20);
1480
+ ctx.fillText(`True nearest: ${trueNearest.label}`, 10, 35);
1481
+ } else {
1482
+ ctx.fillStyle = "#2ecc71";
1483
+ ctx.font = "12px Arial";
1484
+ ctx.textAlign = "left";
1485
+ ctx.fillText("Correct match", 10, 20);
1486
+ }
1487
+ }
1488
+ }
1489
+ }
1490
+
1491
+ // Helper functions for visualizations
1492
+ function drawGrid(ctx) {
1493
+ ctx.strokeStyle = "#e0e0e0";
1494
+ ctx.lineWidth = 0.5;
1495
+
1496
+ // Vertical lines
1497
+ for (let x = 0; x < ctx.canvas.width; x += 40) {
1498
+ ctx.beginPath();
1499
+ ctx.moveTo(x, 0);
1500
+ ctx.lineTo(x, ctx.canvas.height);
1501
+ ctx.stroke();
1502
+ }
1503
+
1504
+ // Horizontal lines
1505
+ for (let y = 0; y < ctx.canvas.height; y += 40) {
1506
+ ctx.beginPath();
1507
+ ctx.moveTo(0, y);
1508
+ ctx.lineTo(ctx.canvas.width, y);
1509
+ ctx.stroke();
1510
+ }
1511
+ }
1512
+
1513
+ function drawPoint(
1514
+ ctx,
1515
+ x,
1516
+ y,
1517
+ color,
1518
+ label,
1519
+ radius = 8,
1520
+ strokeColor = "#333",
1521
+ strokeWidth = 1
1522
+ ) {
1523
+ ctx.beginPath();
1524
+ ctx.arc(x, y, radius, 0, Math.PI * 2);
1525
+ ctx.fillStyle = color;
1526
+ ctx.fill();
1527
+ ctx.strokeStyle = strokeColor;
1528
+ ctx.lineWidth = strokeWidth;
1529
+ ctx.stroke();
1530
+
1531
+ // Label
1532
+ ctx.fillStyle = "#333";
1533
+ ctx.font = "12px Arial";
1534
+ ctx.textAlign = "center";
1535
+ ctx.fillText(label, x, y - radius - 5);
1536
+ }
1537
+
1538
+ function drawLine(
1539
+ ctx,
1540
+ x1,
1541
+ y1,
1542
+ x2,
1543
+ y2,
1544
+ color = "#333",
1545
+ dash = [],
1546
+ width = 1
1547
+ ) {
1548
+ ctx.beginPath();
1549
+ ctx.setLineDash(dash);
1550
+ ctx.strokeStyle = color;
1551
+ ctx.lineWidth = width;
1552
+ ctx.moveTo(x1, y1);
1553
+ ctx.lineTo(x2, y2);
1554
+ ctx.stroke();
1555
+ ctx.setLineDash([]);
1556
+ }
1557
+
1558
+ function drawTextDocuments(ctx, docs, query) {
1559
+ ctx.fillStyle = "#333";
1560
+ ctx.font = "14px Arial";
1561
+ ctx.textAlign = "left";
1562
+
1563
+ // Draw title
1564
+ ctx.fillText("Original Text Documents:", 20, 30);
1565
+
1566
+ // Draw documents
1567
+ let y = 60;
1568
+ docs.slice(0, 5).forEach((doc) => {
1569
+ ctx.fillStyle = "#3498db";
1570
+ ctx.fillText(`D${doc.id}: ${doc.text}`, 20, y);
1571
+ y += 25;
1572
+ });
1573
+
1574
+ // Draw query
1575
+ y += 20;
1576
+ ctx.fillStyle = "#e74c3c";
1577
+ ctx.fillText(`Query: "${query.text}"`, 20, y);
1578
+
1579
+ // Instructions
1580
+ y += 40;
1581
+ ctx.fillStyle = "#333";
1582
+ ctx.fillText(
1583
+ "Step 1: These documents will be converted to vector embeddings",
1584
+ 20,
1585
+ y
1586
+ );
1587
+ ctx.fillText("that capture their semantic meaning.", 20, y + 20);
1588
+ }
1589
+
1590
+ function drawTokenizedDocuments(ctx, docs, query) {
1591
+ ctx.fillStyle = "#333";
1592
+ ctx.font = "14px Arial";
1593
+ ctx.textAlign = "left";
1594
+
1595
+ // Draw title
1596
+ ctx.fillText("Tokenized Documents:", 20, 30);
1597
+
1598
+ // Draw vocabulary
1599
+ ctx.fillText(
1600
+ "Vocabulary: dog, cat, train, pet, health, food, guide, home, behavior, puppy",
1601
+ 20,
1602
+ 50
1603
+ );
1604
+
1605
+ // Draw documents with highlighted tokens
1606
+ let y = 80;
1607
+ docs.slice(0, 5).forEach((doc) => {
1608
+ ctx.fillStyle = "#3498db";
1609
+ ctx.fillText(`D${doc.id}: ${doc.text}`, 20, y);
1610
+
1611
+ // Show token highlighting
1612
+ for (let i = 0; i < vocabulary.length; i++) {
1613
+ if (
1614
+ doc.vector[i] > 0 &&
1615
+ doc.text.toLowerCase().includes(vocabulary[i])
1616
+ ) {
1617
+ const startX = 20 + ctx.measureText(`D${doc.id}: `).width;
1618
+ const wordStart = doc.text.toLowerCase().indexOf(vocabulary[i]);
1619
+ const prefix = doc.text.substring(0, wordStart);
1620
+ const prefixWidth = ctx.measureText(prefix).width;
1621
+ const wordWidth = ctx.measureText(vocabulary[i]).width;
1622
+
1623
+ ctx.fillStyle = "rgba(46, 204, 113, 0.3)";
1624
+ ctx.fillRect(startX + prefixWidth, y - 12, wordWidth, 15);
1625
+ }
1626
+ }
1627
+
1628
+ y += 25;
1629
+ });
1630
+
1631
+ // Draw query with highlighted tokens
1632
+ y += 20;
1633
+ ctx.fillStyle = "#e74c3c";
1634
+ ctx.fillText(`Query: "${query.text}"`, 20, y);
1635
+
1636
+ // Highlight query tokens
1637
+ for (let i = 0; i < vocabulary.length; i++) {
1638
+ if (
1639
+ query.vector[i] > 0 &&
1640
+ query.text.toLowerCase().includes(vocabulary[i])
1641
+ ) {
1642
+ const startX = 20 + ctx.measureText(`Query: "`).width;
1643
+ const wordStart = query.text.toLowerCase().indexOf(vocabulary[i]);
1644
+ const prefix = query.text.substring(0, wordStart);
1645
+ const prefixWidth = ctx.measureText(prefix).width;
1646
+ const wordWidth = ctx.measureText(vocabulary[i]).width;
1647
+
1648
+ ctx.fillStyle = "rgba(231, 76, 60, 0.3)";
1649
+ ctx.fillRect(startX + prefixWidth, y - 12, wordWidth, 15);
1650
+ }
1651
+ }
1652
+ }
1653
+
1654
+ function drawSparseVectors(ctx, docs, query, step, model) {
1655
+ const barWidth = 15;
1656
+ const barSpacing = 5;
1657
+ const startX = 40;
1658
+ const startY = 220;
1659
+ const maxBarHeight = 100;
1660
+
1661
+ if (step >= 1) {
1662
+ // Draw vocabulary labels on x-axis
1663
+ ctx.fillStyle = "#333";
1664
+ ctx.font = "10px Arial";
1665
+ ctx.textAlign = "center";
1666
+
1667
+ vocabulary.forEach((word, i) => {
1668
+ const x = startX + i * (barWidth + barSpacing) + barWidth / 2;
1669
+ ctx.fillText(word, x, startY + 15);
1670
+ });
1671
+
1672
+ // Draw axis titles
1673
+ ctx.textAlign = "center";
1674
+ ctx.fillText("Vocabulary Terms", 230, startY + 30);
1675
+
1676
+ ctx.save();
1677
+ ctx.translate(15, 150);
1678
+ ctx.rotate(-Math.PI / 2);
1679
+ ctx.fillText("Term Weight", 0, 0);
1680
+ ctx.restore();
1681
+
1682
+ // Draw query vector
1683
+ ctx.fillStyle = "#333";
1684
+ ctx.font = "12px Arial";
1685
+ ctx.textAlign = "left";
1686
+ ctx.fillText("Query vector:", 20, 40);
1687
+
1688
+ query.vector.forEach((value, i) => {
1689
+ const x = startX + i * (barWidth + barSpacing);
1690
+ const barHeight = value * maxBarHeight;
1691
+
1692
+ ctx.fillStyle = value > 0 ? "#e74c3c" : "#f8f9fa";
1693
+ ctx.fillRect(x, startY - barHeight, barWidth, barHeight);
1694
+
1695
+ if (value > 0) {
1696
+ ctx.fillStyle = "#fff";
1697
+ ctx.textAlign = "center";
1698
+ ctx.font = "9px Arial";
1699
+ ctx.fillText(
1700
+ value.toFixed(1),
1701
+ x + barWidth / 2,
1702
+ startY - barHeight / 2
1703
+ );
1704
+ }
1705
+
1706
+ // Also draw mini version above
1707
+ const miniHeight = value * 20;
1708
+ ctx.fillStyle = value > 0 ? "#e74c3c" : "#f8f9fa";
1709
+ ctx.fillRect(x, 50, barWidth, miniHeight);
1710
+ });
1711
+
1712
+ if (step >= 2) {
1713
+ // Draw a document vector for comparison
1714
+ const matchingDoc = docs.find((d) => d.id === 1); // Dog training guide
1715
+
1716
+ ctx.fillStyle = "#333";
1717
+ ctx.font = "12px Arial";
1718
+ ctx.textAlign = "left";
1719
+ ctx.fillText(`Document: "${matchingDoc.text}"`, 20, 100);
1720
+
1721
+ matchingDoc.vector.forEach((value, i) => {
1722
+ const x = startX + i * (barWidth + barSpacing);
1723
+ const miniHeight = value * 20;
1724
+
1725
+ // Mini version above
1726
+ ctx.fillStyle = value > 0 ? "#3498db" : "#f8f9fa";
1727
+ ctx.fillRect(x, 110, barWidth, miniHeight);
1728
+
1729
+ // Highlight matching terms
1730
+ if (value > 0 && query.vector[i] > 0) {
1731
+ ctx.fillStyle = "#2ecc71";
1732
+ ctx.strokeStyle = "#2ecc71";
1733
+ ctx.lineWidth = 2;
1734
+ ctx.strokeRect(x, 50, barWidth, query.vector[i] * 20);
1735
+ ctx.strokeRect(x, 110, barWidth, miniHeight);
1736
+
1737
+ // Draw connection
1738
+ drawLine(
1739
+ ctx,
1740
+ x + barWidth / 2,
1741
+ 50 + query.vector[i] * 20,
1742
+ x + barWidth / 2,
1743
+ 110,
1744
+ "#2ecc71",
1745
+ [],
1746
+ 1
1747
+ );
1748
+ }
1749
+ });
1750
+
1751
+ // Show dot product calculation
1752
+ let dotProduct = 0;
1753
+ for (let i = 0; i < query.vector.length; i++) {
1754
+ dotProduct += query.vector[i] * matchingDoc.vector[i];
1755
+ }
1756
+
1757
+ ctx.fillStyle = "#333";
1758
+ ctx.font = "12px Arial";
1759
+ ctx.textAlign = "left";
1760
+ ctx.fillText(`Matching score: ${dotProduct.toFixed(2)}`, 320, 100);
1761
+ }
1762
+ }
1763
+ }
1764
+
1765
+ // Update step descriptions
1766
+ function updateENNStepInfo(step, distanceMetric) {
1767
+ let title, description;
1768
+
1769
+ switch (step) {
1770
+ case 0:
1771
+ title = "Step 0: Data points";
1772
+ description =
1773
+ "Initial dataset with vectors in feature space. The query point (red) will be compared against all data points.";
1774
+ break;
1775
+ case 1:
1776
+ title = "Step 1: Calculate all distances";
1777
+ if (distanceMetric === "euclidean") {
1778
+ description =
1779
+ "Calculate Euclidean (L2) distance between query and every data point: d = √((x₂-x₁)² + (y₂-y₁)²).";
1780
+ } else if (distanceMetric === "manhattan") {
1781
+ description =
1782
+ "Calculate Manhattan (L1) distance between query and every data point: d = |x₂-x₁| + |y₂-y₁|.";
1783
+ } else {
1784
+ description =
1785
+ "Calculate Cosine similarity between query and data points: similarity = cos(θ) between vectors.";
1786
+ }
1787
+ break;
1788
+ case 2:
1789
+ title = "Step 2: Sort by distance";
1790
+ description =
1791
+ "Sort all data points by their distance to query point (ascending order for distance, descending for similarity).";
1792
+ break;
1793
+ case 3:
1794
+ title = "Step 3: Return nearest neighbors";
1795
+ description =
1796
+ "Return the k closest data points (here k=1). This approach guarantees finding the exact nearest neighbor.";
1797
+ break;
1798
+ }
1799
+
1800
+ ennStepTitle.textContent = title;
1801
+ ennStepDesc.textContent = description;
1802
+ }
1803
+
1804
+ function updateANNStepInfo(step, algorithm) {
1805
+ let title, description;
1806
+
1807
+ switch (step) {
1808
+ case 0:
1809
+ title = "Step 0: Indexed structure";
1810
+ if (algorithm === "hnsw") {
1811
+ description =
1812
+ "HNSW pre-organizes vectors into a navigable small world graph with multiple layers for efficient search.";
1813
+ } else if (algorithm === "pq") {
1814
+ description =
1815
+ "Product Quantization divides the vector space into smaller subspaces and quantizes each dimension group.";
1816
+ } else {
1817
+ description =
1818
+ "Locality-Sensitive Hashing uses hash functions that map similar vectors to the same buckets.";
1819
+ }
1820
+ break;
1821
+ case 1:
1822
+ title = "Step 1: Navigate to region";
1823
+ if (algorithm === "hnsw") {
1824
+ description =
1825
+ "Search begins at a random entry point in the top layer (sparse connections).";
1826
+ } else if (algorithm === "pq") {
1827
+ description =
1828
+ "The query is mapped to specific regions in each subspace based on quantized centroids.";
1829
+ } else {
1830
+ description =
1831
+ "Query vector is hashed to identify which bucket(s) to search.";
1832
+ }
1833
+ break;
1834
+ case 2:
1835
+ title = "Step 2: Local search";
1836
+ if (algorithm === "hnsw") {
1837
+ description =
1838
+ "Navigate through connections to find closer and closer neighbors, descending through layers.";
1839
+ } else if (algorithm === "pq") {
1840
+ description =
1841
+ "Compare only with points in the same or nearby quantized regions to limit search space.";
1842
+ } else {
1843
+ description =
1844
+ "Only compute distances for vectors in the same hash bucket, dramatically reducing comparisons.";
1845
+ }
1846
+ break;
1847
+ case 3:
1848
+ title = "Step 3: Return approximate NN";
1849
+ if (algorithm === "hnsw") {
1850
+ description =
1851
+ "Return the closest point found. May not be the true nearest neighbor, but usually very close.";
1852
+ } else if (algorithm === "pq") {
1853
+ description =
1854
+ "Approximates distances between query and dataset points. Fast but loses some precision.";
1855
+ } else {
1856
+ description =
1857
+ "If points fall into different buckets, LSH might miss true nearest neighbors (accuracy vs. speed tradeoff).";
1858
+ }
1859
+ break;
1860
+ }
1861
+
1862
+ annStepTitle.textContent = title;
1863
+ annStepDesc.textContent = description;
1864
+ }
1865
+
1866
+ function updateSemanticStepInfo(step, model) {
1867
+ let title, description;
1868
+
1869
+ switch (step) {
1870
+ case 0:
1871
+ title = "Step 0: Text documents";
1872
+ description = "Raw text data before encoding into vector space.";
1873
+ break;
1874
+ case 1:
1875
+ title = "Step 1: Generate embeddings";
1876
+ if (model === "bert") {
1877
+ description =
1878
+ "BERT creates dense vector embeddings (768 dimensions) that capture semantic meaning of text.";
1879
+ } else if (model === "use") {
1880
+ description =
1881
+ "Universal Sentence Encoder maps sentences to 512-dimensional vectors that capture meaning.";
1882
+ } else {
1883
+ description =
1884
+ "Domain-specific embeddings capture meaning relevant to particular fields or applications.";
1885
+ }
1886
+ break;
1887
+ case 2:
1888
+ title = "Step 2: Vector similarity search";
1889
+ description =
1890
+ "Calculate similarity (usually cosine) between query vector and document vectors.";
1891
+ break;
1892
+ case 3:
1893
+ title = "Step 3: Return relevant results";
1894
+ description =
1895
+ "Rank documents by similarity and return the most relevant. Results include semantic matches, not just exact keyword matches.";
1896
+ break;
1897
+ }
1898
+
1899
+ semanticStepTitle.textContent = title;
1900
+ semanticStepDesc.textContent = description;
1901
+ }
1902
+
1903
+ function updateSparseStepInfo(step, model) {
1904
+ let title, description;
1905
+
1906
+ switch (step) {
1907
+ case 0:
1908
+ title = "Step 0: Tokenized content";
1909
+ description =
1910
+ "Documents broken down into tokens (words/terms) before converting to sparse vector representation.";
1911
+ break;
1912
+ case 1:
1913
+ title = "Step 1: Create sparse vectors";
1914
+ if (model === "tfidf") {
1915
+ description =
1916
+ "TF-IDF weights tokens based on term frequency and inverse document frequency to emphasize distinctive terms.";
1917
+ } else if (model === "bm25") {
1918
+ description =
1919
+ "BM25 extends TF-IDF with better term saturation and document length normalization.";
1920
+ } else {
1921
+ description =
1922
+ "Hybrid representations combine sparse (keyword) and dense (semantic) vectors for better retrieval.";
1923
+ }
1924
+ break;
1925
+ case 2:
1926
+ title = "Step 2: Inverted index search";
1927
+ description =
1928
+ "Lookup only the specific terms present in the query, accessing posting lists through an inverted index.";
1929
+ break;
1930
+ case 3:
1931
+ title = "Step 3: Return matches";
1932
+ description =
1933
+ "Return documents with matching terms, ranked by relevance score. Very efficient for exact term matches.";
1934
+ break;
1935
+ }
1936
+
1937
+ sparseStepTitle.textContent = title;
1938
+ sparseStepDesc.textContent = description;
1939
+ }
1940
+ </script>
1941
+ </body>
1942
  </html>