Haleshot commited on
Commit
39b0312
·
unverified ·
1 Parent(s): 6aff290

Add initial draft

Browse files
Files changed (1) hide show
  1. probability/08_bayes_theorem.py +531 -0
probability/08_bayes_theorem.py ADDED
@@ -0,0 +1,531 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # /// script
2
+ # requires-python = ">=3.10"
3
+ # dependencies = [
4
+ # "marimo",
5
+ # "matplotlib==3.10.0",
6
+ # "numpy==2.2.3",
7
+ # ]
8
+ # ///
9
+
10
+ import marimo
11
+
12
+ __generated_with = "0.11.8"
13
+ app = marimo.App(width="medium", app_title="Bayes Theorem")
14
+
15
+
16
+ @app.cell
17
+ def _():
18
+ import marimo as mo
19
+ return (mo,)
20
+
21
+
22
+ @app.cell
23
+ def _():
24
+ import matplotlib.pyplot as plt
25
+ import numpy as np
26
+ return np, plt
27
+
28
+
29
+ @app.cell(hide_code=True)
30
+ def _(mo):
31
+ mo.md(
32
+ r"""
33
+ # Bayes' Theorem
34
+
35
+ _This notebook is a computational companion to the book ["Probability for Computer Scientists"](https://chrispiech.github.io/probabilityForComputerScientists/en/part1/bayes_theorem/), by Stanford professor Chris Piech._
36
+
37
+ In the 1740s, an English minister named Thomas Bayes discovered a profound mathematical relationship that would revolutionize how we reason about uncertainty. His theorem provides an elegant framework for calculating the probability of a hypothesis being true given observed evidence.
38
+
39
+ At its core, Bayes' Theorem connects two different types of probabilities: the probability of a hypothesis given evidence $P(H|E)$, and its reverse - the probability of evidence given a hypothesis $P(E|H)$. This relationship is particularly powerful because it allows us to compute difficult probabilities using ones that are easier to measure.
40
+ """
41
+ )
42
+ return
43
+
44
+
45
+ @app.cell(hide_code=True)
46
+ def _(mo):
47
+ mo.md(
48
+ r"""
49
+ ## The Heart of Bayesian Reasoning
50
+
51
+ The fundamental insight of Bayes' Theorem lies in its ability to relate what we want to know with what we can measure. When we observe evidence $E$, we often want to know the probability of a hypothesis $H$ being true. However, it's typically much easier to measure how likely we are to observe the evidence when we know the hypothesis is true.
52
+
53
+ This reversal of perspective - from $P(H|E)$ to $P(E|H)$ - is powerful because it lets us:
54
+ 1. Start with what we know (prior beliefs)
55
+ 2. Use easily measurable relationships (likelihood)
56
+ 3. Update our beliefs with new evidence
57
+
58
+ This approach mirrors both how humans naturally learn and the scientific method: we begin with prior beliefs, gather evidence, and update our understanding based on that evidence. This makes Bayes' Theorem not just a mathematical tool, but a framework for rational thinking.
59
+ """
60
+ )
61
+ return
62
+
63
+
64
+ @app.cell(hide_code=True)
65
+ def _(mo):
66
+ mo.md(
67
+ r"""
68
+ ## The Formula
69
+
70
+ Bayes' Theorem states:
71
+
72
+ $P(H|E) = \frac{P(E|H)P(H)}{P(E)}$
73
+
74
+ Where:
75
+
76
+ - $P(H|E)$ is the **posterior probability** - probability of hypothesis H given evidence E
77
+ - $P(E|H)$ is the **likelihood** - probability of evidence E given hypothesis H
78
+ - $P(H)$ is the **prior probability** - initial probability of hypothesis H
79
+ - $P(E)$ is the **evidence** - total probability of observing evidence E
80
+
81
+ The denominator $P(E)$ can be expanded using the [Law of Total Probability](https://marimo.app/gh/marimo-team/learn/main?entrypoint=probability%2F07_law_of_total_probability.py):
82
+
83
+ $P(E) = P(E|H)P(H) + P(E|H^c)P(H^c)$
84
+ """
85
+ )
86
+ return
87
+
88
+
89
+ @app.cell(hide_code=True)
90
+ def _(mo):
91
+ mo.md(
92
+ r"""
93
+ ## Understanding Each Component
94
+
95
+ ### 1. Prior Probability - $P(H)$
96
+ - Initial belief about hypothesis before seeing evidence
97
+ - Based on previous knowledge or assumptions
98
+ - Example: Probability of having a disease before any tests
99
+
100
+ ### 2. Likelihood - $P(E|H)$
101
+ - Probability of evidence given hypothesis is true
102
+ - Often known from data or scientific studies
103
+ - Example: Probability of positive test given disease present
104
+
105
+ ### 3. Evidence - $P(E)$
106
+ - Total probability of observing the evidence
107
+ - Acts as a normalizing constant
108
+ - Can be calculated using Law of Total Probability
109
+
110
+ ### 4. Posterior - $P(H|E)$
111
+ - Updated probability after considering evidence
112
+ - Combines prior knowledge with new evidence
113
+ - Becomes new prior for future updates
114
+ """
115
+ )
116
+ return
117
+
118
+
119
+ @app.cell(hide_code=True)
120
+ def _(mo):
121
+ mo.md(
122
+ r"""
123
+ ## Real-World Examples
124
+
125
+ ### 1. Medical Testing
126
+ - **Want to know**: $P(\text{Disease}|\text{Positive})$ - Probability of disease given positive test
127
+ - **Easy to know**: $P(\text{Positive}|\text{Disease})$ - Test accuracy for sick people
128
+ - **Causality**: Disease causes test results, not vice versa
129
+
130
+ ### 2. Student Ability
131
+ - **Want to know**: $P(\text{High Ability}|\text{Good Grade})$ - Probability student is skilled given good grade
132
+ - **Easy to know**: $P(\text{Good Grade}|\text{High Ability})$ - Probability good students get good grades
133
+ - **Causality**: Ability influences grades, not vice versa
134
+
135
+ ### 3. Cell Phone Location
136
+ - **Want to know**: $P(\text{Location}|\text{Signal Strength})$ - Probability of phone location given signal
137
+ - **Easy to know**: $P(\text{Signal Strength}|\text{Location})$ - Signal strength at known locations
138
+ - **Causality**: Location determines signal strength, not vice versa
139
+
140
+ These examples highlight a common pattern: what we want to know (posterior) is harder to measure directly than its reverse (likelihood).
141
+ """
142
+ )
143
+ return
144
+
145
+
146
+ @app.cell
147
+ def _():
148
+ def calculate_posterior(prior, likelihood, false_positive_rate):
149
+ # Calculate P(E) using Law of Total Probability
150
+ p_e = likelihood * prior + false_positive_rate * (1 - prior)
151
+
152
+ # Calculate posterior using Bayes' Theorem
153
+ posterior = (likelihood * prior) / p_e
154
+ return posterior, p_e
155
+ return (calculate_posterior,)
156
+
157
+
158
+ @app.cell
159
+ def _(calculate_posterior):
160
+ # Medical test example
161
+ p_disease = 0.01 # Prior: 1% have the disease
162
+ p_positive_given_disease = 0.95 # Likelihood: 95% test accuracy
163
+ p_positive_given_healthy = 0.10 # False positive rate: 10%
164
+
165
+ medical_posterior, medical_evidence = calculate_posterior(
166
+ p_disease,
167
+ p_positive_given_disease,
168
+ p_positive_given_healthy
169
+ )
170
+ return (
171
+ medical_evidence,
172
+ medical_posterior,
173
+ p_disease,
174
+ p_positive_given_disease,
175
+ p_positive_given_healthy,
176
+ )
177
+
178
+
179
+ @app.cell
180
+ def _(medical_explanation):
181
+ medical_explanation
182
+ return
183
+
184
+
185
+ @app.cell(hide_code=True)
186
+ def _(medical_posterior, mo):
187
+ medical_explanation = mo.md(f"""
188
+ ### Medical Testing Example
189
+
190
+ Consider a medical test for a rare disease:
191
+
192
+ - Prior: 1% of population has the disease
193
+ - Likelihood: 95% test accuracy for sick people
194
+ - False positive: 10% of healthy people test positive
195
+
196
+ Using Bayes' Theorem:
197
+ $P(D|+) = \\frac{{0.95 times 0.01}}{{0.95 times 0.01 + 0.10 times 0.99}} = {medical_posterior:.3f}$
198
+
199
+ Despite a positive test, there's only a {medical_posterior:.1%} chance of having the disease!
200
+ This counterintuitive result occurs because the disease is rare (low prior probability).
201
+ """)
202
+ return (medical_explanation,)
203
+
204
+
205
+ @app.cell
206
+ def _(calculate_posterior):
207
+ # Student ability example
208
+ p_high_ability = 0.30 # Prior: 30% of students have high ability
209
+ p_good_grade_given_high = 0.90 # Likelihood: 90% of high ability students get good grades
210
+ p_good_grade_given_low = 0.40 # 40% of lower ability students also get good grades
211
+
212
+ student_posterior, student_evidence = calculate_posterior(
213
+ p_high_ability,
214
+ p_good_grade_given_high,
215
+ p_good_grade_given_low
216
+ )
217
+ return (
218
+ p_good_grade_given_high,
219
+ p_good_grade_given_low,
220
+ p_high_ability,
221
+ student_evidence,
222
+ student_posterior,
223
+ )
224
+
225
+
226
+ @app.cell
227
+ def _(student_explanation):
228
+ student_explanation
229
+ return
230
+
231
+
232
+ @app.cell(hide_code=True)
233
+ def _(mo, student_posterior):
234
+ student_explanation = mo.md(f"""
235
+ ### Student Ability Example
236
+
237
+ If a student gets a good grade, what's the probability they have high ability?
238
+
239
+ Using Bayes' Theorem:
240
+
241
+ - Prior: 30% have high ability
242
+ - Likelihood: 90% of high ability students get good grades
243
+ - False positive: 40% of lower ability students get good grades
244
+
245
+ Result: P(High Ability|Good Grade) = {student_posterior:.2f}
246
+
247
+ So a good grade increases our confidence in high ability from 30% to {student_posterior:.1%}
248
+ """)
249
+ return (student_explanation,)
250
+
251
+
252
+ @app.cell
253
+ def _(calculate_posterior):
254
+ # Cell phone location example
255
+ p_location_a = 0.25 # Prior probability of being in location A
256
+ p_strong_signal_at_a = 0.85 # Likelihood of strong signal at A
257
+ p_strong_signal_elsewhere = 0.15 # False positive rate
258
+
259
+ location_posterior, location_evidence = calculate_posterior(
260
+ p_location_a,
261
+ p_strong_signal_at_a,
262
+ p_strong_signal_elsewhere
263
+ )
264
+ return (
265
+ location_evidence,
266
+ location_posterior,
267
+ p_location_a,
268
+ p_strong_signal_at_a,
269
+ p_strong_signal_elsewhere,
270
+ )
271
+
272
+
273
+ @app.cell
274
+ def _(location_explanation):
275
+ location_explanation
276
+ return
277
+
278
+
279
+ @app.cell(hide_code=True)
280
+ def _(location_posterior, mo):
281
+ location_explanation = mo.md(f"""
282
+ ### Cell Phone Location Example
283
+
284
+ Given a strong signal, what's the probability the phone is in location A?
285
+
286
+ Using Bayes' Theorem:
287
+
288
+ - Prior: 25% chance of being in location A
289
+ - Likelihood: 85% chance of strong signal at A
290
+ - False positive: 15% chance of strong signal elsewhere
291
+
292
+ Result: P(Location A|Strong Signal) = {location_posterior:.2f}
293
+
294
+ The strong signal increases our confidence in location A from 25% to {location_posterior:.1%}
295
+ """)
296
+ return (location_explanation,)
297
+
298
+
299
+ @app.cell(hide_code=True)
300
+ def _(mo):
301
+ mo.md(r"""## Interactive example""")
302
+ return
303
+
304
+
305
+ @app.cell(hide_code=True)
306
+ def _(mo):
307
+ mo.md(
308
+ r"""
309
+
310
+ _This interactive exmaple was made with [marimo](https://github.com/marimo-team/marimo/blob/main/examples/misc/bayes_theorem.py), and is [based on an explanation of Bayes' Theorem by Grant Sanderson](https://www.youtube.com/watch?v=HZGCoVF3YvM&list=PLzq7odmtfKQw2KIbQq0rzWrqgifHKkPG1&index=1&t=3s)_.
311
+
312
+ Bayes theorem provides a convenient way to calculate the probability
313
+ of a hypothesis event $H$ given evidence $E$:
314
+
315
+ \[
316
+ P(H \mid E) = \frac{P(H) P(E \mid H)}{P(E)}.
317
+ \]
318
+
319
+
320
+ **The numerator.** The numerator is the probability of events $E$ and $H$ happening
321
+ together; that is,
322
+
323
+ \[
324
+ P(H) P(E \mid H) = P(E \cap H).
325
+ \]
326
+
327
+ **The denominator.**
328
+ In most calculations, it is helpful to rewrite the denominator $P(E)$ as
329
+
330
+ \[
331
+ P(E) = P(H)P(E \mid H) + P(\neg H) P (E \mid \neg H),
332
+ \]
333
+
334
+ which in turn can also be written as
335
+
336
+
337
+ \[
338
+ P(E) = P(E \cap H) + P(E \cap \neg H).
339
+ \]
340
+ """
341
+ ).left()
342
+ return
343
+
344
+
345
+ @app.cell(hide_code=True)
346
+ def _(
347
+ bayes_result,
348
+ construct_probability_plot,
349
+ mo,
350
+ p_e,
351
+ p_e_given_h,
352
+ p_e_given_not_h,
353
+ p_h,
354
+ ):
355
+ mo.hstack(
356
+ [
357
+ mo.md(
358
+ rf"""
359
+ ### Probability parameters
360
+
361
+ You can configure the probabilities of the events $H$, $E \mid H$, and $E \mid \neg H$
362
+
363
+ {mo.as_html([p_h, p_e_given_h, p_e_given_not_h])}
364
+
365
+ The plot on the right visualizes the probabilities of these events.
366
+
367
+ 1. The yellow rectangle represents the event $H$, and its area is $P(H) = {p_h.value:0.2f}$.
368
+ 2. The teal rectangle overlapping with the yellow one represents the event $E \cap H$, and
369
+ its area is $P(H) \cdot P(E \mid H) = {p_h.value * p_e_given_h.value:0.2f}$.
370
+ 3. The teal rectangle that doesn't overlap the yellow rectangle represents the event $E \cap \neg H$, and
371
+ its area is $P(\neg H) \cdot P(E \mid \neg H) = {(1 - p_h.value) * p_e_given_not_h.value:0.2f}$.
372
+
373
+ Notice that the sum of the areas in $2$ and $3$ is the probability $P(E) = {p_e:0.2f}$.
374
+
375
+ One way to think about Bayes' Theorem is the following: the probability $P(H \mid E)$ is the probability
376
+ of $E$ and $H$ happening together (the area of the rectangle $2$), divided by the probability of $E$ happening
377
+ at all (the sum of the areas of $2$ and $3$).
378
+ In this case, Bayes' Theorem says
379
+
380
+ \[
381
+ P(H \mid E) = \frac{{P(H) P(E \mid H)}}{{P(E)}} = \frac{{{p_h.value} \cdot {p_e_given_h.value}}}{{{p_e:0.2f}}} = {bayes_result:0.2f}
382
+ \]
383
+ """
384
+ ),
385
+ construct_probability_plot(),
386
+ ],
387
+ justify="start",
388
+ gap=4,
389
+ align="start",
390
+ widths=[0.33, 0.5],
391
+ )
392
+ return
393
+
394
+
395
+ @app.cell(hide_code=True)
396
+ def _(mo):
397
+ mo.md(
398
+ r"""
399
+ ## Applications in Computer Science
400
+
401
+ Bayes' Theorem is fundamental in many computing applications:
402
+
403
+ 1. **Spam Filtering**
404
+
405
+ - $P(\text{Spam}|\text{Words})$ = Probability email is spam given its words
406
+ - Updates as new emails are classified
407
+
408
+ 2. **Machine Learning**
409
+
410
+ - Naive Bayes classifiers
411
+ - Probabilistic graphical models
412
+ - Bayesian neural networks
413
+
414
+ 3. **Computer Vision**
415
+
416
+ - Object detection confidence
417
+ - Face recognition systems
418
+ - Image classification
419
+ """
420
+ )
421
+ return
422
+
423
+
424
+ @app.cell(hide_code=True)
425
+ def _(mo):
426
+ mo.md(
427
+ """
428
+ ## 🤔 Test Your Understanding
429
+
430
+ Pick which of these statements about Bayes' Theorem you think are correct:
431
+
432
+ <details>
433
+ <summary>The posterior probability will always be larger than the prior probability</summary>
434
+ ❌ Incorrect! Evidence can either increase or decrease our belief in the hypothesis. For example, a negative medical test decreases the probability of having a disease.
435
+ </details>
436
+
437
+ <details>
438
+ <summary>If the likelihood is 0.9 and the prior is 0.5, then the posterior must equal 0.9</summary>
439
+ ❌ Incorrect! We also need the false positive rate to calculate the posterior probability. The likelihood alone doesn't determine the posterior.
440
+ </details>
441
+
442
+ <details>
443
+ <summary>The denominator acts as a normalizing constant to ensure the posterior is a valid probability</summary>
444
+ ✅ Correct! The denominator ensures the posterior probability is between 0 and 1 by considering all ways the evidence could occur.
445
+ </details>
446
+ """
447
+ )
448
+ return
449
+
450
+
451
+ @app.cell(hide_code=True)
452
+ def _(mo):
453
+ mo.md(
454
+ """
455
+ ## Summary
456
+
457
+ You've learned:
458
+
459
+ - The components and intuition behind Bayes' Theorem
460
+ - How to update probabilities when new evidence arrives
461
+ - Why posterior probabilities can be counterintuitive
462
+ - Real-world applications in computer science
463
+
464
+ In the next lesson, we'll explore Random Variables, which help us work with numerical outcomes in probability.
465
+ """
466
+ )
467
+ return
468
+
469
+
470
+ @app.cell(hide_code=True)
471
+ def _(mo):
472
+ mo.md(
473
+ r"""
474
+ ### Appendix
475
+ Below (hidden) cell blocks are responsible for the interactive example above
476
+ """
477
+ )
478
+ return
479
+
480
+
481
+ @app.cell(hide_code=True)
482
+ def _(p_e_given_h, p_e_given_not_h, p_h):
483
+ p_e = p_h.value*p_e_given_h.value + (1 - p_h.value)*p_e_given_not_h.value
484
+ bayes_result = p_h.value * p_e_given_h.value / p_e
485
+ return bayes_result, p_e
486
+
487
+
488
+ @app.cell(hide_code=True)
489
+ def _(mo):
490
+ p_h = mo.ui.slider(0.0, 1, label="$P(H)$", value=0.1, step=0.1)
491
+ p_e_given_h = mo.ui.slider(0.0, 1, label="$P(E \mid H)$", value=0.3, step=0.1)
492
+ p_e_given_not_h = mo.ui.slider(
493
+ 0.0, 1, label=r"$P(E \mid \neg H)$", value=0.3, step=0.1
494
+ )
495
+ return p_e_given_h, p_e_given_not_h, p_h
496
+
497
+
498
+ @app.cell(hide_code=True)
499
+ def _(p_e_given_h, p_e_given_not_h, p_h):
500
+ def construct_probability_plot():
501
+ import matplotlib.pyplot as plt
502
+
503
+ plt.axes()
504
+
505
+ # Radius: 1, face-color: red, edge-color: blue
506
+ plt.figure(figsize=(6,6))
507
+ base = plt.Rectangle((0, 0), 1, 1, fc="black", ec="white", alpha=0.25)
508
+ h = plt.Rectangle((0, 0), p_h.value, 1, fc="yellow", ec="white", label="H")
509
+ e_given_h = plt.Rectangle(
510
+ (0, 0),
511
+ p_h.value,
512
+ p_e_given_h.value,
513
+ fc="teal",
514
+ ec="white",
515
+ alpha=0.5,
516
+ label="E",
517
+ )
518
+ e_given_not_h = plt.Rectangle(
519
+ (p_h.value, 0), 1 - p_h.value, p_e_given_not_h.value, fc="teal", ec="white", alpha=0.5
520
+ )
521
+ plt.gca().add_patch(base)
522
+ plt.gca().add_patch(h)
523
+ plt.gca().add_patch(e_given_not_h)
524
+ plt.gca().add_patch(e_given_h)
525
+ plt.legend()
526
+ return plt.gca()
527
+ return (construct_probability_plot,)
528
+
529
+
530
+ if __name__ == "__main__":
531
+ app.run()