Akshay Agrawal commited on
Commit
9fcae04
·
unverified ·
2 Parent(s): c8a5dc2 ac93bf0

Merge pull request #81 from marimo-team/haleshot/17_normal

Browse files
Files changed (1) hide show
  1. probability/17_normal_distribution.py +1127 -0
probability/17_normal_distribution.py ADDED
@@ -0,0 +1,1127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # /// script
2
+ # requires-python = ">=3.10"
3
+ # dependencies = [
4
+ # "marimo",
5
+ # "matplotlib==3.10.1",
6
+ # "scipy==1.15.2",
7
+ # "wigglystuff==0.1.10",
8
+ # "numpy==2.2.4",
9
+ # ]
10
+ # ///
11
+
12
+ import marimo
13
+
14
+ __generated_with = "0.11.26"
15
+ app = marimo.App(width="medium", app_title="Normal Distribution")
16
+
17
+
18
+ @app.cell(hide_code=True)
19
+ def _(mo):
20
+ mo.md(
21
+ r"""
22
+ # Normal Distribution
23
+
24
+ _This notebook is a computational companion to ["Probability for Computer Scientists"](https://chrispiech.github.io/probabilityForComputerScientists/en/part2/normal/), by Stanford professor Chris Piech._
25
+
26
+ The Normal (also known as Gaussian) distribution is one of the most important probability distributions in statistics and data science. It's characterized by a symmetric bell-shaped curve and is fully defined by two parameters: mean (μ) and variance (σ²).
27
+ """
28
+ )
29
+ return
30
+
31
+
32
+ @app.cell(hide_code=True)
33
+ def _(mo):
34
+ mo.md(
35
+ r"""
36
+ ## Normal Random Variable Definition
37
+
38
+ The Normal (or Gaussian) random variable is denoted as:
39
+
40
+ $$X \sim \mathcal{N}(\mu, \sigma^2)$$
41
+
42
+ Where:
43
+
44
+ - $X$ is our random variable
45
+ - $\mathcal{N}$ indicates it follows a Normal distribution
46
+ - $\mu$ is the mean parameter
47
+ - $\sigma^2$ is the variance parameter (sometimes written as $\sigma$ for standard deviation)
48
+
49
+ ```
50
+ X ~ N(μ, σ²)
51
+ ↑ ↑ ↑ ↑
52
+ | | | +-- Variance (spread)
53
+ | | | of the distribution
54
+ | | +-- Mean (center)
55
+ | | of the distribution
56
+ | +-- Indicates Normal
57
+ | distribution
58
+ |
59
+ Our random variable
60
+ ```
61
+
62
+ The Normal distribution is particularly important for many reasons:
63
+
64
+ 1. It arises naturally from the sum of independent random variables (Central Limit Theorem)
65
+ 2. It appears frequently in natural phenomena
66
+ 3. It is the maximum entropy distribution given a fixed mean and variance
67
+ 4. It simplifies many mathematical calculations in statistics and probability
68
+ """
69
+ )
70
+ return
71
+
72
+
73
+ @app.cell(hide_code=True)
74
+ def _(mo):
75
+ mo.md(
76
+ r"""
77
+ ## Properties of Normal Distribution
78
+
79
+ | Property | Formula |
80
+ |----------|---------|
81
+ | Notation | $X \sim \mathcal{N}(\mu, \sigma^2)$ |
82
+ | Description | A common, naturally occurring distribution |
83
+ | Parameters | $\mu \in \mathbb{R}$, the mean<br>$\sigma^2 \in \mathbb{R}^+$, the variance |
84
+ | Support | $x \in \mathbb{R}$ |
85
+ | PDF equation | $f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$ |
86
+ | CDF equation | $F(x) = \Phi(\frac{x-\mu}{\sigma})$ where $\Phi$ is the CDF of the standard normal |
87
+ | Expectation | $E[X] = \mu$ |
88
+ | Variance | $\text{Var}(X) = \sigma^2$ |
89
+
90
+ The PDF (Probability Density Function) reaches its maximum value at $x = \mu$, where the exponent becomes zero and $e^0 = 1$.
91
+ """
92
+ )
93
+ return
94
+
95
+
96
+ @app.cell(hide_code=True)
97
+ def _(mean_slider, mo, std_slider):
98
+ mo.md(
99
+ f"""
100
+ The figure below shows a comparison between:
101
+
102
+ - The **Standard Normal Distribution** (purple curve): N(0, 1)
103
+ - A **Normal Distribution** with the parameters you selected (blue curve)
104
+
105
+ Adjust the mean (μ) {mean_slider} and standard deviation (σ) {std_slider} below to see how the normal distribution changes shape.
106
+
107
+ """
108
+ )
109
+ return
110
+
111
+
112
+ @app.cell(hide_code=True)
113
+ def _(
114
+ create_distribution_comparison,
115
+ fig_to_image,
116
+ mean_slider,
117
+ mo,
118
+ std_slider,
119
+ ):
120
+ # values from the sliders
121
+ current_mu = mean_slider.amount
122
+ current_sigma = std_slider.amount
123
+
124
+ # Create plot
125
+ comparison_fig = create_distribution_comparison(current_mu, current_sigma)
126
+
127
+ # Call, convert and display
128
+ comp_image = mo.image(fig_to_image(comparison_fig), width="100%")
129
+ comp_image
130
+ return comp_image, comparison_fig, current_mu, current_sigma
131
+
132
+
133
+ @app.cell(hide_code=True)
134
+ def _(mean_slider, mo, std_slider):
135
+ mo.md(
136
+ f"""
137
+ ## Interactive Normal Distribution Visualization
138
+
139
+ The shape of a normal distribution is determined by two key parameters:
140
+
141
+ - The **mean (μ):** {mean_slider} controls the center of the distribution.
142
+
143
+ - The **standard deviation (σ):** {std_slider} controls the spread (width) of the distribution.
144
+
145
+ Try adjusting these parameters to see how they affect the shape of the distribution below:
146
+
147
+ """
148
+ )
149
+ return
150
+
151
+
152
+ @app.cell(hide_code=True)
153
+ def _(create_normal_pdf_plot, fig_to_image, mean_slider, mo, std_slider):
154
+ # value from widgets
155
+ _current_mu = mean_slider.amount
156
+ _current_sigma = std_slider.amount
157
+
158
+ # Create visualization
159
+ pdf_fig = create_normal_pdf_plot(_current_mu, _current_sigma)
160
+
161
+ # Display plot
162
+ pdf_image = mo.image(fig_to_image(pdf_fig), width="100%")
163
+
164
+ pdf_explanation = mo.md(
165
+ r"""
166
+ **Understanding the Normal Distribution Visualization:**
167
+
168
+ - **PDF (top)**: The probability density function shows the relative likelihood of different values.
169
+ The highest point occurs at the mean (μ).
170
+
171
+ - **Shaded regions**: The green shaded areas represent:
172
+ - μ ± 1σ: Contains approximately 68.3% of the probability
173
+ - μ ± 2σ: Contains approximately 95.5% of the probability
174
+ - μ ± 3σ: Contains approximately 99.7% of the probability (the "68-95-99.7 rule")
175
+
176
+ - **CDF (bottom)**: The cumulative distribution function shows the probability that X is less than or equal to a given value.
177
+ - At x = μ, the CDF equals 0.5 (50% probability)
178
+ - At x = μ + σ, the CDF equals approximately 0.84 (84% probability)
179
+ - At x = μ - σ, the CDF equals approximately 0.16 (16% probability)
180
+ """
181
+ )
182
+
183
+ mo.vstack([pdf_image, pdf_explanation])
184
+ return pdf_explanation, pdf_fig, pdf_image
185
+
186
+
187
+ @app.cell(hide_code=True)
188
+ def _(mo):
189
+ mo.md(
190
+ r"""
191
+ ## Standard Normal Distribution
192
+
193
+ The **Standard Normal Distribution** is a special case of the normal distribution where $\mu = 0$ and $\sigma = 1$. We denote it as:
194
+
195
+ $$Z \sim \mathcal{N}(0, 1)$$
196
+
197
+ This distribution is particularly important because:
198
+
199
+ 1. Any normal distribution can be transformed into the standard normal
200
+ 2. Statistical tables and calculations often use the standard normal as a reference
201
+
202
+ ### Standardizing a Normal Random Variable
203
+
204
+ For any normal random variable $X \sim \mathcal{N}(\mu, \sigma^2)$, we can transform it to the standard normal $Z$ using:
205
+
206
+ $$Z = \frac{X - \mu}{\sigma}$$
207
+
208
+ Let's see the mathematical derivation:
209
+
210
+ \begin{align*}
211
+ W &= \frac{X -\mu}{\sigma} && \text{Subtract by $\mu$ and divide by $\sigma$} \\
212
+ &= \frac{1}{\sigma}X - \frac{\mu}{\sigma} && \text{Use algebra to rewrite the equation}\\
213
+ &= aX + b && \text{Linear transform where $a = \frac{1}{\sigma}$, $b = -\frac{\mu}{\sigma}$}\\
214
+ &\sim \mathcal{N}(a\mu + b, a^2\sigma^2) && \text{The linear transform of a Normal is another Normal}\\
215
+ &\sim \mathcal{N}\left(\frac{\mu}{\sigma} - \frac{\mu}{\sigma}, \frac{\sigma^2}{\sigma^2}\right) && \text{Substitute values for $a$ and $b$}\\
216
+ &\sim \mathcal{N}(0, 1) && \text{The standard normal}
217
+ \end{align*}
218
+
219
+ This transformation is the foundation for many statistical tests and probability calculations.
220
+ """
221
+ )
222
+ return
223
+
224
+
225
+ @app.cell(hide_code=True)
226
+ def _(create_standardization_plot, fig_to_image, mo):
227
+ # Create and display visualization
228
+ stand_fig = create_standardization_plot()
229
+
230
+ # Display
231
+ stand_image = mo.image(fig_to_image(stand_fig), width="100%")
232
+
233
+ stand_explanation = mo.md(
234
+ r"""
235
+ **Standardizing a Normal Distribution: A Two-Step Process**
236
+
237
+ The visualization above shows the process of transforming any normal distribution to the standard normal:
238
+
239
+ 1. **Shift the distribution** (left plot): First, we subtract the mean (μ) from X, centering the distribution at 0.
240
+
241
+ 2. **Scale the distribution** (right plot): Next, we divide by the standard deviation (σ), which adjusts the spread to match the standard normal.
242
+
243
+ The resulting standard normal distribution Z ~ N(0,1) has a mean of 0 and a variance of 1.
244
+
245
+ This transformation allows us to use standardized tables and calculations for any normal distribution.
246
+ """
247
+ )
248
+
249
+ mo.vstack([stand_image, stand_explanation])
250
+ return stand_explanation, stand_fig, stand_image
251
+
252
+
253
+ @app.cell(hide_code=True)
254
+ def _(mo):
255
+ mo.md(
256
+ r"""
257
+ ## Linear Transformations of Normal Variables
258
+
259
+ One useful property of the normal distribution is that linear transformations of normal random variables remain normal.
260
+
261
+ If $X \sim \mathcal{N}(\mu, \sigma^2)$ and $Y = aX + b$ (where $a$ and $b$ are constants), then:
262
+
263
+ $$Y \sim \mathcal{N}(a\mu + b, a^2\sigma^2)$$
264
+
265
+ This means:
266
+
267
+ - The mean is transformed by $a\mu + b$
268
+ - The variance is transformed by $a^2\sigma^2$
269
+
270
+ This property is extremely useful in statistics and probability calculations, as it allows us to easily determine the _distribution_ of transformed variables.
271
+ """
272
+ )
273
+ return
274
+
275
+
276
+ @app.cell(hide_code=True)
277
+ def _(mo):
278
+ mo.md(
279
+ r"""
280
+ ## Calculating Probabilities with the Normal CDF
281
+
282
+ Unlike many other distributions, the normal distribution does not have a closed-form expression for its CDF. However, we can use the standard normal CDF (denoted as $\Phi$) to calculate probabilities.
283
+
284
+ For any normal random variable $X \sim \mathcal{N}(\mu, \sigma^2)$, the CDF is:
285
+
286
+ $$F_X(x) = P(X \leq x) = \Phi\left(\frac{x - \mu}{\sigma}\right)$$
287
+
288
+ Where $\Phi$ is the CDF of the standard normal distribution.
289
+
290
+ ### Derivation
291
+
292
+ \begin{align*}
293
+ F_X(x) &= P(X \leq x) \\
294
+ &= P\left(\frac{X - \mu}{\sigma} \leq \frac{x - \mu}{\sigma}\right) \\
295
+ &= P\left(Z \leq \frac{x - \mu}{\sigma}\right) \\
296
+ &= \Phi\left(\frac{x - \mu}{\sigma}\right)
297
+ \end{align*}
298
+
299
+ Let's look at some examples of calculating probabilities with normal distributions.
300
+ """
301
+ )
302
+ return
303
+
304
+
305
+ @app.cell(hide_code=True)
306
+ def _(mo):
307
+ mo.md("""## Examples of Normal Distributions""")
308
+ return
309
+
310
+
311
+ @app.cell(hide_code=True)
312
+ def _(create_probability_example, fig_to_image, mo):
313
+ # Create visualization
314
+ default_mu = 3
315
+ default_sigma = 4
316
+ default_query = 0
317
+
318
+ prob_fig, prob_value, ex_z_score = create_probability_example(default_mu, default_sigma, default_query)
319
+
320
+ # Display
321
+ prob_image = mo.image(fig_to_image(prob_fig), width="100%")
322
+
323
+ prob_explanation = mo.md(
324
+ f"""
325
+ **Example: Let X ~ N(3, 16), what is P(X > 0)?**
326
+
327
+ To solve this probability question:
328
+
329
+ 1. First, we standardize the query value:
330
+ Z = (x - μ) / σ = (0 - 3) / 4 = -0.75
331
+
332
+ 2. Then we calculate using the standard normal CDF:
333
+ P(X > 0) = P(Z > -0.75) = 1 - P(Z ≤ -0.75) = 1 - Φ(-0.75)
334
+
335
+ 3. Because the standard normal is symmetric:
336
+ 1 - Φ(-0.75) = Φ(0.75) = {prob_value:.3f}
337
+
338
+ The shaded orange area in the graph represents this probability of approximately {prob_value:.3f}.
339
+ """
340
+ )
341
+
342
+ mo.vstack([prob_image, prob_explanation])
343
+ return (
344
+ default_mu,
345
+ default_query,
346
+ default_sigma,
347
+ ex_z_score,
348
+ prob_explanation,
349
+ prob_fig,
350
+ prob_image,
351
+ prob_value,
352
+ )
353
+
354
+
355
+ @app.cell(hide_code=True)
356
+ def _(create_range_probability_example, fig_to_image, mo, stats):
357
+ # Create visualization
358
+ default_range_mu = 3
359
+ default_range_sigma = 4
360
+ default_range_lower = 2
361
+ default_range_upper = 5
362
+
363
+ range_fig, range_prob, range_z_lower, range_z_upper = create_range_probability_example(
364
+ default_range_mu, default_range_sigma, default_range_lower, default_range_upper)
365
+
366
+ # Display
367
+ range_image = mo.image(fig_to_image(range_fig), width="100%")
368
+
369
+ range_explanation = mo.md(
370
+ f"""
371
+ **Example: Let X ~ N(3, 16), what is P(2 < X < 5)?**
372
+
373
+ To solve this range probability question:
374
+
375
+ 1. First, we standardize both bounds:
376
+ Z_lower = (lower - μ) / σ = (2 - 3) / 4 = -0.25
377
+ Z_upper = (upper - μ) / σ = (5 - 3) / 4 = 0.5
378
+
379
+ 2. Then we calculate using the standard normal CDF:
380
+ P(2 < X < 5) = P(-0.25 < Z < 0.5)
381
+ = Φ(0.5) - Φ(-0.25)
382
+ = Φ(0.5) - (1 - Φ(0.25))
383
+ = Φ(0.5) + Φ(0.25) - 1
384
+
385
+ 3. Computing these values:
386
+ = {stats.norm.cdf(0.5):.3f} + {stats.norm.cdf(0.25):.3f} - 1
387
+ = {range_prob:.3f}
388
+
389
+ The shaded orange area in the graph represents this probability of approximately {range_prob:.3f}.
390
+ """
391
+ )
392
+
393
+ mo.vstack([range_image, range_explanation])
394
+ return (
395
+ default_range_lower,
396
+ default_range_mu,
397
+ default_range_sigma,
398
+ default_range_upper,
399
+ range_explanation,
400
+ range_fig,
401
+ range_image,
402
+ range_prob,
403
+ range_z_lower,
404
+ range_z_upper,
405
+ )
406
+
407
+
408
+ @app.cell(hide_code=True)
409
+ def _(create_voltage_example_visualization, fig_to_image, mo):
410
+ # Create visualization
411
+ voltage_fig, voltage_error_prob = create_voltage_example_visualization()
412
+
413
+ # Display
414
+ voltage_image = mo.image(fig_to_image(voltage_fig), width="100%")
415
+
416
+ voltage_explanation = mo.md(
417
+ r"""
418
+ **Example: Signal Transmission with Noise**
419
+
420
+ In this example, we're sending digital signals over a wire:
421
+
422
+ - We send voltage 2 to represent a binary "1"
423
+ - We send voltage -2 to represent a binary "0"
424
+
425
+ The received signal R is the sum of the transmitted voltage (X) and random noise (Y):
426
+ R = X + Y, where Y ~ N(0, 1)
427
+
428
+ When decoding, we use a threshold of 0.5:
429
+
430
+ - If R ≥ 0.5, we interpret it as "1"
431
+ - If R < 0.5, we interpret it as "0"
432
+
433
+ Let's calculate the probability of error when sending a "1" (voltage = 2):
434
+
435
+ \begin{align*}
436
+ P(\text{Error when sending "1"}) &= P(X + Y < 0.5) \\
437
+ &= P(2 + Y < 0.5) \\
438
+ &= P(Y < -1.5) \\
439
+ &= \Phi(-1.5) \\
440
+ &\approx 0.067
441
+ \end{align*}
442
+
443
+ Therefore, the probability of incorrectly decoding a transmitted "1" as "0" is approximately 6.7%.
444
+
445
+ The orange shaded area in the plot represents this error probability.
446
+ """
447
+ )
448
+
449
+ mo.vstack([voltage_image, voltage_explanation])
450
+ return voltage_error_prob, voltage_explanation, voltage_fig, voltage_image
451
+
452
+
453
+ @app.cell(hide_code=True)
454
+ def emirical_rule(mo):
455
+ mo.md(
456
+ r"""
457
+ ## The 68-95-99.7 Rule (Empirical Rule)
458
+
459
+ One of the most useful properties of the normal distribution is the "[68-95-99.7 rule](https://en.wikipedia.org/wiki/68-95-99.7_rule)," which states that:
460
+
461
+ - Approximately 68% of the data falls within 1 standard deviation of the mean
462
+ - Approximately 95% of the data falls within 2 standard deviations of the mean
463
+ - Approximately 99.7% of the data falls within 3 standard deviations of the mean
464
+
465
+ Let's verify this with a calculation for the 68% rule:
466
+
467
+ \begin{align}
468
+ P(\mu - \sigma < X < \mu + \sigma)
469
+ &= P(X < \mu + \sigma) - P(X < \mu - \sigma) \\
470
+ &= \Phi\left(\frac{(\mu + \sigma)-\mu}{\sigma}\right) - \Phi\left(\frac{(\mu - \sigma)-\mu}{\sigma}\right) \\
471
+ &= \Phi\left(\frac{\sigma}{\sigma}\right) - \Phi\left(\frac{-\sigma}{\sigma}\right) \\
472
+ &= \Phi(1) - \Phi(-1) \\
473
+ &\approx 0.8413 - 0.1587 \\
474
+ &\approx 0.6826 \approx 68.3\%
475
+ \end{align}
476
+
477
+ This calculation works for any normal distribution, regardless of the values of $\mu$ and $\sigma$!
478
+ """
479
+ )
480
+ return
481
+
482
+
483
+ @app.cell(hide_code=True)
484
+ def _(mo):
485
+ mo.md(r"""The Cumulative Distribution Function (CDF) gives the probability that a random variable is less than or equal to a specific value. Use the interactive calculator below to compute CDF values for a normal distribution.""")
486
+ return
487
+
488
+
489
+ @app.cell(hide_code=True)
490
+ def _(mo, mu_slider, sigma_slider, x_slider):
491
+ mo.md(
492
+ f"""
493
+ ## Interactive Normal CDF Calculator
494
+
495
+ Use the sliders below to explore different probability calculations:
496
+
497
+ **Query value (x):** {x_slider} — The value at which to evaluate F(x) = P(X ≤ x)
498
+
499
+ **Mean (μ):** {mu_slider} — The center of the distribution
500
+
501
+ **Standard deviation (σ):** {sigma_slider} — The spread of the distribution (larger σ means more spread)
502
+ """
503
+ )
504
+ return
505
+
506
+
507
+ @app.cell(hide_code=True)
508
+ def _(
509
+ create_cdf_calculator_plot,
510
+ fig_to_image,
511
+ mo,
512
+ mu_slider,
513
+ sigma_slider,
514
+ x_slider,
515
+ ):
516
+ # Values from widgets
517
+ calc_x = x_slider.amount
518
+ calc_mu = mu_slider.amount
519
+ calc_sigma = sigma_slider.amount
520
+
521
+ # Create visualization
522
+ calc_fig, cdf_value = create_cdf_calculator_plot(calc_x, calc_mu, calc_sigma)
523
+
524
+ # Standardized z-score
525
+ calc_z_score = (calc_x - calc_mu) / calc_sigma
526
+
527
+ # Display
528
+ calc_image = mo.image(fig_to_image(calc_fig), width="100%")
529
+
530
+ calc_result = mo.md(
531
+ f"""
532
+ ### Results:
533
+
534
+ For a Normal distribution with parameters μ = {calc_mu:.1f} and σ = {calc_sigma:.1f}:
535
+
536
+ - The value x = {calc_x:.1f} corresponds to a z-score of z = {calc_z_score:.3f}
537
+ - The CDF value F({calc_x:.1f}) = P(X ≤ {calc_x:.1f}) = {cdf_value:.3f}
538
+ - This means the probability that X is less than or equal to {calc_x:.1f} is {cdf_value*100:.1f}%
539
+
540
+ **Computing this in Python:**
541
+ ```python
542
+ from scipy import stats
543
+
544
+ # Using the one-line method
545
+ p = stats.norm.cdf({calc_x:.1f}, {calc_mu:.1f}, {calc_sigma:.1f})
546
+
547
+ # OR using the two-line method
548
+ X = stats.norm({calc_mu:.1f}, {calc_sigma:.1f})
549
+ p = X.cdf({calc_x:.1f})
550
+ ```
551
+
552
+ **Note:** In SciPy's `stats.norm`, the second parameter is the standard deviation (σ), not the variance (σ²).
553
+ """
554
+ )
555
+
556
+ mo.vstack([calc_image, calc_result])
557
+ return (
558
+ calc_fig,
559
+ calc_image,
560
+ calc_mu,
561
+ calc_result,
562
+ calc_sigma,
563
+ calc_x,
564
+ calc_z_score,
565
+ cdf_value,
566
+ )
567
+
568
+
569
+ @app.cell(hide_code=True)
570
+ def _(mo):
571
+ mo.md(
572
+ r"""
573
+ ## 🤔 Test Your Understanding
574
+
575
+ Test your knowledge with these true/false questions about normal distributions:
576
+
577
+ /// details | For a normal random variable X ~ N(μ, σ²), the probability that X takes on exactly the value μ is highest among all possible values.
578
+
579
+ **✅ True**
580
+
581
+ While the PDF is indeed highest at x = μ, making this the most likely value in terms of density, remember that for continuous random variables, the probability of any exact value is zero. The statement refers to the density function being maximized at the mean.
582
+ ///
583
+
584
+ /// details | The probability that a normal random variable X equals any specific exact value (e.g., P(X = 3)) is always zero.
585
+
586
+ **✅ True**
587
+
588
+ For continuous random variables including the normal, the probability of any exact value is zero. Probabilities only make sense for ranges of values, which is why we integrate the PDF over intervals.
589
+ ///
590
+
591
+ /// details | If X ~ N(μ, σ²), then aX + b ~ N(aμ + b, a²σ²) for any constants a and b.
592
+
593
+ **✅ True**
594
+
595
+ Linear transformations of normal random variables remain normal, with the given transformation of the parameters. This is a key property that makes normal distributions particularly useful.
596
+ ///
597
+
598
+ /// details | If X ~ N(5, 9) and Y ~ N(3, 4) are independent, then X + Y ~ N(8, 5).
599
+
600
+ **❌ False**
601
+
602
+ While the mean of the sum is indeed the sum of the means (5 + 3 = 8), the variance of the sum is the sum of the variances (9 + 4 = 13), not 5. The correct distribution would be X + Y ~ N(8, 13).
603
+ ///
604
+ """
605
+ )
606
+ return
607
+
608
+
609
+ @app.cell(hide_code=True)
610
+ def _(mo):
611
+ mo.md(
612
+ r"""
613
+ ## Summary
614
+
615
+ We've taken a tour of Normal distributions; probably the most famous probability distribution you'll encounter in statistics. It's that nice bell-shaped curve that shows up everywhere from heights/ weights to memes to measurement errors & stock returns.
616
+
617
+ The Normal distribution isn't just pretty — it's incredibly practical. With just two parameters (mean and standard deviation), you can describe complex phenomena and make powerful predictions. Plus, thanks to the Central Limit Theorem, many random processes naturally converge to this distribution, which is why it's so prevalent.
618
+
619
+ **What we covered:**
620
+
621
+ - The mathematical definition and key properties of Normal random variables
622
+
623
+ - How to transform any Normal distribution to the standard Normal
624
+
625
+ - Calculating probabilities using the CDF (no more looking up values in those tiny tables in the back of textbooks or Clark's table!)
626
+
627
+ Whether you're analyzing data, designing experiments, or building ML models, the concepts we explored provide a solid foundation for working with this fundamental distribution.
628
+ """
629
+ )
630
+ return
631
+
632
+
633
+ @app.cell(hide_code=True)
634
+ def _(mo):
635
+ mo.md(r"""Appendix (helper code and functions)""")
636
+ return
637
+
638
+
639
+ @app.cell
640
+ def _():
641
+ import marimo as mo
642
+ return (mo,)
643
+
644
+
645
+ @app.cell(hide_code=True)
646
+ def _():
647
+ from wigglystuff import TangleSlider
648
+ return (TangleSlider,)
649
+
650
+
651
+ @app.cell(hide_code=True)
652
+ def _(np, plt, stats):
653
+ def create_normal_pdf_plot(mu, sigma):
654
+
655
+ # Range for x values (show μ ± 4σ)
656
+ x = np.linspace(mu - 4*sigma, mu + 4*sigma, 1000)
657
+ pdf = stats.norm.pdf(x, mu, sigma)
658
+
659
+ # Calculate CDF values
660
+ cdf = stats.norm.cdf(x, mu, sigma)
661
+
662
+ # Create plot with two subplots for (PDF and CDF)
663
+ pdf_fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8))
664
+
665
+ # PDF plot
666
+ ax1.plot(x, pdf, color='royalblue', linewidth=2, label='PDF')
667
+ ax1.fill_between(x, pdf, color='royalblue', alpha=0.2)
668
+
669
+ # Vertical line at mean
670
+ ax1.axvline(x=mu, color='red', linestyle='--', linewidth=1.5,
671
+ label=f'Mean: μ = {mu:.1f}')
672
+
673
+ # Stdev regions
674
+ for i in range(1, 4):
675
+ alpha = 0.1 if i > 1 else 0.2
676
+ percentage = 100*stats.norm.cdf(i) - 100*stats.norm.cdf(-i)
677
+ label = f'μ ± {i}σ: {percentage:.1f}%' if i == 1 else None
678
+ ax1.axvspan(mu - i*sigma, mu + i*sigma, alpha=alpha, color='green',
679
+ label=label)
680
+
681
+ # Annotations
682
+ ax1.annotate(f'μ = {mu:.1f}', xy=(mu, max(pdf)*0.15), xytext=(mu+0.5*sigma, max(pdf)*0.4),
683
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05))
684
+
685
+ ax1.annotate(f'σ = {sigma:.1f}',
686
+ xy=(mu+sigma, stats.norm.pdf(mu+sigma, mu, sigma)),
687
+ xytext=(mu+1.5*sigma, stats.norm.pdf(mu+sigma, mu, sigma)*1.5),
688
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05))
689
+
690
+ # some styling
691
+ ax1.set_title(f'Normal Distribution PDF: N({mu:.1f}, {sigma:.1f}²)')
692
+ ax1.set_xlabel('x')
693
+ ax1.set_ylabel('Probability Density: f(x)')
694
+ ax1.legend(loc='upper right')
695
+ ax1.grid(alpha=0.3)
696
+
697
+ # CDF plot
698
+ ax2.plot(x, cdf, color='darkorange', linewidth=2, label='CDF')
699
+
700
+ # key CDF values mark
701
+ key_points = [
702
+ (mu-sigma, stats.norm.cdf(mu-sigma, mu, sigma), "16%"),
703
+ (mu, 0.5, "50%"),
704
+ (mu+sigma, stats.norm.cdf(mu+sigma, mu, sigma), "84%")
705
+ ]
706
+
707
+ for point, value, label in key_points:
708
+ ax2.plot(point, value, 'ro')
709
+ ax2.annotate(f'{label}',
710
+ xy=(point, value),
711
+ xytext=(point+0.2*sigma, value-0.1),
712
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05))
713
+
714
+ # CDF styling
715
+ ax2.set_title(f'Normal Distribution CDF: N({mu:.1f}, {sigma:.1f}²)')
716
+ ax2.set_xlabel('x')
717
+ ax2.set_ylabel('Cumulative Probability: F(x)')
718
+ ax2.grid(alpha=0.3)
719
+
720
+ plt.tight_layout()
721
+ return pdf_fig
722
+ return (create_normal_pdf_plot,)
723
+
724
+
725
+ @app.cell(hide_code=True)
726
+ def _(base64, io):
727
+ from matplotlib.figure import Figure
728
+
729
+ # convert matplotlib figures to images (helper code)
730
+ def fig_to_image(fig):
731
+ buf = io.BytesIO()
732
+ fig.savefig(buf, format='png', bbox_inches='tight')
733
+ buf.seek(0)
734
+ img_str = base64.b64encode(buf.getvalue()).decode('utf-8')
735
+ return f"data:image/png;base64,{img_str}"
736
+ return Figure, fig_to_image
737
+
738
+
739
+ @app.cell(hide_code=True)
740
+ def _():
741
+ # Import libraries
742
+ import numpy as np
743
+ import matplotlib.pyplot as plt
744
+ from scipy import stats
745
+ import io
746
+ import base64
747
+ return base64, io, np, plt, stats
748
+
749
+
750
+ @app.cell(hide_code=True)
751
+ def _(TangleSlider, mo):
752
+ mean_slider = mo.ui.anywidget(TangleSlider(
753
+ amount=0,
754
+ min_value=-5,
755
+ max_value=5,
756
+ step=0.1,
757
+ digits=1
758
+ ))
759
+
760
+ std_slider = mo.ui.anywidget(TangleSlider(
761
+ amount=1,
762
+ min_value=0.1,
763
+ max_value=3,
764
+ step=0.1,
765
+ digits=1
766
+ ))
767
+ return mean_slider, std_slider
768
+
769
+
770
+ @app.cell(hide_code=True)
771
+ def _(TangleSlider, mo):
772
+ x_slider = mo.ui.anywidget(TangleSlider(
773
+ amount=0,
774
+ min_value=-5,
775
+ max_value=5,
776
+ step=0.1,
777
+ digits=1
778
+ ))
779
+
780
+ mu_slider = mo.ui.anywidget(TangleSlider(
781
+ amount=0,
782
+ min_value=-5,
783
+ max_value=5,
784
+ step=0.1,
785
+ digits=1
786
+ ))
787
+
788
+ sigma_slider = mo.ui.anywidget(TangleSlider(
789
+ amount=1,
790
+ min_value=0.1,
791
+ max_value=3,
792
+ step=0.1,
793
+ digits=1
794
+ ))
795
+ return mu_slider, sigma_slider, x_slider
796
+
797
+
798
+ @app.cell(hide_code=True)
799
+ def _(np, plt, stats):
800
+ def create_distribution_comparison(mu=5, sigma=6):
801
+
802
+ # Create figure and axis
803
+ comparison_fig, ax = plt.subplots(figsize=(10, 6))
804
+
805
+ # X range for plotting
806
+ x = np.linspace(-10, 20, 1000)
807
+
808
+ # Standard normal
809
+ std_normal = stats.norm.pdf(x, 0, 1)
810
+
811
+ # Our example normal
812
+ example_normal = stats.norm.pdf(x, mu, sigma)
813
+
814
+ # Plot both distributions
815
+ ax.plot(x, std_normal, 'darkviolet', linewidth=2, label='Standard Normal')
816
+ ax.plot(x, example_normal, 'blue', linewidth=2, label=f'X ~ N({mu}, {sigma}²)')
817
+
818
+ # format the plot
819
+ ax.set_xlim(-10, 20)
820
+ ax.set_ylim(0, 0.45)
821
+ ax.set_xlabel('x')
822
+ ax.set_ylabel('Probability Density')
823
+ ax.grid(True, alpha=0.3)
824
+ ax.legend()
825
+
826
+ # Decorative text box for parameters
827
+ props = dict(boxstyle='round', facecolor='white', alpha=0.9)
828
+ textstr = '\n'.join((
829
+ r'Normal (aka Gaussian) Random Variable',
830
+ r'',
831
+ f'Parameter $\mu$: {mu}',
832
+ f'Parameter $\sigma$: {sigma}'
833
+ ))
834
+ ax.text(0.05, 0.95, textstr, transform=ax.transAxes, fontsize=10,
835
+ verticalalignment='top', bbox=props)
836
+
837
+ return comparison_fig
838
+ return (create_distribution_comparison,)
839
+
840
+
841
+ @app.cell(hide_code=True)
842
+ def _(np, plt, stats):
843
+ def create_voltage_example_visualization():
844
+
845
+ # Create data for plotting
846
+ x = np.linspace(-4, 4, 1000)
847
+
848
+ # Signal without noise (X = 2)
849
+ signal_value = 2
850
+
851
+ # Noise distribution (Y ~ N(0, 1))
852
+ noise_pdf = stats.norm.pdf(x, 0, 1)
853
+
854
+ # Signal + Noise distribution (R = X + Y ~ N(2, 1))
855
+ received_pdf = stats.norm.pdf(x, signal_value, 1)
856
+
857
+ # Create figure
858
+ voltage_fig, ax = plt.subplots(figsize=(10, 6))
859
+
860
+ # Plot the noise distribution
861
+ ax.plot(x, noise_pdf, 'blue', linewidth=1.5, alpha=0.6,
862
+ label='Noise: Y ~ N(0, 1)')
863
+
864
+ # received signal distribution
865
+ ax.plot(x, received_pdf, 'red', linewidth=2,
866
+ label=f'Received: R ~ N({signal_value}, 1)')
867
+
868
+ # vertical line at the decision boundary (0.5)
869
+ threshold = 0.5
870
+ ax.axvline(x=threshold, color='green', linestyle='--', linewidth=2,
871
+ label=f'Decision threshold: {threshold}')
872
+
873
+ # Shade the error region
874
+ mask = x < threshold
875
+ error_prob = stats.norm.cdf(threshold, signal_value, 1)
876
+ ax.fill_between(x[mask], received_pdf[mask], color='darkorange', alpha=0.5,
877
+ label=f'Error probability: {error_prob:.3f}')
878
+
879
+ # Styling
880
+ ax.set_title('Voltage Transmission Example: Probability of Error')
881
+ ax.set_xlabel('Voltage')
882
+ ax.set_ylabel('Probability Density')
883
+ ax.legend(loc='upper left')
884
+ ax.grid(alpha=0.3)
885
+
886
+ # Add explanatory annotations
887
+ ax.text(1.5, 0.1, 'When sending "1" (voltage=2),\nthis area represents\nthe error probability',
888
+ bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="black", lw=1))
889
+
890
+ plt.tight_layout()
891
+ plt.gca()
892
+ return voltage_fig, error_prob
893
+ return (create_voltage_example_visualization,)
894
+
895
+
896
+ @app.cell(hide_code=True)
897
+ def _(np, plt, stats):
898
+ def create_cdf_calculator_plot(calc_x, calc_mu, calc_sigma):
899
+
900
+ # Data range for plotting
901
+ x_range = np.linspace(calc_mu - 4*calc_sigma, calc_mu + 4*calc_sigma, 1000)
902
+ pdf = stats.norm.pdf(x_range, calc_mu, calc_sigma)
903
+ cdf = stats.norm.cdf(x_range, calc_mu, calc_sigma)
904
+
905
+ # Calculate the CDF at x
906
+ cdf_at_x = stats.norm.cdf(calc_x, calc_mu, calc_sigma)
907
+
908
+ # Create figure with two subplots
909
+ calc_fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8))
910
+
911
+ # Plot PDF on top subplot
912
+ ax1.plot(x_range, pdf, color='royalblue', linewidth=2, label='PDF')
913
+
914
+ # area shade for P(X ≤ x)
915
+ mask = x_range <= calc_x
916
+ ax1.fill_between(x_range[mask], pdf[mask], color='darkorange', alpha=0.6)
917
+
918
+ # Vertical line at x
919
+ ax1.axvline(x=calc_x, color='red', linestyle='--', linewidth=1.5)
920
+
921
+ # PDF labels and styling
922
+ ax1.set_title(f'Normal PDF with Area P(X ≤ {calc_x:.1f}) Highlighted')
923
+ ax1.set_xlabel('x')
924
+ ax1.set_ylabel('Probability Density')
925
+ ax1.annotate(f'x = {calc_x:.1f}', xy=(calc_x, 0), xytext=(calc_x, -0.01),
926
+ horizontalalignment='center', color='red')
927
+ ax1.grid(alpha=0.3)
928
+
929
+ # CDF on bottom subplot
930
+ ax2.plot(x_range, cdf, color='green', linewidth=2, label='CDF')
931
+
932
+ # Mark the point (x, CDF(x))
933
+ ax2.plot(calc_x, cdf_at_x, 'ro', markersize=8)
934
+
935
+ # CDF labels and styling
936
+ ax2.set_title(f'Normal CDF: F({calc_x:.1f}) = {cdf_at_x:.3f}')
937
+ ax2.set_xlabel('x')
938
+ ax2.set_ylabel('Cumulative Probability')
939
+ ax2.annotate(f'F({calc_x:.1f}) = {cdf_at_x:.3f}',
940
+ xy=(calc_x, cdf_at_x),
941
+ xytext=(calc_x + 0.5*calc_sigma, cdf_at_x - 0.1),
942
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05),
943
+ bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="black", lw=1))
944
+ ax2.grid(alpha=0.3)
945
+
946
+ plt.tight_layout()
947
+ plt.gca()
948
+ return calc_fig, cdf_at_x
949
+ return (create_cdf_calculator_plot,)
950
+
951
+
952
+ @app.cell(hide_code=True)
953
+ def _(np, plt, stats):
954
+ def create_standardization_plot():
955
+
956
+ x = np.linspace(-6, 6, 1000)
957
+
958
+ # Original distribution N(2, 1.5²)
959
+ mu_original, sigma_original = 2, 1.5
960
+ pdf_original = stats.norm.pdf(x, mu_original, sigma_original)
961
+
962
+ # shifted distribution N(0, 1.5²)
963
+ mu_shifted, sigma_shifted = 0, 1.5
964
+ pdf_shifted = stats.norm.pdf(x, mu_shifted, sigma_shifted)
965
+
966
+ # Standard normal N(0, 1)
967
+ mu_standard, sigma_standard = 0, 1
968
+ pdf_standard = stats.norm.pdf(x, mu_standard, sigma_standard)
969
+
970
+ # Create visualization
971
+ stand_fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
972
+
973
+ # Plot on left: Original and shifted distributions
974
+ ax1.plot(x, pdf_original, 'royalblue', linewidth=2,
975
+ label=f'Original: N({mu_original}, {sigma_original}²)')
976
+ ax1.plot(x, pdf_shifted, 'darkorange', linewidth=2,
977
+ label=f'Shifted: N({mu_shifted}, {sigma_shifted}²)')
978
+
979
+ # Add arrow to show the shift
980
+ shift_x1, shift_y1 = mu_original, stats.norm.pdf(mu_original, mu_original, sigma_original)*0.6
981
+ shift_x2, shift_y2 = mu_shifted, stats.norm.pdf(mu_shifted, mu_shifted, sigma_shifted)*0.6
982
+ ax1.annotate('', xy=(shift_x2, shift_y2), xytext=(shift_x1, shift_y1),
983
+ arrowprops=dict(facecolor='black', width=1.5, shrink=0.05))
984
+ ax1.text(0.8, 0.28, 'Subtract μ', transform=ax1.transAxes)
985
+
986
+ # Plot on right: Shifted and standard normal
987
+ ax2.plot(x, pdf_shifted, 'darkorange', linewidth=2,
988
+ label=f'Shifted: N({mu_shifted}, {sigma_shifted}²)')
989
+ ax2.plot(x, pdf_standard, 'green', linewidth=2,
990
+ label=f'Standard: N({mu_standard}, {sigma_standard}²)')
991
+
992
+ # Add arrow to show the scaling
993
+ scale_x1, scale_y1 = 2*sigma_shifted, stats.norm.pdf(2*sigma_shifted, mu_shifted, sigma_shifted)*0.8
994
+ scale_x2, scale_y2 = 2*sigma_standard, stats.norm.pdf(2*sigma_standard, mu_standard, sigma_standard)*0.8
995
+ ax2.annotate('', xy=(scale_x2, scale_y2), xytext=(scale_x1, scale_y1),
996
+ arrowprops=dict(facecolor='black', width=1.5, shrink=0.05))
997
+ ax2.text(0.75, 0.5, 'Divide by σ', transform=ax2.transAxes)
998
+
999
+ # some styling
1000
+ for ax in (ax1, ax2):
1001
+ ax.set_xlabel('x')
1002
+ ax.set_ylabel('Probability Density')
1003
+ ax.grid(alpha=0.3)
1004
+ ax.legend()
1005
+
1006
+ ax1.set_title('Step 1: Shift the Distribution')
1007
+ ax2.set_title('Step 2: Scale the Distribution')
1008
+
1009
+ plt.tight_layout()
1010
+ plt.gca()
1011
+ return stand_fig
1012
+ return (create_standardization_plot,)
1013
+
1014
+
1015
+ @app.cell(hide_code=True)
1016
+ def _(np, plt, stats):
1017
+ def create_probability_example(example_mu=3, example_sigma=4, example_query=0):
1018
+
1019
+ # Create data range
1020
+ x = np.linspace(example_mu - 4*example_sigma, example_mu + 4*example_sigma, 1000)
1021
+ pdf = stats.norm.pdf(x, example_mu, example_sigma)
1022
+
1023
+ # probability calc
1024
+ prob_value = 1 - stats.norm.cdf(example_query, example_mu, example_sigma)
1025
+ ex_z_score = (example_query - example_mu) / example_sigma
1026
+
1027
+ # Create visualization
1028
+ prob_fig, ax = plt.subplots(figsize=(10, 6))
1029
+
1030
+ # Plot PDF
1031
+ ax.plot(x, pdf, 'royalblue', linewidth=2)
1032
+
1033
+ # area shading representing the probability
1034
+ mask = x >= example_query
1035
+ ax.fill_between(x[mask], pdf[mask], color='darkorange', alpha=0.6)
1036
+
1037
+ # Add vertical line at query point
1038
+ ax.axvline(x=example_query, color='red', linestyle='--', linewidth=1.5)
1039
+
1040
+ # Annotations
1041
+ ax.annotate(f'x = {example_query}', xy=(example_query, 0), xytext=(example_query, -0.005),
1042
+ horizontalalignment='center')
1043
+
1044
+ ax.annotate(f'P(X > {example_query}) = {prob_value:.3f}',
1045
+ xy=(example_query + example_sigma, 0.015),
1046
+ xytext=(example_query + 1.5*example_sigma, 0.02),
1047
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05),
1048
+ bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="black", lw=1))
1049
+
1050
+ # Standard normal calculation annotation
1051
+ ax.annotate(f'= P(Z > {ex_z_score:.3f}) = {prob_value:.3f}',
1052
+ xy=(example_query - example_sigma, 0.01),
1053
+ xytext=(example_query - 2*example_sigma, 0.015),
1054
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05),
1055
+ bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="black", lw=1))
1056
+
1057
+ # some styling
1058
+ ax.set_title(f'Example: P(X > {example_query}) where X ~ N({example_mu}, {example_sigma}²)')
1059
+ ax.set_xlabel('x')
1060
+ ax.set_ylabel('Probability Density')
1061
+ ax.grid(alpha=0.3)
1062
+
1063
+ plt.tight_layout()
1064
+ plt.gca()
1065
+ return prob_fig, prob_value, ex_z_score
1066
+ return (create_probability_example,)
1067
+
1068
+
1069
+ @app.cell(hide_code=True)
1070
+ def _(np, plt, stats):
1071
+ def create_range_probability_example(range_mu=3, range_sigma=4, range_lower=2, range_upper=5):
1072
+
1073
+ x = np.linspace(range_mu - 4*range_sigma, range_mu + 4*range_sigma, 1000)
1074
+ pdf = stats.norm.pdf(x, range_mu, range_sigma)
1075
+
1076
+ # probability
1077
+ range_prob = stats.norm.cdf(range_upper, range_mu, range_sigma) - stats.norm.cdf(range_lower, range_mu, range_sigma)
1078
+ range_z_lower = (range_lower - range_mu) / range_sigma
1079
+ range_z_upper = (range_upper - range_mu) / range_sigma
1080
+
1081
+ # Create visualization
1082
+ range_fig, ax = plt.subplots(figsize=(10, 6))
1083
+
1084
+ # Plot PDF
1085
+ ax.plot(x, pdf, 'royalblue', linewidth=2)
1086
+
1087
+ # Shade the area representing the probability
1088
+ mask = (x >= range_lower) & (x <= range_upper)
1089
+ ax.fill_between(x[mask], pdf[mask], color='darkorange', alpha=0.6)
1090
+
1091
+ # Add vertical lines at query points
1092
+ ax.axvline(x=range_lower, color='red', linestyle='--', linewidth=1.5)
1093
+ ax.axvline(x=range_upper, color='red', linestyle='--', linewidth=1.5)
1094
+
1095
+ # Annotations
1096
+ ax.annotate(f'x = {range_lower}', xy=(range_lower, 0), xytext=(range_lower, -0.005),
1097
+ horizontalalignment='center')
1098
+ ax.annotate(f'x = {range_upper}', xy=(range_upper, 0), xytext=(range_upper, -0.005),
1099
+ horizontalalignment='center')
1100
+
1101
+ ax.annotate(f'P({range_lower} < X < {range_upper}) = {range_prob:.3f}',
1102
+ xy=((range_lower + range_upper)/2, max(pdf[mask])/2),
1103
+ xytext=((range_lower + range_upper)/2, max(pdf[mask])*1.5),
1104
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05),
1105
+ bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="black", lw=1),
1106
+ horizontalalignment='center')
1107
+
1108
+ # Standard normal calculation annotation
1109
+ ax.annotate(f'= P({range_z_lower:.3f} < Z < {range_z_upper:.3f}) = {range_prob:.3f}',
1110
+ xy=((range_lower + range_upper)/2, max(pdf[mask])/3),
1111
+ xytext=(range_mu - 2*range_sigma, max(pdf[mask])/1.5),
1112
+ arrowprops=dict(facecolor='black', width=1, shrink=0.05),
1113
+ bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="black", lw=1))
1114
+
1115
+ ax.set_title(f'Example: P({range_lower} < X < {range_upper}) where X ~ N({range_mu}, {range_sigma}²)')
1116
+ ax.set_xlabel('x')
1117
+ ax.set_ylabel('Probability Density')
1118
+ ax.grid(alpha=0.3)
1119
+
1120
+ plt.tight_layout()
1121
+ plt.gca()
1122
+ return range_fig, range_prob, range_z_lower, range_z_upper
1123
+ return (create_range_probability_example,)
1124
+
1125
+
1126
+ if __name__ == "__main__":
1127
+ app.run()