File size: 3,033 Bytes
9e55022
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8646a4e
 
 
 
 
 
9e55022
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# /// script
# requires-python = ">=3.11"
# dependencies = [
#     "cvxpy==1.6.0",
#     "marimo",
#     "numpy==2.2.2",
# ]
# ///

import marimo

__generated_with = "0.11.0"
app = marimo.App()


@app.cell
def _():
    import marimo as mo
    return (mo,)


@app.cell(hide_code=True)
def _(mo):
    mo.md(
        r"""
        # Least squares

        In a least-squares problem, we have measurements $A \in \mathcal{R}^{m \times
        n}$ (i.e., $m$ rows and $n$ columns) and $b \in \mathcal{R}^m$. We seek a vector
        $x \in \mathcal{R}^{n}$ such that $Ax$ is close to $b$. The matrices $A$ and $b$ are problem data or constants, and $x$ is the variable we are solving for.

        Closeness is defined as the sum of the squared differences:

        \[ \sum_{i=1}^m (a_i^Tx - b_i)^2, \]

        also known as the $\ell_2$-norm squared, $\|Ax - b\|_2^2$.

        For example, we might have a dataset of $m$ users, each represented by $n$ features. Each row $a_i^T$ of $A$ is the feature vector for user $i$, while the corresponding entry $b_i$ of $b$ is the measurement we want to predict from $a_i^T$, such as ad spending. The prediction for user $i$ is given by $a_i^Tx$.

        We find the optimal value of $x$ by solving the optimization problem

        \[
            \begin{array}{ll}
            \text{minimize}   & \|Ax - b\|_2^2.
            \end{array}
        \]

        Let $x^\star$ denote the optimal $x$. The quantity $r = Ax^\star - b$ is known as the residual. If $\|r\|_2 = 0$, we have a perfect fit.
        """
    )
    return


@app.cell(hide_code=True)
def _(mo):
    mo.md(
        r"""
        ## Example

        In this example, we use the Python library [CVXPY](https://github.com/cvxpy/cvxpy) to construct and solve a least-squares problems.
        """
    )
    return


@app.cell
def _():
    import cvxpy as cp
    import numpy as np
    return cp, np


@app.cell
def _():
    m = 20
    n = 15
    return m, n


@app.cell
def _(m, n, np):
    np.random.seed(0)
    A = np.random.randn(m, n)
    b = np.random.randn(m)
    return A, b


@app.cell
def _(A, b, cp, n):
    x = cp.Variable(n)
    objective = cp.sum_squares(A @ x - b)
    problem = cp.Problem(cp.Minimize(objective))
    optimal_value = problem.solve()
    return objective, optimal_value, problem, x


@app.cell
def _(A, b, cp, mo, optimal_value, x):
    mo.md(
        f"""
        - The optimal value is **{optimal_value:.04f}**.
        - The optimal value of $x$ is {mo.as_html(list(x.value))}
        - The norm of the residual is **{cp.norm(A @ x - b, p=2).value:0.4f}**
        """
    )
    return


@app.cell(hide_code=True)
def _(mo):
    mo.md(
        r"""
        ## Further reading

        For a primer on least squares, with many real-world examples, check out the free book
        [Vectors, Matrices, and Least Squares](https://web.stanford.edu/~boyd/vmls/), which is used for undergraduate linear algebra education at Stanford.
        """
    )
    return


if __name__ == "__main__":
    app.run()