|
import streamlit as st |
|
|
|
|
|
st.set_page_config(page_title="Linear Regression", page_icon="🤖", layout="wide") |
|
|
|
|
|
st.markdown(""" |
|
<style> |
|
.stApp { |
|
background-color: #ECECEC; |
|
} |
|
h1, h2, h3 { |
|
color: #002244; |
|
} |
|
.custom-font, p { |
|
font-family: 'Arial', sans-serif; |
|
font-size: 18px; |
|
color: black; |
|
line-height: 1.6; |
|
} |
|
</style> |
|
""", unsafe_allow_html=True) |
|
|
|
|
|
st.markdown("<h1 style='color: #002244;'>Complete Overview of Linear Regression</h1>", unsafe_allow_html=True) |
|
|
|
st.write(""" |
|
Linear Regression is a key **Supervised Learning** method mainly used for **regression problems**. |
|
It predicts continuous outputs by identifying the best-fit line (or hyperplane) that minimizes the gap between actual and predicted values. |
|
""") |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>What Defines the Best Fit Line?</h2>", unsafe_allow_html=True) |
|
st.write(""" |
|
The ideal regression line is one that: |
|
- Gets as close as possible to all data points. |
|
- **Minimizes the error** between actual and predicted values. |
|
- Is calculated using optimization techniques like **Ordinary Least Squares (OLS)** or **Gradient Descent**. |
|
""") |
|
st.image("linearregression.png", width=900) |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>Training Process: Simple Linear Regression</h2>", unsafe_allow_html=True) |
|
st.write(""" |
|
Simple Linear Regression is typically used when there's only one or two features. |
|
It models the relationship between \( x \) (independent variable) and \( y \) (dependent variable) as: |
|
\[ y = w_1x + w_0 \] |
|
Where: |
|
- \( w_1 \) is the slope (weight) |
|
- \( w_0 \) is the intercept (bias) |
|
|
|
**Steps:** |
|
- Initialize weights \( w_1 \), \( w_0 \) randomly. |
|
- Predict output \( \hat{y} \) for each input \( x \). |
|
- Compute **Mean Squared Error (MSE)** to assess prediction error. |
|
- Update weights using **Gradient Descent**. |
|
- Repeat until the error is minimized. |
|
""") |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>Testing Phase</h2>", unsafe_allow_html=True) |
|
st.write(""" |
|
Once trained: |
|
1. Feed a new input \( x \). |
|
2. Use the model to predict \( \hat{y} \). |
|
3. Compare predicted value with actual to evaluate model accuracy. |
|
""") |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>Multiple Linear Regression</h2>", unsafe_allow_html=True) |
|
st.write(""" |
|
Multiple Linear Regression (MLR) extends simple linear regression by using multiple features to predict the output. |
|
- Initialize all weights \( w_1, w_2, ..., w_n \) and \( w_0 \). |
|
- Predict \( y \) for each data point. |
|
- Calculate loss using **MSE**. |
|
- Update weights via **Gradient Descent** to improve model accuracy. |
|
""") |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>Gradient Descent: Optimization Technique</h2>", unsafe_allow_html=True) |
|
st.write(""" |
|
Gradient Descent is used to minimize the loss function: |
|
- Begin with random weights and bias. |
|
- Compute gradient (derivative) of the loss function. |
|
- Update weights using the gradient: |
|
\[ w = w - \alpha \frac{dL}{dw} \] |
|
- Repeat until convergence (minimum loss is achieved). |
|
- Suggested learning rates: **0.1, 0.01** — avoid very large values. |
|
""") |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>Core Assumptions in Linear Regression</h2>", unsafe_allow_html=True) |
|
st.write(""" |
|
1. **Linearity**: Relationship between input and output is linear. |
|
2. **No Multicollinearity**: Features should not be highly correlated. |
|
3. **Homoscedasticity**: Errors should have constant variance. |
|
4. **Normality of Errors**: Residuals are normally distributed. |
|
5. **No Autocorrelation**: Residuals are independent of each other. |
|
""") |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>Model Evaluation Metrics</h2>", unsafe_allow_html=True) |
|
st.write(""" |
|
- **Mean Squared Error (MSE)**: Average squared error between predicted and actual values. |
|
- **Mean Absolute Error (MAE)**: Average absolute difference. |
|
- **R-squared (R²)**: Proportion of variance explained by the model. |
|
""") |
|
|
|
|
|
st.markdown("<h2 style='color: #002244;'>Practice Notebook: Linear Regression Implementation</h2>", unsafe_allow_html=True) |
|
st.markdown("<a href='https://colab.research.google.com/drive/11-Rv7BC2PhOqk5hnpdXo6QjqLLYLDvTD?usp=sharing' target='_blank' style='font-size: 16px; color: #002244;'>Click here to open the Jupyter Notebook</a>", unsafe_allow_html=True) |
|
|
|
|
|
st.write("Linear Regression remains a simple yet powerful tool. Understanding how it works under the hood—optimization, assumptions, and evaluation—helps in building better models.") |
|
|