Sathwikchowdary commited on
Commit
ea76204
·
verified ·
1 Parent(s): 115cca1

Delete pages/15KNN Alogrithm.py

Browse files
Files changed (1) hide show
  1. pages/15KNN Alogrithm.py +0 -137
pages/15KNN Alogrithm.py DELETED
@@ -1,137 +0,0 @@
1
- import streamlit as st
2
-
3
- # Page configuration
4
- st.set_page_config(page_title="KNN Overview", page_icon="📊", layout="wide")
5
-
6
- # Custom CSS styling for a cleaner, light-colored interface
7
- st.markdown("""
8
- <style>
9
- .stApp {
10
- background-color: #f2f6fa;
11
- }
12
- h1, h2, h3 {
13
- color: #1a237e;
14
- }
15
- .custom-font, p {
16
- font-family: 'Arial', sans-serif;
17
- font-size: 18px;
18
- color: #212121;
19
- line-height: 1.6;
20
- }
21
- </style>
22
- """, unsafe_allow_html=True)
23
-
24
- # Title
25
- st.markdown("<h1 style='color: #1a237e;'>Understanding K-Nearest Neighbors (KNN)</h1>", unsafe_allow_html=True)
26
-
27
- # Introduction to KNN
28
- st.write("""
29
- K-Nearest Neighbors (KNN) is a fundamental machine learning method suitable for both **classification** and **regression** problems. It makes predictions by analyzing the `K` closest data points in the training set.
30
-
31
- Key features:
32
- - KNN is a non-parametric model.
33
- - It memorizes training data instead of learning a model.
34
- - Distance metrics like **Euclidean** help determine similarity between data points.
35
- """)
36
-
37
- # How KNN Works
38
- st.markdown("<h2 style='color: #1a237e;'>How KNN Functions</h2>", unsafe_allow_html=True)
39
-
40
- st.subheader("Training Phase")
41
- st.write("""
42
- - KNN doesn't train a model in the traditional sense.
43
- - It stores the dataset and uses it during prediction.
44
- """)
45
-
46
- st.subheader("Prediction - Classification")
47
- st.write("""
48
- 1. Set the value of `k`.
49
- 2. Calculate the distance between the input and each point in the training data.
50
- 3. Identify the `k` nearest neighbors.
51
- 4. Use majority voting to assign the class label.
52
- """)
53
-
54
- st.subheader("Prediction - Regression")
55
- st.write("""
56
- 1. Choose `k`.
57
- 2. Find the distances to all training points.
58
- 3. Pick the closest `k` neighbors.
59
- 4. Predict using the **average** or **weighted average** of their values.
60
- """)
61
-
62
- # Overfitting and Underfitting
63
- st.subheader("Model Behavior")
64
- st.write("""
65
- - **Overfitting**: Occurs when the model captures noise by using very low values of `k`.
66
- - **Underfitting**: Happens when the model oversimplifies, often with high `k` values.
67
- - **Optimal Fit**: Found by balancing both, often using cross-validation.
68
- """)
69
-
70
- # Training vs CV Error
71
- st.subheader("Error Analysis")
72
- st.write("""
73
- - **Training Error**: Error on the dataset used for fitting.
74
- - **Cross-Validation Error**: Error on separate validation data.
75
- - Ideal models show low error in both.
76
- """)
77
-
78
- # Hyperparameter Tuning
79
- st.subheader("Hyperparameter Choices")
80
- st.write("""
81
- Important tuning options for KNN include:
82
- - `k`: Number of neighbors
83
- - `weights`: `uniform` or `distance`
84
- - `metric`: Distance formula like Euclidean or Manhattan
85
- - `n_jobs`: Parallel processing support
86
- """)
87
-
88
- # Scaling
89
- st.subheader("Why Scaling is Crucial")
90
- st.write("""
91
- KNN relies heavily on distances, so it's essential to scale features. Use:
92
- - **Min-Max Normalization** to compress values between 0 and 1.
93
- - **Z-score Standardization** to center data.
94
-
95
- Always scale training and testing data separately.
96
- """)
97
-
98
- # Weighted KNN
99
- st.subheader("Weighted KNN")
100
- st.write("""
101
- In Weighted KNN, closer neighbors have more influence on the result. It improves accuracy, especially in noisy or uneven data.
102
- """)
103
-
104
- # Decision Regions
105
- st.subheader("Decision Boundaries")
106
- st.write("""
107
- KNN creates boundaries based on training data:
108
- - Small `k` = complex, sensitive regions (risk of overfitting).
109
- - Large `k` = smoother regions (risk of underfitting).
110
- """)
111
-
112
- # Cross Validation
113
- st.subheader("Cross-Validation")
114
- st.write("""
115
- Cross-validation helps evaluate models effectively. For example:
116
- - **K-Fold CV** divides data into parts and tests each part.
117
- - Ensures model generalization.
118
- """)
119
-
120
- # Hyperparameter Optimization Techniques
121
- st.subheader("Tuning Methods")
122
- st.write("""
123
- - **Grid Search**: Tests all combinations of parameters.
124
- - **Random Search**: Picks random combinations for faster tuning.
125
- - **Bayesian Search**: Uses previous results to make better guesses on parameter selection.
126
- """)
127
-
128
- # Notebook Link
129
- st.markdown("<h2 style='color: #1a237e;'>KNN Implementation Notebook</h2>", unsafe_allow_html=True)
130
- st.markdown(
131
- "<a href='https://colab.research.google.com/drive/11wk6wt7sZImXhTqzYrre3ic4oj3KFC4M?usp=sharing' target='_blank' style='font-size: 16px; color: #1a237e;'>Click here to open the Colab notebook</a>",
132
- unsafe_allow_html=True
133
- )
134
-
135
- st.write("""
136
- KNN is intuitive and effective when combined with proper preprocessing and hyperparameter tuning. Use cross-validation to find the sweet spot and avoid overfitting or underfitting.
137
- """)