File size: 1,877 Bytes
0a26abe |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# Clustering Algorithms for Customer Segmentation
This repository contains a comprehensive implementation of various clustering algorithms to perform customer segmentation on a synthetic dataset. The project explores K-Means, Hierarchical Clustering, DBSCAN, and Gaussian Mixture Models (GMM) to identify distinct customer groups based on age and income.
## Project Structure
- `implementation.ipynb`: The main Jupyter notebook containing the entire analysis, from data generation to model evaluation and visualization.
- `data/`: Contains the synthetic `customer_data.csv` file.
- `models/`: Stores the trained clustering models and the data scaler.
- `results/`: Includes the algorithm comparison, detailed analysis, and experiment summary.
- `visualizations/`: Contains the output plots, such as the elbow method analysis and cluster comparisons.
## Features
- **Data Generation**: A synthetic customer dataset is generated with clear cluster structures for effective model training and evaluation.
- **Multiple Algorithms**: Implements and compares four popular clustering algorithms:
- K-Means
- Hierarchical Clustering
- DBSCAN
- Gaussian Mixture Models (GMM)
- **Model Evaluation**: Uses the elbow method and silhouette scores to determine the optimal number of clusters and evaluate performance.
- **Comprehensive Visualization**: Generates plots to visualize the clusters, compare algorithm performance, and analyze the optimal 'k'.
## How to Use
1. **Clone the repository:**
```bash
git clone https://github.com/GruheshKurra/ClusteringAlgorithms.git
```
2. **Install dependencies:**
```bash
pip install -r requirements.txt
```
3. **Run the notebook:**
Open and run the `implementation.ipynb` notebook in a Jupyter environment to see the full analysis.
## License
This project is licensed under the MIT License. |