Clustering Algorithms for Customer Segmentation

This repository contains a comprehensive implementation of various clustering algorithms to perform customer segmentation on a synthetic dataset. The project explores K-Means, Hierarchical Clustering, DBSCAN, and Gaussian Mixture Models (GMM) to identify distinct customer groups based on age and income.

Project Structure

implementation.ipynb: The main Jupyter notebook containing the entire analysis, from data generation to model evaluation and visualization.
data/: Contains the synthetic customer_data.csv file.
models/: Stores the trained clustering models and the data scaler.
results/: Includes the algorithm comparison, detailed analysis, and experiment summary.
visualizations/: Contains the output plots, such as the elbow method analysis and cluster comparisons.

Features

Data Generation: A synthetic customer dataset is generated with clear cluster structures for effective model training and evaluation.
Multiple Algorithms: Implements and compares four popular clustering algorithms:
- K-Means
- Hierarchical Clustering
- DBSCAN
- Gaussian Mixture Models (GMM)
Model Evaluation: Uses the elbow method and silhouette scores to determine the optimal number of clusters and evaluate performance.
Comprehensive Visualization: Generates plots to visualize the clusters, compare algorithm performance, and analyze the optimal 'k'.

How to Use

Clone the repository:

git clone https://github.com/GruheshKurra/ClusteringAlgorithms.git

Install dependencies:
```
pip install -r requirements.txt
```
Run the notebook: Open and run the implementation.ipynb notebook in a Jupyter environment to see the full analysis.

License

This project is licensed under the MIT License.