louiecerv's picture
updated the readme file
2f7249c
---
title: Cyberattack Detection ML Approach
emoji: 🛡️
colorFrom: red
colorTo: pink
sdk: streamlit
sdk_version: 1.42.2
app_file: app.py
pinned: false
license: mit
short_description: UNSW Dataset in an ML-based analysis
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# Cyber Attack Detection ML Approach
This Streamlit app provides an interactive analysis of the UNSW-NB15 dataset, a popular benchmark for evaluating network intrusion detection systems. The app leverages machine learning techniques to classify network traffic as either normal or indicative of various attack types.
## About the UNSW-NB15 Dataset
The UNSW-NB15 dataset was created by the Cyber Security Lab at the University of New South Wales, Canberra. It's a comprehensive dataset containing network traffic captures (tcpdump) and system call traces. The dataset includes a variety of modern attack types, making it a valuable resource for training and testing intrusion detection systems. Key features of the dataset include:
- **Diverse Attack Types:** Covers a wide range of attacks such as Fuzzers, Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, and Worms.
- **Realistic Network Traffic:** Generated using a realistic network environment, simulating real-world scenarios.
- **Labeled Data:** Each network flow is labeled with its corresponding attack type or as normal traffic, enabling supervised learning.
## App Purpose
This app aims to:
1. **Visualize and Explore the Data:** Provide an interface to view the dataset's structure, data types, and descriptive statistics. This allows users to understand the characteristics of the UNSW-NB15 dataset.
2. **Train and Evaluate Machine Learning Models:** Implement and compare the performance of several machine learning classifiers, specifically:
- Naive Bayes
- Decision Tree
- K-Nearest Neighbors
3. **Analyze Model Performance:** Present confusion matrices and classification reports to evaluate the effectiveness of each model in detecting different attack types. This helps users understand the strengths and weaknesses of each algorithm.
4. **Facilitate Learning:** Serve as an educational tool for learning about network intrusion detection, machine learning classification, and dataset analysis.
## Installation
To run this app, you need to have Python installed along with the following libraries:
- streamlit
- datasets
- pandas
- huggingface_hub
- scikit-learn
- seaborn
- matplotlib
- numpy
- Pillow
You can install the required libraries using pip:
```bash
pip install streamlit datasets pandas huggingface_hub scikit-learn seaborn matplotlib numpy Pillow
```
## Usage
1. Ensure you have set the `HF_TOKEN` environment variable with your Hugging Face token.
2. Run the Streamlit app:
```bash
streamlit run app.py
```
## Features
- **Dataset Information:** View the dataset's structure, data types, and descriptive statistics.
- **Naive Bayes Classifier:** Train and evaluate a Naive Bayes model.
- **Decision Tree Classifier:** Train and evaluate a Decision Tree model.
- **K-Nearest Neighbor Classifier:** Train and evaluate a K-Nearest Neighbor model.
- **Confusion Matrix and Classification Report:** Visualize the performance of each model.
## Screenshots
!Cybersecurity
## License
This project is licensed under the MIT License.
## Acknowledgements
- The UNSW-NB15 dataset creators at the University of New South Wales, Canberra.
- The Hugging Face team for providing the datasets and tools.