File size: 3,539 Bytes
be5af3f
2b44237
2f7249c
81b34dd
2b44237
be5af3f
eb6676a
be5af3f
 
 
2b44237
be5af3f
 
2f7249c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
title: Cyberattack Detection ML Approach
emoji: 🛡️
colorFrom: red
colorTo: pink
sdk: streamlit
sdk_version: 1.42.2
app_file: app.py
pinned: false
license: mit
short_description: UNSW Dataset in an ML-based analysis
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# Cyber Attack Detection ML Approach

This Streamlit app provides an interactive analysis of the UNSW-NB15 dataset, a popular benchmark for evaluating network intrusion detection systems. The app leverages machine learning techniques to classify network traffic as either normal or indicative of various attack types.

## About the UNSW-NB15 Dataset

The UNSW-NB15 dataset was created by the Cyber Security Lab at the University of New South Wales, Canberra. It's a comprehensive dataset containing network traffic captures (tcpdump) and system call traces. The dataset includes a variety of modern attack types, making it a valuable resource for training and testing intrusion detection systems. Key features of the dataset include:

- **Diverse Attack Types:** Covers a wide range of attacks such as Fuzzers, Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, and Worms.
- **Realistic Network Traffic:** Generated using a realistic network environment, simulating real-world scenarios.
- **Labeled Data:** Each network flow is labeled with its corresponding attack type or as normal traffic, enabling supervised learning.

## App Purpose

This app aims to:

1. **Visualize and Explore the Data:** Provide an interface to view the dataset's structure, data types, and descriptive statistics. This allows users to understand the characteristics of the UNSW-NB15 dataset.
2. **Train and Evaluate Machine Learning Models:** Implement and compare the performance of several machine learning classifiers, specifically:
   - Naive Bayes
   - Decision Tree
   - K-Nearest Neighbors
3. **Analyze Model Performance:** Present confusion matrices and classification reports to evaluate the effectiveness of each model in detecting different attack types. This helps users understand the strengths and weaknesses of each algorithm.
4. **Facilitate Learning:** Serve as an educational tool for learning about network intrusion detection, machine learning classification, and dataset analysis.

## Installation

To run this app, you need to have Python installed along with the following libraries:

- streamlit
- datasets
- pandas
- huggingface_hub
- scikit-learn
- seaborn
- matplotlib
- numpy
- Pillow

You can install the required libraries using pip:

```bash
pip install streamlit datasets pandas huggingface_hub scikit-learn seaborn matplotlib numpy Pillow
```

## Usage

1. Ensure you have set the `HF_TOKEN` environment variable with your Hugging Face token.
2. Run the Streamlit app:

```bash
streamlit run app.py
```

## Features

- **Dataset Information:** View the dataset's structure, data types, and descriptive statistics.
- **Naive Bayes Classifier:** Train and evaluate a Naive Bayes model.
- **Decision Tree Classifier:** Train and evaluate a Decision Tree model.
- **K-Nearest Neighbor Classifier:** Train and evaluate a K-Nearest Neighbor model.
- **Confusion Matrix and Classification Report:** Visualize the performance of each model.

## Screenshots

!Cybersecurity

## License

This project is licensed under the MIT License.

## Acknowledgements

- The UNSW-NB15 dataset creators at the University of New South Wales, Canberra.
- The Hugging Face team for providing the datasets and tools.