--- title: Cyberattack Detection ML Approach emoji: 🛡️ colorFrom: red colorTo: pink sdk: streamlit sdk_version: 1.42.2 app_file: app.py pinned: false license: mit short_description: UNSW Dataset in an ML-based analysis --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Cyber Attack Detection ML Approach This Streamlit app provides an interactive analysis of the UNSW-NB15 dataset, a popular benchmark for evaluating network intrusion detection systems. The app leverages machine learning techniques to classify network traffic as either normal or indicative of various attack types. ## About the UNSW-NB15 Dataset The UNSW-NB15 dataset was created by the Cyber Security Lab at the University of New South Wales, Canberra. It's a comprehensive dataset containing network traffic captures (tcpdump) and system call traces. The dataset includes a variety of modern attack types, making it a valuable resource for training and testing intrusion detection systems. Key features of the dataset include: - **Diverse Attack Types:** Covers a wide range of attacks such as Fuzzers, Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, and Worms. - **Realistic Network Traffic:** Generated using a realistic network environment, simulating real-world scenarios. - **Labeled Data:** Each network flow is labeled with its corresponding attack type or as normal traffic, enabling supervised learning. ## App Purpose This app aims to: 1. **Visualize and Explore the Data:** Provide an interface to view the dataset's structure, data types, and descriptive statistics. This allows users to understand the characteristics of the UNSW-NB15 dataset. 2. **Train and Evaluate Machine Learning Models:** Implement and compare the performance of several machine learning classifiers, specifically: - Naive Bayes - Decision Tree - K-Nearest Neighbors 3. **Analyze Model Performance:** Present confusion matrices and classification reports to evaluate the effectiveness of each model in detecting different attack types. This helps users understand the strengths and weaknesses of each algorithm. 4. **Facilitate Learning:** Serve as an educational tool for learning about network intrusion detection, machine learning classification, and dataset analysis. ## Installation To run this app, you need to have Python installed along with the following libraries: - streamlit - datasets - pandas - huggingface_hub - scikit-learn - seaborn - matplotlib - numpy - Pillow You can install the required libraries using pip: ```bash pip install streamlit datasets pandas huggingface_hub scikit-learn seaborn matplotlib numpy Pillow ``` ## Usage 1. Ensure you have set the `HF_TOKEN` environment variable with your Hugging Face token. 2. Run the Streamlit app: ```bash streamlit run app.py ``` ## Features - **Dataset Information:** View the dataset's structure, data types, and descriptive statistics. - **Naive Bayes Classifier:** Train and evaluate a Naive Bayes model. - **Decision Tree Classifier:** Train and evaluate a Decision Tree model. - **K-Nearest Neighbor Classifier:** Train and evaluate a K-Nearest Neighbor model. - **Confusion Matrix and Classification Report:** Visualize the performance of each model. ## Screenshots !Cybersecurity ## License This project is licensed under the MIT License. ## Acknowledgements - The UNSW-NB15 dataset creators at the University of New South Wales, Canberra. - The Hugging Face team for providing the datasets and tools.