arkodeep
/

spam-classfication-model

spam classification

text classification

Model card Files Files and versions Community

arkodeep commited on Dec 30, 2024

Commit

94ce757

·

verified ·

1 Parent(s): 3333ce1

Update README.md

Files changed (1) hide show

README.md +69 -3

README.md CHANGED Viewed

@@ -1,3 +1,69 @@
----
-license: wtfpl
----

+---
+license: wtfpl
+datasets:
+- arkodeep/spam-data
+language:
+- en
+tags:
+- spam
+- spam classification
+- text
+- spam detection
+- text classification
+---
+# Spam Detection System
+## Lite Model
+### Introduction
+The Lite model is a streamlined approach with optimized parameters and enhanced feature extraction designed for quick and efficient spam detection.
+### Features
+- **Text Preprocessing**: Lemmatization, removal of stop words and punctuation.
+- **Feature Extraction**: Text length, word count, unique word count, uppercase count, special character count.
+- **Model Creation**: Ensemble model using SVC, MultinomialNB, and ExtraTreesClassifier.
+- **Visualization**: Generates graphs for dataset insights, word clouds, and performance metrics.
+- **Metrics Saving**: Accuracy, precision, and F1 score.
+### How to Run
+1. **Train the Model**:
+    ```bash
+    python training/train_model_lite.py
+    ```
+2. **Use the Model**:
+    ```python
+    import joblib
+    model = joblib.load('models/model.pkl')
+    vectorizer = joblib.load('models/vectorizer.pkl')
+    ```
+## Legacy Model
+### Introduction
+The Legacy model retains the original model logic without optimization but updates the structure and adds visualizations for spam detection.
+### Features
+- **Text Preprocessing**: Porter Stemming, removal of stop words and punctuation.
+- **Model Creation**: Ensemble model using SVC, MultinomialNB, and ExtraTreesClassifier with original parameters.
+- **Visualization**: Generates graphs for dataset insights, word clouds, and performance metrics.
+- **Metrics Saving**: Accuracy and precision.
+### How to Run
+1. **Train the Model**:
+    ```bash
+    python training/train_model_legacy.py
+    ```
+2. **Use the Model**:
+    ```python
+    import joblib
+    model = joblib.load('models/model.pkl')
+    vectorizer = joblib.load('models/vectorizer.pkl')
+    ```
+### Additional Information
+- **Dependencies**: Python 3.6 or higher, pip, and required packages listed in `requirements.txt`.
+- **Dataset**: The dataset used for training is `spam.csv`.
+- **Contact and Support**: For questions or support, please contact the project maintainers.
+For more details, you can refer to the [README.md](https://github.com/arkodeepsen/spam-filter-mbo/blob/4894a939099e5523f22bf3c2e5b3d763c92a73c6/README.md) and [models.md](https://github.com/arkodeepsen/spam-filter-mbo/blob/4894a939099e5523f22bf3c2e5b3d763c92a73c6/models.md).