|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- shawhin/phishing-site-classification |
|
metrics: |
|
- accuracy |
|
- recall |
|
- precision |
|
- f1 |
|
base_model: distilbert/distilbert-base-uncased |
|
pipeline_tag: text-classification |
|
library_name: transformers |
|
--- |
|
|
|
# bert-phishing-classifier_student |
|
|
|
This model is modified version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) trained via knowledge distillation from [shawhin/bert-phishing-classifier_teacher](https://huggingface.co/shawhin/bert-phishing-classifier_teacher) using the [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification) dataset. |
|
It achieves the following results on the testing set: |
|
- Loss (training): 0.0563 |
|
- Accuracy: 0.9022 |
|
- Precision: 0.9426 |
|
- Recall: 0.8603 |
|
- F1 Score: 0.8995 |
|
|
|
## Model description |
|
|
|
Student model for knowledge distillation example. |
|
|
|
[Video](https://youtu.be/FLkUOkeMd5M) | [Blog](https://towardsdatascience.com/compressing-large-language-models-llms-9f406eea5b5e) | [Example code](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/model-compression) |
|
|
|
## Intended uses & limitations |
|
|
|
This model was created for educational purposes. |
|
|
|
## Training and evaluation data |
|
|
|
The Training, Testing, and Validation data are available here: [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification). |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0001 |
|
- train_batch_size: 32 |
|
- eval_batch_size: 32 |
|
- num_epochs: 5 |
|
- temperature: 2.0 |
|
- adam optimizer alpha: 0.5 |