shawhin's picture
Update README.md
3a1447d verified
---
license: apache-2.0
datasets:
- shawhin/phishing-site-classification
metrics:
- accuracy
- recall
- precision
- f1
base_model: distilbert/distilbert-base-uncased
pipeline_tag: text-classification
library_name: transformers
---
# bert-phishing-classifier_student
This model is modified version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) trained via knowledge distillation from [shawhin/bert-phishing-classifier_teacher](https://huggingface.co/shawhin/bert-phishing-classifier_teacher) using the [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification) dataset.
It achieves the following results on the testing set:
- Loss (training): 0.0563
- Accuracy: 0.9022
- Precision: 0.9426
- Recall: 0.8603
- F1 Score: 0.8995
## Model description
Student model for knowledge distillation example.
[Video](https://youtu.be/FLkUOkeMd5M) | [Blog](https://towardsdatascience.com/compressing-large-language-models-llms-9f406eea5b5e) | [Example code](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/model-compression)
## Intended uses & limitations
This model was created for educational purposes.
## Training and evaluation data
The Training, Testing, and Validation data are available here: [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification).
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- num_epochs: 5
- temperature: 2.0
- adam optimizer alpha: 0.5