File size: 1,658 Bytes

---
license: apache-2.0
datasets:
- shawhin/phishing-site-classification
metrics:
- accuracy
- recall
- precision
- f1
base_model: distilbert/distilbert-base-uncased
pipeline_tag: text-classification
library_name: transformers
---

# bert-phishing-classifier_student

This model is modified version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) trained via knowledge distillation from [shawhin/bert-phishing-classifier_teacher](https://huggingface.co/shawhin/bert-phishing-classifier_teacher) using the [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification) dataset. 
It achieves the following results on the testing set:
- Loss (training): 0.0563
- Accuracy: 0.9022
- Precision: 0.9426
- Recall: 0.8603
- F1 Score: 0.8995

## Model description

Student model for knowledge distillation example.

[Video](https://youtu.be/FLkUOkeMd5M) | [Blog](https://towardsdatascience.com/compressing-large-language-models-llms-9f406eea5b5e) | [Example code](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/model-compression)

## Intended uses & limitations

This model was created for educational purposes.

## Training and evaluation data

The Training, Testing, and Validation data are available here: [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification).

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- num_epochs: 5
- temperature: 2.0
- adam optimizer alpha: 0.5