shawhin
/

bert-phishing-classifier_student

Text Classification

Inference Endpoints

Model card Files Files and versions Community

bert-phishing-classifier_student / README.md

shawhin's picture

Update README.md

3a1447d verified 7 months ago

|

history blame contribute delete

1.66 kB

	---
	license: apache-2.0
	datasets:
	- shawhin/phishing-site-classification
	metrics:
	- accuracy
	- recall
	- precision
	- f1
	base_model: distilbert/distilbert-base-uncased
	pipeline_tag: text-classification
	library_name: transformers
	---

	# bert-phishing-classifier_student

	This model is modified version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) trained via knowledge distillation from [shawhin/bert-phishing-classifier_teacher](https://huggingface.co/shawhin/bert-phishing-classifier_teacher) using the [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification) dataset.
	It achieves the following results on the testing set:
	- Loss (training): 0.0563
	- Accuracy: 0.9022
	- Precision: 0.9426
	- Recall: 0.8603
	- F1 Score: 0.8995

	## Model description

	Student model for knowledge distillation example.

	[Video](https://youtu.be/FLkUOkeMd5M) \| [Blog](https://towardsdatascience.com/compressing-large-language-models-llms-9f406eea5b5e) \| [Example code](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/model-compression)

	## Intended uses & limitations

	This model was created for educational purposes.

	## Training and evaluation data

	The Training, Testing, and Validation data are available here: [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification).

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 32
	- eval_batch_size: 32
	- num_epochs: 5
	- temperature: 2.0
	- adam optimizer alpha: 0.5