mastavtsev/SqueezeBERT_PM_CLR

SqueezeBERT Model for Unsupervised Anomaly Detection

Overview

This model was developed as part of a Course Project during my third year at HSE Faculty of Computer Science (FCS). It utilizes the SqueezeBERT architecture, tailored for the task of unsupervised anomaly detection. The model identifies anomalies by learning representations of trace tokens indicative of normal program execution through masked language modeling.

Research Notebooks

Detailed Python notebooks documenting the research and methodology are available on GitHub: Visit GitHub Repository.

Model Configuration

Architecture: SqueezeBERT, adapted for masked language modeling.
Tokenizer: LOA 13 with a dictionary size of 20,000 and a maximum token length of 300.
Context Window Size: 512 tokens.
Learning Rate: 2.5e-3.
Optimizer: LAMB.
Training Duration: Trained for 300 epochs.
Parameters: 43.6 million.
Training Environment: Google Colab, utilizing an A100 GPU, with a training time of approximately 1.5 hours.

Model Performance

The model's effectiveness in anomaly detection is evidenced by its performance on test data. For visual representation of the model's capability to segregate normal vs. anomalous execution traces.

This detailed configuration and performance data is provided to facilitate replication and further experimentation by the community. The use of the Apache-2.0 license allows for both academic and commercial use, promoting wider adoption and potential contributions to the model's development.