Edit model card

SqueezeBERT Model for Unsupervised Anomaly Detection

Overview

This model was developed as part of a Course Project during my third year at HSE Faculty of Computer Science (FCS). It utilizes the SqueezeBERT architecture, tailored for the task of unsupervised anomaly detection. The model identifies anomalies by learning representations of trace tokens indicative of normal program execution through masked language modeling.

Research Notebooks

Detailed Python notebooks documenting the research and methodology are available on GitHub: Visit GitHub Repository.

Model Configuration

  • Architecture: SqueezeBERT, adapted for masked language modeling.
  • Tokenizer: LOA 13 with a dictionary size of 20,000 and a maximum token length of 300.
  • Context Window Size: 512 tokens.
  • Learning Rate: 2.5e-3.
  • Optimizer: LAMB.
  • Training Duration: Trained for 300 epochs.
  • Parameters: 43.6 million.
  • Training Environment: Google Colab, utilizing an A100 GPU, with a training time of approximately 1.5 hours.

Model Performance

The model's effectiveness in anomaly detection is evidenced by its performance on test data. For visual representation of the model's capability to segregate normal vs. anomalous execution traces.

image/png

This detailed configuration and performance data is provided to facilitate replication and further experimentation by the community. The use of the Apache-2.0 license allows for both academic and commercial use, promoting wider adoption and potential contributions to the model's development.

Downloads last month
0
Safetensors
Model size
43.6M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.