---
license: mit
base_model: google/vivit-b-16x2
tags:
- cctv-surveillance
- video-classification
metrics:
- accuracy
- f1
- recall
- precision
---

## Model Performance

The model achieved the following scores on the evaluation dataset:

- **Accuracy**: 94.6%
- **F1 Score**: 94.3%
- **Recall**: 94.6%
- **Precision**: 94.5%

## Intended Use & Limitations

- **Best for:** CCTV footage analysis, anomaly detection  
- **Not suitable for:** Non-surveillance video types, real-time processing with limited hardware  

## Training Details

- **Learning Rate:** 5e-6  
- **Batch Size:** 2  
- **Optimizer:** Adam  
- **Training Steps:** 4176  

## Framework Versions

- Transformers: 4.39.3  
- PyTorch: 2.1.2  
- Datasets: 2.18.0