metadata
license: mit
base_model: google/vivit-b-16x2
tags:
- cctv-surveillance
- video-classification
metrics:
- accuracy
- f1
- recall
- precision
Model Performance
The model achieved the following scores on the evaluation dataset:
- Accuracy: 94.6%
- F1 Score: 94.3%
- Recall: 94.6%
- Precision: 94.5%
Intended Use & Limitations
- Best for: CCTV footage analysis, anomaly detection
- Not suitable for: Non-surveillance video types, real-time processing with limited hardware
Training Details
- Learning Rate: 5e-6
- Batch Size: 2
- Optimizer: Adam
- Training Steps: 4176
Framework Versions
- Transformers: 4.39.3
- PyTorch: 2.1.2
- Datasets: 2.18.0