|
--- |
|
license: mit |
|
base_model: google/vivit-b-16x2 |
|
tags: |
|
- cctv-surveillance |
|
- video-classification |
|
metrics: |
|
- accuracy |
|
- f1 |
|
- recall |
|
- precision |
|
--- |
|
|
|
## Model Performance |
|
|
|
The model achieved the following scores on the evaluation dataset: |
|
|
|
- **Accuracy**: 94.6% |
|
- **F1 Score**: 94.3% |
|
- **Recall**: 94.6% |
|
- **Precision**: 94.5% |
|
|
|
## Intended Use & Limitations |
|
|
|
- **Best for:** CCTV footage analysis, anomaly detection |
|
- **Not suitable for:** Non-surveillance video types, real-time processing with limited hardware |
|
|
|
## Training Details |
|
|
|
- **Learning Rate:** 5e-6 |
|
- **Batch Size:** 2 |
|
- **Optimizer:** Adam |
|
- **Training Steps:** 4176 |
|
|
|
## Framework Versions |
|
|
|
- Transformers: 4.39.3 |
|
- PyTorch: 2.1.2 |
|
- Datasets: 2.18.0 |
|
|
|
|