EAR-WACV25-DAKiet-TSM
The model was presented in the paper .
This model is a Temporal Shift Module (TSM) based video classification model with a resnext50_32x4d backbone.
Github Repository: https://github.com/fdfyaytkt/EAR-WACV25-DAKiet-TSM
Data
The model was trained on a combination of datasets:
- Toyota Smarthome dataset: Used for activity recognition.
- ETRI-Activity3D: RGB videos (specific subsets or full dataset used depending on configuration).
- ETRI-Activity3D-LivingLab: RGB videos (specific subsets or full dataset used depending on configuration).
Two configurations are detailed below, with their respective public leaderboard scores:
Config 1 (Public Leaderboard: 0.84402)
- Toyota Smarthome dataset
- ETRI-Activity3D - RGB videos (RGB_P091-P100)
- ETRI-Activity3D-LivingLab - RGB videos (RGB(P201-P230))
Config 2 (Public Leaderboard: 0.78856)
- Toyota Smarthome dataset
- ETRI-Activity3D - RGB videos (full)
- ETRI-Activity3D-LivingLab - RGB videos (full)
Running
Example training and evaluation commands are provided below. Refer to the repository for complete details and options:
Train
python main.py elderly RGB --arch resnext50_32x4d --num_segments 8 --gd 20 --lr 0.001 --wd 1e-4 --lr_steps 20 40 --epochs 100 --batch-size 4 -j 32 --dropout 0.5 --consensus_type=avg --eval-freq=1 --shift --shift_div=8 --shift_place=blockres --npb
Eval
python generate_submission.py elderly --arch=resnext50_32x4d --csv_file=submission.csv --weights=checkpoint/TSM_elderly_RGB_resnext50_32x4d_shift8_blockres_avg_segment8_e100/ckpt.best.pth.tar --test_segments=8 --batch_size=1 --test_crops=1
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support video-classification models for pytorch library.