Spaces: Running on Zero
Getting Started with Mask-Adapter
This document provides a brief intro of the usage of Mask-Adapter.
Please see Getting Started with Detectron2 for full usage.
Inference Demo with Pre-trained Models
We provide demo.py
that is able to demo builtin configs. Run it with:
cd demo/
python demo.py \
--input input1.jpg input2.jpg \
[--other-options]
--opts MODEL.WEIGHTS /path/to/checkpoint_file
The configs are made for training, therefore we need to specify MODEL.WEIGHTS
to a model from model zoo for evaluation.
This command will run the inference and show visualizations in an OpenCV window.
For details of the command line arguments, see demo.py -h
or look at its source code
to understand its behavior. Some common arguments are:
- To run on your webcam, replace
--input files
with--webcam
. - To run on a video, replace
--input files
with--video-input video.mp4
. - To run on cpu, add
MODEL.DEVICE cpu
after--opts
. - To save outputs to a directory (for images) or a file (for webcam or video), use
--output
.
Ground-truth Warmup Training
We provide the script train_net_maskadapter.py
to train the mask-adapter using ground-truth masks.To train a model with train_net_maskadapter.py
, first set up the corresponding datasets as described in datasets/README.md , and then run the following command:
python train_net_maskadapter.py --num-gpus 4 \
--config-file configs/ground-truth-warmup/mask-adapter/mask_adapter_convnext_large_cocopan_eval_ade20k.yaml
For the MAFTP model, run:
python train_net_maskadapter.py --num-gpus 4 \
--config-file configs/ground-truth-warmup/mask-adapter/mask_adapter_maft_convnext_large_cocostuff_eval_ade20k.yaml \
MODEL.WEIGHTS /path/to/maftp_l.pth
The configurations are set for 4-GPU training. Since we use the ADAMW optimizer, it is unclear how to scale the learning rate with batch size. If training with a single GPU, you will need to manually adjust the learning rate and batch size:
python train_net_maskadapter.py \
--config-file configs/ground-truth-warmup/mask-adapter/mask_adapter_convnext_large_cocopan_eval_ade20k.yaml \
--num-gpus 1 SOLVER.IMS_PER_BATCH SET_TO_SOME_REASONABLE_VALUE SOLVER.BASE_LR SET_TO_SOME_REASONABLE_VALUE
Combining Mask-Adapter Weights with Mask2Former
Since the ground-truth warmup phase for training the mask-adapter does not involve training Mask2Former, the weights obtained in the first phase will not include Mask2Former weights. To combine the weights, run the following command:
python tools/weight_fuse.py \
--model_first_phase_path /path/to/first_phase.pth \
--model_sem_seg_path /path/to/maftp_l.pth \
--output_path /path/to/maftp_l_withadapter.pth
Mixed-Masks Training
For the mixed-masks training phase, we provide two scripts: train_net_fcclip.py
and train_net_maftp.py
, which train the mask-adapter for FC-CLIP and MAFTP models, respectively. These two models use different backbones (CLIP) and training source data.
For FC-CLIP, run:
python train_net_fcclip.py --num-gpus 4 \
--config-file configs/mixed-mask-training/fc-clip/fcclip/fcclip_convnext_large_eval_ade20k.yaml MODEL.WEIGHTS /path/to/checkpoint_file
For MAFTP, run:
python train_net_maftp.py --num-gpus 4 \
--config-file configs/mixed-mask-training/maftp/semantic/train_semantic_large_eval_a150.yaml MODEL.WEIGHTS /path/to/checkpoint_file
To evaluate a model’s performance, for FC-CLIP, use:
python train_net_fcclip.py \
--config-file configs/mixed-mask-training/fc-clip/fcclip/fcclip_convnext_large_eval_ade20k.yaml \
--eval-only MODEL.WEIGHTS /path/to/checkpoint_file
For MAFTP, use:
python train_net_maftp.py \
--config-file configs/mixed-mask-training/maftp/semantic/train_semantic_large_eval_a150.yaml \
--eval-only MODEL.WEIGHTS /path/to/checkpoint_file