English
segment anything
An-619 commited on
Commit
1f351d9
1 Parent(s): 55253c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +205 -0
README.md CHANGED
@@ -1,3 +1,208 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
  ---
6
+
7
+ ![](assets/logo.png)
8
+
9
+ # Fast Segment Anything
10
+
11
+ [[`Paper`](https://arxiv.org/pdf/2306.12156.pdf)] [[`Web Demo`](https://huggingface.co/spaces/An-619/FastSAM)] [[`Colab demo`](https://colab.research.google.com/drive/1oX14f6IneGGw612WgVlAiy91UHwFAvr9?usp=sharing)] [[`Model Zoo`](#model-checkpoints)] [[`BibTeX`](#citing-fastsam)]
12
+
13
+ ![FastSAM Speed](assets/head_fig.png)
14
+
15
+ The **Fast Segment Anything Model(FastSAM)** is a CNN Segment Anything Model trained by only 2% of the SA-1B dataset published by SAM authors. The FastSAM achieve a comparable performance with
16
+ the SAM method at **50× higher run-time speed**.
17
+
18
+ ![FastSAM design](assets/Overview.png)
19
+
20
+
21
+ ## Installation
22
+
23
+ Clone the repository locally:
24
+
25
+ ```
26
+ git clone https://github.com/CASIA-IVA-Lab/FastSAM.git
27
+ ```
28
+
29
+ Create the conda env. The code requires `python>=3.7`, as well as `pytorch>=1.7` and `torchvision>=0.8`. Please follow the instructions [here](https://pytorch.org/get-started/locally/) to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended.
30
+
31
+ ```
32
+ conda create -n FastSAM python=3.9
33
+ conda activate FastSAM
34
+ ```
35
+
36
+ Install the packages:
37
+
38
+ ```
39
+ cd FastSAM
40
+ pip install -r requirements.txt
41
+ ```
42
+
43
+ Install clip:
44
+ ```
45
+ pip install git+https://github.com/openai/CLIP.git
46
+ ```
47
+
48
+ ## <a name="GettingStarted"></a> Getting Started
49
+
50
+ First download a [model checkpoint](#model-checkpoints).
51
+
52
+ Then, you can run the scripts to try the everything mode and three prompt modes.
53
+
54
+
55
+ ```
56
+ # Everything mode
57
+ python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg
58
+ ```
59
+
60
+ ```
61
+ # text prompt
62
+ python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --text_prompt "the yellow dog"
63
+ ```
64
+
65
+ ```
66
+ # box prompt
67
+ python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --box_prompt [570,200,230,400]
68
+ ```
69
+
70
+ ```
71
+ # points prompt
72
+ python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --point_prompt "[[520,360],[620,300]]" --point_label "[1,0]"
73
+ ```
74
+ You are also welcomed to try our Colab demo: [FastSAM_example.ipynb](https://colab.research.google.com/drive/1oX14f6IneGGw612WgVlAiy91UHwFAvr9?usp=sharing).
75
+ ## Different Inference Options
76
+ We provide various options for different purposes, details are in [MORE_USAGES.md](MORE_USAGES.md).
77
+
78
+ ## Web demo
79
+
80
+ In the [web demo](https://huggingface.co/spaces/An-619/FastSAM), you can upload your own image, select input size from 512~1024, and choose whether to visualize in high quality. High quality visualization additionally shows more easily observable split edges. The web demo only supports Everything Mode now, other modes will try to support in the future.
81
+
82
+ <!-- The [web demo](https://huggingface.co/spaces/An-619/FastSAM) can process your custom image using the Everything mode. -->
83
+
84
+ ![Web Demo](assets/web_demo.png)
85
+
86
+
87
+ ## <a name="Models"></a>Model Checkpoints
88
+
89
+ Two model versions of the model are available with different sizes. Click the links below to download the checkpoint for the corresponding model type.
90
+
91
+ - **`default` or `FastSAM`: [YOLOv8x based Segment Anything Model.](https://drive.google.com/file/d/1m1sjY4ihXBU1fZXdQ-Xdj-mDltW-2Rqv/view?usp=sharing)**
92
+ - `FastSAM-s`: [YOLOv8s based Segment Anything Model.](https://drive.google.com/file/d/10XmSj6mmpmRb8NhXbtiuO9cTTBwR_9SV/view?usp=sharing)
93
+
94
+ ## Results
95
+
96
+ All result were tested on a single NVIDIA GeForce RTX 3090.
97
+
98
+ ### 1. Inference time
99
+ Running Speed under Different Point Prompt Numbers(ms).
100
+ | method | params | 1 | 10 | 100 | E(16x16) | E(32x32*) | E(64x64) |
101
+ |:------------------:|:--------:|:-----:|:-----:|:-----:|:----------:|:-----------:|:----------:|
102
+ | SAM-H | 0.6G | 446 | 464 | 627 | 852 | 2099 | 6972 |
103
+ | SAM-B | 136M | 110 | 125 | 230 | 432 | 1383 | 5417 |
104
+ | FastSAM | 68M | 40 |40 | 40 | 40 | 40 | 40 |
105
+
106
+ ### 2. Memory usage
107
+
108
+ | Dataset | Method | GPU Memory (MB) |
109
+ |:-----------:|:-----------------:|:-----------------------:|
110
+ | COCO 2017 | FastSAM | 2608 |
111
+ | COCO 2017 | SAM-H | 7060 |
112
+ | COCO 2017 | SAM-B | 4670 |
113
+
114
+ ### 3. Zero-shot Transfer Experiments
115
+
116
+ #### Edge Detection
117
+ Test on the BSDB500 dataset.
118
+ |method | year| ODS | OIS | AP | R50 |
119
+ |:----------:|:-------:|:--------:|:--------:|:------:|:-----:|
120
+ | HED | 2015| .788 | .808 | .840 | .923 |
121
+ | SAM | 2023| .768 | .786 | .794 | .928 |
122
+ | FastSAM | 2023| .750 | .790 | .793 | .903 |
123
+
124
+ #### Object Proposals
125
+ ##### COCO
126
+ |method | AR10 | AR100 | AR1000 | AUC |
127
+ |:---------------------------:|:------:|:-------:|--------:|:------:|
128
+ | SAM-H E64 | 15.5 | 45.6 | 67.7 | 32.1 |
129
+ | SAM-H E32 | 18.5 | 49.5 | 62.5 | 33.7 |
130
+ | SAM-B E32 | 11.4 | 39.6 | 59.1 | 27.3 |
131
+ | FastSAM | 15.7 | 47.3 | 63.7 | 32.2 |
132
+
133
+ ##### LVIS
134
+ bbox AR@1000
135
+ | method | all | small | med. | large |
136
+ |:---------------:|:-----:|:------:|:-----:|:------:|
137
+ | ViTDet-H | 65.0 | 53.2 | 83.3 | 91.2 |
138
+ zero-shot transfer methods
139
+ | SAM-H E64 | 52.1 | 36.6 | 75.1 | 88.2 |
140
+ | SAM-H E32 | 50.3 | 33.1 | 76.2 | 89.8 |
141
+ | SAM-B E32 | 45.0 | 29.3 | 68.7 | 80.6 |
142
+ | FastSAM | 57.1 | 44.3 | 77.1 | 85.3 |
143
+
144
+ #### Instance Segmentation On COCO 2017
145
+
146
+ |method | AP | APS | APM | APL |
147
+ |:--------------:|:--------:|:--------:|:------:|:-----:|
148
+ | ViTDet-H | .510 | .320 | .543 | .689 |
149
+ | SAM | .465 | .308 | .510 | .617 |
150
+ | FastSAM | .379 | .239 | .434 | .500 |
151
+
152
+ ### 4. Performance Visulization
153
+ Several segmentation results:
154
+ #### Natural Images
155
+ ![Natural Images](assets/eightpic.png)
156
+ #### Text to Mask
157
+ ![Text to Mask](assets/dog_clip.png)
158
+
159
+ ### 5.Downstream tasks
160
+
161
+ The results of several downstream tasks to show the effectiveness.
162
+
163
+
164
+ #### Anomaly Detection
165
+
166
+ ![Anomaly Detection](assets/anomaly.png)
167
+
168
+ #### Salient Object Detection
169
+
170
+ ![Salient Object Detection](assets/salient.png)
171
+
172
+ #### Building Extracting
173
+
174
+ ![Building Detection](assets/building.png)
175
+
176
+ ## License
177
+
178
+ The model is licensed under the [Apache 2.0 license](LICENSE).
179
+
180
+
181
+ ## Acknowledgement
182
+
183
+ - [Segment Anything](https://segment-anything.com/) provides the SA-1B dataset and the base codes.
184
+ - [YOLOv8](https://github.com/ultralytics/ultralytics) provides codes and pre-trained models.
185
+ - [YOLACT](https://arxiv.org/abs/2112.10003) provides powerful instance segmentation method.
186
+ - [Grounded-Segment-Anything](https://huggingface.co/spaces/yizhangliu/Grounded-Segment-Anything) provides a useful web demo template.
187
+
188
+
189
+ ## Citing FastSAM
190
+
191
+ If you find this project useful for your research, please consider citing the following BibTeX entry.
192
+
193
+ ```
194
+ @misc{zhao2023fast,
195
+ title={Fast Segment Anything},
196
+ author={Xu Zhao and Wenchao Ding and Yongqi An and Yinglong Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang},
197
+ year={2023},
198
+ eprint={2306.12156},
199
+ archivePrefix={arXiv},
200
+ primaryClass={cs.CV}
201
+ }
202
+ ```
203
+
204
+ <!-- <p align="center">
205
+ <a href="https://star-history.com/#geekyutao/Inpaint-Anything&Date">
206
+ <img src="https://api.star-history.com/svg?repos=geekyutao/Inpaint-Anything&type=Date" alt="Star History Chart">
207
+ </a>
208
+ </p> -->