BAAI
/

Transformers
Safetensors
seggpt
vision
nielsr HF staff commited on
Commit
bc7d330
1 Parent(s): f7798f9

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - vision
5
+ inference: false
6
+ ---
7
+
8
+ # SegGPT model
9
+
10
+ The SegGPT model was proposed in [SegGPT: Segmenting Everything In Context](https://arxiv.org/abs/2304.03284) by Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang.
11
+
12
+ ## Model description
13
+
14
+ SegGPT employs a decoder-only (GPT-like) Transformer that can generate a segmentation mask given an input image, a prompt image and its corresponding prompt mask.
15
+ The model achieves remarkable one-shot results with 56.1 mIoU on COCO-20 and 85.6 mIoU on FSS-1000.
16
+
17
+ ## Intended uses & limitations
18
+
19
+ You can use the raw model for one-shot image segmentation.
20
+
21
+ ### How to use
22
+
23
+ Here's how to use the model for one-shot semantic segmentation:
24
+
25
+ ```python
26
+ import torch
27
+ from datasets import load_dataset
28
+ from transformers import SegGptImageProcessor, SegGptForImageSegmentation
29
+
30
+ model_id = "EduardoPacheco/seggpt-vit-large"
31
+ image_processor = SegGptImageProcessor.from_pretrained(checkpoint)
32
+ model = SegGptForImageSegmentation.from_pretrained(checkpoint)
33
+
34
+ dataset_id = "EduardoPacheco/FoodSeg103"
35
+ ds = load_dataset(dataset_id, split="train")
36
+ # Number of labels in FoodSeg103 (not including background)
37
+ num_labels = 103
38
+
39
+ image_input = ds[4]["image"]
40
+ ground_truth = ds[4]["label"]
41
+ image_prompt = ds[29]["image"]
42
+ mask_prompt = ds[29]["label"]
43
+
44
+ inputs = image_processor(
45
+ images=image_input,
46
+ prompt_images=image_prompt,
47
+ prompt_masks=mask_prompt,
48
+ num_labels=num_labels,
49
+ return_tensors="pt"
50
+ )
51
+
52
+ with torch.no_grad():
53
+ outputs = model(**inputs)
54
+
55
+ target_sizes = [image_input.size[::-1]]
56
+ mask = image_processor.post_process_semantic_segmentation(outputs, target_sizes, num_labels=num_labels)[0]
57
+ ```
58
+
59
+ ### BibTeX entry and citation info
60
+
61
+ ```bibtex
62
+ @misc{wang2023seggpt,
63
+ title={SegGPT: Segmenting Everything In Context},
64
+ author={Xinlong Wang and Xiaosong Zhang and Yue Cao and Wen Wang and Chunhua Shen and Tiejun Huang},
65
+ year={2023},
66
+ eprint={2304.03284},
67
+ archivePrefix={arXiv},
68
+ primaryClass={cs.CV}
69
+ }
70
+ ```