Update README.md

497e0ca verified 8 months ago

4.79 kB

	---
	pipeline_tag: image-segmentation
	---

	<!---
	Copyright 2024 The HuggingFace Team. All rights reserved.

	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->

	# Instance Segmentation Example

	Content:
	- [PyTorch Version with Trainer](#pytorch-version-with-trainer)
	- [Reload and Perform Inference](#reload-and-perform-inference)
	- [Note on Custom Data](#note-on-custom-data)

	## PyTorch Version with Trainer

	This model is based on the script [`run_instance_segmentation.py`](https://github.com/huggingface/transformers/blob/main/examples/pytorch/instance-segmentation/run_instance_segmentation.py).
	The script uses the [🤗 Trainer API](https://huggingface.co/docs/transformers/main_classes/trainer) to manage training automatically, including distributed environments.
	Here, we fine-tune a [Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former) model on a subsample of the [ADE20K](https://huggingface.co/datasets/zhoubolei/scene_parse_150) dataset. We created a [small dataset](https://huggingface.co/datasets/qubvel-hf/ade20k-mini) with approximately 2,000 images containing only "person" and "car" annotations; all other pixels are marked as "background."

	Here is the `label2id` mapping for this model:

	```python
	label2id = {
	"person": 0,
	"car": 1,
	}
	```

	The training was done with the following command:

	```bash
	python run_instance_segmentation.py \
	--model_name_or_path facebook/mask2former-swin-tiny-coco-instance \
	--output_dir finetune-instance-segmentation-ade20k-mini-mask2former \
	--dataset_name qubvel-hf/ade20k-mini \
	--do_reduce_labels \
	--image_height 256 \
	--image_width 256 \
	--do_train \
	--fp16 \
	--num_train_epochs 40 \
	--learning_rate 1e-5 \
	--lr_scheduler_type constant \
	--per_device_train_batch_size 8 \
	--gradient_accumulation_steps 2 \
	--dataloader_num_workers 8 \
	--dataloader_persistent_workers \
	--dataloader_prefetch_factor 4 \
	--do_eval \
	--evaluation_strategy epoch \
	--logging_strategy epoch \
	--save_strategy epoch \
	--save_total_limit 2 \
	--push_to_hub
	```

	## Reload and Perform Inference

	You can easily load this trained model and perform inference as follows:

	```python
	import torch
	import requests
	import matplotlib.pyplot as plt

	from PIL import Image
	from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor

	# Load image
	image = Image.open(requests.get("http://farm4.staticflickr.com/3017/3071497290_31f0393363_z.jpg", stream=True).raw)

	# Load model and image processor
	device = "cuda"
	checkpoint = "qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former"

	model = Mask2FormerForUniversalSegmentation.from_pretrained(checkpoint, device_map=device)
	image_processor = Mask2FormerImageProcessor.from_pretrained(checkpoint)

	# Run inference on image
	inputs = image_processor(images=[image], return_tensors="pt").to(device)
	with torch.no_grad():
	outputs = model(**inputs)

	# Post-process outputs
	outputs = image_processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])

	print("Mask shape: ", outputs[0]["segmentation"].shape)
	print("Mask values: ", outputs[0]["segmentation"].unique())
	for segment in outputs[0]["segments_info"]:
	print("Segment: ", segment)
	```

	```
	Mask shape: torch.Size([427, 640])
	Mask values: tensor([-1., 0., 1., 2., 3., 4., 5., 6.])
	Segment: {'id': 0, 'label_id': 0, 'was_fused': False, 'score': 0.946127}
	Segment: {'id': 1, 'label_id': 1, 'was_fused': False, 'score': 0.961582}
	Segment: {'id': 2, 'label_id': 1, 'was_fused': False, 'score': 0.968367}
	Segment: {'id': 3, 'label_id': 1, 'was_fused': False, 'score': 0.819527}
	Segment: {'id': 4, 'label_id': 1, 'was_fused': False, 'score': 0.655761}
	Segment: {'id': 5, 'label_id': 1, 'was_fused': False, 'score': 0.531299}
	Segment: {'id': 6, 'label_id': 1, 'was_fused': False, 'score': 0.929477}
	```

	Use the following code to visualize the results:

	```python
	import numpy as np
	import matplotlib.pyplot as plt

	segmentation = outputs[0]["segmentation"].numpy()

	plt.figure(figsize=(10, 10))
	plt.subplot(1, 2, 1)
	plt.imshow(np.array(image))
	plt.axis("off")
	plt.subplot(1, 2, 2)
	plt.imshow(segmentation)
	plt.axis("off")
	plt.show()
	```

	![Result](https://i.imgur.com/rZmaRjD.png)