Papers
arxiv:2401.02931

SPFormer: Enhancing Vision Transformer with Superpixel Representation

Published on Jan 5, 2024
Authors:
,
,
,

Abstract

In this work, we introduce SPFormer, a novel <PRE_TAG>Vision Transformer</POST_TAG> enhanced by <PRE_TAG>superpixel representation</POST_TAG>. Addressing the limitations of traditional Vision Transformers' fixed-size, non-adaptive patch partitioning, SPFormer employs <PRE_TAG>superpixels</POST_TAG> that adapt to the image's content. This approach divides the image into irregular, <PRE_TAG>semantically coherent regions</POST_TAG>, effectively capturing intricate details and applicable at both initial and intermediate feature levels. SPFormer, trainable end-to-end, exhibits superior performance across various benchmarks. Notably, it exhibits significant improvements on the challenging <PRE_TAG>ImageNet benchmark</POST_TAG>, achieving a 1.4% increase over <PRE_TAG>DeiT-T</POST_TAG> and 1.1% over <PRE_TAG>DeiT-S</POST_TAG> respectively. A standout feature of SPFormer is its inherent <PRE_TAG>explainability</POST_TAG>. The superpixel structure offers a window into the model's internal processes, providing valuable insights that enhance the model's <PRE_TAG>interpretability</POST_TAG>. This level of clarity significantly improves SPFormer's robustness, particularly in challenging scenarios such as <PRE_TAG>image rotations</POST_TAG> and <PRE_TAG>occlusions</POST_TAG>, demonstrating its <PRE_TAG>adaptability</POST_TAG> and <PRE_TAG>resilience</POST_TAG>.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2401.02931 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2401.02931 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2401.02931 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.