--- pipeline_tag: image-classification tags: - model_hub_mixin - pytorch_model_hub_mixin - image-classification license: mit language: - en base_model: - timm/vit_base_patch14_reg4_dinov2.lvd142m --- # PdiscoFormer NABirds Model (K=11) PdiscoFormer (Vit-base-dinov2-reg4) trained on NABirds with K (number of unsupervised parts to discover) set to a value of 11. PdiscoFormer is a novel method for unsupervised part discovery using self-supervised Vision Transformers which achieves state-of-the-art results for this task, both qualitatively and quantitatively. The code can be found in the following repository: https://github.com/ananthu-aniraj/pdiscoformer # BibTex entry and citation info ``` @misc{aniraj2024pdiscoformerrelaxingdiscoveryconstraints, title={PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers}, author={Ananthu Aniraj and Cassio F. Dantas and Dino Ienco and Diego Marcos}, year={2024}, eprint={2407.04538}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2407.04538}, }