metadata

tags:
  - image-classification
  - timm
library_name: timm
license: apache-2.0
datasets:
  - imagenet-1k
metrics:
  - accuracy

Model card for hpx_former_b36

The model hpx_former_b36 is part of the HyenaPixel model family proposed in the paper "HyenaPixel: Global Image Context with Convolutions". HyenaPixel uses large convolutions as an attention replacement by extending Hyena (Paper and GitHub) to support bidirectrional and two-dimensional input. The operator is integrated in the MetaFormer (Paper and GitHub) framework.

The official PyTorch implementation of HyenaPixel can be found on GitHub.

Models

Model	Resolution	Params	Top1 Acc	Download
hpx_former_s18	224	29M	83.2	HuggingFace
hpx_former_s18_384	384	29M	84.7	HuggingFace
hb_former_s18	224	28M	83.5	HuggingFace
c_hpx_former_s18	224	28M	83.0	HuggingFace
hpx_a_former_s18	224	28M	83.6	HuggingFace
hb_a_former_s18	224	27M	83.2	HuggingFace
hpx_former_b36	224	111M	84.9	HuggingFace
hb_former_b36	224	102M	85.2	HuggingFace

Usage

pip install git+https://github.com/spravil/HyenaPixel.git

import timm
import hyenapixel.models

model = timm.create_model("hpx_former_b36", pretrained=True)

Bibtex

@article{spravil2024hyenapixel,
  title={HyenaPixel: Global Image Context with Convolutions},
  author={Julian Spravil and Sebastian Houben and Sven Behnke},
  journal={arXiv preprint arXiv:2402.19305},
  year={2024},
}