timm
/

Image Feature Extraction
timm
PyTorch
Safetensors
regnety_640.seer / README.md
rwightman's picture
rwightman HF staff
Fix library_name.
3e8f2c8 verified
metadata
license: other
library_name: timm
tags:
  - image-feature-extraction
  - timm

Model card for regnety_640.seer

A RegNetY-64GF feature / backbone model. Pretrained according to SEER: self-supervised learning with SwAV on "2B random internet images".

SEER is licensed under SEER license, Copyright (c) Meta Platforms, Inc. All Rights Reserved. The license is a non-commercial license with useage and distribution restrictions.

The timm RegNet implementation includes a number of enhancements not present in other implementations, including:

  • stochastic depth
  • gradient checkpointing
  • layer-wise LR decay
  • configurable output stride (dilation)
  • configurable activation and norm layers
  • option for a pre-activation bottleneck block used in RegNetV variant
  • only known RegNetZ model definitions with pretrained weights

Model Details

Model Usage

Image Classification

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model('regnety_640.seer', pretrained=True)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)

Feature Map Extraction

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_640.seer',
    pretrained=True,
    features_only=True,
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

for o in output:
    # print shape of each feature map in output
    # e.g.:
    #  torch.Size([1, 32, 112, 112])
    #  torch.Size([1, 328, 56, 56])
    #  torch.Size([1, 984, 28, 28])
    #  torch.Size([1, 1968, 14, 14])
    #  torch.Size([1, 4920, 7, 7])

    print(o.shape)

Image Embeddings

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_640.seer',
    pretrained=True,
    num_classes=0,  # remove classifier nn.Linear
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor

# or equivalently (without needing to set num_classes=0)

output = model.forward_features(transforms(img).unsqueeze(0))
# output is unpooled, a (1, 4920, 7, 7) shaped tensor

output = model.forward_head(output, pre_logits=True)
# output is a (1, num_features) shaped tensor

Model Comparison

Explore the dataset and runtime metrics of this model in timm model results.

For the comparison summary below, the ra_in1k, ra3_in1k, ch_in1k, sw_*, and lion_* tagged weights are trained in timm.

model img_size top1 top5 param_count gmacs macts
regnety_1280.swag_ft_in1k 384 88.228 98.684 644.81 374.99 210.2
regnety_320.swag_ft_in1k 384 86.84 98.364 145.05 95.0 88.87
regnety_160.swag_ft_in1k 384 86.024 98.05 83.59 46.87 67.67
regnety_160.sw_in12k_ft_in1k 288 86.004 97.83 83.59 26.37 38.07
regnety_1280.swag_lc_in1k 224 85.996 97.848 644.81 127.66 71.58
regnety_160.lion_in12k_ft_in1k 288 85.982 97.844 83.59 26.37 38.07
regnety_160.sw_in12k_ft_in1k 224 85.574 97.666 83.59 15.96 23.04
regnety_160.lion_in12k_ft_in1k 224 85.564 97.674 83.59 15.96 23.04
regnety_120.sw_in12k_ft_in1k 288 85.398 97.584 51.82 20.06 35.34
regnety_2560.seer_ft_in1k 384 85.15 97.436 1282.6 747.83 296.49
regnetz_e8.ra3_in1k 320 85.036 97.268 57.7 15.46 63.94
regnety_120.sw_in12k_ft_in1k 224 84.976 97.416 51.82 12.14 21.38
regnety_320.swag_lc_in1k 224 84.56 97.446 145.05 32.34 30.26
regnetz_040_h.ra3_in1k 320 84.496 97.004 28.94 6.43 37.94
regnetz_e8.ra3_in1k 256 84.436 97.02 57.7 9.91 40.94
regnety_1280.seer_ft_in1k 384 84.432 97.092 644.81 374.99 210.2
regnetz_040.ra3_in1k 320 84.246 96.93 27.12 6.35 37.78
regnetz_d8.ra3_in1k 320 84.054 96.992 23.37 6.19 37.08
regnetz_d8_evos.ch_in1k 320 84.038 96.992 23.46 7.03 38.92
regnetz_d32.ra3_in1k 320 84.022 96.866 27.58 9.33 37.08
regnety_080.ra3_in1k 288 83.932 96.888 39.18 13.22 29.69
regnety_640.seer_ft_in1k 384 83.912 96.924 281.38 188.47 124.83
regnety_160.swag_lc_in1k 224 83.778 97.286 83.59 15.96 23.04
regnetz_040_h.ra3_in1k 256 83.776 96.704 28.94 4.12 24.29
regnetv_064.ra3_in1k 288 83.72 96.75 30.58 10.55 27.11
regnety_064.ra3_in1k 288 83.718 96.724 30.58 10.56 27.11
regnety_160.deit_in1k 288 83.69 96.778 83.59 26.37 38.07
regnetz_040.ra3_in1k 256 83.62 96.704 27.12 4.06 24.19
regnetz_d8.ra3_in1k 256 83.438 96.776 23.37 3.97 23.74
regnetz_d32.ra3_in1k 256 83.424 96.632 27.58 5.98 23.74
regnetz_d8_evos.ch_in1k 256 83.36 96.636 23.46 4.5 24.92
regnety_320.seer_ft_in1k 384 83.35 96.71 145.05 95.0 88.87
regnetv_040.ra3_in1k 288 83.204 96.66 20.64 6.6 20.3
regnety_320.tv2_in1k 224 83.162 96.42 145.05 32.34 30.26
regnety_080.ra3_in1k 224 83.16 96.486 39.18 8.0 17.97
regnetv_064.ra3_in1k 224 83.108 96.458 30.58 6.39 16.41
regnety_040.ra3_in1k 288 83.044 96.5 20.65 6.61 20.3
regnety_064.ra3_in1k 224 83.02 96.292 30.58 6.39 16.41
regnety_160.deit_in1k 224 82.974 96.502 83.59 15.96 23.04
regnetx_320.tv2_in1k 224 82.816 96.208 107.81 31.81 36.3
regnety_032.ra_in1k 288 82.742 96.418 19.44 5.29 18.61
regnety_160.tv2_in1k 224 82.634 96.22 83.59 15.96 23.04
regnetz_c16_evos.ch_in1k 320 82.634 96.472 13.49 3.86 25.88
regnety_080_tv.tv2_in1k 224 82.592 96.246 39.38 8.51 19.73
regnetx_160.tv2_in1k 224 82.564 96.052 54.28 15.99 25.52
regnetz_c16.ra3_in1k 320 82.51 96.358 13.46 3.92 25.88
regnetv_040.ra3_in1k 224 82.44 96.198 20.64 4.0 12.29
regnety_040.ra3_in1k 224 82.304 96.078 20.65 4.0 12.29
regnetz_c16.ra3_in1k 256 82.16 96.048 13.46 2.51 16.57
regnetz_c16_evos.ch_in1k 256 81.936 96.15 13.49 2.48 16.57
regnety_032.ra_in1k 224 81.924 95.988 19.44 3.2 11.26
regnety_032.tv2_in1k 224 81.77 95.842 19.44 3.2 11.26
regnetx_080.tv2_in1k 224 81.552 95.544 39.57 8.02 14.06
regnetx_032.tv2_in1k 224 80.924 95.27 15.3 3.2 11.37
regnety_320.pycls_in1k 224 80.804 95.246 145.05 32.34 30.26
regnetz_b16.ra3_in1k 288 80.712 95.47 9.72 2.39 16.43
regnety_016.tv2_in1k 224 80.66 95.334 11.2 1.63 8.04
regnety_120.pycls_in1k 224 80.37 95.12 51.82 12.14 21.38
regnety_160.pycls_in1k 224 80.288 94.964 83.59 15.96 23.04
regnetx_320.pycls_in1k 224 80.246 95.01 107.81 31.81 36.3
regnety_080.pycls_in1k 224 79.882 94.834 39.18 8.0 17.97
regnetz_b16.ra3_in1k 224 79.872 94.974 9.72 1.45 9.95
regnetx_160.pycls_in1k 224 79.862 94.828 54.28 15.99 25.52
regnety_064.pycls_in1k 224 79.716 94.772 30.58 6.39 16.41
regnetx_120.pycls_in1k 224 79.592 94.738 46.11 12.13 21.37
regnetx_016.tv2_in1k 224 79.44 94.772 9.19 1.62 7.93
regnety_040.pycls_in1k 224 79.23 94.654 20.65 4.0 12.29
regnetx_080.pycls_in1k 224 79.198 94.55 39.57 8.02 14.06
regnetx_064.pycls_in1k 224 79.064 94.454 26.21 6.49 16.37
regnety_032.pycls_in1k 224 78.884 94.412 19.44 3.2 11.26
regnety_008_tv.tv2_in1k 224 78.654 94.388 6.43 0.84 5.42
regnetx_040.pycls_in1k 224 78.482 94.24 22.12 3.99 12.2
regnetx_032.pycls_in1k 224 78.178 94.08 15.3 3.2 11.37
regnety_016.pycls_in1k 224 77.862 93.73 11.2 1.63 8.04
regnetx_008.tv2_in1k 224 77.302 93.672 7.26 0.81 5.15
regnetx_016.pycls_in1k 224 76.908 93.418 9.19 1.62 7.93
regnety_008.pycls_in1k 224 76.296 93.05 6.26 0.81 5.25
regnety_004.tv2_in1k 224 75.592 92.712 4.34 0.41 3.89
regnety_006.pycls_in1k 224 75.244 92.518 6.06 0.61 4.33
regnetx_008.pycls_in1k 224 75.042 92.342 7.26 0.81 5.15
regnetx_004_tv.tv2_in1k 224 74.57 92.184 5.5 0.42 3.17
regnety_004.pycls_in1k 224 74.018 91.764 4.34 0.41 3.89
regnetx_006.pycls_in1k 224 73.862 91.67 6.2 0.61 3.98
regnetx_004.pycls_in1k 224 72.38 90.832 5.16 0.4 3.14
regnety_002.pycls_in1k 224 70.282 89.534 3.16 0.2 2.17
regnetx_002.pycls_in1k 224 68.752 88.556 2.68 0.2 2.16

Citation

@article{goyal2022vision,
  title={Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision}, 
  author={Priya Goyal and Quentin Duval and Isaac Seessel and Mathilde Caron and Ishan Misra and Levent Sagun and Armand Joulin and Piotr Bojanowski},
  year={2022},
  eprint={2202.08360},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}
@InProceedings{Radosavovic2020,
  title = {Designing Network Design Spaces},
  author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{'a}r},
  booktitle = {CVPR},
  year = {2020}
}
@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}