Image Classification
Core ML
coreml-FastViT-MA36 / README.md
pcuenq's picture
pcuenq HF staff
Update README
6572202
|
raw
history blame
2.58 kB
metadata
tags:
  - image-classification
library_name: coreml
license: other
license_name: apple-ascl
license_link: LICENSE
datasets:
  - imagenet-1k

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

Image Classification.

Image Classification

Please observe original license.

Model Details

Evaluation - Variants

Variant Parameters Size (MB) Weight precision Act. precision Δ Pytorch acc
T8 3.6M 7.8 Float16 Float16 -0.9%
MA36 42.7M 84 Float16 Float16 -0.06%

Evaluation - Inference time

Variant Device OS Inference time (ms) Dominant compute unit
T8 iPhone 12 Pro Max 17.5 0.79 Neural Engine
T8 M3 Max 14.4 0.62 Neural Engine
MA36 iPhone 12 Pro Max 18.0 4.50 Neural Engine
MA36 M3 Max 15.0 2.99 Neural Engine

Download

Install huggingface-cli

brew install huggingface-cli

To download one of the .mlpackage folders to the models directory:

huggingface-cli download \
  --local-dir models --local-dir-use-symlinks False \
  apple/coreml-FastViT-T8 

Citation

@inproceedings{vasufastvit2023,
  author = {Pavan Kumar Anasosalu Vasu and James Gabriel and Jeff Zhu and Oncel Tuzel and Anurag Ranjan},
  title = {FastViT:  A Fast Hybrid Vision Transformer using Structural Reparameterization},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year = {2023}
}