krasserm commited on
Commit
42333a1
·
1 Parent(s): 864295b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +116 -0
README.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - c4
5
+ - wikipedia
6
+ inference: false
7
+ language:
8
+ - en
9
+ pipeline_tag: fill-mask
10
+ ---
11
+
12
+ # Perceiver IO image classifier
13
+
14
+ This model is a Perceiver IO model pretrained on ImageNet (14 million images, 1,000 classes). It is weight-equivalent
15
+ to the [deepmind/vision-perceiver-fourier](https://huggingface.co/deepmind/vision-perceiver-fourier) model but based on
16
+ implementation classes of the [perceiver-io](https://github.com/krasserm/perceiver-io) library. It can be created from
17
+ the `deepmind/vision-perceiver-fourier` model with a library-specific [conversion utility](#model-conversion). Both
18
+ models generate equal output for the same input.
19
+
20
+ Content of the `deepmind/vision-perceiver-fourier` [model card](https://huggingface.co/deepmind/vision-perceiver-fourier)
21
+ also applies to this model except [usage examples](#usage-examples). Refer to the linked card for further model and
22
+ training details.
23
+
24
+ <img src="http://images.cocodataset.org/val2017/000000507223.jpg" alt="sample image" width=200>
25
+
26
+ ## Model description
27
+
28
+ The model is specif in Appendix A of the [Perceiver IO paper](https://arxiv.org/abs/2107.14795) (2D Fourier features).
29
+
30
+ ## Intended use and limitations
31
+
32
+ The model can be used for image classification.
33
+
34
+ ## Usage examples
35
+
36
+ To use this model you first need to [install](https://github.com/krasserm/perceiver-io/blob/main/README.md#installation)
37
+ the `perceiver-io` library with extension `text`.
38
+
39
+ ```shell
40
+ pip install perceiver-io[text]
41
+ ```
42
+
43
+ Then the model can be used with PyTorch. Either use the model and image processor directly
44
+
45
+ ```python
46
+ import requests
47
+ from PIL import Image
48
+ from transformers import AutoModelForImageClassification, AutoImageProcessor
49
+ from perceiver.model.vision import image_classifier # auto-class registration
50
+
51
+ repo_id = "krasserm/perceiver-io-img-clf"
52
+
53
+ # An image of a baseball player from MS-COCO validation set
54
+ url = "http://images.cocodataset.org/val2017/000000507223.jpg"
55
+ image = Image.open(requests.get(url, stream=True).raw)
56
+
57
+ model = AutoModelForImageClassification.from_pretrained(repo_id)
58
+ processor = AutoImageProcessor.from_pretrained(repo_id)
59
+
60
+ processed = processor(image, return_tensors="pt")
61
+ prediction = model(**processed).logits.argmax(dim=-1)
62
+
63
+ print(f"Predicted class = {model.config.id2label[prediction.item()]}")
64
+ ```
65
+ ```
66
+ Predicted class = ballplayer, baseball player
67
+ ```
68
+
69
+ or use an `image-classification` pipeline:
70
+
71
+ ```python
72
+ import requests
73
+ from PIL import Image
74
+ from transformers import pipeline
75
+ from perceiver.model.vision import image_classifier # auto-class registration
76
+
77
+ repo_id = "krasserm/perceiver-io-img-clf"
78
+
79
+ # An image of a baseball player from MS-COCO validation set
80
+ url = "http://images.cocodataset.org/val2017/000000507223.jpg"
81
+ image = Image.open(requests.get(url, stream=True).raw)
82
+
83
+ classifier = pipeline("image-classification", model=repo_id)
84
+ prediction = classifier(image)
85
+
86
+ print(f"Predicted class = {prediction[0]['label']}")
87
+ ```
88
+ ```
89
+ Predicted class = ballplayer, baseball player
90
+ ```
91
+
92
+ ## Model conversion
93
+
94
+ The `krasserm/perceiver-io-img-clf` model has been created from the source `deepmind/vision-perceiver-fourier` model
95
+ with:
96
+
97
+ ```python
98
+ from perceiver.model.vision.image_classifier import convert_model
99
+
100
+ convert_model(
101
+ save_dir="krasserm/perceiver-io-img-clf",
102
+ source_repo_id="deepmind/vision-perceiver-fourier",
103
+ push_to_hub=True,
104
+ )
105
+ ```
106
+
107
+ ## Citation
108
+
109
+ ```bibtex
110
+ @article{jaegle2021perceiver,
111
+ title={Perceiver IO: A General Architecture for Structured Inputs \& Outputs},
112
+ author={Jaegle, Andrew and Borgeaud, Sebastian and Alayrac, Jean-Baptiste and Doersch, Carl and Ionescu, Catalin and Ding, David and Koppula, Skanda and Zoran, Daniel and Brock, Andrew and Shelhamer, Evan and others},
113
+ journal={arXiv preprint arXiv:2107.14795},
114
+ year={2021}
115
+ }
116
+ ```