krasserm commited on
Commit
ef60b7d
·
1 Parent(s): 42333a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -47
README.md CHANGED
@@ -1,35 +1,39 @@
1
  ---
2
  license: apache-2.0
 
3
  datasets:
4
  - c4
5
  - wikipedia
6
- inference: false
7
  language:
8
  - en
9
  pipeline_tag: fill-mask
10
  ---
11
 
12
- # Perceiver IO image classifier
13
 
14
- This model is a Perceiver IO model pretrained on ImageNet (14 million images, 1,000 classes). It is weight-equivalent
15
- to the [deepmind/vision-perceiver-fourier](https://huggingface.co/deepmind/vision-perceiver-fourier) model but based on
16
- implementation classes of the [perceiver-io](https://github.com/krasserm/perceiver-io) library. It can be created from
17
- the `deepmind/vision-perceiver-fourier` model with a library-specific [conversion utility](#model-conversion). Both
18
- models generate equal output for the same input.
 
19
 
20
- Content of the `deepmind/vision-perceiver-fourier` [model card](https://huggingface.co/deepmind/vision-perceiver-fourier)
21
  also applies to this model except [usage examples](#usage-examples). Refer to the linked card for further model and
22
  training details.
23
 
24
- <img src="http://images.cocodataset.org/val2017/000000507223.jpg" alt="sample image" width=200>
25
-
26
  ## Model description
27
 
28
- The model is specif in Appendix A of the [Perceiver IO paper](https://arxiv.org/abs/2107.14795) (2D Fourier features).
 
29
 
30
- ## Intended use and limitations
31
 
32
- The model can be used for image classification.
 
 
 
 
33
 
34
  ## Usage examples
35
 
@@ -40,66 +44,60 @@ the `perceiver-io` library with extension `text`.
40
  pip install perceiver-io[text]
41
  ```
42
 
43
- Then the model can be used with PyTorch. Either use the model and image processor directly
44
 
45
  ```python
46
- import requests
47
- from PIL import Image
48
- from transformers import AutoModelForImageClassification, AutoImageProcessor
49
- from perceiver.model.vision import image_classifier # auto-class registration
50
 
51
- repo_id = "krasserm/perceiver-io-img-clf"
52
 
53
- # An image of a baseball player from MS-COCO validation set
54
- url = "http://images.cocodataset.org/val2017/000000507223.jpg"
55
- image = Image.open(requests.get(url, stream=True).raw)
56
 
57
- model = AutoModelForImageClassification.from_pretrained(repo_id)
58
- processor = AutoImageProcessor.from_pretrained(repo_id)
59
 
60
- processed = processor(image, return_tensors="pt")
61
- prediction = model(**processed).logits.argmax(dim=-1)
62
 
63
- print(f"Predicted class = {model.config.id2label[prediction.item()]}")
 
 
64
  ```
65
  ```
66
- Predicted class = ballplayer, baseball player
67
  ```
68
 
69
- or use an `image-classification` pipeline:
70
 
71
  ```python
72
- import requests
73
- from PIL import Image
74
- from transformers import pipeline
75
- from perceiver.model.vision import image_classifier # auto-class registration
76
-
77
- repo_id = "krasserm/perceiver-io-img-clf"
78
 
79
- # An image of a baseball player from MS-COCO validation set
80
- url = "http://images.cocodataset.org/val2017/000000507223.jpg"
81
- image = Image.open(requests.get(url, stream=True).raw)
82
 
83
- classifier = pipeline("image-classification", model=repo_id)
84
- prediction = classifier(image)
85
 
86
- print(f"Predicted class = {prediction[0]['label']}")
 
 
87
  ```
88
  ```
89
- Predicted class = ballplayer, baseball player
90
  ```
91
 
92
  ## Model conversion
93
 
94
- The `krasserm/perceiver-io-img-clf` model has been created from the source `deepmind/vision-perceiver-fourier` model
95
- with:
96
 
97
  ```python
98
- from perceiver.model.vision.image_classifier import convert_model
99
 
100
  convert_model(
101
- save_dir="krasserm/perceiver-io-img-clf",
102
- source_repo_id="deepmind/vision-perceiver-fourier",
103
  push_to_hub=True,
104
  )
105
  ```
 
1
  ---
2
  license: apache-2.0
3
+ inference: false
4
  datasets:
5
  - c4
6
  - wikipedia
 
7
  language:
8
  - en
9
  pipeline_tag: fill-mask
10
  ---
11
 
12
+ # Perceiver IO masked language model
13
 
14
+ This model is a Perceiver IO model pretrained on the masked language modeling (MLM) task using a text corpus created
15
+ from [C4](https://huggingface.co/datasets/c4) and [English Wikipedia](https://huggingface.co/datasets/wikipedia). It
16
+ is weight-equivalent to the [deepmind/language-perceiver](https://huggingface.co/deepmind/language-perceiver) model
17
+ but based on implementation classes of the [perceiver-io](https://github.com/krasserm/perceiver-io) library. It can
18
+ be created from the `deepmind/language-perceiver` model with a library-specific [conversion utility](#model-conversion).
19
+ Both models generate equal output for the same input.
20
 
21
+ Content of the `deepmind/language-perceiver` [model card](https://huggingface.co/deepmind/language-perceiver)
22
  also applies to this model except [usage examples](#usage-examples). Refer to the linked card for further model and
23
  training details.
24
 
 
 
25
  ## Model description
26
 
27
+ The model is specified in Section 4 (Table 1) and Appendix F (Table 11) of the [Perceiver IO paper](https://arxiv.org/abs/2107.14795)
28
+ (UTF-8 bytes tokenization, vocabulary size of 262, 201M parameters).
29
 
30
+ ## Intended use
31
 
32
+ Although the raw model can be [used directly](#usage-examples) for masked language modeling, the main use case is
33
+ fine-tuning. This can be fine-tuning with masked language modeling and whole word masking on an unlabeled dataset
34
+ ([example](https://huggingface.co/krasserm/perceiver-io-mlm-imdb)) or fine-tuning on a labeled dataset using the
35
+ pretrained encoder of this model ([example](https://huggingface.co/krasserm/perceiver-io-txt-clf-imdb)) for weight
36
+ initialization.
37
 
38
  ## Usage examples
39
 
 
44
  pip install perceiver-io[text]
45
  ```
46
 
47
+ Then the model can be used with PyTorch. Either use the model and tokenizer directly
48
 
49
  ```python
50
+ from transformers import AutoModelForMaskedLM, AutoTokenizer
51
+ from perceiver.model.text import mlm # auto-class registration
 
 
52
 
53
+ repo_id = "krasserm/perceiver-io-mlm"
54
 
55
+ model = AutoModelForMaskedLM.from_pretrained(repo_id)
56
+ tokenizer = AutoTokenizer.from_pretrained(repo_id)
 
57
 
58
+ masked_text = "This is an incomplete sentence where some words are" \
59
+ "[MASK][MASK][MASK][MASK][MASK][MASK][MASK][MASK][MASK]"
60
 
61
+ encoding = tokenizer(masked_text, return_tensors="pt")
62
+ outputs = model(**encoding)
63
 
64
+ # get predictions for 9 [MASK] tokens (exclude [SEP] token at the end)
65
+ masked_token_predictions = outputs.logits[0, -10:-1].argmax(dim=-1)
66
+ print(tokenizer.decode(masked_token_predictions))
67
  ```
68
  ```
69
+ missing.
70
  ```
71
 
72
+ or use a `fill-mask` pipeline:
73
 
74
  ```python
75
+ from transformers import pipeline
76
+ from perceiver.model.text import mlm # auto-class registration
 
 
 
 
77
 
78
+ repo_id = "krasserm/perceiver-io-mlm"
 
 
79
 
80
+ masked_text = "This is an incomplete sentence where some words are" \
81
+ "[MASK][MASK][MASK][MASK][MASK][MASK][MASK][MASK][MASK]"
82
 
83
+ filler_pipeline = pipeline("fill-mask", model=repo_id)
84
+ masked_token_predictions = filler_pipeline(masked_text)
85
+ print("".join([pred[0]["token_str"] for pred in masked_token_predictions]))
86
  ```
87
  ```
88
+ missing.
89
  ```
90
 
91
  ## Model conversion
92
 
93
+ The `krasserm/perceiver-io-mlm` model has been created from the source `deepmind/language-perceiver` model with:
 
94
 
95
  ```python
96
+ from perceiver.model.text.mlm import convert_model
97
 
98
  convert_model(
99
+ save_dir="krasserm/perceiver-io-mlm",
100
+ source_repo_id="deepmind/language-perceiver",
101
  push_to_hub=True,
102
  )
103
  ```