shi-labs
/

nat-mini-in1k-224

Image Classification

Transformers

PyTorch

nat

vision

Inference Endpoints

Model card Files Files and versions Community

alih commited on Nov 17, 2022

Commit

db8168a

1 Parent(s): 650219d

Update model name

Browse files

Files changed (2) hide show

README.md +18 -7
config.json +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ widget:
   example_title: Palace
 ---
-# NAT (mini variant)
 NAT-Mini trained on ImageNet-1K at 224x224 resolution.
 It was introduced in the paper [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) by Hassani et al. and first released in [this repository](https://github.com/SHI-Labs/Neighborhood-Attention-Transformer).
@@ -38,20 +38,20 @@ NA is implemented in PyTorch implementations through its extension, [NATTEN](htt
 You can use the raw model for image classification. See the [model hub](https://huggingface.co/models?search=nat) to look for
 fine-tuned versions on a task that interests you.
-### How to use
-Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes:
 ```python
-from transformers import AutoFeatureExtractor, NATForImageClassification
 from PIL import Image
 import requests
 url = "http://images.cocodataset.org/val2017/000000039769.jpg"
 image = Image.open(requests.get(url, stream=True).raw)
-feature_extractor = AutoFeatureExtractor.from_pretrained("shi-labs/nat-mini-in1k-224")
-model = NATForImageClassification.from_pretrained("shi-labs/nat-mini-in1k-224")
 inputs = feature_extractor(images=image, return_tensors="pt")
 outputs = model(**inputs)
@@ -61,7 +61,18 @@ predicted_class_idx = logits.argmax(-1).item()
 print("Predicted class:", model.config.id2label[predicted_class_idx])
 ```
-For more code examples, we refer to the [documentation](https://huggingface.co/transformers/model_doc/nat.html#).
 ### BibTeX entry and citation info

   example_title: Palace
 ---
+# NAT (mini variant)
 NAT-Mini trained on ImageNet-1K at 224x224 resolution.
 It was introduced in the paper [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) by Hassani et al. and first released in [this repository](https://github.com/SHI-Labs/Neighborhood-Attention-Transformer).
 You can use the raw model for image classification. See the [model hub](https://huggingface.co/models?search=nat) to look for
 fine-tuned versions on a task that interests you.
+### Example
+Here is how to use this model to classify an image from the COCO 2017 dataset into one of the 1,000 ImageNet classes:
 ```python
+from transformers import AutoImageProcessor, NatForImageClassification
 from PIL import Image
 import requests
 url = "http://images.cocodataset.org/val2017/000000039769.jpg"
 image = Image.open(requests.get(url, stream=True).raw)
+feature_extractor = AutoImageProcessor.from_pretrained("shi-labs/nat-mini-in1k-224")
+model = NatForImageClassification.from_pretrained("shi-labs/nat-mini-in1k-224")
 inputs = feature_extractor(images=image, return_tensors="pt")
 outputs = model(**inputs)
 print("Predicted class:", model.config.id2label[predicted_class_idx])
 ```
+For more examples, please refer to the [documentation](https://huggingface.co/transformers/model_doc/nat.html#).
+### Requirements
+Other than transformers, this model requires the [NATTEN](https://shi-labs.com/natten) package.
+If you're on Linux, you can refer to [shi-labs.com/natten](https://shi-labs.com/natten) for instructions on installing with pre-compiled binaries (just select your torch build to get the correct wheel URL).
+You can alternatively use `pip install natten` to compile on your device, which may take up to a few minutes.
+Mac users only have the latter option (no pre-compiled binaries).
+Refer to [NATTEN's GitHub](https://github.com/SHI-Labs/NATTEN/) for more information.
 ### BibTeX entry and citation info

config.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "architectures": [
-    "NATForImageClassification"
   ],
   "attention_probs_dropout_prob": 0.0,
   "depths": [

 {
   "architectures": [
+    "NatForImageClassification"
   ],
   "attention_probs_dropout_prob": 0.0,
   "depths": [