shehan97 commited on
Commit
0825bac
1 Parent(s): d77b990

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -1
README.md CHANGED
@@ -3,4 +3,54 @@ datasets:
3
  - imagenet-1k
4
  library_name: transformers
5
  pipeline_tag: image-classification
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  - imagenet-1k
4
  library_name: transformers
5
  pipeline_tag: image-classification
6
+ ---
7
+
8
+ # SwiftFormer
9
+
10
+ ## Model description
11
+
12
+ The SwiftFormer model was proposed in [SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications](https://arxiv.org/abs/2303.15446) by Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan.
13
+
14
+ SwiftFormer paper introduces a novel efficient additive attention mechanism that effectively replaces the quadratic matrix multiplication operations in the self-attention computation with linear element-wise multiplications. A series of models called 'SwiftFormer' is built based on this, which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed. Even their small variant achieves 78.5% top-1 ImageNet1K accuracy with only 0.8 ms latency on iPhone 14, which is more accurate and 2× faster compared to MobileViT-v2.
15
+
16
+ ## Intended uses & limitations
17
+
18
+
19
+
20
+
21
+ ## How to use
22
+
23
+
24
+ import requests
25
+ from PIL import Image
26
+
27
+ url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
28
+ image = Image.open(requests.get(url, stream=True).raw)
29
+
30
+ from transformers import ViTImageProcessor
31
+ processor = ViTImageProcessor.from_pretrained('shehan97/swiftformer-xs')
32
+ inputs = processor(images=image, return_tensors="pt")
33
+
34
+
35
+ from transformers.models.swiftformer import SwiftFormerForImageClassification
36
+ new_model = SwiftFormerForImageClassification.from_pretrained('shehan97/swiftformer-xs')
37
+
38
+ output = new_model(inputs['pixel_values'], output_hidden_states=True)
39
+ logits = output.logits
40
+ predicted_class_idx = logits.argmax(-1).item()
41
+ print("Predicted class:", new_model.config.id2label[predicted_class_idx])
42
+
43
+
44
+ ## Limitations and bias
45
+
46
+ ## Training data
47
+
48
+ The classification model is trained on the ImageNet-1K dataset.
49
+
50
+
51
+ ## Training procedure
52
+
53
+ ## Evaluation results
54
+
55
+
56
+