shepnerd commited on
Commit
c9bac54
·
verified ·
1 Parent(s): eb19c00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -6
README.md CHANGED
@@ -1,11 +1,41 @@
1
  ---
2
  license: mit
3
  pipeline_tag: video-classification
4
- tags:
5
- - model_hub_mixin
6
- - pytorch_model_hub_mixin
7
  ---
8
 
9
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
10
- - Library: [More Information Needed]
11
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  pipeline_tag: video-classification
 
 
 
4
  ---
5
 
6
+ ## Introduction
7
+
8
+ This repository contains the 6B model of the paper [InternVideo2](https://arxiv.org/pdf/2403.15377) in stage 2.
9
+
10
+ Code: https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo2/multi_modality
11
+
12
+ ## 🚀 Installation
13
+
14
+ Please refer to https://github.com/OpenGVLab/InternVideo/blob/main/InternVideo2/multi_modality/INSTALL.md
15
+
16
+ ## Usage
17
+
18
+ ```python
19
+ import cv2
20
+ from transformers import AutoModel
21
+ from modeling_internvideo2 import (retrieve_text, vid2tensor, _frame_from_video,)
22
+
23
+
24
+ if __name__ == '__main__':
25
+ model = AutoModel.from_pretrained("OpenGVLab/InternVideo2-Stage2_6B", trust_remote_code=True).eval()
26
+
27
+ video = cv2.VideoCapture('example1.mp4')
28
+ frames = [x for x in _frame_from_video(video)]
29
+ text_candidates = ["A playful dog and its owner wrestle in the snowy yard, chasing each other with joyous abandon.",
30
+ "A man in a gray coat walks through the snowy landscape, pulling a sleigh loaded with toys.",
31
+ "A person dressed in a blue jacket shovels the snow-covered pavement outside their house.",
32
+ "A cat excitedly runs through the yard, chasing a rabbit.",
33
+ "A person bundled up in a blanket walks through the snowy landscape, enjoying the serene winter scenery."]
34
+
35
+ texts, probs = retrieve_text(frames, text_candidates, model=model, topk=5)
36
+ for t, p in zip(texts, probs):
37
+ print(f'text: {t} ~ prob: {p:.4f}')
38
+
39
+ vidtensor = vid2tensor('example1.mp4', fnum=4)
40
+ feat = model.get_vid_feat(vidtensor)
41
+ ```