AskYoutube
/

AskVideos-VideoCLIP-v0.1

Model card Files Files and versions Community

AskYoutube commited on Dec 20, 2023

Commit

c5d07e3

·

1 Parent(s): a226df2

Create README.md

Files changed (1) hide show

README.md +33 -0

README.md ADDED Viewed

	@@ -0,0 +1,33 @@

+---
+license: mit
+---
+# AskVideos-VideoCLIP-7B-v0.1
+Like it's image-only counterpart, CLIP, VideoCLIP enables you to compute similarity scores but between text and videos.
+VideoCLIP uses a Video Q-Former to aggregate frame-level embeddings temporally into a single embedding, maintaining relevance of the underlying content.
+The resulting embedding is then trained with contrastive learning to match it's corresponding text, enabling similarity search for videos and text.
+# Usage
+```
+# Load model
+import video_clip
+eval_config = 'eval_configs/video_clip.yaml'
+model, vis_processor = video_clip.load_model(eval_config)
+# Compute video embeddings
+video_embs = video_clip.get_all_video_embeddings(videos, model, vis_processor)
+# Compute Video-Text similarity
+v2t_sim = video_clip.compute_sim(model, texts, video_embs)
+# Compute Text-Video similarity
+t2v_sim = v2t_sim.T
+# Compute Video-Video distance
+v2v_dists = video_clip.compute_dist_videoq(model, video_embs[0], video_embs)
+```
+For a more detailed demo of how to use the model, see demo.ipynb.