Can this model be used for video captioning?

#16
by HugTibers - opened

I want to use this model to identify actions, such as falls, which cannot be judged by a single image.

Same question with you , any new ideas now?

Hi, refer to V-BLIP for video captioning: https://huggingface.co/models?other=video-captioning

Sign up or log in to comment