RTX 4090 cant process videos above 4 mins
I´m using this model on my 4090 GPU, but it cant process videos above 4 min. I´ve used min, max resolution params as given in model card. I want to use longer videos (10 mins) Do you guys have suggestions?
Yes. Directly feeding video as input to models is still quite basic and not a reliable method.
A better approach would be to split the video into frames using ffmpeg and feed those to the model. This way you have way more granular control on what is processed and you can do clever stuff like detecting changing frames and only calling the vision model when there is new data to be scraped.
This way you'd be able to process any length of video, given you also implement a moving context window mechanism.
An added bonus is that if you do this, you can then use a bigger LLM to query the video data!