Update README.md
Browse filesOfficial baseline implementation for the ViCaS dataset. This is the pretrained (stage 2) model which has been optimized for video captioning on a subset of WebVid10M and Panda70M.
For detailed information, refer to the [Video-LLaVA-Seg GitHub repository](https://github.com/Ali2500/Video-LLaVA-Seg/tree/main)