license: mit | |
datasets: | |
- liuhaotian/LLaVA-Instruct-150K | |
- LanguageBind/Video-LLaVA | |
language: | |
- en | |
metrics: | |
- accuracy | |
pipeline_tag: image-text-to-text | |
library_name: transformers | |
# LSTP-Chat: Language-guided Spatial-Temporal Prompt Learning for Video Chat | |
Available Models: | |
- LSTP-Chat-7B (Vicuna-7b) | |
For more details, please refer to our [official repository](https://github.com/bigai-nlco/CDBert) |