VideoChat2-TPO
This model is based on the paper Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment.
π Installation
pip install -r requirements.txt
python app.py
π§ Usage
from transformers import AutoModel, AutoTokenizer
from tokenizer import MultimodalLlamaTokenizer
model_path = "OpenGVLab/VideoChat-TPO"
tokenizer = AutoTokenizer.from_pretrained(model_path,
trust_remote_code=True,
use_fast=False,)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True, _tokenizer=self.tokenizer).eval()
- Downloads last month
- 24
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
HF Inference deployability: The HF Inference API does not support video-text-to-text models for transformers
library.
Model tree for OpenGVLab/VideoChat-TPO
Base model
mistralai/Mistral-7B-Instruct-v0.2