VideoChat2-TPO

This model is based on the paper Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment.

πŸƒ Installation

pip install -r requirements.txt
python app.py

πŸ”§ Usage

from transformers import AutoModel, AutoTokenizer
from tokenizer import MultimodalLlamaTokenizer

model_path = "OpenGVLab/VideoChat-TPO"
tokenizer =  AutoTokenizer.from_pretrained(model_path,
trust_remote_code=True,
use_fast=False,)
model = AutoModel.from_pretrained(model_path,  trust_remote_code=True, _tokenizer=self.tokenizer).eval()
Downloads last month
24
Safetensors
Model size
8.1B params
Tensor type
I64
Β·
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for OpenGVLab/VideoChat-TPO

Finetuned
(941)
this model