File size: 747 Bytes
9a028ed c0e3e5e 9a028ed c0e3e5e 9a028ed 9bc8d73 c0e3e5e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
---
base_model:
- Qwen/Qwen2-0.5B-Instruct
- google/siglip-so400m-patch14-384
datasets:
- liuhaotian/LLaVA-Pretrain
- lmms-lab/LLaVA-ReCap-558K
- lmms-lab/LLaVA-ReCap-118K
- lmms-lab/LLaVA-ReCap-CC3M
- lmms-lab/LLaVA-OneVision-Mid-Data
- lmms-lab/LLaVA-OneVision-Data
- Zhiqiang007/MathV360K
language:
- en
license: mit
pipeline_tag: image-text-to-text
library_name: transformers
tags:
- LLaVA-OneVision-Manager
- LLaVA-OV-Manager
- Manager
---
Model weights for our submission to TCSVT, titled "Manager: Aggregating Insights from Unimodal Experts in Two-Tower VLMs and MLLMs".
Related materials can be found at [Paper](https://huggingface.co/papers/2506.11515), [Code](https://github.com/LooperXX/LLaVA-OV-Manager), https://looperxx.github.io/. |