tags:
- vision-language model
- llama
- generation
datasets:
- YanweiLi/Mini-Gemini-Instruction
Mini-Gemini-7B-HD Model Card
Model details
Mini-Gemini supports a series of dense and MoE Large Language Models (LLMs) from 2B to 34B with HD image understanding, reasoning, and generation simultaneously.
Normal resolution setting: Mini-Gemini-2B, Mini-Gemini-7B, Mini-Gemini-13B, Mini-Gemini-8x7B, Mini-Gemini-34B
High resolution setting: Mini-Gemini-13B-HD, Mini-Gemini-8x7B-HD, Mini-Gemini-34B-HD
Model type: Mini-Gemini is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
Mini-Gemini empowers existing frameworks to support HD image understanding, reasoning, and generation simultaneously.
Model version: Mini-Gemini HD Version with LLM Vicuna-7B-v1.5
Model date: Mini-Gemini-7B-HD was trained on 03/2024.
License
Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
Where to send questions or comments about the model: https://github.com/dvlab-research/MiniGemini/issues
Intended use
Primary intended uses: The primary use of Mini-Gemini is research on large multimodal models and chatbots.
Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
Training data
This model is trained based on Mini-Gemini-Instruction dataset, please to the Github for more detail.