tabtoyou
/

KoLLaVA-KoVicuna-7b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

KoLLaVA : Korean Large Language and Vision Assistant (feat. LLaVA)

This model is a large multimodal model (LMM) that combines the LLM(KoVicuna) with visual encoder of CLIP(ViT-14), trained on Korean visual-instruction dataset.

Detail codes are available at KoLLaVA github repository

Training hyperparameters

learning rate : 2e-5
train_batch_size: 16
distributed_type: multi-GPU (A100 80G)
num_devices: 4
gradient_accumulation_steps: 1
total_train_batch_size: 64
total_eval_batch_size: 16
lr_scheduler_type: cosine
num_epochs: 1

Model License: Apache License 2.0

Downloads last month: 227

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train tabtoyou/KoLLaVA-KoVicuna-7b