Update README.md

802f327 verified 6 days ago

4.48 kB

	---
	library_name: transformers
	tags: []
	---

	# HumanF-MarkrAI/Gukbap-Gemma2-9B-VL🍚

	## Model Details🍚

	### Model Description
	- Developed by: HumanF-MarkrAI
	- Model type: Korean-VL-Gemma2-9B
	- Language(s): Korean + English
	- Context Length: 2048
	- License: cc-by-nc-4.0
	- Finetuned from model: [AIDC-AI/Ovis1.6-Gemma2-9B](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B).

	### Model Sources
	When training, we used `H100 80GB GPU`x4.


	### Implications🍚
	If you want to know our model's details, please see [🔥Gukbap-LMM Blog🔥](coming_soon).
	And also, we provided the Korean-LMM training code based Ovis!! [🔥Github🔥](coming_soon). Please star⭐⭐!!


	### Training Method (SFT)🧐
	The following papers contain the foundational methodologies for the dataset and training methods we are currently proceeding.
	- [LIMA](https://arxiv.org/abs/2305.11206).
	- [Ovis](https://arxiv.org/abs/2405.20797).


	### SFT Text-Datasets (Private)
	When we made the `Open-Source based dataset`, we use `microsoft/WizardLM-2-8x22B` through [DeepInfra](https://deepinfra.com/).
	Our datasets are made by `Evolving system`, which is propsed by [WizardLM](https://wizardlm.github.io/WizardLM2/).
	In training, we used 1849 training dataset, and 200 validation dataset.

	- Wizard-Korea-Datasets: [MarkrAI/Markr_WizardLM_train_ver4](https://huggingface.co/datasets/MarkrAI/Markr_WizardLM_train_ver4).
	> Learning rate: 1e-5; Epoch: 2


	## Benchmakrs🤗

	### Global MM Benchmark Score (Zero-shot)

	We internally evaluated [VLMEvalKit](https://github.com/open-compass/VLMEvalKit?tab=readme-ov-file).
	We utilized chatgpt-0125, gpt-4o-mini and gpt-4-turbo in `MMBench`, `MathVista` and `MMVet`, respectively.

	\| Model \| MMStar \| MathVista \| HallusionBench \| AI2D \| OCRBench \| MMVet \| MMBench_V11 \| AVG \|
	\|:---------:\|:-----:\|:------:\|:-----:\|:-----:\|:----:\|:-----:\|:-----:\|:-----:\|
	\| Step-1o (closed model) \| 69.3 \| 74.7 \| 89.1 \| 55.8 \| 92.6 \| 82.8 \| 87.3 \| 78.8 \|
	\| InternVL2.5-78B-MPO (Open) \| 72.1 \| 76.6 \| 58.1 \| 89.2 \| 90.9 \| 73.5 \| 87.8 \| 78.3 \|
	\| InternVL2.5-38B-MPO (Open) \| 70.1 \| 73.6 \| 59.7 \| 87.9 \| 89.4 \| 72.6 \| 85.4 \| 77.0 \|
	\| Ovis1.6-Gemma2-27B (Open) \| 63.5 \| 70.1 \| 54.1 \| 86.6 \| 85.6 \| 68.0 \| 82.2 \| 72.9 \|
	\| Gemini-2.0-Flash \| 69.4 \| 70.4 \| 58.0 \| 83.1 \| 82.5 \| 73.6 \| 71.0 \| 72.6 \|
	\| GPT-4o-20241120 \| 65.1 \| 59.9 \| 56.2 \| 84.9 \| 80.6 \| 74.5 \| 84.3 \| 72.2 \|
	\| Ovis1.6-Gemma2-9B (Open) \| 62.00 \| 67.10 \| 84.42 \| 51.96 \| 82.60 \| 64.68 \| 82.20 \| 70.71 \|
	\|:---------:\|:-----:\|:------:\|:-----:\|:-----:\|:----:\|:-----:\|:-----:\|:-----:\|
	\| Gukbap-Gemma2-9B-VL🍚 \| 62.13 \| 66.00 \| 84.49 \| 53.01 \| 82.80 \| 63.90 \| 82.20 \| 70.65 \|
	\|:---------:\|:-----:\|:------:\|:-----:\|:-----:\|:----:\|:-----:\|:-----:\|:-----:\|
	\| LLaVA-OneVision-72B \| 65.8 \| 68.4 \| 47.9 \| 86.2 \| 74.1\| 60.6 \| 84.5 \| 69.6 \|
	\| VARCO-VISION-14B (NCSoft) \| 64.1 \| 67.6 \| 46.8 \| 83.9 \| 81.5 \| 53.0 \| 81.2 \| 68.3 \|
	\| GPT-4o-mini-20240718 \| 54.8 \| 52.4 \| 46.1 \| 77.8 \| 78.5 \| 66.9 \| 76.0 \| 64.6 \|
	> HallusionBench score: (aAcc + fAcc + qAcc) / 3

	### Korean MM Benchmark Score (Zero-shot)

	We internally evaluated [our code](coming_soon).
	We utilized gpt-4o-2024-08-06 in `K-LLAVA-W` evaluation.

	\| Model \| K-MMBench \| K-MMStar \| K-DTCBench \| K-LLAVA-W \| AVG \|
	\|:---------:\|:-----:\|:------:\|:-----:\|:-----:\|:----:\|
	\| GPT-4o-20241120 \| NaN \| NaN \| NaN \| 85.50 \| NaN \|
	\|:---------:\|:-----:\|:------:\|:-----:\|:-----:\|:----:\|
	\| Gukbap-Gemma2-9B-VL🍚 \| 80.16 \| 54.20 \| 52.92 \| 63.83 \| 62.78 \|
	\| Ovis1.6-Gemma2-9B \| 52.46 \| 50.40 \| 47.08 \| 55.67 \| 51.40 \|
	\| VARCO-VISION-14B \| 87.16 \| 58.13 \| 85.42 \| 51.17 \| 70.47 \|
	\| llama-3.2-Korean-Bllossom-AICA-5B \| 26.01 \| 21.60 \| 17.08 \| 45.33 \| 27.51 \|

	### MM Benchmarks
	- Global MM Bench dataset: [OpenCampass MM leaderboard](https://rank.opencompass.org.cn/leaderboard-multimodal)
	- Korean MM Bench dataset: [NCSOFT](https://huggingface.co/NCSOFT).


	## Chat Prompt😶‍🌫️
	```yaml
	<start_of_turn>user<image>
	Hello! My favorite food is Gukbap🍚!<end_of_turn>
	<start_of_turn>model
	(model answer)
	```


	## Gukbap-VL Series models🍚🍚
	- [HumanF-MarkrAI/Gukbap-Qwen2.5-34B-VL](https://huggingface.co/HumanF-MarkrAI/Gukbap-Qwen2.5-34B-VL)


	## BibTeX
	```
	@article{HumanF-MarkrAI,
	title={Gukbap-Gemma2-9B-VL},
	author={MarkrAI},
	year={2025},
	url={https://huggingface.co/HumanF-MarkrAI}
	}
	```