yanyoyo
/

InternVL

vision-language

Model card Files Files and versions Community

InternVL / README.md

yanyoyo

update

d492f5a 3 months ago

|

history blame contribute delete

1.34 kB

	---
	language:
	- zh
	- en
	tags:
	- internvl
	- multimodal
	- vision-language
	- food
	- finetuned
	license: apache-2.0
	datasets:
	- food-recognition
	model-index:
	- name: InternVL2-2B-Food-Finetuned
	results:
	- task:
	type: vision-language-understanding
	name: food-recognition
	dataset:
	name: food-dataset
	type: custom
	metrics:
	- name: Accuracy
	type: accuracy
	value: 85.5
	- name: F1-Score
	type: f1
	value: 84.3
	---

	# InternVL2-2B Food Recognition Finetuned Model

	## Model Description

	这是一个基于 InternVL2-2B 模型使用 LoRA 方法在食物识别数据集上微调的多模态模型。该模型专门优化了对食物图像的理解和描述能力。

	### Key Features

	- 基础模型: InternVL2-2B
	- 微调方法: LoRA (Low-Rank Adaptation)
	- 训练迭代: 640 iterations
	- 特定领域: 食物识别与描述
	- 多模态能力: 图像理解和文本生成

	## Training Details

	### Base Model
	- 架构: InternVL2
	- 参数量: 2B
	- 类型: 视觉-语言多模态模型

	### Fine-tuning
	- 方法: LoRA
	- 配置文件: internvl_v2_internlm2_2b_lora_finetune_food.py
	- 训练步数: 640
	- 学习率: 3.5e-5
	- 训练轮数: 10 epochs

	## Usage