|
---
|
|
language:
|
|
- zh
|
|
- en
|
|
tags:
|
|
- internvl
|
|
- multimodal
|
|
- vision-language
|
|
- food
|
|
- finetuned
|
|
license: apache-2.0
|
|
datasets:
|
|
- food-recognition
|
|
model-index:
|
|
- name: InternVL2-2B-Food-Finetuned
|
|
results:
|
|
- task:
|
|
type: vision-language-understanding
|
|
name: food-recognition
|
|
dataset:
|
|
name: food-dataset
|
|
type: custom
|
|
metrics:
|
|
- name: Accuracy
|
|
type: accuracy
|
|
value: 85.5
|
|
- name: F1-Score
|
|
type: f1
|
|
value: 84.3
|
|
---
|
|
|
|
# InternVL2-2B Food Recognition Finetuned Model
|
|
|
|
## Model Description
|
|
|
|
这是一个基于 InternVL2-2B 模型使用 LoRA 方法在食物识别数据集上微调的多模态模型。该模型专门优化了对食物图像的理解和描述能力。
|
|
|
|
### Key Features
|
|
|
|
- **基础模型**: InternVL2-2B
|
|
- **微调方法**: LoRA (Low-Rank Adaptation)
|
|
- **训练迭代**: 640 iterations
|
|
- **特定领域**: 食物识别与描述
|
|
- **多模态能力**: 图像理解和文本生成
|
|
|
|
## Training Details
|
|
|
|
### Base Model
|
|
- **架构**: InternVL2
|
|
- **参数量**: 2B
|
|
- **类型**: 视觉-语言多模态模型
|
|
|
|
### Fine-tuning
|
|
- **方法**: LoRA
|
|
- **配置文件**: internvl_v2_internlm2_2b_lora_finetune_food.py
|
|
- **训练步数**: 640
|
|
- **学习率**: 3.5e-5
|
|
- **训练轮数**: 10 epochs
|
|
|
|
## Usage
|
|
|
|
|