|
--- |
|
language: |
|
- ko |
|
- en |
|
library_name: transformers |
|
base_model: Bllossom/llama-3.2-Korean-Bllossom-AICA-5B |
|
tags: |
|
- vision-language |
|
- korean |
|
- image-to-text |
|
- multilingual |
|
- fashion |
|
- e-commerce |
|
- text-classification |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- mllama |
|
- lora |
|
datasets: |
|
- hateslopacademy/otpensource_data |
|
inference: true |
|
license: cc-by-4.0 |
|
model_name: otpensource-vision-lora |
|
size_categories: 1K<n<10K |
|
task_categories: |
|
- image-to-text |
|
- text-classification |
|
task_ids: |
|
- image-captioning |
|
- sentiment-analysis |
|
--- |
|
|
|
# otpensource-vision LoRA |
|
|
|
## ๋ชจ๋ธ ์ค๋ช
|
|
|
|
**otpensource-vision LoRA**๋ *otpensource-vision* ๋ชจ๋ธ์ ๊ธฐ๋ฐ์ผ๋ก **LoRA (Low-Rank Adaptation)** ๊ธฐ๋ฒ์ ํ์ฉํ์ฌ ํ์ต๋ ๊ฒฝ๋ Vision-Language ๋ชจ๋ธ์
๋๋ค. ๊ธฐ์กด ๋ชจ๋ธ ๋๋น ์ ์ ์ฐ์ฐ๋์ผ๋ก ํน์ ๋๋ฉ์ธ์ ์ต์ ํ๋ ๊ฒฐ๊ณผ๋ฅผ ์ ๊ณตํ๋ฉฐ, ํ๊ตญ์ด์ ์์ด๋ฅผ ์ง์ํฉ๋๋ค. |
|
|
|
### ์ฃผ์ ํน์ง |
|
- **LoRA ๊ธฐ๋ฐ ๊ฒฝ๋ ์ด๋ํฐ**: ๊ธฐ์กด ๋ชจ๋ธ์ ์ฑ๋ฅ์ ์ ์งํ๋ฉด์๋ ์ ์ ์์์ผ๋ก ์ถ๊ฐ ํ์ต์ด ๊ฐ๋ฅ |
|
- **Vision-Language ํ์คํฌ ์ง์**: ์ด๋ฏธ์ง๋ฅผ ์
๋ ฅ๋ฐ์ ํ
์คํธ ์ ๋ณด๋ฅผ ์์ฑํ๊ณ , ํ
์คํธ ์
๋ ฅ๋ง์ผ๋ก ์์ฐ์ด ์ฒ๋ฆฌ ์ํ |
|
- **ํจ์
๋ฐ์ดํฐ๋ฅผ ํ์ฉํ ํ์ต**: *otpensource_data*๋ฅผ ํ์ฉํด ํจ์
์นดํ
๊ณ ๋ฆฌ, ์์, ๊ณ์ ๋ฑ์ ์ ๋ณด๋ฅผ ๋ถ์ํ๋ ๋ฐ ์ต์ ํ |
|
- **๋น ๋ฅธ ์ ์ฉ ๋ฐ ํ์ฅ์ฑ**: ๊ธฐ์กด ๋ชจ๋ธ์ ๋ฏธ์ธ ์กฐ์ (Fine-tuning)ํ ๋ LoRA ์ด๋ํฐ๋ฅผ ํ์ฉํ์ฌ ๋น ๋ฅด๊ฒ ์ ์ฉ ๊ฐ๋ฅ |
|
|
|
--- |
|
|
|
## ๋ชจ๋ธ ์ธ๋ถ์ฌํญ |
|
|
|
### ํ์ต ๋ฐ์ดํฐ |
|
๋ชจ๋ธ ํ์ต์ ์ฌ์ฉ๋ ๋ฐ์ดํฐ์
: |
|
- **[otpensource_dataset](https://huggingface.co/datasets/hateslopacademy/otpensource_dataset)**: |
|
- ์ฝ 9000๊ฐ์ ํจ์
๋ฐ์ดํฐ๋ก ๊ตฌ์ฑ |
|
- ์ท์ ์นดํ
๊ณ ๋ฆฌ, ์์, ๊ณ์ , ํน์ง, ์ด๋ฏธ์ง URL ๋ฑ์ ํฌํจํ์ฌ Vision-Language ํ์ต์ ์ต์ ํ |
|
|
|
### ํ์ต ๋ฐฉ์ |
|
- **๊ธฐ๋ฐ ๋ชจ๋ธ**: Bllossom/llama-3.2-Korean-Bllossom-AICA-5B |
|
- **์ต์ ํ ๊ธฐ๋ฒ**: LoRA ์ ์ฉ |
|
- **GPU ์๊ตฌ์ฌํญ**: A100 40GB ์ด์ ๊ถ์ฅ |
|
- **ํ๋ จ ํจ์จ์ฑ**: LoRA๋ฅผ ํ์ฉํ์ฌ ๊ธฐ์กด ๋ชจ๋ธ ๋๋น 2๋ฐฐ ๋น ๋ฅธ ํ์ต ์ํ |
|
|
|
--- |
|
|
|
## ์ฃผ์ ์ฌ์ฉ ์ฌ๋ก |
|
|
|
### Vision-Language ํ์คํฌ |
|
1. **์ด๋ฏธ์ง ๋ถ์ ๋ฐ ์ค๋ช
** |
|
- ์
๋ ฅ๋ ์ด๋ฏธ์ง์์ ์ท์ ์นดํ
๊ณ ๋ฆฌ, ์์, ๊ณ์ , ํน์ง์ ์ถ์ถํ์ฌ JSON ํ์์ผ๋ก ๋ฐํ. |
|
- ์์: |
|
```json |
|
{ |
|
"category": "ํธ๋ ์น์ฝํธ", |
|
"gender": "์ฌ", |
|
"season": "SS", |
|
"color": "๋ค์ด๋น", |
|
"material": "", |
|
"feature": "ํธ๋ ์น์ฝํธ" |
|
} |
|
``` |
|
|
|
2. **ํ
์คํธ ๋ถ์ ๋ฐ ๋ถ๋ฅ** |
|
- ํ
์คํธ ์
๋ ฅ๋ง์ผ๋ก ๊ฐ์ ๋ถ์, ์ง๋ฌธ ์๋ต, ํ
์คํธ ์์ฝ ๋ฑ์ ์์ฐ์ด ์ฒ๋ฆฌ ํ์คํฌ ์ํ ๊ฐ๋ฅ. |
|
|
|
--- |
|
|
|
## ์ฝ๋ ์์ |
|
|
|
### Vision-Language ํ์คํฌ |
|
|
|
```python |
|
from transformers import MllamaForConditionalGeneration, MllamaProcessor |
|
import torch |
|
from PIL import Image |
|
import requests |
|
|
|
model = MllamaForConditionalGeneration.from_pretrained( |
|
'otpensource-vision-lora', |
|
torch_dtype=torch.bfloat16, |
|
device_map='auto' |
|
) |
|
processor = MllamaProcessor.from_pretrained('otpensource-vision-lora') |
|
|
|
url = "https://image.msscdn.net/thumbnails/images/prd_img/20240710/4242307/detail_4242307_17205916382801_big.jpg?w=1200" |
|
image = Image.open(requests.get(url, stream=True).raw) |
|
|
|
messages = [ |
|
{'role': 'user', 'content': [ |
|
{'type': 'image', 'image': image}, |
|
{'type': 'text', 'text': '์ด ์ท์ ์ ๋ณด๋ฅผ JSON์ผ๋ก ์๋ ค์ค.'} |
|
]} |
|
] |
|
|
|
input_text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
|
inputs = processor( |
|
image=image, |
|
text=input_text, |
|
add_special_tokens=False, |
|
return_tensors="pt", |
|
).to(model.device) |
|
|
|
output = model.generate(**inputs, max_new_tokens=256, temperature=0.1) |
|
print(processor.decode(output[0])) |
|
``` |
|
|
|
--- |
|
|
|
## ์
๋ก๋๋ ๋ชจ๋ธ ์ ๋ณด |
|
|
|
- **๊ฐ๋ฐ์**: hateslopacademy |
|
- **๋ผ์ด์ ์ค**: CC-BY-4.0 |
|
- **LoRA ํ์ต ๋ชจ๋ธ**: otpensource-vision ๊ธฐ๋ฐ |
|
|
|
์ด ๋ชจ๋ธ์ [Unsloth](https://github.com/unslothai/unsloth) ๋ฐ Hugging Face TRL ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ํ์ฉํด ๊ธฐ์กด ๋ชจ๋ธ ๋๋น 2๋ฐฐ ๋น ๋ฅด๊ฒ ํ์ต๋์์ต๋๋ค. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |
|
|