|
--- |
|
language: |
|
- en |
|
- ko |
|
pipeline_tag: text-generation |
|
inference: false |
|
tags: |
|
- facebook |
|
- meta |
|
- pytorch |
|
- llama |
|
- llama-2 |
|
- llama-2-chat |
|
library_name: peft |
|
--- |
|
# komt : korean multi task instruction tuning model |
|
![multi task instruction tuning.jpg](https://github.com/davidkim205/komt/assets/16680469/c7f6ade7-247e-4b62-a94f-47e19abea68e) |
|
|
|
Recently, due to the success of ChatGPT, numerous large language models have emerged in an attempt to catch up with ChatGPT's capabilities. |
|
However, when it comes to Korean language performance, it has been observed that many models still struggle to provide accurate answers or generate Korean text effectively. |
|
This study addresses these challenges by introducing a multi-task instruction technique that leverages supervised datasets from various tasks to create training data for Large Language Models (LLMs). |
|
|
|
## Model Details |
|
|
|
* **Model Developers** : davidkim(changyeon kim) |
|
* **Repository** : https://github.com/davidkim205/komt |
|
* **Model Architecture** : The komt-mistral-7b-v1-dpo is is a fine-tuned version of the komt-mistral-7b-v1(original model : Mistral-7B-Instruct-v0.1). |
|
|
|
|
|
## Dataset |
|
* maywell/ko_Ultrafeedback_binarized |
|
- https://huggingface.co/datasets/maywell/ko_Ultrafeedback_binarized |
|
|
|
## Hardware and Software |
|
- nvidia driver : 535.54.03 |
|
- CUDA Version: 12.2 |
|
|
|
## Training |
|
Refer https://github.com/davidkim205/komt |
|
|
|
## Prompt template: Mistral |
|
``` |
|
<s>[INST] {prompt} [/INST]</s> |
|
``` |
|
|
|
## Usage |
|
``` |
|
import torch |
|
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
|
from peft import PeftModel, PeftConfig |
|
from transformers import TextStreamer, GenerationConfig |
|
|
|
|
|
model='davidkim205/komt-mistral-7b-v1' |
|
peft_model_name = 'davidkim205/komt-mistral-7b-v1-dpo' |
|
config = PeftConfig.from_pretrained(peft_model_name) |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
config.base_model_name_or_path =model |
|
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, quantization_config=bnb_config, device_map="auto") |
|
model = PeftModel.from_pretrained(model, peft_model_name) |
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
|
streamer = TextStreamer(tokenizer) |
|
|
|
def gen(x): |
|
generation_config = GenerationConfig( |
|
temperature=0.8, |
|
top_p=0.8, |
|
top_k=100, |
|
max_new_tokens=1024, |
|
early_stopping=True, |
|
do_sample=True, |
|
) |
|
q = f"[INST]{x} [/INST]" |
|
gened = model.generate( |
|
**tokenizer( |
|
q, |
|
return_tensors='pt', |
|
return_token_type_ids=False |
|
).to('cuda'), |
|
generation_config=generation_config, |
|
pad_token_id=tokenizer.eos_token_id, |
|
eos_token_id=tokenizer.eos_token_id, |
|
streamer=streamer, |
|
) |
|
result_str = tokenizer.decode(gened[0]) |
|
|
|
start_tag = f"[/INST]" |
|
start_index = result_str.find(start_tag) |
|
|
|
if start_index != -1: |
|
result_str = result_str[start_index + len(start_tag):].strip() |
|
return result_str |
|
|
|
result = gen('์ ์ฃผ๋๋ฅผ 1๋ฐ2์ผ๋ก ํผ์ ์ฌํํ๋ ค๊ณ ํ๋๋ฐ ์ฌํ ์ฝ์ค๋ฅผ ๋ง๋ค์ด์ค') |
|
|
|
print('##########') |
|
print(result) |
|
``` |
|
output |
|
``` |
|
์ ์ฃผ๋ 1๋ฐ2์ผ 1์ธ ์ฌํ ์ฝ์ค |
|
์ ์ฃผ๋๋ ํ๊ตญ์์ ๊ฐ์ฅ ๋จผ ์ฌ์ธ ๋๋จ์์์ ์ต๋ ์ฌ์ผ๋ก, ๋ฉ์ง ํด๋ณ, ์๋ฆ๋ค์ด ์์ฐ๊ฒฝ๊ด, ์ ๊ฒฝ ๋ฉ๋ ์ ๋ฒฝ, ํ๊ตญ ์ต๋ ๊ท๋ชจ์ ๋ณตํฉ๋ฆฌ์กฐํธ ๋ฑ ๋ค์ํ ๊ด๊ด ๋ช
์๊ฐ ํ๋ถํ๊ฒ ์์ด 1๋ฐ2์ผ๋ก ํผ์ ์ฌํํ์๋ ์ฌ๋ฌ๋ถ๋ค์ ์ํด ์๋์ ๊ฐ์ ์ฝ์ค๋ฅผ ์ ์ํด ๋๋ฆฌ๊ฒ ์ต๋๋ค. |
|
|
|
โท ์ฝ์ค 1 : ์ฑ์ฐ์ผ์ถ๋ด, ์ฉ๋์ด์ ๋ฒฝ, ์ฑ์ฐ์ผ์ถ๋ด ์ผ๊ฐ ๊ฒฝ๊ด ๊ด๋ |
|
- ์ฝ์ค ์ค๋ช
: ์ ์ฃผ ๋๋จ์ชฝ ํด์์ ๋ช
์์ธ ์ฑ์ฐ์ผ์ถ๋ด, ์ฉ๋์ด์ ๋ฒฝ, ์ฑ์ฐ์ผ์ถ๋ด ์ผ๊ฐ ๊ฒฝ๊ด ๊ด๋ ์์ผ๋ก ๊ตฌ์ฑ๋ ์ฝ์ค์
๋๋ค. ์์นจ์ ์ผ์ฐ ์ผ์ด๋ ์ผ์ถ๋ด์ ๋์ฐฉํ์ฌ ์ผ์ถ์ ๊ฐ์ํ๊ณ , ์์นจ ์์ฌ๋ฅผ ํ๊ณ ์ ๋ฒฝ ๋ฑ๋ฐ์ ์ฆ๊ธฐ๋ฉฐ ํด์์ ์ทจํฉ๋๋ค. ์คํ์๋ ์ผ์ถ๋ด ์ผ๊ฐ ๊ฒฝ๊ด ๊ด๋์ ์ฆ๊ธฐ๋ฉฐ ํด์๊ณผ ํด์์ ์ทจํฉ๋๋ค. |
|
|
|
โท ์ฝ์ค 2 : ํ๋ผ์ฐ, ํ๋ผ์ฐ ์ผ์ด๋ธ์นด, ์ค๋ฏธ์ ๋ฐ์, ์ ๋ผ ์ด์ |
|
- ์ฝ์ค ์ค๋ช
: ์ ์ฃผ ๋จ๋ถ์ ๋ช
์์ธ ํ๋ผ์ฐ, ํ๋ผ์ฐ ์ผ์ด๋ธ์นด, ์ค๋ฏธ์ ๋ฐ์, ์ ๋ผ ์ด์ ์์ผ๋ก ๊ตฌ์ฑ๋ ์ฝ์ค์
๋๋ค. ์์นจ์ ์ผ์ฐ ์ผ์ด๋ ํ๋ผ์ฐ ์ผ์ด๋ธ์นด๋ฅผ ํ๊ณ ๋์ ๊ณ ์ง์ ์์นํ ํ๋ผ์ฐ ์ ์์ผ๋ก ์ฌ๋ผ๊ฐ์ ํํ์ ์ฆ๊ธฐ๋ฉฐ ์์นจ ์์ฌ๋ฅผ ํฉ๋๋ค. ์คํ์๋ ์ค๋ฏธ์ ๋ฐ์๋ฅผ ์ฐพ์ ํด์๊ณผ ํด์์ ์ทจํ๊ณ , ์ผ์ถ๋ด ์ผ๊ฐ ๊ฒฝ๊ด ๊ด๋์ ์ฆ๊ธฐ๋ฉฐ ํด์์ ์ทจํฉ๋๋ค. |
|
|
|
โท ์ฝ์ค 3 : ๋ํ๋๊ธธ, ์ผ๊ฑฐ๋ฆฌ, ๊ณฐ๋๋ผ๋น, ์น ๋๊ตด, ๊ด์์ , ์น ๊ธ์ , ํด๋์ด๊ธธ, ๋ฐ๋ค์ง์ ๊ธธ |
|
- ์ฝ์ค ์ค๋ช
: ์ ์ฃผ ์๋ถ์ ๋ช
์์ธ ๋ํ๋๊ธธ, ์ผ๊ฑฐ๋ฆฌ, ๊ณฐ๋๋ผ๋น, ์น ๋๊ตด, ๊ด์์ , ์น ๊ธ์ , ํด๋์ด๊ธธ, ๋ฐ๋ค์ง์ ๊ธธ ์์ผ๋ก ๊ตฌ์ฑ๋ ์ฝ์ค์
๋๋ค. ์์นจ์ ์ผ์ฐ ์ผ์ด๋ ๋ํ๋๊ธธ์์ ํํ์ ์ฆ๊ธฐ๋ฉฐ ์์นจ ์์ฌ๋ฅผ ํฉ๋๋ค. ์คํ์๋ ์ผ๊ฑฐ๋ฆฌ๋ฅผ ์ฐพ์ ํด์๊ณผ ํด์์ ์ทจํ๊ณ , ์ผ์ถ๋ด ์ผ๊ฐ ๊ฒฝ๊ด ๊ด๋์ ์ฆ๊ธฐ๋ฉฐ ํด์์ ์ทจํฉ๋๋ค. |
|
|
|
|
|
|
|
``` |
|
## Evaluation |
|
For objective model evaluation, we initially used EleutherAI's lm-evaluation-harness but obtained unsatisfactory results. Consequently, we conducted evaluations using ChatGPT, a widely used model, as described in [Self-Alignment with Instruction Backtranslation](https://arxiv.org/pdf/2308.06502.pdf) and [Three Ways of Using Large Language Models to Evaluate Chat](https://arxiv.org/pdf/2308.06259.pdf) . |
|
|
|
|
|
| model | score | average(0~5) | percentage | |
|
|------------------------------------------|---------| ------------ |------------| |
|
| gpt-3.5-turbo(close) | 147 | 3.97 | 79.45% | |
|
| naver Cue(close) | 140 | 3.78 | 75.67% | |
|
| clova X(close) | 136 | 3.67 | 73.51% | |
|
| WizardLM-13B-V1.2(open) | 96 | 2.59 | 51.89% | |
|
| Llama-2-7b-chat-hf(open) | 67 | 1.81 | 36.21% | |
|
| Llama-2-13b-chat-hf(open) | 73 | 1.91 | 38.37% | |
|
| nlpai-lab/kullm-polyglot-12.8b-v2(open) | 70 | 1.89 | 37.83% | |
|
| kfkas/Llama-2-ko-7b-Chat(open) | 96 | 2.59 | 51.89% | |
|
| beomi/KoAlpaca-Polyglot-12.8B(open) | 100 | 2.70 | 54.05% | |
|
| **komt-llama2-7b-v1 (open)(ours)** | **117** | **3.16** | **63.24%** | |
|
| **komt-llama2-13b-v1 (open)(ours)** | **129** | **3.48** | **69.72%** | |
|
| **komt-llama-30b-v1 (open)(ours)** | **129** | **3.16** | **63.24%** | |
|
| **komt-mistral-7b-v1 (open)(ours)** | **131** | **3.54** | **70.81%** | |
|
| **komt-mistral-7b-v1-dpo (open)(ours)** | **142** | **3.83** | **76.75%** | |
|
|