|
Book (์ฌํ๊ณผํ, ๊ธฐ์ ๊ณผํ, ์ฒ ํ, ๋ฒํ, ์์ ๋ฑ) - 5000๊ฐ |
|
|
|
|
|
qlora |
|
|
|
max_seq_length=1024 |
|
|
|
|
|
num_train_epochs=3 |
|
|
|
|
|
per_device_train_batch_size=8 |
|
|
|
|
|
gradient_accumulation_steps=32, |
|
|
|
|
|
evaluation_strategy="steps" |
|
|
|
|
|
eval_steps=2000, |
|
|
|
|
|
logging_steps=25, |
|
|
|
|
|
optim="paged_adamw_8bit", |
|
|
|
|
|
learning_rate=2e-4, |
|
|
|
|
|
lr_scheduler_type="cosine", |
|
|
|
|
|
warmup_steps=10, |
|
|
|
|
|
warmup_ratio=0.05, |
|
|
|
|
|
report_to="tensorboard", |
|
|
|
|
|
weight_decay=0.01, |
|
|
|
|
|
max_steps=-1, |
|
|
|
|
|
| Model | rouge-1 | rouge-2 | rouge-l | |
|
|-------|---------|---------|---------| |
|
| **Book** | | | | |
|
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.2095 | 0.0866 | 0.1985 | |
|
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2454 | 0.1158 | 0.2404 | |
|
| meta-llama/llama-3-8b-instruct | 0.2137 | 0.0883 | 0.2020 | |
|
| meta-llama/llama-3-70b-instruct | 0.2269 | 0.0925 | 0.2186 | |
|
| **Paper** | | | | |
|
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.1934 | 0.0829 | 0.1832 | |
|
| meta-llama/llama-3-8b-instruct | 0.2044 | 0.0868 | 0.1895 | |
|
| meta-llama/llama-3-70b-instruct | 0.1935 | 0.0783 | 0.1836 | |
|
|
|
|
|
|