File size: 1,012 Bytes
624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d 624db31 3d6665d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
Book (사회과학, 기술과학, 철학, 법학, 예술 등) - 5000개
qlora
max_seq_length=1024
num_train_epochs=3
per_device_train_batch_size=8
gradient_accumulation_steps=32,
evaluation_strategy="steps"
eval_steps=2000,
logging_steps=25,
optim="paged_adamw_8bit",
learning_rate=2e-4,
lr_scheduler_type="cosine",
warmup_steps=10,
warmup_ratio=0.05,
report_to="tensorboard",
weight_decay=0.01,
max_steps=-1,
| Model | rouge-1 | rouge-2 | rouge-l |
|-------|---------|---------|---------|
| **Book** | | | |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.2095 | 0.0866 | 0.1985 |
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2454 | 0.1158 | 0.2404 |
| meta-llama/llama-3-8b-instruct | 0.2137 | 0.0883 | 0.2020 |
| meta-llama/llama-3-70b-instruct | 0.2269 | 0.0925 | 0.2186 |
| **Paper** | | | |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.1934 | 0.0829 | 0.1832 |
| meta-llama/llama-3-8b-instruct | 0.2044 | 0.0868 | 0.1895 |
| meta-llama/llama-3-70b-instruct | 0.1935 | 0.0783 | 0.1836 |
|