File size: 1,012 Bytes
624db31
3d6665d
 
624db31
3d6665d
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
624db31
3d6665d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
Book (사회과학, 기술과학, 철학, 법학, 예술 등) - 5000개


qlora 

max_seq_length=1024


num_train_epochs=3


per_device_train_batch_size=8


gradient_accumulation_steps=32,


evaluation_strategy="steps"


eval_steps=2000,


logging_steps=25,


optim="paged_adamw_8bit",


learning_rate=2e-4,


lr_scheduler_type="cosine",


warmup_steps=10,


warmup_ratio=0.05,


report_to="tensorboard",


weight_decay=0.01,


max_steps=-1,


| Model | rouge-1 | rouge-2 | rouge-l |
|-------|---------|---------|---------|
| **Book** | | | |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.2095 | 0.0866 | 0.1985 |
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2454 | 0.1158 | 0.2404 |
| meta-llama/llama-3-8b-instruct | 0.2137 | 0.0883 | 0.2020 |
| meta-llama/llama-3-70b-instruct | 0.2269 | 0.0925 | 0.2186 |
| **Paper** | | | |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.1934 | 0.0829 | 0.1832 |
| meta-llama/llama-3-8b-instruct | 0.2044 | 0.0868 | 0.1895 |
| meta-llama/llama-3-70b-instruct | 0.1935 | 0.0783 | 0.1836 |