File size: 2,732 Bytes
0595def
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3d6665d
 
d6bce14
d6428ae
0595def
 
 
 
 
 
 
 
3d6665d
d6bce14
0595def
 
 
 
 
 
 
 
3d6665d
d6bce14
0595def
 
 
 
 
 
 
c839cc7
fd52ed2
 
3039dfd
 
fd52ed2
 
53a198b
 
 
3039dfd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
| 파라미터 | 값 |
|----------|-----|
| Task     | Book (사회과학, 기술과학, 철학, 법학, 예술 등) |
| 데이터 크기 | 5000개 |
| 모델     | qlora |
| max_seq_length | 1024 |
| num_train_epochs | 3 |
| per_device_train_batch_size | 8 |
| gradient_accumulation_steps | 32 |
| evaluation_strategy | "steps" |
| eval_steps | 2000 |
| logging_steps | 25 |
| optim | "paged_adamw_8bit" |
| learning_rate | 2e-4 |
| lr_scheduler_type | "cosine" |
| warmup_steps | 10 |
| warmup_ratio | 0.05 |
| report_to | "tensorboard" |
| weight_decay | 0.01 |
| max_steps | -1 |


# Summary
**Book** 
| 모델 이름                                     | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *ryanu/EEVE-10.8-BOOK-v0.1                   | 0.2454  | 0.1158  | 0.2404  |
| meta-llama/llama-3-70b-instruct              | 0.2269  | 0.0925  | 0.2186  |
| meta-llama/llama-3-8b-instruct               | 0.2137  | 0.0883  | 0.2020  |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0       | 0.2095  | 0.0866  | 0.1985  |
| mistralai/mixtral-8x7b-instruct-v0-1         | 0.1735  | 0.0516  | 0.1668  |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q    | 0.1724  | 0.0534  | 0.1630  |

**Paper**
| 모델 이름                                     | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *meta-llama/llama-3-8b-instruct               | 0.2044  | 0.0868  | 0.1895  |
| ryanu/EEVE-10.8-BOOK-v0.1                    | 0.2004  | 0.0860  | 0.1938  |
| meta-llama/llama-3-70b-instruct              | 0.1935  | 0.0783  | 0.1836  |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0       | 0.1934  | 0.0829  | 0.1832  |
| mistralai/mixtral-8x7b-instruct-v0-1         | 0.1774  | 0.0601  | 0.1684  |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q    | 0.1702  | 0.0561  | 0.1605  |

# RAG Q&A
| 모델 이름                                     | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *meta-llama/llama-3-70b-instruct             | 0.4418  | 0.2986  | 0.4297  |
| *meta-llama/llama-3-8b-instruct              | 0.4391  | 0.3100  | 0.4273  |
| mistralai/mixtral-8x7b-instruct-v0-1         | 0.4022  | 0.2653  | 0.3916  |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q    | 0.3105  | 0.1763  | 0.2960  |
| yanolja/EEVE-Korean-Instruct-10.8B-v1.0      | 0.3191  | 0.2069  | 0.3136  |
| ryanu/EEVE-10.8-BOOK-v0.1      | 0.2185  | 0.1347  | 0.2139  |


**prompt template**
-------------------

다음 문장을 3~5문장으로 반복되는 구문없이 텍스트에 제시된 주요 논거를 간략하게 요약해줘.

문장: {context}

요약: {summary}