File size: 2,732 Bytes
0595def 3d6665d d6bce14 d6428ae 0595def 3d6665d d6bce14 0595def 3d6665d d6bce14 0595def c839cc7 fd52ed2 3039dfd fd52ed2 53a198b 3039dfd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
| 파라미터 | 값 |
|----------|-----|
| Task | Book (사회과학, 기술과학, 철학, 법학, 예술 등) |
| 데이터 크기 | 5000개 |
| 모델 | qlora |
| max_seq_length | 1024 |
| num_train_epochs | 3 |
| per_device_train_batch_size | 8 |
| gradient_accumulation_steps | 32 |
| evaluation_strategy | "steps" |
| eval_steps | 2000 |
| logging_steps | 25 |
| optim | "paged_adamw_8bit" |
| learning_rate | 2e-4 |
| lr_scheduler_type | "cosine" |
| warmup_steps | 10 |
| warmup_ratio | 0.05 |
| report_to | "tensorboard" |
| weight_decay | 0.01 |
| max_steps | -1 |
# Summary
**Book**
| 모델 이름 | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *ryanu/EEVE-10.8-BOOK-v0.1 | 0.2454 | 0.1158 | 0.2404 |
| meta-llama/llama-3-70b-instruct | 0.2269 | 0.0925 | 0.2186 |
| meta-llama/llama-3-8b-instruct | 0.2137 | 0.0883 | 0.2020 |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.2095 | 0.0866 | 0.1985 |
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.1735 | 0.0516 | 0.1668 |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.1724 | 0.0534 | 0.1630 |
**Paper**
| 모델 이름 | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *meta-llama/llama-3-8b-instruct | 0.2044 | 0.0868 | 0.1895 |
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2004 | 0.0860 | 0.1938 |
| meta-llama/llama-3-70b-instruct | 0.1935 | 0.0783 | 0.1836 |
| yanolja/EEVE-Korean-Instruct-2.8B-v1.0 | 0.1934 | 0.0829 | 0.1832 |
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.1774 | 0.0601 | 0.1684 |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.1702 | 0.0561 | 0.1605 |
# RAG Q&A
| 모델 이름 | Rouge-1 | Rouge-2 | Rouge-L |
|----------------------------------------------|---------|---------|---------|
| *meta-llama/llama-3-70b-instruct | 0.4418 | 0.2986 | 0.4297 |
| *meta-llama/llama-3-8b-instruct | 0.4391 | 0.3100 | 0.4273 |
| mistralai/mixtral-8x7b-instruct-v0-1 | 0.4022 | 0.2653 | 0.3916 |
| ibm-mistralai/mixtral-8x7b-instruct-v01-q | 0.3105 | 0.1763 | 0.2960 |
| yanolja/EEVE-Korean-Instruct-10.8B-v1.0 | 0.3191 | 0.2069 | 0.3136 |
| ryanu/EEVE-10.8-BOOK-v0.1 | 0.2185 | 0.1347 | 0.2139 |
**prompt template**
-------------------
다음 문장을 3~5문장으로 반복되는 구문없이 텍스트에 제시된 주요 논거를 간략하게 요약해줘.
문장: {context}
요약: {summary} |