File size: 1,700 Bytes
00bec6d
 
 
 
89eaf5c
00bec6d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
### κ°œμš” 
- ν•œκ΅­μ–΄ μš”μ•½ Taskλ₯Ό μˆ˜ν–‰ν•˜λŠ” λͺ¨λΈμž…λ‹ˆλ‹€. 

### Base Model
- [beomi/open-llama-2-ko-7b](https://huggingface.co/beomi/open-llama-2-ko-7b)

### Dataset  
AI hub에 μžˆλŠ” μ•„λž˜ μš”μ•½ 데이터 쀑 3λ§Œκ±΄μ„ μƒ˜ν”Œλ§ν•˜μ—¬ μ‚¬μš©ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
- [좔상 μš”μ•½ 사싀성 검증 데이터](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71620)  
- [μš”μ•½λ¬Έ 및 레포트 생성 데이터](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=582)  
- [λ¬Έμ„œμš”μ•½ ν…μŠ€νŠΈ](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=97)  
- [λ„μ„œμžλ£Œ μš”μ•½](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=9)  

### 라이브러리 μ„€μΉ˜
```bash
pip3 install transformers gradio vllm
```

### 예제 μ½”λ“œ

```python
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
import gradio as gr
import os

model_path = "gangyeolkim/open-llama-2-ko-7b-summarization"
sampling_params = SamplingParams(max_tokens=1024, temperature=0.1)
tokenizer = AutoTokenizer.from_pretrained(model_path) 
 
llm = LLM(model=model_path, tokenizer=model_path, tensor_parallel_size=1) 

def gen(text, history):
    
    text = [
            "### 원문:",
            f"{text}\n",
            "### μš”μ•½:\n",
        ]

    prompts = "\n".join(text)
    outputs = llm.generate(prompts, sampling_params) 
    
    for output in outputs:
        generated_text = output.outputs[0].text
    return generated_text 
    
demo = gr.ChatInterface(gen)
demo.launch(share=True) 
```