dennispark commited on
Commit
51f763f
Β·
1 Parent(s): ed7398f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ko
3
+ license: mit
4
+ library_name: transformers
5
+ pipeline_tag: text2text-generation
6
+ ---
7
+
8
+ # FLAN T5
9
+
10
+ FLAN T5λŠ” [paust/pko-t5-large](https://huggingface.co/paust/pko-t5-large) λͺ¨λΈμ„ 기반으둜 λ‹€μ–‘ν•œ νƒœμŠ€ν¬λ₯Ό instruction finetuning을 ν†΅ν•΄μ„œ λ§Œλ“  λͺ¨λΈμž…λ‹ˆλ‹€.
11
+
12
+ ν˜„μž¬ 계속 Instruction Finetuning 을 μ§„ν–‰ν•˜λ©΄μ„œ 쀑간결과λ₯Ό λͺ¨λΈλ‘œ μ—…λ°μ΄νŠΈν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
13
+
14
+ ### ν•™μŠ΅λœ νƒœμŠ€ν¬
15
+
16
+ | Task name | Task type |
17
+ |----------------------------|----------------|
18
+ | NSMC | Classification |
19
+ | Klue Ynat | Classification |
20
+ | KorNLI | Classification |
21
+ | KorSTS | Classification |
22
+ | QuestionPair | Classification |
23
+ | Klue STS | Classification |
24
+ | AIHub news Summary | Summarization |
25
+ | AIHub document Summary | Summarization |
26
+ | AIHub book Summary | Summarization |
27
+ | AIHub conversation Summary | Summarization |
28
+ | AIHub ko-to-en | Translation |
29
+ | AIHub ko-to-en Expert | Translation |
30
+ | AIHub ko-to-en Tech | Translation |
31
+ | AIHub ko-to-en social | Translation |
32
+ | AIHub ko-to-jp | Translation |
33
+ | AIHub ko-to-cn Tech | Translation |
34
+ | AIHub Translation Corpus | Translation |
35
+ | korquad | QA |
36
+ | Klue MRC | QA |
37
+ | AIHub mindslab's MRC | QA |
38
+
39
+
40
+ ### λͺ¨λΈ
41
+ - [Hugginface 링크](https://huggingface.co/paust/pko-flan-t5-large)
42
+
43
+
44
+ ### μ‚¬μš© μ˜ˆμ‹œ
45
+ ```python
46
+ from transformers import T5ForConditionalGeneration, T5TokenizerFast
47
+
48
+ tokenizer = T5TokenizerFast.from_pretrained('paust/pko-flan-t5-large')
49
+ model = T5ForConditionalGeneration.from_pretrained('paust/pko-flan-t5-large', device_map='cuda')
50
+
51
+ prompt = """μ„œμšΈνŠΉλ³„μ‹œ(μ„œμšΈη‰Ήεˆ₯εΈ‚, μ˜μ–΄: Seoul Metropolitan Government)λŠ” λŒ€ν•œλ―Όκ΅­ μˆ˜λ„μ΄μž μ΅œλŒ€ λ„μ‹œμ΄λ‹€. μ„ μ‚¬μ‹œλŒ€λΆ€ν„° μ‚¬λžŒμ΄ κ±°μ£Όν•˜μ˜€μœΌλ‚˜ λ³Έ μ—­μ‚¬λŠ” 백제 첫 μˆ˜λ„ μœ„λ‘€μ„±μ„ μ‹œμ΄ˆλ‘œ ν•œλ‹€. μ‚Όκ΅­μ‹œλŒ€μ—λŠ” μ „λž΅μ  μš”μΆ©μ§€λ‘œμ„œ 고ꡬ렀, 백제, 신라가 λ²ˆκ°ˆμ•„ μ°¨μ§€ν•˜μ˜€μœΌλ©°, κ³ λ € μ‹œλŒ€μ—λŠ” μ™•μ‹€μ˜ 별ꢁ이 μ„Έμ›Œμ§„ 남경(南京)으둜 μ΄λ¦„ν•˜μ˜€λ‹€.
52
+ ν•œκ΅­μ˜ μˆ˜λ„λŠ” μ–΄λ””μž…λ‹ˆκΉŒ?"""
53
+ input_ids = tokenizer(prompt, add_special_tokens=True, return_tensors='pt').input_ids
54
+ output_ids = model.generate(input_ids=input_ids.cuda(), max_new_tokens=32, num_beams=12)
55
+ text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
56
+ print(text) # μ„œμšΈνŠΉλ³„μ‹œ
57
+ ```