eenzeenee commited on
Commit
413de8f
Β·
1 Parent(s): c5faa38

Create README.md

Browse files

Usage, Model description, Citation

Files changed (1) hide show
  1. README.md +199 -0
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: summarization
3
+ language:
4
+ - ko
5
+ tags:
6
+ - T5
7
+ ---
8
+
9
+ # t5-base-korean-summarization
10
+
11
+ This is [T5](https://huggingface.co/docs/transformers/model_doc/t5) model: It is an encoder-decoder model and converts all NLP problems into a text-to-text format.
12
+
13
+ # Usage (HuggingFace Transformers)
14
+
15
+ ```python
16
+ import nltk
17
+ nltk.download('punkt')
18
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
19
+
20
+ sample = """
21
+ μ•ˆλ…•ν•˜μ„Έμš”? 우리 (2ν•™λ…„)/(이 ν•™λ…„) μΉœκ΅¬λ“€ 우리 μΉœκ΅¬λ“€ 학ꡐ에 κ°€μ„œ μ§„μ§œ (2ν•™λ…„)/(이 ν•™λ…„) 이 되고 μ‹Άμ—ˆλŠ”λ° 학ꡐ에 λͺ» κ°€κ³  μžˆμ–΄μ„œ λ‹΅λ‹΅ν•˜μ£ ?
22
+ κ·Έλž˜λ„ 우리 μΉœκ΅¬λ“€μ˜ μ•ˆμ „κ³Ό 건강이 μ΅œμš°μ„ μ΄λ‹ˆκΉŒμš” μ˜€λŠ˜λΆ€ν„° μ„ μƒλ‹˜μ΄λž‘ 맀일 맀일 κ΅­μ–΄ 여행을 λ– λ‚˜λ³΄λ„λ‘ ν•΄μš”.
23
+ μ–΄/ μ‹œκ°„μ΄ 벌써 μ΄λ ‡κ²Œ λλ‚˜μš”? λŠ¦μ—ˆμ–΄μš”. λŠ¦μ—ˆμ–΄μš”. 빨리 κ΅­μ–΄ 여행을 λ– λ‚˜μ•Ό λΌμš”.
24
+ 그런데 μ–΄/ ꡭ어여행을 λ– λ‚˜κΈ° 전에 μš°λ¦¬κ°€ 쀀비물을 챙겨야 되겠죠? κ΅­μ–΄ 여행을 λ– λ‚  μ€€λΉ„λ¬Ό, κ΅μ•ˆμ„ μ–΄λ–»κ²Œ 받을 수 μžˆλŠ”μ§€ μ„ μƒλ‹˜μ΄ μ„€λͺ…을 ν•΄μ€„κ²Œμš”.
25
+ (EBS)/(μ΄λΉ„μ—μŠ€) μ΄ˆλ“±μ„ κ²€μƒ‰ν•΄μ„œ λ“€μ–΄κ°€λ©΄μš” 첫화면이 μ΄λ ‡κ²Œ λ‚˜μ™€μš”.
26
+ 자/ κ·ΈλŸ¬λ©΄μš” μ—¬κΈ° (X)/(μ—‘μŠ€) 눌러주(κ³ μš”)/(κ΅¬μš”). μ €κΈ° (동그라미)/(λ˜₯그라미) (EBS)/(μ΄λΉ„μ—μŠ€) (2μ£Ό)/(이 μ£Ό) λΌμ΄λΈŒνŠΉκ°•μ΄λΌκ³  λ˜μ–΄μžˆμ£ ?
27
+ κ±°κΈ°λ₯Ό λ°”λ‘œ κ°€κΈ°λ₯Ό λˆ„λ¦…λ‹ˆλ‹€. 자/ (λˆ„λ₯΄λ©΄μš”)/(눌λ₯΄λ©΄μš”). μ–΄λ–»κ²Œ λ˜λƒ? b/ λ°‘μœΌλ‘œ λ‚΄λ €μš” λ‚΄λ €μš” λ‚΄λ €μš” μ­‰ λ‚΄λ €μš”.
28
+ 우리 λͺ‡ 학년이죠? μ•„/ (2ν•™λ…„)/(이 ν•™λ…„) 이죠 (2ν•™λ…„)/(이 ν•™λ…„)의 무슨 κ³Όλͺ©? κ΅­μ–΄.
29
+ μ΄λ²ˆμ£ΌλŠ” (1μ£Ό)/(일 μ£Ό) μ°¨λ‹ˆκΉŒμš” μ—¬κΈ° κ΅μ•ˆ. λ‹€μŒμ£ΌλŠ” μ—¬κΈ°μ„œ λ‹€μš΄μ„ λ°›μœΌλ©΄ λΌμš”.
30
+ 이 κ΅μ•ˆμ„ 클릭을 ν•˜λ©΄, μ§œμž”/. μ΄λ ‡κ²Œ κ΅μž¬κ°€ λ‚˜μ˜΅λ‹ˆλ‹€ .이 κ΅μ•ˆμ„ (λ‹€μš΄)/(λ”°μš΄)λ°›μ•„μ„œ 우리 ꡭ어여행을 λ– λ‚  μˆ˜κ°€ μžˆμ–΄μš”.
31
+ 그럼 우리 μ§„μ§œλ‘œ κ΅­μ–΄ 여행을 ν•œλ²ˆ λ– λ‚˜λ³΄λ„λ‘ ν•΄μš”? κ΅­μ–΄μ—¬ν–‰ 좜발. 자/ (1단원)/(일 단원) 제λͺ©μ΄ λ­”κ°€μš”? ν•œλ²ˆ μ°Ύμ•„λ΄μš”.
32
+ μ‹œλ₯Ό μ¦κ²¨μš” μ—μš”. κ·Έλƒ₯ μ‹œλ₯Ό μ½μ–΄μš” κ°€ μ•„λ‹ˆμ—μš”. μ‹œλ₯Ό 즐겨야 λΌμš” 즐겨야 돼. μ–΄λ–»κ²Œ 즐길까? 일단은 λ‚΄λ‚΄ μ‹œλ₯Ό μ¦κΈ°λŠ” 방법에 λŒ€ν•΄μ„œ 곡뢀λ₯Ό ν•  κ±΄λ°μš”.
33
+ 그럼 μ˜€λŠ˜μ€μš” μ–΄λ–»κ²Œ μ¦κΈΈκΉŒμš”? 였늘 곡뢀할 λ‚΄μš©μ€μš” μ‹œλ₯Ό μ—¬λŸ¬ 가지 λ°©λ²•μœΌλ‘œ 읽기λ₯Ό κ³΅λΆ€ν• κ²λ‹ˆλ‹€.
34
+ μ–΄λ–»κ²Œ μ—¬λŸ¬κ°€μ§€ λ°©λ²•μœΌλ‘œ μ½μ„κΉŒ 우리 곡뢀해 보도둝 ν•΄μš”. 였늘의 μ‹œ λ‚˜μ™€λΌ μ§œμž”/! μ‹œκ°€ λ‚˜μ™”μŠ΅λ‹ˆλ‹€ μ‹œμ˜ 제λͺ©μ΄ λ­”κ°€μš”? λ‹€νˆ° λ‚ μ΄μ—μš” λ‹€νˆ° λ‚ .
35
+ λˆ„κ΅¬λž‘ λ‹€ν‰œλ‚˜ λ™μƒμ΄λž‘ λ‹€ν‰œλ‚˜ μ–Έλ‹ˆλž‘ μΉœκ΅¬λž‘? λˆ„κ΅¬λž‘ λ‹€ν‰œλŠ”μ§€ μ„ μƒλ‹˜μ΄ μ‹œλ₯Ό 읽어 쀄 ν…Œλ‹ˆκΉŒ ν•œλ²ˆ 생각을 해보도둝 ν•΄μš”."""
36
+
37
+ inputs = [args.prefix + sample]
38
+
39
+
40
+ inputs = tokenizer(inputs, max_length=args.max_input_length, truncation=True, return_tensors="pt")
41
+ output = model.generate(**inputs, num_beams=3, do_sample=True, min_length=10, max_length=64)
42
+ decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
43
+ result = nltk.sent_tokenize(decoded_output.strip())[0]
44
+
45
+ print('RESULT >>', result)
46
+ ```
47
+
48
+ # Evalutation Result
49
+
50
+ # Training
51
+
52
+ # Model Architecture
53
+
54
+ ```
55
+ T5ForConditionalGeneration(
56
+ (shared): Embedding(50358, 768)
57
+ (encoder): T5Stack(
58
+ (embed_tokens): Embedding(50358, 768)
59
+ (block): ModuleList(
60
+ (0): T5Block(
61
+ (layer): ModuleList(
62
+ (0): T5LayerSelfAttention(
63
+ (SelfAttention): T5Attention(
64
+ (q): Linear(in_features=768, out_features=768, bias=False)
65
+ (k): Linear(in_features=768, out_features=768, bias=False)
66
+ (v): Linear(in_features=768, out_features=768, bias=False)
67
+ (o): Linear(in_features=768, out_features=768, bias=False)
68
+ (relative_attention_bias): Embedding(32, 12)
69
+ )
70
+ (layer_norm): T5LayerNorm()
71
+ (dropout): Dropout(p=0.1, inplace=False)
72
+ )
73
+ (1): T5LayerFF(
74
+ (DenseReluDense): T5DenseGatedActDense(
75
+ (wi_0): Linear(in_features=768, out_features=2048, bias=False)
76
+ (wi_1): Linear(in_features=768, out_features=2048, bias=False)
77
+ (wo): Linear(in_features=2048, out_features=768, bias=False)
78
+ (dropout): Dropout(p=0.1, inplace=False)
79
+ (act): NewGELUActivation()
80
+ )
81
+ (layer_norm): T5LayerNorm()
82
+ (dropout): Dropout(p=0.1, inplace=False)
83
+ )
84
+ )
85
+ )
86
+ (1~11): T5Block(
87
+ (layer): ModuleList(
88
+ (0): T5LayerSelfAttention(
89
+ (SelfAttention): T5Attention(
90
+ (q): Linear(in_features=768, out_features=768, bias=False)
91
+ (k): Linear(in_features=768, out_features=768, bias=False)
92
+ (v): Linear(in_features=768, out_features=768, bias=False)
93
+ (o): Linear(in_features=768, out_features=768, bias=False)
94
+ )
95
+ (layer_norm): T5LayerNorm()
96
+ (dropout): Dropout(p=0.1, inplace=False)
97
+ )
98
+ (1): T5LayerFF(
99
+ (DenseReluDense): T5DenseGatedActDense(
100
+ (wi_0): Linear(in_features=768, out_features=2048, bias=False)
101
+ (wi_1): Linear(in_features=768, out_features=2048, bias=False)
102
+ (wo): Linear(in_features=2048, out_features=768, bias=False)
103
+ (dropout): Dropout(p=0.1, inplace=False)
104
+ (act): NewGELUActivation()
105
+ )
106
+ (layer_norm): T5LayerNorm()
107
+ (dropout): Dropout(p=0.1, inplace=False)
108
+ )
109
+ )
110
+ )
111
+ )
112
+ (final_layer_norm): T5LayerNorm()
113
+ (dropout): Dropout(p=0.1, inplace=False)
114
+ )
115
+ (decoder): T5Stack(
116
+ (embed_tokens): Embedding(50358, 768)
117
+ (block): ModuleList(
118
+ (0): T5Block(
119
+ (layer): ModuleList(
120
+ (0): T5LayerSelfAttention(
121
+ (SelfAttention): T5Attention(
122
+ (q): Linear(in_features=768, out_features=768, bias=False)
123
+ (k): Linear(in_features=768, out_features=768, bias=False)
124
+ (v): Linear(in_features=768, out_features=768, bias=False)
125
+ (o): Linear(in_features=768, out_features=768, bias=False)
126
+ (relative_attention_bias): Embedding(32, 12)
127
+ )
128
+ (layer_norm): T5LayerNorm()
129
+ (dropout): Dropout(p=0.1, inplace=False)
130
+ )
131
+ (1): T5LayerCrossAttention(
132
+ (EncDecAttention): T5Attention(
133
+ (q): Linear(in_features=768, out_features=768, bias=False)
134
+ (k): Linear(in_features=768, out_features=768, bias=False)
135
+ (v): Linear(in_features=768, out_features=768, bias=False)
136
+ (o): Linear(in_features=768, out_features=768, bias=False)
137
+ )
138
+ (layer_norm): T5LayerNorm()
139
+ (dropout): Dropout(p=0.1, inplace=False)
140
+ )
141
+ (2): T5LayerFF(
142
+ (DenseReluDense): T5DenseGatedActDense(
143
+ (wi_0): Linear(in_features=768, out_features=2048, bias=False)
144
+ (wi_1): Linear(in_features=768, out_features=2048, bias=False)
145
+ (wo): Linear(in_features=2048, out_features=768, bias=False)
146
+ (dropout): Dropout(p=0.1, inplace=False)
147
+ (act): NewGELUActivation()
148
+ )
149
+ (layer_norm): T5LayerNorm()
150
+ (dropout): Dropout(p=0.1, inplace=False)
151
+ )
152
+ )
153
+ )
154
+ (1~11): T5Block(
155
+ (layer): ModuleList(
156
+ (0): T5LayerSelfAttention(
157
+ (SelfAttention): T5Attention(
158
+ (q): Linear(in_features=768, out_features=768, bias=False)
159
+ (k): Linear(in_features=768, out_features=768, bias=False)
160
+ (v): Linear(in_features=768, out_features=768, bias=False)
161
+ (o): Linear(in_features=768, out_features=768, bias=False)
162
+ )
163
+ (layer_norm): T5LayerNorm()
164
+ (dropout): Dropout(p=0.1, inplace=False)
165
+ )
166
+ (1): T5LayerCrossAttention(
167
+ (EncDecAttention): T5Attention(
168
+ (q): Linear(in_features=768, out_features=768, bias=False)
169
+ (k): Linear(in_features=768, out_features=768, bias=False)
170
+ (v): Linear(in_features=768, out_features=768, bias=False)
171
+ (o): Linear(in_features=768, out_features=768, bias=False)
172
+ )
173
+ (layer_norm): T5LayerNorm()
174
+ (dropout): Dropout(p=0.1, inplace=False)
175
+ )
176
+ (2): T5LayerFF(
177
+ (DenseReluDense): T5DenseGatedActDense(
178
+ (wi_0): Linear(in_features=768, out_features=2048, bias=False)
179
+ (wi_1): Linear(in_features=768, out_features=2048, bias=False)
180
+ (wo): Linear(in_features=2048, out_features=768, bias=False)
181
+ (dropout): Dropout(p=0.1, inplace=False)
182
+ (act): NewGELUActivation()
183
+ )
184
+ (layer_norm): T5LayerNorm()
185
+ (dropout): Dropout(p=0.1, inplace=False)
186
+ )
187
+ )
188
+ )
189
+ (final_layer_norm): T5LayerNorm()
190
+ (dropout): Dropout(p=0.1, inplace=False)
191
+ )
192
+ (lm_head): Linear(in_features=768, out_features=50358, bias=False)
193
+ )
194
+ ```
195
+
196
+ ## Citation
197
+
198
+ - Raffel, Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." J. Mach. Learn. Res. 21.140 (2020): 1-67.
199
+