์•ˆ๋…• ํ•˜์„ธ์š” Ai hub์—์„œ ์ œ๊ณตํ•˜๋Š” ๊ธฐ์ˆ ๊ณผํ•™ ์š”์•ฝ ๋ฐ์ดํ„ฐ(์ฐธ๊ณ )๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ํŒŒ์ธํŠœ๋‹ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉ๋ฐฉ๋ฒ•์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. aihub๋ฐ์ดํ„ฐ: https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&dataSetSn=71532

!pip install transformers 
!pip install sentence_transformers
from transformers import T5ForConditionalGeneration,AutoTokenizer
path = "kimdwan/t5-base-korean-summarize-LOGAN"
model = T5ForConditionalGeneration.from_pretrained(path)
tokenizer = AutoTokenizer.from_pretrained(path)

#์—ฌ๊ธฐ์— ์›ํ•˜๋Š” ๋ฌธ์žฅ์„ ์ž…๋ ฅํ•˜์‹œ๊ธธ ๋ฐ”๋ž๋‹ˆ๋‹ค. 
text= """ (์„œ์šธ=๋‰ด์Šค1) ์ด๋น„์Šฌ ๊ธฐ์ž = ์œค์ƒํ˜„ ๊ตญ๋ฏผ์˜ํž˜ ์˜์›์€ 18์ผ ์ด์ฒ ๊ทœ ์‚ฌ๋ฌด์ด์žฅ์˜ '์Šน์„ ๋ถˆ๊ฐ€' ๋ฐœ์–ธ๊ณผ ๊ด€๋ จํ•ด "๋ฌด์—‡์ด ์œ„๊ธฐ์ธ์ง€ ๋ณธ์งˆ์„ ์ž˜ ๋ชจ๋ฅด๊ณ  ์žˆ๋‹ค๋Š” ๊ฒŒ ์ง„์งœ ์œ„๊ธฐ"๋ผ๊ณ  ๋งํ–ˆ๋‹ค.

์œค ์˜์›์€ ์ด๋‚  SBS ๋ผ๋””์˜ค '๊น€ํƒœํ˜„์˜ ์ •์น˜์‡ผ' ์ธํ„ฐ๋ทฐ์—์„œ "๊ตญ๋ฏผ์˜ํž˜์ด ๋”๋ถˆ์–ด๋ฏผ์ฃผ๋‹น์„ ๋นผ๋†“๊ณ  ์ œ3์ •๋‹น์ด ๋‚˜์˜ค๋ฉด ์ง€์ง€์œจ์ด ๋น„์Šทํ•˜๋‹ค. ์ด๊ฒƒ์ด ์œ„๊ธฐ ์•„๋‹ˆ๋ƒ. ๊ทธ๋Ÿฐ๋ฐ ์ด๋Ÿฐ ๊ฒƒ์— ๊ด€ํ•ด์„œ (์ด์•ผ๊ธฐ)ํ•˜๋ฉด ์ด๊ฒƒ์„ ์ด์ƒํ•˜๊ฒŒ ๋ฐ›์•„๋“ค์ด๋Š”๋ฐ, ๊ทธ๋ž˜์„œ ์œ„๊ธฐ๊ฐ€ ์œ„๊ธฐ๋ผ๋Š” ๊ฒƒ"์ด๋ผ๊ณ  ๋งํ–ˆ๋‹ค.

์œค ์˜์›์€ "์ˆ˜๋„๊ถŒ ์‹ธ์›€์€ ์˜๋‚จ๊ถŒ ์‹ธ์›€๊ณผ ๋‹ค๋ฅด๋‹ค. ์ˆ˜๋„๊ถŒ ๊ฑฐ์˜ ๋ชจ๋“  ์ง€์—ญ์ด 1000ํ‘œ, 1500ํ‘œ ์‹ธ์›€ ์•„๋‹ˆ๋ƒ"๋ฉฐ "์ œ3์ •๋‹น์ด ๋‚˜์™”์„ ๋•Œ ๊ตญ๋ฏผ์˜ํž˜ ํ‘œ๋ฅผ ๋บ์–ด๊ฐ„๋‹ค. ์Šน๋ถ€๋ฅผ ๊ฐ€๋ฅด๋Š” ๊ฒฐ์ •์ ์ธ ์š”์ธ์ด ๋  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— 3์ง€๋Œ€์— ์žˆ๋Š” ์‚ฌ๋žŒ๋“ค๋„ ํฌ์šฉํ•˜๊ณ  ์ „๋žต์„ ๊ฐ–์ถ”๋Š” ๊ฒƒ์— ๋Œ€ํ•ด ๋ง์”€๋“œ๋ฆฐ ๊ฒƒ"์ด๋ผ๊ณ  ๋งํ–ˆ๋‹ค.

์•ž์„œ ์ด ์‚ฌ๋ฌด์ด์žฅ์€ ์ง€๋‚œ 16์ผ ๋น„๊ณต๊ฐœ ์˜์›์ดํšŒ์—์„œ "๋ฐฐ๋ฅผ ์นจ๋ชฐ์‹œํ‚ค๋ ค๋Š” ์Šน๊ฐ์€ ํ•จ๊ป˜ํ•˜์ง€ ๋ชปํ•œ๋‹ค"๊ณ  ๋ฐœ์–ธํ–ˆ๋‹ค. ์ด๋ฅผ ๋‘๊ณ  ๋‚ด๋…„ ์ด์„  ์ˆ˜๋„๊ถŒ ์œ„๊ธฐ๋ก ๊ณผ ๋‹น ์ง€๋„๋ถ€ ์ฑ…์ž„์„ ์–ธ๊ธ‰ํ•˜๋ฉฐ ๊ณต๊ฐœ์ ์œผ๋กœ ๋น„ํŒ ์ž…์žฅ์„ ๋ฐํ˜€์˜จ ์œค ์˜์›์ด ๋ฐœ์–ธ์˜ ํ‘œ์ ์ด์—ˆ๋‹ค๋Š” ๊ด€์ธก์ด ๋‚˜์™”๋‹ค.

์œค ์˜์›์€ ์ด ์‚ฌ๋ฌด์ด์žฅ์˜ ๋ฐœ์–ธ์ด ์ž์‹ ์„ ๊ฒจ๋ƒฅํ–ˆ๋‹ค๋Š” ๊ด€์ธก๊ณผ ๊ด€๋ จํ•ด "๋‹น์— ๋Œ€ํ•œ ์ถฉ์ •์œผ๋กœ ๋ง์”€๋“œ๋ฆฐ ๊ฒƒ"์ด๋ผ๋ฉฐ "๋‹น์ด๋ผ๋Š” ๋ฐฐ๋ฅผ ์ขŒ์ดˆ์‹œํ‚ค๋ ค๋Š” ์˜๋„๋Š” ์—†์—ˆ๋‹ค"๊ณ  ๋งํ–ˆ๋‹ค.

์œค ์˜์›์€ "๋‹น์ด๋ผ๋Š” ๋ฐฐ๊ฐ€ ์ขŒ์ดˆ๋˜๊ฑฐ๋‚˜ ์–ด๋ ค์›Œ์ง€๋ฉด ๋‹น ์ง€๋„๋ถ€ ์˜์›์ด ์•„๋‹ˆ๋ผ ์ˆ˜๋„๊ถŒ์— ์žˆ๋Š” ์˜์›์ด ๊ฐ€์žฅ ๋จผ์ € ์ฃฝ๋Š”๋‹ค"๋ฉฐ "๋ˆ„๊ตฌ๋ฅผ ๊ธฐ๋ถ„ ๋‚˜์˜๊ฒŒ ํ•  ๋งˆ์Œ์œผ๋กœ (์ด์•ผ๊ธฐ)ํ•œ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ๋‹น์— ๋Œ€ํ•œ ์ง„์ •์„ฑ์œผ๋กœ ์ด์•ผ๊ธฐํ•œ ๊ฒƒ"์ด๋ผ๊ณ  ํ–ˆ๋‹ค."""

prefix = "summarize: " + text

token = tokenizer(prefix ,return_tensors="pt")
output = model.generate(input_ids=token["input_ids"],attention_mask = token["attention_mask"])
text = tokenizer.decode(output[0])[5:-4]
text
Downloads last month
14
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.