t5-large-korean-P2G / Readme.md
kfkas's picture
Upload Readme.md
fac1d3c
metadata
language:
  - ko
tags:
  - generated_from_keras_callback
model-index:
  - name: t5-large-korean-P2G
    results: []

t5-large-korean-text-summary

์ด ๋ชจ๋ธ์€ lcw99 / t5-large-korean-text-summary์„ ๊ตญ๋ฆฝ ๊ตญ์–ด์› ์‹ ๋ฌธ ๋ง๋ญ‰์น˜ 50๋งŒ๊ฐœ์˜ ๋ฌธ์žฅ์„ 2021์„ g2pK๋กœ ํ›ˆ๋ จ์‹œ์ผœ G2P๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์›๋ณธ์œผ๋กœ ๋Œ๋ฆฝ๋‹ˆ๋‹ค.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import nltk
nltk.download('punkt')
model_dir = "t5-large-korean-P2G"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir)

text = "ํšŒ์ƒˆ๊ธด๊ฐ„ ์ž‘๊นŒ ๊น€๋™์‹œ ๊ฑ์‹ฌ๊ผฌ๋ฐฑ ๋œฝ ์ƒˆ ์†Œ์„ค์ง‘ ๋šœ๊ถŒ ์ถœ๊ฐ„"
inputs = tokenizer(text, max_length=256, truncation=True, return_tensors="pt")
output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=10, max_length=100)
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
predicted_title = nltk.sent_tokenize(decoded_output.strip())[0]
print(predicted_title)

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: None
  • training_precision: float16

Training results

Framework versions

  • Transformers 4.22.1
  • TensorFlow 2.10.0
  • Datasets 2.5.1
  • Tokenizers 0.12.1