Kconvo-roberta / README.md
yeongjoon's picture
Update README.md
3840cc7
|
raw
history blame
1.87 kB
metadata
license: mit
language:
  - ko

Kconvo-roberta: Korean conversation RoBERTa (github)

  • There are many PLMs (Pretrained Language Models) for Korean, but most of them are trained with written language.
  • Here, we introduce a retrained PLM for prediction of Korean conversation data where we use verbal data for training.

Usage

# Kconvo-roberta
from transformers import RobertaTokenizerFast, RobertaModel

tokenizer_roberta = RobertaTokenizerFast.from_pretrained("yeongjoon/Kconvo-roberta")
model_roberta = RobertaModel.from_pretrained("yeongjoon/Kconvo-roberta")

Domain Robust Retraining of Pretrained Language Model

- National Institute of the Korean Language
   * ์˜จ๋ผ์ธ ๋Œ€ํ™” ๋ง๋ญ‰์น˜ 2021
   * ์ผ์ƒ ๋Œ€ํ™” ๋ง๋ญ‰์น˜ 2020
   * ๊ตฌ์–ด ๋ง๋ญ‰์น˜
   * ๋ฉ”์‹ ์ € ๋ง๋ญ‰์น˜

- AI-Hub
   * ์˜จ๋ผ์ธ ๊ตฌ์–ด์ฒด ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ
   * ์ƒ๋‹ด ์Œ์„ฑ
   * ํ•œ๊ตญ์–ด ์Œ์„ฑ
   * ์ž์œ ๋Œ€ํ™” ์Œ์„ฑ(์ผ๋ฐ˜๋‚จ์—ฌ)
   * ์ผ์ƒ์ƒํ™œ ๋ฐ ๊ตฌ์–ด์ฒด ํ•œ-์˜ ๋ฒˆ์—ญ ๋ณ‘๋ ฌ ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ
   * ํ•œ๊ตญ์ธ ๋Œ€ํ™”์Œ์„ฑ
   * ๊ฐ์„ฑ ๋Œ€ํ™” ๋ง๋ญ‰์น˜
   * ์ฃผ์ œ๋ณ„ ํ…์ŠคํŠธ ์ผ์ƒ ๋Œ€ํ™” ๋ฐ์ดํ„ฐ
   * ์šฉ๋„๋ณ„ ๋ชฉ์ ๋Œ€ํ™” ๋ฐ์ดํ„ฐ
   * ํ•œ๊ตญ์–ด SNS