kebyt5-base-preview / README.md
dalgarak's picture
Update README.md
1dddc97 verified
---
license: other
language:
- ko
- en
- ja
- zh
pipeline_tag: fill-mask
---
# Model Card for KEByT5-base (580M #params)
<!-- Provide a quick summary of what the model is/does. -->
KEByT5: Korean-Enhanced/Enriched Byte-level Text-to-Text Transfer Transformer(T5)
ํฌ๋กœ์Šค๋ชจ๋‹ฌ ๋ฐ ๋‹ค๊ตญ์–ด ์นœํ™”์ ์ธ ํ•œ๊ตญ์–ด ์ค‘์‹ฌ์˜ ํ† ํฐ-ํ”„๋ฆฌ ์–ธ์–ด ์ดํ•ด ์ƒ์„ฑ ๋ชจ๋ธ
(EN=Cross-modal, Multilingual Friendly, Token-free Encoder-Decoder Pretrained Language Model for Korean)
* ๋ณธ ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ์€ ์‹œ๊ฐ, ์ฒญ๊ฐ๊ณผ ๊ฐ™์€ ํ…์ŠคํŠธ ์ด์™ธ์˜ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์™€ ๊ต์ฐจ์–ธ์–ด ์ง€์‹ ๊ตํ™˜์— ์šฉ์ดํ•œ ํ† ํฐ-ํ”„๋ฆฌ ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.
* ๋ณ„๋„์˜ tokenizer๊ฐ€ ํ•„์š”์—†์ง€๋งŒ, ํŽธ์˜๋ฅผ ์œ„ํ•ด AutoTokenizer.from_pretrained()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค๋ฅธ ํ† ํฌ๋‚˜์ด์ € ๊ธฐ๋ฐ˜ ์ธ์ฝ”๋”-๋””์ฝ”๋” ๋ชจ๋ธ๊ณผ ๋™์ผํ•˜๊ฒŒ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ƒ๋žตํ•˜๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ, UTF-8 ์ž…๋ ฅ์„ ๋ฐ”์ดํŠธ ๋‹จ์œ„๋กœ ์ชผ๊ฐœ์–ด, ๊ฐ ๋ฐ”์ดํŠธ์— +3์„ ํ•˜์—ฌ Token ID๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (์ฆ‰, ASCII value 0 == Token ID 3, ASCII value 255 == Token ID 258)
* ํ˜„์žฌ Preview ์Šคํ…Œ์ด์ง€์— ์žˆ๋Š” ๋ชจ๋ธ์ด๋ฉฐ, ํ™œ์šฉ์—๋Š” fine-tuning์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
## Acknowledgements
* ๋ณธ ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ์€ 2022๋…„๋„ ์ •๋ถ€(๊ณผํ•™๊ธฐ์ˆ ์ •๋ณดํ†ต์‹ ๋ถ€)์˜ ์žฌ์›์œผ๋กœ ์ •๋ณดํ†ต์‹ ๊ธฐํšํ‰๊ฐ€์›์˜ ์ง€์›์„ ๋ฐ›์•„ ์ˆ˜ํ–‰๋œ ์—ฐ๊ตฌ์ž„ (No. RS-2022-00187238, ํšจ์œจ์  ์‚ฌ์ „ํ•™์Šต์ด ๊ฐ€๋Šฅํ•œ ํ•œ๊ตญ์–ด ๋Œ€ํ˜• ์–ธ์–ด๋ชจ๋ธ ์‚ฌ์ „ํ•™์Šต ๊ธฐ์ˆ  ๊ฐœ๋ฐœ)
(EN=This pretrained language model was supported by the Institute of Information & communication Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (No. RS-2022-00187238, Development of Large Korean Language Model Technology for Efficient Pre-training))
# Model Details
๋ณธ ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ทœ๋ชจ๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค:
* kebyt5-small : 330M [link](https://huggingface.co/etri-lirs/kebyt5-small-preview)
* kebyt5-base : 580M (this model)
* kebyt5-large : 1.23B [link](https://huggingface.co/etri-lirs/kebyt5-large-preview)
์ด๋“ค ๋ชจ๋ธ์€ [google/byt5-small](https://huggingface.co/google/byt5-small), [google/byt5-base](https://huggingface.co/google/byt5-base), [google/byt5-large](https://huggingface.co/google/byt5-large) ๋ชจ๋ธ๊ณผ ๋™์ผํ•œ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ์™€ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง€๋ฉฐ, ํ† ํฌ๋‚˜์ด์ €(ByT5Tokenizer)์™€ ๊ตฌํ˜„ ์ƒ ๋‘ ๋ชจ๋ธ์€ ๋ณ„๋„์˜ ์ˆ˜์ •์—†์ด ๋ฐ”๋กœ ๊ตํ™˜ํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
huggingface transformers์—์„œ์˜ ์‚ฌ์šฉ๋ฒ• ์—ญ์‹œ, T5ForConditionalGeneration์„ ๋™์ผํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
## Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Language Intelligence Research Section, Electronics and Telecommunications Research Institute(ETRI)
- **Model type:** Encoder-Decoder Transformer, specifically, ByT5.
- **Language(s) (NLP):** Korean, English(partially for translation task), Chinese(partially for translation task), Japanese(partially for translation task).
- **License:** Apache 2.0 License
- **Finetuned from model:** kebyt5-small/-base/-xl model weights were initialized by google/byt5-* for Warm-start pretraining.
## Model Sources
- **Repository:** ๋‹ค์šด์ŠคํŠธ๋ฆผ ํƒœ์Šคํฌ ํ•™์Šต์„ ์œ„ํ•ด, https://github.com/etri-crossmodal/llm-downstream-s2s
- **Paper:** ์‹ ์ข…ํ›ˆ ์™ธ, "ํ•œ๊ตญ์–ด ์ค‘์‹ฌ์˜ ํ† ํฐ-ํ”„๋ฆฌ ์–ธ์–ด ์ดํ•ด-์ƒ์„ฑ ๋ชจ๋ธ ์‚ฌ์ „ํ•™์Šต ์—ฐ๊ตฌ", ์ œ35ํšŒ ํ•œ๊ธ€ ๋ฐ ํ•œ๊ตญ์–ด ์ •๋ณด์ฒ˜๋ฆฌ ํ•™์ˆ ๋Œ€ํšŒ ๋…ผ๋ฌธ์ง‘, pp.711-715. 2023.
(EN=Shin et al., "Towards Korean-Centric Token-free Pretrained Language Model", in Procs. of the 35th Annual Conference on Human and Cognitive Language Technology. pp. 711-715. 2023.)
# Uses
ํ•ด๋‹น ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ์€ ์—ฐ๊ตฌ ๋ฐ ๊ต์œก ๋ชฉ์ ์˜ ํ™œ์šฉ์œผ๋กœ ๊ทธ ์‚ฌ์šฉ ๋ชฉ์ ์ด ์ œํ•œ๋ฉ๋‹ˆ๋‹ค.
## Direct Use
ํ˜„์žฌ ๊ณต๊ฐœ๋˜๋Š” ๋ชจ๋ธ์€ T5 ๋ชจ๋ธ ํ•™์Šต์— ์‚ฌ์šฉ๋œ Corrupted span denoising ๋งŒ์œผ๋กœ ํ•™์Šต๋˜์–ด ์žˆ์–ด, ์‹ค์ œ ์‘์šฉ ํƒœ์Šคํฌ์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” fine-tuning ๊ณผ์ •์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
Sentinel Token(token id 258, 257, 256, ...)์„ ์‚ฌ์šฉํ•˜์—ฌ Masked Token Prediction์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์œผ๋‚˜, ์˜ˆ์ธก๋œ ๋‚ด์šฉ์—๋Š” ๋ถ€์ ์ ˆํ•œ ๋‚ด์šฉ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
## Downstream Use
Token-free ๋ชจ๋ธ์˜ ํŠน์„ฑ ์ƒ, ๋ณต์žกํ•˜๊ฑฐ๋‚˜ Noisyํ•œ ์ž…๋ ฅ์— ๊ฐ•๊ฑดํ•˜๋ฉฐ, ์งง์€ ์‹œํ€€์Šค ๊ธธ์ด์˜ ์ƒ์„ฑ์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. (์˜ˆ: ์–ธ์–ด ์ดํ•ด, ๋Œ€ํ™” ์‘๋‹ต ์ƒ์„ฑ)
์‚ฌ์ „ํ•™์Šต์€ 1024 bytes ๊ธธ์ด์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ–ˆ๊ธฐ ๋•Œ๋ฌธ์—, ์ด๋ฅผ ์ดˆ๊ณผํ•˜๋Š” ๊ธด ์‹œํ€€์Šค๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฌธ์ œ์— ์ ํ•ฉํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋” ๊ธด ์‹œํ€€์Šค๋ฅผ ๋‹ค๋ค„์•ผ ํ•˜๋Š” ๋ฌธ์ œ์—์„œ๋Š”, [GBST ๊ธฐ๋ฐ˜์˜ ํ† ํฐ-ํ”„๋ฆฌ ์–ธ์–ด๋ชจ๋ธ](https://huggingface.co/etri-lirs/gbst-kebyt5-base-preview)์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.
# Bias, Risks, Limitations, and Recommendations
Masked Token Prediction์„ ํ†ตํ•ด ํš๋“๋  ์ˆ˜ ์žˆ๋Š” ์ •๋ณด์—๋Š” ๋‹ค๋ฅธ ์ƒ์„ฑํ˜• ์–ธ์–ด๋ชจ๋ธ๊ณผ ๊ฐ™์€ ์œ„ํ—˜์„ ๊ฐ€์ง€๊ณ  ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต์— ์‚ฌ์šฉ๋œ ๋ฐ์ดํ„ฐ๋Š” ์š•์„ค, ์Œ๋ž€, ์ •์น˜์  ๋‚ด์šฉ ๋ฐ ๊ธฐํƒ€ ๊ฑฐ์นœ ์–ธ์–ด๋“ค์— ๋Œ€ํ•œ ๋ณ„๋„์˜ ์ฒ˜๋ฆฌ๊ฐ€ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ์‚ฌํšŒ์ ์œผ๋กœ ์šฉ์ธ๋˜์ง€ ์•Š์€ ํ† ํฐ์ด๋‚˜ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ฃผ๋ณ€ ๋ฌธ๋งฅ์— ๋”ฐ๋ผ์„œ ๊ณต๊ฒฉ์ ์ธ ์ž…๋ ฅ์— ์–ด๋– ํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์„์ง€ ์‰ฝ๊ฒŒ ์˜ˆ์ƒํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
ํ•œํŽธ, ๋ณธ ์–ธ์–ด๋ชจ๋ธ์€ ์ฃผ๋กœ ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ๋กœ ํ•™์Šต๋˜์—ˆ์œผ๋ฉฐ, ์ด๋“ค์˜ ํŠน์„ฑ์„ ์ „์ดํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์šด์ŠคํŠธ๋ฆผ ํƒœ์Šคํฌ, ๊ทธ ์ค‘์—์„œ๋„ ๋ถ„๋ฅ˜, ์š”์•ฝ, ์งง์€ ๋ฌธ์žฅ ์ƒ์„ฑ์— ์ ํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž…์ถœ๋ ฅ ์ˆ˜์ค€์—์„œ ๋ฏธ๋“ฑ๋ก์–ด(Out-of-Vocabulary)๊ฐ€ ์กด์žฌํ•  ์ˆ˜ ์—†์œผ๋‚˜, ์‚ฌ์ „ํ•™์Šต๋˜์ง€ ์•Š์€ ํ…์ŠคํŠธ ์‹œํ€€์Šค์— ๋Œ€ํ•ด์„œ๋Š” ์ถ”๊ฐ€์˜ ๋„๋ฉ”์ธ ์ ์‘ ํ•™์Šต ๋ฐ ๋‹ค์šด์ŠคํŠธ๋ฆผ ํƒœ์Šคํฌ์˜ ๋ฏธ์„ธ์กฐ์ •์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
[More Information Needed]
## How to Get Started with the Model
Transformers 4.27.0 ์ด์ƒ์˜ ๋ฒ„์ „์—์„œ, ๋‹ค์Œ์˜ ํŒŒ์ด์ฌ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ๊ณผ tokenizer๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("etri-lirs/kebyt5-small-preview")
model = AutoModelForSeq2SeqLM.from_pretrained("etri-lirs/kebyt5-small-preview")
```
# Training Details
## Training Data
๋ณธ ์‚ฌ์ „ํ•™์Šต์—๋Š” ์•„๋ž˜์˜ ๊ณต๊ฐœ ๋ฐ์ดํ„ฐ๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค:
* ๊ตญ๋ฆฝ๊ตญ์–ด์›, ๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜. ์‹ ๋ฌธ v2.0
* ๊ตญ๋ฆฝ๊ตญ์–ด์›, ๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜. ๊ตฌ์–ด ๋ง๋ญ‰์น˜ v1.2
* ๊ตญ๋ฆฝ๊ตญ์–ด์›, ๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜. ๋ฌธ์–ด ๋ง๋ญ‰์น˜ v1.0
* ๊ตญ๋ฆฝ๊ตญ์–ด์›, ๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜. ์‹ ๋ฌธ 2020 v1.0
* ๊ตญ๋ฆฝ๊ตญ์–ด์›, ๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜. ์‹ ๋ฌธ 2021 v1.0
* ํ•œ๊ตญ์–ด ์œ„ํ‚คํ”ผ๋””์–ด ๋คํ”„, [v2020.09.20](https://github.com/lovit/kowikitext)
* [๋‚˜๋ฌด์œ„ํ‚ค ๋คํ”„](https://github.com/lovit/namuwikitext)
* ํ•œ๊ตญ์ •๋ณดํ™”์ง„ํฅ์›, AIHub. ์ „๋ฌธ๋ถ„์•ผ ๋ง๋ญ‰์น˜, ๋ฒ•๋ฅ /ํŠนํ—ˆ ์ง€์‹๋ฒ ์ด์Šค, ๋…ผ๋ฌธ/๋„์„œ/๋Œ€ํ™”/๋Œ€๋ณธ ์š”์•ฝ, ํ•œ์˜/ํ•œ์ผ/ํ•œ์ค‘ ๋ฒˆ์—ญ ๋ง๋ญ‰์น˜, ์ฝœ์„ผํ„ฐ/์ฃผ๋ฌธ/๋‰ด์Šค๊ธฐ์‚ฌ/์‹œ๊ฐ์ •๋ณด ์งˆ์˜์‘๋‹ต, ๋ฐฉ์†ก/ํšŒ์˜/์ƒ๋‹ด ์Œ์„ฑ์ธ์‹ ๋ฐ์ดํ„ฐ.
* ํ•œ๊ตญ์ •๋ณดํ™”์ง„ํฅ์›, AIHub. ๋Œ€๊ทœ๋ชจ ์›น๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ํ•œ๊ตญ์–ด ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ
* ํ•œ๊ตญ์ •๋ณดํ™”์ง„ํฅ์›, AIHub. ์˜จ๋ผ์ธ ๊ตฌ์–ด์ฒด ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ.
* [KcBERT ๋ง๋ญ‰์น˜, v2022.3Q](https://github.com/Beomi/KcBERT)
๋˜ํ•œ, ์†Œ๋Ÿ‰์˜ ์ž์ฒด ๊ตฌ์ถ•๋œ ๋ฐ์ดํ„ฐ ๋ฐ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ ์ผ๋ถ€๋ฅผ ์‚ฌ์šฉ, ์ „์ฒด ์•ฝ ~220GB ๊ฐ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
# Evaluation
## Testing Data, Factors & Metrics & Results
ํ•œ๊ตญ์–ด ์–ธ์–ด ์ดํ•ด ํƒœ์Šคํฌ์— ์‚ฌ์šฉ๋˜๋Š” [KLUE dataset, v1.1](https://klue-benchmark.com/)์˜ dev set์„ ์‚ฌ์šฉํ•˜์—ฌ ํ‰๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
์ƒ์„ฑ์€ ๋ชจ๋‘ seq2seq์„ ์ด์šฉํ•œ ์ถœ๋ ฅ ๋ ˆ์ด๋ธ” ์ง์ ‘ ์ƒ์„ฑ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
| models | KLUE-TC(YNAT) (F1) | KLUE-NER (Entity, Char F1) | KLUE-DP (UAS, LAS) | KLUE-MRC (EM, ROUGE-W) |
|-------------|---------------|--------------|-------------------|------------------|
| google/byt5-large (1.23B) | 78.52 | 48.81, 63.95 | 44.26, 7.805 | _NOT TESTED_ |
| **KEByT5-Base (580M)** | **84.99** | **86.75, 91.05** | **88.70, 85.90** | **62.28, 68.38** |
| KEByT5-Large (1.23B) | 85.68 | 88.09, 92.40 | 87.18, 85.52 | 70.07, 75.81 |
| GBST-KEByT5-Base (584M) | 85.29 | 87.35, 92.09 | 88.33, 85.00 | 59.69, 66.44 |
๋Œ€ํ™” ์ƒํƒœ ์ถ”์ (DST; Dialogue State Tracking) ํƒœ์Šคํฌ์ธ KLUE-WOS-v1.1 ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํ‰๊ฐ€๋Š” ๋ชจ๋‘ seq2seq์„ ์ด์šฉํ•œ ๋‹ค์ด์–ผ๋กœ๊ทธ ์ƒํƒœ ์ง์ ‘ ์ƒ์„ฑ์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค:
| models | WOS (JGA, %) | WOS (F1, %) |
| ------- | ---------- | ----------- |
| klue/klue-roberta-large | 50.22 | 92.23 |
| **KEByT5-Base (580M)** | **77.15** | **96.92** |
| KEByT5-Large (1.23B) | 78.54 | 97.28 |
๊ด€๊ณ„ ์ถ”์ถœ(RE; Relation Extraction) ํƒœ์Šคํฌ์ธ KLUE-RE-v1.1 ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. no_relation์„ ์ œ์™ธํ•œ 29๊ฐœ์˜ ๊ด€๊ณ„ ํด๋ž˜์Šค์— ๋Œ€ํ•œ Micro F1 ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค:
| models | KLUE-RE (F1, %) |
| ------- | ---------- |
| klue/klue-roberta-base | 65.90 |
| **KEByT5-Base (580M)** | **65.48** |
| KEByT5-Large (1.23B) | 68.95 |
## Compute Infrastructure
* Trained on nVidia A100 80GB * 4EA
# Citation
* ํ—ˆ์ • ์™ธ, "์ƒ์„ฑํ˜• ์–ธ์–ด๋ชจ๋ธ์„ ์ด์šฉํ•œ ๊ด€๊ณ„ ์ถ”์ถœ", ์ œ35ํšŒ ํ•œ๊ธ€ ๋ฐ ํ•œ๊ตญ์–ด ์ •๋ณด์ฒ˜๋ฆฌ ํ•™์ˆ ๋Œ€ํšŒ ๋…ผ๋ฌธ์ง‘. pp.708-710. 2023.
* ์ด๊ธฐ์˜ ์™ธ, "ํ•œ๊ตญ์–ด ํ† ํฐ-ํ”„๋ฆฌ ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ KeByT5๋ฅผ ์ด์šฉํ•œ ํ•œ๊ตญ์–ด ์ƒ์„ฑ ๊ธฐ๋ฐ˜ ๋Œ€ํ™” ์ƒํƒœ ์ถ”์ ", ์ œ35ํšŒ ํ•œ๊ธ€ ๋ฐ ํ•œ๊ตญ์–ด ์ •๋ณด์ฒ˜๋ฆฌ ํ•™์ˆ ๋Œ€ํšŒ ๋…ผ๋ฌธ์ง‘. pp.644-647. 2023.
# Model Card Authors/Contacts
Jong-hun Shin(ETRI), e-mail=jhshin82 _AT_ etri _DOT_ re _DOT_ kr.