KoBigBird

Pretrained BigBird Model for Korean (kobigbird-bert-base)

About

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences.

BigBird relies on block sparse attention instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT.

Model is warm started from Korean BERT’s checkpoint.

How to use

NOTE: Use BertTokenizer instead of BigBirdTokenizer. (AutoTokenizer will load BertTokenizer)

from transformers import AutoModel, AutoTokenizer

# by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
model = AutoModel.from_pretrained("monologg/kobigbird-bert-base")

# you can change `attention_type` to full attention like this:
model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", attention_type="original_full")

# you can change `block_size` & `num_random_blocks` like this:
model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", block_size=16, num_random_blocks=2)

tokenizer = AutoTokenizer.from_pretrained("monologg/kobigbird-bert-base")
text = "한국어 BigBird 모델을 공개합니다!"
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
Downloads last month
18,936
Safetensors
Model size
114M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for monologg/kobigbird-bert-base

Finetunes
3 models

Space using monologg/kobigbird-bert-base 1