distilbert-tokenizer_256k-MLM_best / tokenizer_config.json
nreimers's picture
upload
b034ae1
raw
history blame
412 Bytes
{"model_max_length": 512, "unk_token": "[UNK]", "cls_token": "[CLS]", "sep_token": "[SEP]", "pad_token": "[PAD]", "mask_token": "[MASK]", "model_input_names": ["input_ids", "attention_mask"], "special_tokens_map_file": "c4_msmarco_news_s2orc_wiki/tokenizer-256k/special_tokens_map.json", "name_or_path": "train-w2v-model/c4_msmarco_news_s2orc_wiki/distilbert-256k/", "tokenizer_class": "PreTrainedTokenizerFast"}