File size: 1,271 Bytes
73c89dc
 
 
 
 
 
 
 
 
 
 
 
 
 
d490d52
 
 
 
 
 
 
 
 
73c89dc
 
 
 
d490d52
 
 
 
 
 
 
 
 
 
4928c7a
d490d52
 
 
4928c7a
d490d52
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
language:
- zh
thumbnail: https://ckip.iis.sinica.edu.tw/files/ckip_logo.png
tags:
- pytorch
- token-classification
- bert
- zh
license: gpl-3.0
datasets:
metrics:
---

# CKIP BERT Base Chinese

This project provides traditional Chinese transformers models (including ALBERT, BERT, GPT2) and NLP tools (including word segmentation, part-of-speech tagging, named entity recognition).

這個專案提供了繁體中文的 transformers 模型(包含 ALBERT、BERT、GPT2)及自然語言處理工具(包含斷詞、詞性標記、實體辨識)。

## Homepage

* https://github.com/ckiplab/ckip-transformers

## Contributers

* [Mu Yang](https://muyang.pro) at [CKIP](https://ckip.iis.sinica.edu.tw) (Author & Maintainer)

## Usage

Please use BertTokenizerFast as tokenizer instead of AutoTokenizer.

請使用 BertTokenizerFast 而非 AutoTokenizer。

```
from transformers import (
  BertTokenizerFast,
  AutoModel,
)

tokenizer = BertTokenizerFast.from_pretrained('bert-base-chinese')
model = AutoModel.from_pretrained('ckiplab/bert-base-chinese-ws')
```

For full usage and more information, please refer to https://github.com/ckiplab/ckip-transformers.

有關完整使用方法及其他資訊,請參見 https://github.com/ckiplab/ckip-transformers 。