iioSnail commited on
Commit
9b6e41a
·
1 Parent(s): f76a1c9

Update README.md

Browse files

# ChineseBERT-base

本项目是将ChineseBERT进行了加工,可供使用者直接使用HuggingFace API进行调用,无需再进行多余的代码配置。

原论文地址:
**[ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information](https://arxiv.org/abs/2106.16038)**
*Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng, Xiang Ao, Qing He, Fei Wu and Jiwei Li*

原项目地址:
[ChineseBERT github link](https://github.com/ShannonAI/ChineseBert)

原模型地址:
[ShannonAI/ChineseBERT-base](https://huggingface.co/ShannonAI/ChineseBERT-base) (该模型无法直接使用HuggingFace API调用)

# 本项目使用方法

1. 安装pypinyin

```
pip install pypinyin
```

2. 使用AutoClass加载tokenizer和model

```python
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("iioSnail/ChineseBERT-base", trust_remote_code=True)
model = AutoModel.from_pretrained("iioSnail/ChineseBERT-base", trust_remote_code=True)
```

3. 之后与普通BERT使用方法一致

```python
inputs = tokenizer(["我 喜 [MASK] 猫"], return_tensors='pt')
logits = model(**inputs).logits

print(tokenizer.decode(logits.argmax(-1)[0, 1:-1]))
```

输出:

```
tokenizer.decode(logits.argmax(-1)[0, 1:-1])
```

Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -1,3 +1,9 @@
1
  ---
2
  license: afl-3.0
3
- ---
 
 
 
 
 
 
 
1
  ---
2
  license: afl-3.0
3
+ language:
4
+ - zh
5
+ tags:
6
+ - bert
7
+ - chinesebert
8
+ - MLM
9
+ ---