KoichiYasuoka
commited on
Commit
•
e0a8336
1
Parent(s):
234cff1
initial release
Browse files
README.md
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- "lzh"
|
4 |
+
tags:
|
5 |
+
- "classical chinese"
|
6 |
+
- "literary chinese"
|
7 |
+
- "ancient chinese"
|
8 |
+
license: "apache-2.0"
|
9 |
+
pipeline_tag: "fill-mask"
|
10 |
+
widget:
|
11 |
+
- text: "孟子[MASK]梁惠王"
|
12 |
+
---
|
13 |
+
|
14 |
+
# roberta-classical-chinese-base-char
|
15 |
+
|
16 |
+
## Model Description
|
17 |
+
|
18 |
+
This is a RoBERTa model pre-trained on Classical Chinese texts, derived from [GuwenBERT-base](https://huggingface.co/ethanyt/guwenbert-base). Character-embeddings are enhanced into traditional/simplified characters. You can fine-tune `roberta-classical-chinese-base-char` for downstream tasks, such as sentencization, POS-tagging, dependency-parsing, and so on.
|
19 |
+
|
20 |
+
## How to Use
|
21 |
+
|
22 |
+
```py
|
23 |
+
from transformers import AutoTokenizer,AutoModel
|
24 |
+
tokenizer=AutoTokenizer.from_pretrained("KoichiYasuoka/roberta-classical-chinese-base-char")
|
25 |
+
model=AutoModel.from_pretrained("KoichiYasuoka/roberta-classical-chinese-base-char")
|
26 |
+
```
|