Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,8 @@ tags:
|
|
7 |
- Machine Reading
|
8 |
- Text Generation
|
9 |
- Pretrained Chinese T5-Large model
|
|
|
|
|
10 |
|
11 |
metrics:
|
12 |
- RougeL
|
@@ -14,6 +16,10 @@ metrics:
|
|
14 |
- F1
|
15 |
- EM
|
16 |
- Contain Answer Rate
|
|
|
|
|
|
|
|
|
17 |
|
18 |
licence: apache-2.0
|
19 |
---
|
@@ -40,15 +46,15 @@ This T5-Large model, is the first pretrained generative question answering model
|
|
40 |
|
41 |
CMRC 2018的测试集上的效果(原始任务是一个起始和结束预测问题,这里作为一个生成回答的问题)
|
42 |
|
43 |
-
| model |
|
44 |
|-------|----|----|--------------------|--------|--------|
|
45 |
-
| Ours |
|
46 |
-
|MacBERT-Large(SOTA)
|
47 |
|
48 |
-
Our model enjoys a high level of generation quality and accuracy, with 76% of generated answers containing the ground truth
|
49 |
P.S.The SOTA model only predicts the start and end tag as an extractive MRC task.
|
50 |
|
51 |
-
我们的模型有着极高的生成质量和准确率,76%的回答包含了正确答案(Contain Answer Rate)
|
52 |
P.S. SOTA模型只需预测起始和结束位置,这种抽取式阅读理解任务比生成式的简单很多。
|
53 |
|
54 |
## 样例 Cases
|
@@ -86,6 +92,19 @@ tokenizer.batch_decode(pred_ids, skip_special_tokens=True, clean_up_tokenization
|
|
86 |
|
87 |
|
88 |
# 引用 Citation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
89 |
|
90 |
You can also cite our [website](https://github.com/IDEA-CCNL/Fengshenbang-LM/):
|
91 |
|
|
|
7 |
- Machine Reading
|
8 |
- Text Generation
|
9 |
- Pretrained Chinese T5-Large model
|
10 |
+
- Squad
|
11 |
+
- CMRC
|
12 |
|
13 |
metrics:
|
14 |
- RougeL
|
|
|
16 |
- F1
|
17 |
- EM
|
18 |
- Contain Answer Rate
|
19 |
+
|
20 |
+
widget:
|
21 |
+
- text: "question:美国建筑师是怎样创造维多利亚哥特式建筑的?knowledge:底特律圣保罗座堂(Cathedral Church of St. Paul)是美国圣公会密歇根教区的主教座堂,位于底特律伍德沃德大道4800号,毗邻韦恩州立大学校园。圣保罗堂区成立于1824年,是密歇根第一个新教堂会。现存建筑由著名教堂设计师拉尔夫·克拉姆(Ralph Adams Cram),始建于1907年,至今钟楼尚未完成。教堂完全用石灰岩和中世纪建筑技术建造,没有支持的钢铁上层建筑。建设拥有交错骨,大片花窗玻璃,雕饰窗格,哥特式建筑的楷模,包括Pewabic 陶瓷中心。在1912年成为教区的主教座堂。圣保罗座堂是20世纪初后期哥特复兴建筑的最佳实例之一。19世纪中叶的美国建筑师输入并重新阐释了英国哥特复兴风格,基于中世纪主教座堂的视觉丰富的细节。美国建筑师将哥特元素与简单的建筑规划相结合,创造了美国建筑风格“维多利亚哥特式”(Victorian Gothic)。兴建于1876年的堡垒街长老会教堂就是早期维多利亚哥特式建筑的杰出例证。answer:<extra_id_0>"
|
22 |
+
- example_title: "将哥特元素与简单的建筑规划相结合"
|
23 |
|
24 |
licence: apache-2.0
|
25 |
---
|
|
|
46 |
|
47 |
CMRC 2018的测试集上的效果(原始任务是一个起始和结束预测问题,这里作为一个生成回答的问题)
|
48 |
|
49 |
+
| model | Contain Answer Rate| RougeL | BLEU-4 |F1 | EM |
|
50 |
|-------|----|----|--------------------|--------|--------|
|
51 |
+
| Ours | 76.0 | 82.7 |61.1|77.9 |57.1|
|
52 |
+
|MacBERT-Large(SOTA)|-|-|-|88.9|70.0|
|
53 |
|
54 |
+
Our model enjoys a high level of generation quality and accuracy, with 76% of generated answers containing the ground truth. The high RougeL and BLEU-4 reveal the overlap between generated results and ground truth. Our model has a lower EM because it generates complete sentences while golden answers are segmentations of sentences.
|
55 |
P.S.The SOTA model only predicts the start and end tag as an extractive MRC task.
|
56 |
|
57 |
+
我们的模型有着极高的生成质量和准确率,76%的回答包含了正确答案(Contain Answer Rate)。RougeL和BLEU-4反映了模型预测结果和标准答案重合的程度。我们的模型EM值较低,因为生成的大部分为完整的句子,而标准答案通常是句子片段。
|
58 |
P.S. SOTA模型只需预测起始和结束位置,这种抽取式阅读理解任务比生成式的简单很多。
|
59 |
|
60 |
## 样例 Cases
|
|
|
92 |
|
93 |
|
94 |
# 引用 Citation
|
95 |
+
如果您在您的工作中使用了我们的模型,可以引用我们的[论文](https://arxiv.org/abs/2210.08590):
|
96 |
+
|
97 |
+
If you are using the resource for your work, please cite the our [paper](https://arxiv.org/abs/2210.08590):
|
98 |
+
|
99 |
+
```text
|
100 |
+
@article{fengshenbang,
|
101 |
+
author = {Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen and Ruyi Gan and Jiaxing Zhang},
|
102 |
+
title = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
|
103 |
+
journal = {CoRR},
|
104 |
+
volume = {abs/2209.02970},
|
105 |
+
year = {2022}
|
106 |
+
}
|
107 |
+
```
|
108 |
|
109 |
You can also cite our [website](https://github.com/IDEA-CCNL/Fengshenbang-LM/):
|
110 |
|