Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- MRC
|
4 |
+
- TyDiQA
|
5 |
+
- xlm-roberta-large
|
6 |
+
language:
|
7 |
+
- multilingual
|
8 |
+
---
|
9 |
+
|
10 |
+
# Model description
|
11 |
+
|
12 |
+
Reading comprehension, XLM-RoBERTa model for [TyDiQA Primary Tasks](https://arxiv.org/abs/2003.05002).
|
13 |
+
|
14 |
+
- **Passage selection task (SelectP):** Given a list of the passages in the article, return either (a) the index of the passage that answers the question or (b) NULL if no such passage exists.
|
15 |
+
|
16 |
+
- **Minimal answer span task (MinSpan):** Given the full text of an article, return one of (a) the start and end byte indices of the minimal span that completely answers the question; (b) YES or NO if the question requires a yes/no answer and we can draw a conclusion from the passage; (c) NULL if it is not possible to produce a minimal answer for this question.
|
17 |
+
|
18 |
+
The model is initialized with [xlm-roberta-large](https://huggingface.co/xlm-roberta-large/) and fine-tuned on the [TyDiQA train data](https://huggingface.co/datasets/tydiqa).
|
19 |
+
|
20 |
+
## Intended uses & limitations
|
21 |
+
|
22 |
+
You can use the raw model for the reading comprehension task.
|
23 |
+
|
24 |
+
## Usage
|
25 |
+
|
26 |
+
You can use this model directly with the [PrimeQA](https://github.com/primeqa/primeqa) pipeline for reading comprehension [tydiqa.ipynb](https://github.com/primeqa/primeqa/blob/main/notebooks/mrc/tydiqa.ipynb).
|
27 |
+
|
28 |
+
### BibTeX entry and citation info
|
29 |
+
|
30 |
+
```bibtex
|
31 |
+
@article{tydiqa,
|
32 |
+
title = {TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages},
|
33 |
+
author = {Jonathan H. Clark and Eunsol Choi and Michael Collins and Dan Garrette and Tom Kwiatkowski and Vitaly Nikolaev and Jennimaria Palomaki}
|
34 |
+
year = {2020},
|
35 |
+
journal = {Transactions of the Association for Computational Linguistics}
|
36 |
+
}
|
37 |
+
```
|