|
--- |
|
license: wtfpl |
|
datasets: |
|
- ms_marco |
|
- squad |
|
language: |
|
- en |
|
--- |
|
|
|
# Model |
|
|
|
t5-base-msmarco-squad-query-generation-longp-v2 |
|
|
|
Task: query generation |
|
Architecture: LongT5 |
|
|
|
Base model: google/long-t5-tglobal-base |
|
|
|
Note: This is supposed to be a baseline model. |
|
|
|
|
|
## Prompt: |
|
|
|
"Generate Query: {document}. Query:" |
|
|
|
## Sequence length: |
|
|
|
1536 tokens |
|
|
|
## Training details |
|
|
|
### Hyperparameters |
|
|
|
Batch size: 8; |
|
Gradient acc: 8; |
|
LR: 3e-4, linear scheduler, 400 warmup steps. |
|
|
|
|
|
### Data |
|
|
|
Total: 252059 pairs (document, query) |
|
|
|
From MARCO-V2: 165238 |
|
From SQuAD: 86821 |
|
|
|
The remaining queries from MARCO-V2 train split were not used. |
|
|
|
## Evaluation |
|
|
|
This model is supposed to be used for data augmentation. |
|
Hence, meaningful evaluation will come from downstream tasks. |
|
|
|
MARCO-V2 Dev1: |
|
BLEU: 0.102 |
|
ROUGE: 0.447 |
|
|
|
MARCO-V2 Dev2: |
|
BLEU: 0.1691 |
|
ROUGE: 0.5013 |