xfact/FiD-NQ · Hugging Face

Dataset

desc: Natural Questions with contexts retrieved with DPR as 100 tokens. Inputs were truncated to 200 tokens and label is a first answer in an answer list where answers with more than 5 tokens were discarded.

input: "question: total number of death row inmates in the us title: Death row context: on death row in the United States on January 1, 2013. Since 1977, the states of Texas (464), Virginia (108) and Oklahoma (94) have executed the most death row inmates. , California (683), Florida (390), Texas (330) and Pennsylvania (218) housed more than half of all inmates pending on death row. , the longest-serving prisoner on d eath row in the US who has been executed was Jack Alderman who served over 33 years. He was executed in Georgia in 2008. However, Alderman only holds the distinction of being the longest-serving "executed" inmate so far. A Florida inmate, Gary Alvord, arrived"

label: "2,718"

FiD Architecture

This pre-trained model achieved 1-2% increase in EM than the FiD paper's result on NQ w/ 5, 10, and 25 passages

from transformers import AutoConfig, T5Tokenizer
from fid import FiDT5

tokenizer = T5Tokenizer.from_pretrained("t5-base")

question = "question:"
title = "title:"
context = "context:"
tokenizer.add_tokens([question, title, context], special_tokens=True)
config = AutoConfig.from_pretrained("t5-base")
model = FiDT5.from_pretrained(
    "xfact/FiD-NQ",
    config=config
)