Spaces:
Runtime error
Runtime error
import streamlit as st | |
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline | |
st.title('Question-Answering NLU') | |
st.sidebar.title('Navigation') | |
menu = st.sidebar.radio("", options=["Introduction", "Parsing NLU data into SQuAD 2.0", "Training", | |
"Evaluation"], index=0) | |
if menu == "Introduction": | |
st.markdown(''' | |
Question Answering NLU (QANLU) is an approach that maps the NLU task into question answering, | |
leveraging pre-trained question-answering models to perform well on few-shot settings. Instead of | |
training an intent classifier or a slot tagger, for example, we can ask the model intent- and | |
slot-related questions in natural language: | |
``` | |
Context : I'm looking for a cheap flight to Boston. | |
Question: Is the user looking to book a flight? | |
Answer : Yes | |
Question: Is the user asking about departure time? | |
Answer : No | |
Question: What price is the user looking for? | |
Answer : cheap | |
Question: Where is the user flying from? | |
Answer : (empty) | |
``` | |
Thus, by asking questions for each intent and slot in natural language, we can effectively construct an NLU hypothesis. For more details, | |
please read the paper: | |
[Language model is all you need: Natural language understanding as question answering](https://assets.amazon.science/33/ea/800419b24a09876601d8ab99bfb9/language-model-is-all-you-need-natural-language-understanding-as-question-answering.pdf). | |
In this Space, we will see how to transform an example | |
NLU dataset (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/) | |
question-answering data that can be used by QANLU. | |
''') | |
elif menu == "Parsing NLU data into SQuAD 2.0": | |
st.markdown(''' | |
Here, we show a small example of how NLU data can be transformed into QANLU data. | |
The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis) | |
NLU data (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/) | |
question-answering data that can be used by QANLU. | |
Here is an example dataset with three intents and two examples per intent: | |
```` | |
restaurant, I am looking for some Vietnamese food | |
restaurant, What is there to eat around here? | |
music, Play my workout playlist | |
music, Can you find Bob Dylan songs? | |
flight, Show me flights from Oakland to Dallas | |
flight, I want two economy tickets from Miami to Chicago | |
```` | |
Now, we need to define some questions, per intent. We can use free-form questions or use templates. | |
```` | |
{ | |
'restaurant': [ | |
'Did they ask for a restaurant?', | |
'Did they mention a restaurant?' | |
], | |
'music': [ | |
'Did they ask for music?', | |
'Do they want to play music?' | |
], | |
'flight': [ | |
'Did they ask for a flight?', | |
'Do they want to book a flight?' | |
] | |
} | |
```` | |
The next step is to run the `atis.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu). | |
That script will produce a json file that looks like this: | |
```` | |
{ | |
"version": 1.0, | |
"data": [ | |
{ | |
"title": "MultiATIS++", | |
"paragraphs": [ | |
{ | |
"context": "yes. no. i am looking for some vietnamese food", | |
"qas": [ | |
{ | |
"question": "did they ask for a restaurant?", | |
"id": "49f1180cb9ce4178a8a90f76c21f69b4", | |
"is_impossible": false, | |
"answers": [ | |
{ | |
"text": "yes", | |
"answer_start": 0 | |
} | |
], | |
"slot": "", | |
"intent": "restaurant" | |
}, | |
{ | |
"question": "did they ask for music?", | |
"id": "a7ffe039fb3e4843ae16d5a68194f45e", | |
"is_impossible": false, | |
"answers": [ | |
{ | |
"text": "no", | |
"answer_start": 5 | |
} | |
], | |
"slot": "", | |
"intent": "restaurant" | |
}, | |
... <More questions> | |
```` | |
There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model. | |
''') | |
elif menu == "Evaluation": | |
st.header('QANLU Evaluation') | |
tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu", use_auth_token=True) | |
model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu", use_auth_token=True) | |
qa_pipeline = pipeline('question-answering', model=model, tokenizer=tokenizer) | |
context = st.text_input( | |
'Please enter the context:', | |
value="I want a cheap flight to Boston." | |
) | |
question = st.text_input( | |
'Please enter the question:', | |
value="What is the destination?" | |
) | |
qa_input = { | |
'context': 'Yes. No. ' + context, | |
'question': question | |
} | |
if st.button('Ask QANLU'): | |
answer = qa_pipeline(qa_input) | |
st.write(answer) |