{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "# Table Question Answering using TAPAS and OpenVINO™\n", "\n", "Table Question Answering (Table QA) is the answering a question about an information on a given table. You can use the Table Question Answering models to simulate SQL execution by inputting a table.\n", "\n", "In this tutorial we demonstrate how to perform table question answering using OpenVINO. This example based on [TAPAS base model fine-tuned on WikiTable Questions (WTQ)](https://huggingface.co/google/tapas-base-finetuned-wtq) that is based on the paper [TAPAS: Weakly Supervised Table Parsing via Pre-training](https://arxiv.org/abs/2004.02349#:~:text=Answering%20natural%20language%20questions%20over,denotations%20instead%20of%20logical%20forms).\n", "\n", "Answering natural language questions over tables is usually seen as a semantic parsing task. To alleviate the collection cost of full logical forms, one popular approach focuses on weak supervision consisting of denotations instead of logical forms. However, training semantic parsers from weak supervision poses difficulties, and in addition, the generated logical forms are only used as an intermediate step prior to retrieving the denotation. In [this paper](https://arxiv.org/pdf/2004.02349.pdf), it is presented TAPAS, an approach to question answering over tables without generating logical forms. TAPAS trains from weak supervision, and predicts the denotation by selecting table cells and optionally applying a corresponding aggregation operator to such selection. TAPAS extends BERT's architecture to encode tables as input, initializes from an effective joint pre-training of text segments and tables crawled from Wikipedia, and is trained end-to-end.\n", "\n", "\n", "\n", "#### Table of contents:\n", "\n", "- [Prerequisites](#Prerequisites)\n", "- [Use the original model to run an inference](#Use-the-original-model-to-run-an-inference)\n", "- [Convert the original model to OpenVINO Intermediate Representation (IR) format](#Convert-the-original-model-to-OpenVINO-Intermediate-Representation-(IR)-format)\n", "- [Run the OpenVINO model](#Run-the-OpenVINO-model)\n", "- [Interactive inference](#Interactive-inference)\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Prerequisites\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "\n", "%pip install -q torch \"transformers>=4.31.0\" \"torch>=2.1\" --extra-index-url https://download.pytorch.org/whl/cpu\n", "%pip install -q \"openvino>=2023.2.0\" \"gradio>=4.0.2\"" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import torch\n", "from transformers import TapasForQuestionAnswering\n", "from transformers import TapasTokenizer\n", "from transformers import pipeline\n", "import pandas as pd" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Use `TapasForQuestionAnswering.from_pretrained` to download a pretrained model and `TapasTokenizer.from_pretrained` to get a tokenizer." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ActorsNumber of movies
0Brad Pitt87
1Leonardo Di Caprio53
2George Clooney69
\n", "
" ], "text/plain": [ " Actors Number of movies\n", "0 Brad Pitt 87\n", "1 Leonardo Di Caprio 53\n", "2 George Clooney 69" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = TapasForQuestionAnswering.from_pretrained(\"google/tapas-large-finetuned-wtq\")\n", "tokenizer = TapasTokenizer.from_pretrained(\"google/tapas-large-finetuned-wtq\")\n", "\n", "data = {\n", " \"Actors\": [\"Brad Pitt\", \"Leonardo Di Caprio\", \"George Clooney\"],\n", " \"Number of movies\": [\"87\", \"53\", \"69\"],\n", "}\n", "table = pd.DataFrame.from_dict(data)\n", "question = \"how many movies does Leonardo Di Caprio have?\"\n", "table" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Use the original model to run an inference\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "\n", "We use [this example](https://huggingface.co/tasks/table-question-answering) to demonstrate how to make an inference. You can use `pipeline` from `transformer` library for this purpose." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The answer is 53\n" ] } ], "source": [ "tqa = pipeline(task=\"table-question-answering\", model=model, tokenizer=tokenizer)\n", "result = tqa(table=table, query=question)\n", "print(f\"The answer is {result['cells'][0]}\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "You can read more about the inference output structure in [this documentation](https://huggingface.co/docs/transformers/model_doc/tapas)." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Convert the original model to OpenVINO Intermediate Representation (IR) format\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "The original model is a PyTorch module, that can be converted with `ov.convert_model` function directly. We also use `ov.save_model` function to serialize the result of conversion." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import openvino as ov\n", "from pathlib import Path\n", "\n", "\n", "# Define the input shape\n", "batch_size = 1\n", "sequence_length = 29\n", "\n", "# Modify the input shape of the dummy_input dictionary\n", "dummy_input = {\n", " \"input_ids\": torch.zeros((batch_size, sequence_length), dtype=torch.long),\n", " \"attention_mask\": torch.zeros((batch_size, sequence_length), dtype=torch.long),\n", " \"token_type_ids\": torch.zeros((batch_size, sequence_length, 7), dtype=torch.long),\n", "}\n", "\n", "\n", "ov_model_xml_path = Path(\"models/ov_model.xml\")\n", "\n", "if not ov_model_xml_path.exists():\n", " ov_model = ov.convert_model(model, example_input=dummy_input)\n", " ov.save_model(ov_model, ov_model_xml_path)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Run the OpenVINO model\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "\n", "Select a device from dropdown list for running inference using OpenVINO." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "465c49c179c64e2881376a078945f605", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='Device:', index=2, options=('CPU', 'GPU', 'AUTO'), value='AUTO')" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import ipywidgets as widgets\n", "\n", "core = ov.Core()\n", "\n", "device = widgets.Dropdown(\n", " options=core.available_devices + [\"AUTO\"],\n", " value=\"AUTO\",\n", " description=\"Device:\",\n", " disabled=False,\n", ")\n", "\n", "device" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "We use `ov.compile_model` to make it ready to use for loading on a device. To prepare inputs use the original `tokenizer`." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "inputs = tokenizer(table=table, queries=question, padding=\"max_length\", return_tensors=\"pt\")\n", "\n", "compiled_model = core.compile_model(ov_model_xml_path, device.value)\n", "result = compiled_model((inputs[\"input_ids\"], inputs[\"attention_mask\"], inputs[\"token_type_ids\"]))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Now we should postprocess results. For this, we can use the appropriate part of the code from [`postprocess`](https://github.com/huggingface/transformers/blob/fe2877ce21eb75d34d30664757e2727d7eab817e/src/transformers/pipelines/table_question_answering.py#L393) method of `TableQuestionAnsweringPipeline`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "53\n" ] } ], "source": [ "logits = result[0]\n", "logits_aggregation = result[1]\n", "\n", "\n", "predictions = tokenizer.convert_logits_to_predictions(inputs, torch.from_numpy(result[0]))\n", "answer_coordinates_batch = predictions[0]\n", "aggregators = {}\n", "aggregators_prefix = {}\n", "answers = []\n", "for index, coordinates in enumerate(answer_coordinates_batch):\n", " cells = [table.iat[coordinate] for coordinate in coordinates]\n", " aggregator = aggregators.get(index, \"\")\n", " aggregator_prefix = aggregators_prefix.get(index, \"\")\n", " answer = {\n", " \"answer\": aggregator_prefix + \", \".join(cells),\n", " \"coordinates\": coordinates,\n", " \"cells\": [table.iat[coordinate] for coordinate in coordinates],\n", " }\n", " if aggregator:\n", " answer[\"aggregator\"] = aggregator\n", "\n", " answers.append(answer)\n", "\n", "print(answers[0][\"cells\"][0])" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Also, we can use the original pipeline. For this, we should create a wrapper for `TapasForQuestionAnswering` class replacing `forward` method to use the OpenVINO model for inference and methods and attributes of original model class to be integrated into the pipeline." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "53\n" ] } ], "source": [ "from transformers import TapasConfig\n", "\n", "\n", "# get config for pretrained model\n", "config = TapasConfig.from_pretrained(\"google/tapas-large-finetuned-wtq\")\n", "\n", "\n", "class TapasForQuestionAnswering(TapasForQuestionAnswering): # it is better to keep the class name to avoid warnings\n", " def __init__(self, ov_model_path):\n", " super().__init__(config) # pass config from the pretrained model\n", " self.tqa_model = core.compile_model(ov_model_path, device.value)\n", "\n", " def forward(self, input_ids, *, attention_mask, token_type_ids):\n", " results = self.tqa_model((input_ids, attention_mask, token_type_ids))\n", "\n", " return torch.from_numpy(results[0]), torch.from_numpy(results[1])\n", "\n", "\n", "compiled_model = TapasForQuestionAnswering(ov_model_xml_path)\n", "tqa = pipeline(task=\"table-question-answering\", model=compiled_model, tokenizer=tokenizer)\n", "print(tqa(table=table, query=question)[\"cells\"][0])" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "### Interactive inference\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import requests\n", "\n", "import gradio as gr\n", "import pandas as pd\n", "\n", "r = requests.get(\"https://github.com/openvinotoolkit/openvino_notebooks/files/13215688/eu_city_population_top10.csv\")\n", "\n", "with open(\"eu_city_population_top10.csv\", \"w\") as f:\n", " f.write(r.text)\n", "\n", "\n", "def display_table(csv_file_name):\n", " table = pd.read_csv(csv_file_name.name, delimiter=\",\")\n", " table = table.astype(str)\n", "\n", " return table\n", "\n", "\n", "def highlight_answers(x, coordinates):\n", " highlighted_table = pd.DataFrame(\"\", index=x.index, columns=x.columns)\n", " for coordinates_i in coordinates:\n", " highlighted_table.iloc[coordinates_i[0], coordinates_i[1]] = \"background-color: lightgreen\"\n", "\n", " return highlighted_table\n", "\n", "\n", "def infer(query, csv_file_name):\n", " table = pd.read_csv(csv_file_name.name, delimiter=\",\")\n", " table = table.astype(str)\n", "\n", " result = tqa(table=table, query=query)\n", " table = table.style.apply(highlight_answers, axis=None, coordinates=result[\"coordinates\"])\n", "\n", " return result[\"answer\"], table\n", "\n", "\n", "with gr.Blocks(title=\"TAPAS Table Question Answering\") as demo:\n", " with gr.Row():\n", " with gr.Column():\n", " search_query = gr.Textbox(label=\"Search query\")\n", " csv_file = gr.File(label=\"CSV file\")\n", " infer_button = gr.Button(\"Submit\", variant=\"primary\")\n", " with gr.Column():\n", " answer = gr.Textbox(label=\"Result\")\n", " result_csv_file = gr.Dataframe(label=\"All data\")\n", "\n", " examples = [\n", " [\n", " \"What is the city with the highest population that is not a capital?\",\n", " \"eu_city_population_top10.csv\",\n", " ],\n", " [\"In which country is Madrid?\", \"eu_city_population_top10.csv\"],\n", " [\n", " \"In which cities is the population greater than 2,000,000?\",\n", " \"eu_city_population_top10.csv\",\n", " ],\n", " ]\n", " gr.Examples(examples, inputs=[search_query, csv_file])\n", "\n", " # Callbacks\n", " csv_file.upload(display_table, inputs=csv_file, outputs=result_csv_file)\n", " csv_file.select(display_table, inputs=csv_file, outputs=result_csv_file)\n", " csv_file.change(display_table, inputs=csv_file, outputs=result_csv_file)\n", " infer_button.click(infer, inputs=[search_query, csv_file], outputs=[answer, result_csv_file])\n", "\n", "try:\n", " demo.queue().launch(debug=True)\n", "except Exception:\n", " demo.queue().launch(share=True, debug=True)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" }, "openvino_notebooks": { "imageUrl": "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/cda2a126-ad66-40eb-b85f-3e9ceff05190", "tags": { "categories": [ "Model Demos" ], "libraries": [], "other": [], "tasks": [ "Table Question Answering" ] } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }