File size: 3,101 Bytes
9e29cfc
0ae1e07
 
cdb72c4
9e29cfc
 
0ae1e07
9e29cfc
0ae1e07
9e29cfc
0ae1e07
9e29cfc
 
0ae1e07
9e29cfc
0ae1e07
9e29cfc
0ae1e07
 
 
 
8aeca9c
 
 
 
 
 
9e29cfc
0ae1e07
9e29cfc
90aceaa
9e29cfc
0ae1e07
9e29cfc
0ae1e07
9e29cfc
0ae1e07
9e29cfc
0ae1e07
9e29cfc
0ae1e07
 
 
 
 
 
 
 
 
e754eba
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
language:
- en
license: llama2
---

# TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data

Paper: https://arxiv.org/abs/2401.13223

Code: https://github.com/fengbinzhu/TAT-LLM


## Introduction

We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding financial QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.

| Model | Size | FINQA | TATQA | TATDQA |
| ---   | ---  | ---   | ---   | ---    |
| GPT-3.5-Turbo | - | 58.00 | 59.47 | 52.74 |
| GPT-4 | - | 63.91 | 71.92 | 64.46 |
| [TAT-LLM-7B-LORA](https://huggingface.co/next-tat/tat-llm-7b-lora) | 7B | 65.13 | 76.49 | 71.38 |
| [TAT-LLM-7B-FFT](https://huggingface.co/next-tat/tat-llm-7b-fft) | 7B | 69.75 | 76.91 | 72.64 |
| [TAT-LLM-13B-LORA](https://huggingface.co/next-tat/tat-llm-13b-lora) | 13B | 71.93 | 77.51 | 72.22 |
| [TAT-LLM-13B-FFT](https://huggingface.co/next-tat/tat-llm-13b-fft) | 13B | 72.97 | 78.41 | 73.18 |
| [TAT-LLM-70B-LORA](https://huggingface.co/next-tat/tat-llm-70b-lora) | 70B | **76.81** | 81.42 | 76.55 |
| [TAT-LLM-70B-FFT](https://huggingface.co/next-tat/tat-llm-70b-fft) | 70B | 76.11 | **82.20** | **76.97** |

## Training

We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, using different methods such as parameter-efficient fine-tuning and full-parameter fine-tuning of LLaMA 2 on a combination of financial data from the FinQA, TAT-QA, and TAT-DQA training sets([🤗HuggingFace Repo](https://huggingface.co/datasets/next-tat/tat-llm-instructions)). To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.

## Inference & Evaluation

Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)

## Citation

If you find this model helpful, please consider citing our paper:

```
@misc{zhu2024tatllm,
      title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
      author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
      year={2024},
      eprint={2401.13223},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```