next-tat
/

tat-llm-13b-fft

Text Generation

text-generation-inference

Model card Files Files and versions Community

tat-llm-13b-fft / README.md

frankliu666's picture

Update README.md

90aceaa verified over 1 year ago

|

history blame contribute delete

3.1 kB

	---
	language:
	- en
	license: llama2
	---

	# TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data

	Paper: https://arxiv.org/abs/2401.13223

	Code: https://github.com/fengbinzhu/TAT-LLM


	## Introduction

	We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding financial QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.

	\| Model \| Size \| FINQA \| TATQA \| TATDQA \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| GPT-3.5-Turbo \| - \| 58.00 \| 59.47 \| 52.74 \|
	\| GPT-4 \| - \| 63.91 \| 71.92 \| 64.46 \|
	\| [TAT-LLM-7B-LORA](https://huggingface.co/next-tat/tat-llm-7b-lora) \| 7B \| 65.13 \| 76.49 \| 71.38 \|
	\| [TAT-LLM-7B-FFT](https://huggingface.co/next-tat/tat-llm-7b-fft) \| 7B \| 69.75 \| 76.91 \| 72.64 \|
	\| [TAT-LLM-13B-LORA](https://huggingface.co/next-tat/tat-llm-13b-lora) \| 13B \| 71.93 \| 77.51 \| 72.22 \|
	\| [TAT-LLM-13B-FFT](https://huggingface.co/next-tat/tat-llm-13b-fft) \| 13B \| 72.97 \| 78.41 \| 73.18 \|
	\| [TAT-LLM-70B-LORA](https://huggingface.co/next-tat/tat-llm-70b-lora) \| 70B \| 76.81 \| 81.42 \| 76.55 \|
	\| [TAT-LLM-70B-FFT](https://huggingface.co/next-tat/tat-llm-70b-fft) \| 70B \| 76.11 \| 82.20 \| 76.97 \|

	## Training

	We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, using different methods such as parameter-efficient fine-tuning and full-parameter fine-tuning of LLaMA 2 on a combination of financial data from the FinQA, TAT-QA, and TAT-DQA training sets([🤗HuggingFace Repo](https://huggingface.co/datasets/next-tat/tat-llm-instructions)). To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.

	## Inference & Evaluation

	Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)

	## Citation

	If you find this model helpful, please consider citing our paper:

	```
	@misc{zhu2024tatllm,
	title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
	author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
	year={2024},
	eprint={2401.13223},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```