File size: 7,195 Bytes
cf64cc1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
---
license: cc-by-nc-4.0
pipeline_tag: text-generation
library_name: transformers
tags:
- text-to-sql
- sql-generation
- reinforcement-learning
- qwen
---
# CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning
The model presented in the paper [CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning](https://huggingface.co/papers/2505.13271).
**Abstract:** Large language models (LLMs) have demonstrated strong capabilities in translating natural language questions about relational databases into SQL queries. In particular, test-time scaling techniques such as Self-Consistency and Self-Correction can enhance SQL generation accuracy by increasing computational effort during inference. However, these methods have notable limitations: Self-Consistency may select suboptimal outputs despite majority votes, while Self-Correction typically addresses only syntactic errors. To leverage the strengths of both approaches, we propose CSC-SQL, a novel method that integrates Self-Consistency and Self-Correction. CSC-SQL selects the two most frequently occurring outputs from parallel sampling and feeds them into a merge revision model for correction. Additionally, we employ the Group Relative Policy Optimization (GRPO) algorithm to fine-tune both the SQL generation and revision models via reinforcement learning, significantly enhancing output quality. Experimental results confirm the effectiveness and generalizability of CSC-SQL. On the BIRD private test set, our 7B model achieves 71.72% execution accuracy, while the 32B model achieves 73.67%. The code has been open sourced at this https URL.
**Code:** The code for CSC-SQL is open-sourced at [https://github.com/CycloneBoy/csc_sql](https://github.com/CycloneBoy/csc_sql).
## Introduction
CSC-SQL is a novel method that integrates Self-Consistency and Self-Correction for improved Text-to-SQL generation. It addresses limitations of prior methods by selecting optimal outputs and handling both syntactic and semantic errors. The approach employs Group Relative Policy Optimization (GRPO) to fine-tune SQL generation and revision models, leading to significant enhancements in output quality.

## Main Results
Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.

## Models
A collection of CSC-SQL models can be found on Hugging Face: [CSC-SQL Hugging Face Collection](https://huggingface.co/collections/cycloneboy/csc-sql-6835c4a52da10c54bbe14f8e).
| **Model and Dataset** | HuggingFace |
|---------------------------------------|--------------------------------------------------------------------------------------------|
| CscSQL-Merge-Qwen2.5-Coder-3B-Instruct | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-3B-Instruct) |
| CscSQL-Merge-Qwen2.5-Coder-7B-Instruct | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-7B-Instruct) |
| CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct) |
| CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502 | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502) |
| CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct) |
| CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502 | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502) |
## Dataset
The BIRD training and development datasets used can be found here: [BIRD Train Dataset](https://huggingface.co/datasets/cycloneboy/bird_train).
## Quickstart
This section provides instructions on how to use the pre-trained CSC-SQL models.
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
model_dir = "cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct" # Or other released models
def load_model_tokenizer(model_path):
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
tokenizer.eos_token = "<|im_end|>"
tokenizer.pad_token = "<|endoftext|>"
tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids(tokenizer.eos_token)
tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids(tokenizer.pad_token)
tokenizer.padding_side = "left"
model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto',torch_dtype=torch.bfloat16, trust_remote_code=True)
return model, tokenizer
# Example usage for a natural language question (Text-to-SQL)
# Make sure your input string ends with "<|im_start|>assistant
" for generation
text_list = ["""
<|im_start|>user
Your task is to write a SQLite query given a natural language question and a database schema.
You need to generate the SQL query that answers the question correctly.
For example, to find out the names of all the songs, given:
CREATE TABLE songs (
song_id INTEGER PRIMARY KEY,
song_name TEXT
);
Question: What are the names of all the songs?
SQL: SELECT song_name FROM songs
To find the artist of the song 'Yesterday', given:
CREATE TABLE songs (
song_id INTEGER PRIMARY KEY,
song_name TEXT,
artist_id INTEGER
);
CREATE TABLE artists (
artist_id INTEGER PRIMARY KEY,
artist_name TEXT
);
Question: Who is the artist of the song 'Yesterday'?
SQL: SELECT T2.artist_name FROM songs AS T1 JOIN artists AS T2 ON T1.artist_id = T2.artist_id WHERE T1.song_name = 'Yesterday'
Now, answer the following question.
Question: How many records are there in the table 'songs'?
SQL:
<|im_end|>
<|im_start|>assistant
"""]
model, tokenizer = load_model_tokenizer(model_dir)
inputs = tokenizer(text_list, return_tensors='pt', padding=True, add_special_tokens=False).to('cuda')
input_ids = inputs["input_ids"]
attention_mask = inputs["attention_mask"]
generation_config = GenerationConfig(
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
temperature=0.1,
max_new_tokens=512,
num_return_sequences=1,
num_beams=1,
top_p=0.95,
do_sample=False
)
outputs = model.generate(
inputs= input_ids,
attention_mask=attention_mask,
**generation_config.to_dict()
)
gen_text = tokenizer.batch_decode(outputs[:, input_ids.shape[1]:], skip_special_tokens=True)
print(gen_text[0])
# Expected output: SELECT count(*) FROM songs
```
## Citation
If you find our work useful or helpful for your R&D works, please feel free to cite our paper as below.
```bibtex
@misc{sheng2025cscsqlcorrectiveselfconsistencytexttosql,
title={CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning},
author={Lei Sheng and Shuai-Shuai Xu},
year={2025},
eprint={2505.13271},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.13271},
}
``` |