File size: 7,679 Bytes
44f342c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
---
license: cc-by-nc-4.0
pipeline_tag: text-generation
library_name: transformers
tags:
- text-to-sql
- sql
- qwen2
datasets:
- cycloneboy/bird_train
base_model: Qwen/Qwen2.5-7B-Instruct
---
# CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning
This repository contains the `CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct` model, presented in the paper [CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning](https://huggingface.co/papers/2505.13271).
## Abstract
Large language models (LLMs) have demonstrated strong capabilities in translating natural language questions about relational databases into SQL queries. In particular, test-time scaling techniques such as Self-Consistency and Self-Correction can enhance SQL generation accuracy by increasing computational effort during inference. However, these methods have notable limitations: Self-Consistency may select suboptimal outputs despite majority votes, while Self-Correction typically addresses only syntactic errors. To leverage the strengths of both approaches, we propose CSC-SQL, a novel method that integrates Self-Consistency and Self-Correction. CSC-SQL selects the two most frequently occurring outputs from parallel sampling and feeds them into a merge revision model for correction. Additionally, we employ the Group Relative Policy Optimization (GRPO) algorithm to fine-tune both the SQL generation and revision models via reinforcement learning, significantly enhancing output quality. Experimental results confirm the effectiveness and generalizability of CSC-SQL. On the BIRD private test set, our 7B model achieves 71.72% execution accuracy, while the 32B model achieves 73.67%.
## Code
The official implementation, including training and evaluation scripts, can be found on GitHub: [https://github.com/CycloneBoy/csc_sql](https://github.com/CycloneBoy/csc_sql)
## Introduction
CSC-SQL is a novel method that integrates Self-Consistency and Self-Correction to enhance SQL generation accuracy. It addresses the limitations of existing test-time scaling techniques by combining their strengths. The method involves selecting the two most frequently occurring outputs from parallel sampling and feeding them into a merge revision model for correction. Furthermore, the Group Relative Policy Optimization (GRPO) algorithm is employed to fine-tune both the SQL generation and revision models via reinforcement learning, leading to significantly enhanced output quality.
The framework overview is illustrated below:

## Main Results
The CSC-SQL model achieves state-of-the-art results in Text-to-SQL generation. On the BIRD private test set, the 7B model achieves 71.72% execution accuracy, while the 32B model achieves 73.67%.
Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset:
<img src="https://github.com/CycloneBoy/csc_sql/raw/main/data/image/csc_sql_result_main.png" height="500" alt="Performance Comparison">
## Models and Datasets
The project provides various models and datasets, which can be found on Hugging Face and ModelScope:
| **Model and Dataset** | Modelscope | HuggingFace |
|---------------------------------------|-------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|
| bird train and dev dataset | [π€ Modelscope](https://modelscope.cn/datasets/cycloneboy/bird_train) | [π€ HuggingFace](https://huggingface.co/datasets/cycloneboy/bird_train) |
| CscSQL-Merge-Qwen2.5-Coder-3B-Instruct | [π€ Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-3B-Instruct) | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-3B-Instruct) |
| CscSQL-Merge-Qwen2.5-Coder-7B-Instruct | [π€ Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-7B-Instruct) | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-7B-Instruct) |
| CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct | [π€ Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct) | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct) |
| CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502 | [π€ Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502) | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502) |
| CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct | [π€ Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct) | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct) |
| CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502 | [π€ Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502) | [π€ HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502) |
## Usage
You can use this model with the Hugging Face `transformers` library. Here's a quick example for Text-to-SQL generation following the Qwen chat template:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
model_name = "cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
torch_dtype=torch.bfloat16,
trust_remote_code=True
).eval()
# Example natural language question and a simplified database schema
question = "List the names of all employees who work in the 'Sales' department."
schema = """
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(255),
department_id INT
);
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(255)
);
"""
# Construct the prompt according to the model's expected input format for Text-to-SQL
# This is typically a combination of natural language question and the schema
user_prompt = f"Question: {question}
Schema: {schema}
SQL:"
messages = [
{"role": "user", "content": user_prompt}
]
# Apply the chat template to format the input for the model
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Define generation configuration
generation_config = GenerationConfig(
do_sample=True,
temperature=0.7,
top_p=0.8,
top_k=20,
repetition_penalty=1.05,
max_new_tokens=512, # Adjust as needed for SQL query length
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
# Generate the SQL query
generated_ids = model.generate(
model_inputs.input_ids,
generation_config=generation_config
)
# Decode the generated SQL, skipping the input prompt
generated_sql = tokenizer.batch_decode(generated_ids[:, model_inputs.input_ids.shape[1]:], skip_special_tokens=True)[0]
print("Generated SQL Query:")
print(generated_sql)
```
## Citation
If you find our work helpful or inspiring, please feel free to cite it:
```bibtex
@misc{sheng2025cscsqlcorrectiveselfconsistencytexttosql,
title={CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning},
author={Lei Sheng and Shuai-Shuai Xu},
year={2025},
eprint={2505.13271},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.13271},
}
``` |