File size: 7,679 Bytes
44f342c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
---
license: cc-by-nc-4.0
pipeline_tag: text-generation
library_name: transformers
tags:
  - text-to-sql
  - sql
  - qwen2
datasets:
  - cycloneboy/bird_train
base_model: Qwen/Qwen2.5-7B-Instruct
---

# CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning

This repository contains the `CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct` model, presented in the paper [CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning](https://huggingface.co/papers/2505.13271).

## Abstract

Large language models (LLMs) have demonstrated strong capabilities in translating natural language questions about relational databases into SQL queries. In particular, test-time scaling techniques such as Self-Consistency and Self-Correction can enhance SQL generation accuracy by increasing computational effort during inference. However, these methods have notable limitations: Self-Consistency may select suboptimal outputs despite majority votes, while Self-Correction typically addresses only syntactic errors. To leverage the strengths of both approaches, we propose CSC-SQL, a novel method that integrates Self-Consistency and Self-Correction. CSC-SQL selects the two most frequently occurring outputs from parallel sampling and feeds them into a merge revision model for correction. Additionally, we employ the Group Relative Policy Optimization (GRPO) algorithm to fine-tune both the SQL generation and revision models via reinforcement learning, significantly enhancing output quality. Experimental results confirm the effectiveness and generalizability of CSC-SQL. On the BIRD private test set, our 7B model achieves 71.72% execution accuracy, while the 32B model achieves 73.67%.

## Code

The official implementation, including training and evaluation scripts, can be found on GitHub: [https://github.com/CycloneBoy/csc_sql](https://github.com/CycloneBoy/csc_sql)

## Introduction

CSC-SQL is a novel method that integrates Self-Consistency and Self-Correction to enhance SQL generation accuracy. It addresses the limitations of existing test-time scaling techniques by combining their strengths. The method involves selecting the two most frequently occurring outputs from parallel sampling and feeding them into a merge revision model for correction. Furthermore, the Group Relative Policy Optimization (GRPO) algorithm is employed to fine-tune both the SQL generation and revision models via reinforcement learning, leading to significantly enhanced output quality.

The framework overview is illustrated below:

![csc_sql_framework](https://github.com/CycloneBoy/csc_sql/raw/main/data/image/csc_sql_framework.png)

## Main Results

The CSC-SQL model achieves state-of-the-art results in Text-to-SQL generation. On the BIRD private test set, the 7B model achieves 71.72% execution accuracy, while the 32B model achieves 73.67%.

Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset:
<img src="https://github.com/CycloneBoy/csc_sql/raw/main/data/image/csc_sql_result_main.png"  height="500" alt="Performance Comparison">

## Models and Datasets

The project provides various models and datasets, which can be found on Hugging Face and ModelScope:

| **Model and Dataset** | Modelscope | HuggingFace |
|---------------------------------------|-------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|
| bird train and dev dataset | [πŸ€– Modelscope](https://modelscope.cn/datasets/cycloneboy/bird_train) | [πŸ€— HuggingFace](https://huggingface.co/datasets/cycloneboy/bird_train) |
| CscSQL-Merge-Qwen2.5-Coder-3B-Instruct | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-3B-Instruct) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-3B-Instruct) |
| CscSQL-Merge-Qwen2.5-Coder-7B-Instruct | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-7B-Instruct) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-7B-Instruct) |
| CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-3B-Instruct) |
| CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502 | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-3B-2502) |
| CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct) |
| CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502 | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502) |

## Usage

You can use this model with the Hugging Face `transformers` library. Here's a quick example for Text-to-SQL generation following the Qwen chat template:

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_name = "cycloneboy/CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
).eval()

# Example natural language question and a simplified database schema
question = "List the names of all employees who work in the 'Sales' department."
schema = """
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(255),
    department_id INT
);

CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(255)
);
"""

# Construct the prompt according to the model's expected input format for Text-to-SQL
# This is typically a combination of natural language question and the schema
user_prompt = f"Question: {question}
Schema: {schema}
SQL:"

messages = [
    {"role": "user", "content": user_prompt}
]

# Apply the chat template to format the input for the model
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Define generation configuration
generation_config = GenerationConfig(
    do_sample=True,
    temperature=0.7,
    top_p=0.8,
    top_k=20,
    repetition_penalty=1.05,
    max_new_tokens=512, # Adjust as needed for SQL query length
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
)

# Generate the SQL query
generated_ids = model.generate(
    model_inputs.input_ids,
    generation_config=generation_config
)

# Decode the generated SQL, skipping the input prompt
generated_sql = tokenizer.batch_decode(generated_ids[:, model_inputs.input_ids.shape[1]:], skip_special_tokens=True)[0]

print("Generated SQL Query:")
print(generated_sql)
```

## Citation

If you find our work helpful or inspiring, please feel free to cite it:

```bibtex
@misc{sheng2025cscsqlcorrectiveselfconsistencytexttosql,
      title={CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning}, 
      author={Lei Sheng and Shuai-Shuai Xu},
      year={2025},
      eprint={2505.13271},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.13271}, 
}
```