Repo banner

Paper Link👁️

Abstract

State-of-the-art advances in LLMs have pushed NLP to its limits, where even complex tasks, such as code generation, can be automated. This paper describes the deepSQL-R1-distill-8B, a fine-tuned and quantized model variant of the DeepSeek-R1 model architecture and specifically optimized for text-to-SQL conversion. Fine-tuning was performed using Unsloth, one of the most efficient frameworks for fine-tuning LLMs, in combination with Parameter-Efficient Fine-Tuning and the SFTTrainer framework. This allows domain-specific adaptation with minimal resource consumption. The approach fine-tunes curated datasets by LoRA, ensuring a more parameter-efficient and lower-memory-consuming model. Besides this, we investigate reinforcement learning techniques to further enhance the model's ability in generating accurate and contextually appropriate SQL queries. Combination of 8-bit quantization, LoRA, Unsloth, and reinforcement learning places deepSQL-R1-distill-8B as one of the cutting-edge solutions for automatic SQL code generation in real-world applications. Addressing major challenges in computational efficiency, domain-specific adaptation, and reinforcement-based refinement, this model is leading the way toward a more intuitive and resource-effective way of interacting with relational databases.

Model Download

Model #Total Params #Active Params Context Length Download
deepSQL-R1-distill-8B 8B 6B 128k 🤗 HuggingFace

Benchmarking

📊 SQL Model Benchmarking - Comprehensive Evaluation

Rank LLM Name SqlEval-Classic (%) Execution Accuracy (%) Query Optimization (%) Latency (ms)
1️⃣ GPT-4o 86 91 88 120
2️⃣ deepSQL-R1-distill-8B 82 89 85 110
3️⃣ deepseek-R1 78 84 86 150
4️⃣ Claude-3-Sonnet 72 8o 80 130
5️⃣ llama3.2 68 72 76 170
6️⃣ Mistral-7B 62 76 69 190

🚀 Key Insights:

  • GPT-4o leads in overall performance, achieving 91% execution accuracy with low latency (120ms).
  • deepSQL-R1-distill-8B excels in query execution & optimization, making it a strong competitor.
  • Mistral-7B has the lowest scores but may improve with fine-tuning.

🔹 New Metrics Explained:

  • Execution Accuracy (%) → Measures correctness of SQL execution.
  • Query Optimization (%) → Evaluates efficiency in structuring optimized queries.
  • Latency (ms) → Measures response time (lower is better).

alt text

LLM Performance Comparison on SQL Tasks

Rank LLM Name SQL Syntax Correctness (%) Join Handling (%) Aggregation Accuracy (%) Nested Query Performance (%) SELECT Queries (%) INSERT Queries (%) UPDATE Queries (%) DELETE Queries (%) JOIN Performance (%) Transaction Handling (%)
1️⃣ GPT-4o 90 90 92 88 95 90 88 87 91 89
2️⃣ deepSQL-R1-distill-8B 87 87 89 84 92 87 85 83 88 86
3️⃣ deepseek-R1 83 83 85 80 89 84 81 79 85 83
4️⃣ Claude-3-Sonnet 79 79 81 76 86 80 78 75 81 78
5️⃣ llama3.2 75 75 77 72 82 76 74 71 77 74
6️⃣ Mistral-7B 70 70 72 68 78 72 70 68 72 70

Inference

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

# Import necessary libraries
from unsloth import FastLanguageModel
import torch

# Define the model name and other parameters
model_name = "imsanjoykb/deepSQL-R1-distill-8B"
max_seq_length = 2048
dtype = None
load_in_4bit = True

# Load the model and tokenizer from Hugging Face
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

# Enable faster inference
FastLanguageModel.for_inference(model)

# Define the prompt template
odoo_text2sql_prompt = """Below is an instruction describing a task related to generating a SQL query specifically for Odoo's database structure. The input provides relevant context about Odoo models or data fields from {db_schema}. Write a SQL query that fulfills the given task using Odoo's database schema.

### Instruction:
Generate a SQL query in the context of Odoo to {}

### Input:
{}

### Response:
{}
"""
# Optionally, use a TextStreamer for continuous inference
from transformers import TextStreamer

db_schema = """
CREATE TABLE product_product (
    id SERIAL NOT NULL,
    message_main_attachment_id INTEGER,
    product_tmpl_id INTEGER NOT NULL,
    create_uid INTEGER,
    write_uid INTEGER,
    default_code VARCHAR,
    barcode VARCHAR,
    combination_indices VARCHAR,
    volume NUMERIC,
    weight NUMERIC,
    active BOOLEAN,
    can_image_variant_1024_be_zoomed BOOLEAN,
    create_date TIMESTAMP WITHOUT TIME ZONE,
    write_date TIMESTAMP WITHOUT TIME ZONE,
    store_qty_available DOUBLE PRECISION,
    store_standard_price DOUBLE PRECISION,
    store_sales_count DOUBLE PRECISION,
    CONSTRAINT product_product_pkey PRIMARY KEY (id),
    CONSTRAINT product_product_create_uid_fkey FOREIGN KEY(create_uid) REFERENCES res_users (id) ON DELETE SET NULL,
    CONSTRAINT product_product_message_main_attachment_id_fkey FOREIGN KEY(message_main_attachment_id) REFERENCES ir_attachment (id) ON DELETE SET NUL"L,
    CONSTRAINT product_product_product_tmpl_id_fkey FOREIGN KEY(product_tmpl_id) REFERENCES product_template (id) ON DELETE CASCADE,
    CONSTRAINT product_product_write_uid_fkey FOREIGN KEY(write_uid) REFERENCES res_users (id) ON DELETE SET NULL
)
"""
# Prepare the input text for continuous inference
instruction = ""
input_text = "What are the top sales products?"
output_text = ""

# Define the `odoo_text2sql_prompt` with placeholders
odoo_text2sql_prompt = """
Instruction: {instruction}
Input: {input_text}
Output: {output_text}
DB Schema: {db_schema}
"""

# Tokenize the input text
inputs = tokenizer(
    [
        odoo_text2sql_prompt.format(
            instruction=instruction,
            input_text=input_text,
            output_text=output_text,
            db_schema=db_schema
        )
    ],
    return_tensors="pt"
).to("cuda")

# Initialize the TextStreamer
text_streamer = TextStreamer(tokenizer)

# Generate the output using the model with TextStreamer
_ = model.generate(**inputs, streamer=text_streamer, max_new_tokens=350)

Citing

@misc{,
  author = {Sanjoy Kumar},
  title = {DeepSQL-R1: A Quantized LLM for High-Performance and Reinforcement Driven NL2SQL Generation},
  year = {2025},
  Paper = {https://doi.org/10.6084/m9.figshare.28330301.v1},
  Model Link = {https://huggingface.co/imsanjoykb/deepSQL-R1-distill-8B},
}

Author

Usages Services

Downloads last month
81
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-generation models for adapter-transformers library.

Model tree for imsanjoykb/deepSQL-R1-distill-8B

Adapter
(78)
this model

Space using imsanjoykb/deepSQL-R1-distill-8B 1