Text2SQL-1.5B / README.md
yasserrmd's picture
Update README.md
cfac131 verified
|
raw
history blame
4.06 kB
metadata
base_model: unsloth/qwen2.5-coder-1.5b-instruct-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen2
  - trl
  - sft
license: apache-2.0
language:
  - en
datasets:
  - gretelai/synthetic_text_to_sql

Text2SQL-1.5B Model

Overview

Text2SQL-1.5B is a powerful natural language to SQL model designed to convert user queries into structured SQL statements. It supports complex multi-table queries and ensures high accuracy in text-to-SQL conversion.

System Instruction

To ensure consistency in model outputs, use the following system instruction:

**Always separate code and explanation. Return SQL code in a separate block, followed by the explanation in a separate paragraph. Use markdown triple backticks (```sql for SQL) to format the code properly. Write the SQL query first in a separate code block. Then, explain the query in plain text. Do not merge them into one response.

Prompt Format

The prompt format should include both the user query and the table structure using a CREATE TABLE statement. The expected message format should be:

messages = [
    {"role": "system", "content": "Always separate code and explanation. Return SQL code in a separate block, followed by the explanation in a separate paragraph. Use markdown triple backticks (```sql for SQL) to format the code properly. Write the SQL query first in a separate code block. Then, explain the query in plain text. Do not merge them into one response. The query should always include the table structure using a CREATE TABLE statement before executing the main SQL query."},
    {"role": "user", "content": "Show the total sales for each customer who has spent more than $50,000."},
    {"role": "user", "content": "
CREATE TABLE sales (
    id INT PRIMARY KEY,
    customer_id INT,
    total_amount DECIMAL(10,2),
    FOREIGN KEY (customer_id) REFERENCES customers(id)
);

CREATE TABLE customers (
    id INT PRIMARY KEY,
    name VARCHAR(255)
);
"}
] 

Model Usage

Using the Model for Text-to-SQL Conversion

The following code demonstrates how to use the model to convert natural language queries into SQL statements:

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("yasserrmd/Text2SQL-1.5B")
model = AutoModelForCausalLM.from_pretrained("yasserrmd/Text2SQL-1.5B")

# Define the pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

# Define system instruction
system_instruction = "Always separate code and explanation. Return SQL code in a separate block, followed by the explanation in a separate paragraph. Use markdown triple backticks (```sql for SQL) to format the code properly. Write the SQL query first in a separate code block. Then, explain the query in plain text. Do not merge them into one response. The query should always include the table structure using a CREATE TABLE statement before executing the main SQL query."

# Define user query
user_query = "Show the total sales for each customer who has spent more than $50,000.
CREATE TABLE sales (
    id INT PRIMARY KEY,
    customer_id INT,
    total_amount DECIMAL(10,2),
    FOREIGN KEY (customer_id) REFERENCES customers(id)
);

CREATE TABLE customers (
    id INT PRIMARY KEY,
    name VARCHAR(255)
);
"

# Define messages for input
messages = [
    {"role": "system", "content": system_instruction},
    {"role": "user", "content": user_query},
]

# Generate SQL output
response = pipe(messages)


# Print the generated SQL query
print(response[0]['generated_text'])

Uploaded model

  • Developed by: yasserrmd
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen2.5-coder-1.5b-instruct-bnb-4bit

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.