yasserrmd commited on
Commit
cfac131
·
verified ·
1 Parent(s): c1c937b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md CHANGED
@@ -14,6 +14,90 @@ datasets:
14
  - gretelai/synthetic_text_to_sql
15
  ---
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  # Uploaded model
18
 
19
  - **Developed by:** yasserrmd
 
14
  - gretelai/synthetic_text_to_sql
15
  ---
16
 
17
+ # Text2SQL-1.5B Model
18
+
19
+ ## Overview
20
+ **Text2SQL-1.5B** is a powerful **natural language to SQL** model designed to convert user queries into structured SQL statements. It supports complex multi-table queries and ensures high accuracy in text-to-SQL conversion.
21
+
22
+ ## System Instruction
23
+ To ensure consistency in model outputs, use the following system instruction:
24
+
25
+ > **Always separate code and explanation. Return SQL code in a separate block, followed by the explanation in a separate paragraph. Use markdown triple backticks (` ```sql ` for SQL) to format the code properly. Write the SQL query first in a separate code block. Then, explain the query in plain text. Do not merge them into one response.
26
+
27
+ ## Prompt Format
28
+ The prompt format should include both the user query and the table structure using a `CREATE TABLE` statement. The expected message format should be:
29
+
30
+ ```json
31
+ messages = [
32
+ {"role": "system", "content": "Always separate code and explanation. Return SQL code in a separate block, followed by the explanation in a separate paragraph. Use markdown triple backticks (```sql for SQL) to format the code properly. Write the SQL query first in a separate code block. Then, explain the query in plain text. Do not merge them into one response. The query should always include the table structure using a CREATE TABLE statement before executing the main SQL query."},
33
+ {"role": "user", "content": "Show the total sales for each customer who has spent more than $50,000."},
34
+ {"role": "user", "content": "
35
+ CREATE TABLE sales (
36
+ id INT PRIMARY KEY,
37
+ customer_id INT,
38
+ total_amount DECIMAL(10,2),
39
+ FOREIGN KEY (customer_id) REFERENCES customers(id)
40
+ );
41
+
42
+ CREATE TABLE customers (
43
+ id INT PRIMARY KEY,
44
+ name VARCHAR(255)
45
+ );
46
+ "}
47
+ ]
48
+ ```
49
+
50
+ ## Model Usage
51
+
52
+ ### **Using the Model for Text-to-SQL Conversion**
53
+ The following code demonstrates how to use the model to convert natural language queries into SQL statements:
54
+
55
+ ```python
56
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
57
+
58
+ # Load tokenizer and model
59
+ tokenizer = AutoTokenizer.from_pretrained("yasserrmd/Text2SQL-1.5B")
60
+ model = AutoModelForCausalLM.from_pretrained("yasserrmd/Text2SQL-1.5B")
61
+
62
+ # Define the pipeline
63
+ pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
64
+
65
+ # Define system instruction
66
+ system_instruction = "Always separate code and explanation. Return SQL code in a separate block, followed by the explanation in a separate paragraph. Use markdown triple backticks (```sql for SQL) to format the code properly. Write the SQL query first in a separate code block. Then, explain the query in plain text. Do not merge them into one response. The query should always include the table structure using a CREATE TABLE statement before executing the main SQL query."
67
+
68
+ # Define user query
69
+ user_query = "Show the total sales for each customer who has spent more than $50,000.
70
+ CREATE TABLE sales (
71
+ id INT PRIMARY KEY,
72
+ customer_id INT,
73
+ total_amount DECIMAL(10,2),
74
+ FOREIGN KEY (customer_id) REFERENCES customers(id)
75
+ );
76
+
77
+ CREATE TABLE customers (
78
+ id INT PRIMARY KEY,
79
+ name VARCHAR(255)
80
+ );
81
+ "
82
+
83
+ # Define messages for input
84
+ messages = [
85
+ {"role": "system", "content": system_instruction},
86
+ {"role": "user", "content": user_query},
87
+ ]
88
+
89
+ # Generate SQL output
90
+ response = pipe(messages)
91
+
92
+
93
+ # Print the generated SQL query
94
+ print(response[0]['generated_text'])
95
+ ```
96
+
97
+
98
+
99
+
100
+
101
  # Uploaded model
102
 
103
  - **Developed by:** yasserrmd