File size: 12,638 Bytes
bdcf8c7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
# ------------------------------ PROMPTS ------------------------------------------------------------------------------------------------------------------------
llmchain_prompt = """You are a helpful and friendly assistant. 
Your job is to read the question, and the answer provided in the context and return a Final answer explaining the given answer in the context of the question.
You will include the entire contents of the answer provided to you in the final answer. Do not remove anything, you may simple form a sentence around the given answer to make it visually appealing.
Context:
question: {question}
answer: {answer}
Summarize the entire answer in the last line
You may refer to the examples given below
Example:
question: Calculate the total downtime observed due to Warmup in Nov 2023
answer: [(7356823,)]
Final Answer: The total downtime observed due to Warmup in Nov 2023 is 7,356,823 seconds.

question: Mention all the Reasons behind Management DownCategory
answer: [(Lunch, Tea / Coffee, No Shift, No forged parts, Team Member Meeting, NO Raw material)]
Final Answer: The reasons behind the Management down category in the DT_full table are:
        -Lunch
        -Tea / Coffee
        -No Shift
        -No forged parts
        -Team Member Meeting
        -NO Raw material
The above reasons were responsible for Management loss and caused Downtime."""
# ---------------------------------------------------------------------------------------------------------------------------------------------------------------
system_prompt1 = """You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the given tools. Only use the information returned by the tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

If you need to filter on a proper noun, you must ALWAYS first look up the filter value using the "search_proper_nouns" tool! 

You have access to the following tables: {table_names}

If the question does not seem related to the database, just return "I don't know" as the answer."""
# ---------------------------------------------------------------------------------------------------------------------------------------------------------------
examples = [
    {"input": "List all artists.", "query": "SELECT * FROM Artist;"},
    {
        "input": "List different reasons for downtime in Operator Category for machine K-8 during shift 1 in the months of Nov and Dec.",
        "query": """SELECT DISTINCT DownID AS 'Reasons for Downtime'
            FROM ShiftDownTimeDetails
            WHERE MachineID = 'K-8'
            AND DownCategory = 'Operator'
            AND Shift = 'First Shift'
            AND MONTH(dDate) IN (11, 12)"""
    },
    {
        "input": "How many instances of Downtime due to Management were observed in Aug 2023?",
        "query": "SELECT COUNT(*) FROM ShiftDownTimeDetails WHERE DownCategory = 'Management' AND dDate BETWEEN '2023-08-01' AND '2023-08-31'"
    },
    {
        "input": "Find all albums for the artist 'AC/DC'.",
        "query": "SELECT * FROM Album WHERE ArtistId = (SELECT ArtistId FROM Artist WHERE Name = 'AC/DC');",
    },
    {
        "input": "List all tracks in the 'Rock' genre.",
        "query": "SELECT * FROM Track WHERE GenreId = (SELECT GenreId FROM Genre WHERE Name = 'Rock');",
    },
    {
        "input": "Find the total duration of all tracks.",
        "query": "SELECT SUM(Milliseconds) FROM Track;",
    },
    {
        "input": "List all customers from Canada.",
        "query": "SELECT * FROM Customer WHERE Country = 'Canada';",
    },
    {
        "input": "How many tracks are there in the album with ID 5?",
        "query": "SELECT COUNT(*) FROM Track WHERE AlbumId = 5;",
    },
    {
        "input": "Find the total number of invoices.",
        "query": "SELECT COUNT(*) FROM Invoice;",
    },
    {
        "input": "How many ML_flags were raised (ML_flag=1) in 2023 for machine K-8",
        "query": "SELECT count(*) FROM DT_full WHERE ML_Flag = 1 AND strftime('%Y', dDate) = '2023' AND MachineID = 'K-8'"  
    },
    {
        "input": "Mention all the Reasons behind Management DownCategory",
        "query": "SELECT DISTINCT DownCategory_Reason FROM DT_full WHERE DownCategory = 'Management'"
    },
    {
        "input": "List all tracks that are longer than 5 minutes.",
        "query": "SELECT * FROM Track WHERE Milliseconds > 300000;",
    },
    {
        "input": "Who are the top 5 customers by total purchase?",
        "query": "SELECT CustomerId, SUM(Total) AS TotalPurchase FROM Invoice GROUP BY CustomerId ORDER BY TotalPurchase DESC LIMIT 5;",
    },
    {
        "input": "Which albums are from the year 2000?",
        "query": "SELECT * FROM Album WHERE strftime('%Y', ReleaseDate) = '2000';",
    },
    {
        "input": "How many employees are there",
        "query": 'SELECT COUNT(*) FROM "Employee"',
    },
]
# Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
system_prefix = """You are an agent designed to interact with a SQL database
        Given an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.

        You can order the results by a relevant column to return the most interesting examples in the database.
        Never query for all the columns from a specific table, only ask for the relevant columns given the question.
        You have access to tools for interacting with the database.
        Only use the given tools. Only use the information returned by the tools to construct your final answer.
        You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.

        DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

        In the given ShiftDownTimeDetails table, many different columns are there, however only focus of select few which are relevant for the answer. 
        The description of each column is given below as Additional Information, so that it will be easier for you to answer queries.
        Additional Information:- 
        ID: gives the id of the record.
        dDate: The date on which the given instance was recorded.
        Shift: mentions the working shift when the current instance was encountered. In all there are total of 3 shifts.
        MachineID: Identification of a machine. Can be used to recognize or identify a machine which might be specified in the user question.
        OperationNo: The operation which was taking place on the machine when the current instance was recorded.
        OperationID: The operator which was undertaking the given operation.
        StartTime: The starting time of Downtime.
        EndTime: The ending time of Downtime.
        DownCategory: Specifies what kind of fault lead to the Downtime. Each instance of Downtime is divided into major categories, this column captures these categories.
        DownID: Specifies the reason behind the occurance of DownCategory and Downtime.
        ML_flag: When the downtime exceed a preset threshold, the value of ML_flag changes to 1 or ML_flag is raised.
        Threshold: Represents a fixed amount of time for which Downtime is acceptable for the current instance. Exceeding of the threshold by Downtime raises an ML_flag.

        Focus on the above columns

        If the question does not seem related to the database, just return "I don't know" as the answer.

        Here are some examples of user inputs and their corresponding SQL queries:"""
# ---------------------------------------------------------------------------------------------------------------------------------------------------------------
        # You have access to the following tables: {table_names}    , dont use mysql queries
template = """You are a very intelligent AI assistant who is expert in identifying relevant questions from user and converting into MS-SQL queries to generate correct answer.
        For SQL queries, ALWAYS use the available tools in this order:
        1. sql_db_list_tables
        2. sql_db_schema
        3. sql_db_query_checker
        4. sql_db_query
        Use these tools for retriving tables, designing, running and validating queries from database. 
        If you could not access tools and database after running all the tools, RE-RUN the tools in the same squence as given above. 
    """
table_info = """
    context: 
    In order to assist you, I have given the table information below -     
    Information:
    ShiftDownTimeDetails table has 
             ID: gives the id of the record.
             dDate: The date on which the given instance was recorded.
             Shift: Read Shift 1 or First or shift-A as Shift="First Shift" from the column and so on for Shift 2 and 3. So shift 2 or second or Second or shift-B will be read as 'Second Shift'. Same for shift 3.
             MachineID: Identification of a machine. Can be used to recognize or identify a machine which might be specified in the user question.
             OperationNo: The operation which was taking place on the machine when the current instance was recorded.
             OperationID: The operator which was undertaking the given operation.
             StartTime: The starting time of Downtime.
             EndTime: The ending time of Downtime.
             DownCategory: Specifies what kind of fault lead to the Downtime. Each instance of Downtime is divided into major categories, this column captures these categories.
             DownID: Also called DownCategory_Reasons column. Use this column to answer questions related to the reasons of downtime. Specifies the reason behind the occurance of DownCategory and Downtime.
             Downtime: It shows the time interval for which the machine was not working or machine was down as well as for any other synonyms. It is given in seconds. If anything related to downtime is asked, check this column.
             ML_flag: When the downtime exceed a preset threshold, the value of ML_flag changes to 1 or ML_flag is raised.
             Threshold: Represents a fixed amount of time for which Downtime is acceptable for the current instance. Exceeding of the threshold by Downtime raises an ML_flag.
    ShiftProductionDetails table has 
             ID: gives the id of the record.
             pDate: The date on which the given instance was recorded.
             Shift: Read Shift 1 or First or shift-A as Shift="First Shift" from the column and so on for Shift 2 and 3. So shift 2 or second or Second or shift-B will be read as 'Second Shift'. Same for shift 3.
             MachineID: Identification of a machine. Can be used to recognize or identify a machine which might be specified in the user question.
             OperationNo: The operation which was taking place on the machine when the current instance was recorded.
             OperationID: The operator which was undertaking the given operation.
             Prod_Qty: Represents the quantity produced.
             Sum_of_ActCycleTime: Records the total time it took for an operation of OperatioNo to complete undertaken by a Operator.
             Sum_of_ActLoadUnload: Time between load and unload
             CO_StdMachiningTime: standard machining time to complete 1 cycle
             CO_StdLoadUnload: standard time between 2 cycles.
             Rework_Performed: How many times rework was needed.
             Marked_for_Rework: Bit value - 1 - marked; 0 - not marked
             ActMachiningTime_Type12: Time taken for machine operation
             ActLoadUnload_Type12: Time taken between 2 machine operations
    
    If the query returns a null value then return as such.
    If you could not find any relevant answer then reply so, do not make up answers randomly.
    """