Spaces:
Sleeping
Sleeping
File size: 12,632 Bytes
3b1cdbf 79b1523 62fb408 3b1cdbf 79b1523 62fb408 3b1cdbf 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 62fb408 79b1523 ce61883 79b1523 ce61883 79b1523 62fb408 3b1cdbf 07a9ce5 3b1cdbf 07a9ce5 3b1cdbf 62fb408 3b1cdbf 829d673 3b1cdbf 07a9ce5 3b1cdbf 07a9ce5 3b1cdbf 62fb408 3b1cdbf 829d673 3b1cdbf 829d673 3b1cdbf a291864 3b1cdbf a291864 3b1cdbf 0e80df8 a291864 0e80df8 a291864 0e80df8 a291864 0e80df8 a291864 3b1cdbf 0e80df8 3b1cdbf 0e80df8 3b1cdbf 0e80df8 3b1cdbf 4ba958e 3b1cdbf 829d673 3b1cdbf 62fb408 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 |
from langchain_core.prompts import ChatPromptTemplate
# NODE_TASK_BRIEF_DEVELOPER = "task_brief_developer"
NODE_ACCEPTANCE_CRITERIA_DEVELOPER = "acceptance_criteria_developer"
NODE_PROMPT_INITIAL_DEVELOPER = "prompt_initial_developer"
NODE_PROMPT_DEVELOPER = "prompt_developer"
NODE_PROMPT_EXECUTOR = "prompt_executor"
NODE_OUTPUT_HISTORY_ANALYZER = "output_history_analyzer"
NODE_PROMPT_ANALYZER = "prompt_analyzer"
NODE_PROMPT_SUGGESTER = "prompt_suggester"
META_PROMPT_NODES = [
# NODE_TASK_BRIEF_DEVELOPER,
NODE_ACCEPTANCE_CRITERIA_DEVELOPER,
NODE_PROMPT_INITIAL_DEVELOPER,
NODE_PROMPT_DEVELOPER,
NODE_PROMPT_EXECUTOR,
NODE_OUTPUT_HISTORY_ANALYZER,
NODE_PROMPT_ANALYZER,
NODE_PROMPT_SUGGESTER
]
DEFAULT_PROMPT_TEMPLATES = {
# NODE_TASK_BRIEF_DEVELOPER: ChatPromptTemplate.from_messages([
# ("system", """# Task Brief Developer
# You are a task brief developer. You will receive a specific example to create a task brief. You will respond directly with the brief for the task type.
# ## Instructions
# The user will provide you a specific example with User Message (input) and Expected Output (output) of a task type. You will respond with a brief for the task type in the following format:
# ```
# # Task Description
# [Task description]
# ```
# """),
# ("human", """# User Message
# {user_message}
# # Expected Output
# {expected_output}
# # Task Brief
# """)
# ]),
NODE_ACCEPTANCE_CRITERIA_DEVELOPER: ChatPromptTemplate.from_messages([
("system", """# Acceptance Criteria Developer
You are an acceptance criteria developer. You will receive a specific example of a task type to create acceptance criteria. You will respond directly with the acceptance criteria.
## Instructions
The user will provide you a specific example with User Message (input) and Expected Output (output) of a task type. You will respond with acceptance criteria for the task type, by comparing with Expected Output (which may be referenced as EO), includes the following:
* What the output should include
* What the output should not include
* Language requirements
* Formatting requirements
* Structure requirements
* Style requirements
* Any specific requirements
## Output
Create acceptance criteria in the following format:
```
# Acceptance Criteria
* [Criteria 1]
* [Criteria 2]
* ...
* Unacceptable differences (compared with EO):
* ...
* Acceptable differences (compared with EO):
* ...
```
"""),
("human", """# Task Brief
{system_message}
# User Message
{user_message}
# Expected Output
{expected_output}
# Acceptance Criteria
""")
]),
NODE_PROMPT_INITIAL_DEVELOPER: ChatPromptTemplate.from_messages([
("system", """# Expert Prompt Engineer
You are an expert at creating and modifying GPTs, which are like chatbots that can have additional capabilities.
## Instructions
The user will provide you a specific example to create the GPT. You will respond directly with the description of the GPT. The description should be around 200 tokens.
## Output
Create a [name], Here's the descriptions [description]. Start with "GPT Description:"
"""),
("human", """# User Message
{user_message}
# Expected Output
{expected_output}
# System Message
""")
]),
NODE_PROMPT_DEVELOPER: ChatPromptTemplate.from_messages([
("system", """# Expert Prompt Engineer
You are an expert at creating and modifying GPTs, which are like chatbots that can have additional capabilities.
## Instructions
The user will provide you a specific example (`User Message` and `Expected Output`), current GPT (`Current System Message`) and suggestions to update the GPT. You will respond directly with the description of the GPT.
* Modify only the content mentioned in the Suggestion. Do not change the parts that are not related to the Suggestion.
* Avoiding the behavior should be explicitly requested (e.g. `Don't ...`) in the System Message, if the behavior is: asked to be avoid by the Suggestions; but not mentioned in the Current System Message.
## Output
Create a [name], Here's the descriptions [description]. Start with "GPT Description:"
"""),
("human", """# Current System Message
{system_message}
# User Message
{user_message}
# Expected Output
{expected_output}
# Suggestions
{suggestions}
# Updated System Message
""")
]),
NODE_PROMPT_EXECUTOR: ChatPromptTemplate.from_messages([
("system", "{system_message}"),
("human", "{user_message}")
]),
NODE_OUTPUT_HISTORY_ANALYZER: ChatPromptTemplate.from_messages([
("system", """{{
"task_description": "You are a text comparing program. Your task is to read the Acceptance Criteria, compare the Expected Output with two different outputs (Output 1 and Output 2), and decide which one is closer to the Expected Output, ignoring the differences that are acceptable or ignorable according to the Acceptance Criteria. Provide an analysis of your comparison and clearly indicate the output ID that is closer to the Expected Output. Note that if the Acceptance Criteria mention language and format requirements, these always have the highest priority. Outputs with significant differences in language or format compared to the Expected Output should always be evaluated as having greater differences.",
"requirements": [
"Read and understand the provided Acceptance Criteria carefully.",
"Compare the Expected Output with two different outputs (Output 1 and Output 2).",
"Ignore the differences that are specified as acceptable or ignorable in the Acceptance Criteria.",
"Determine which output (Output 1 or Output 2) is closer to the Expected Output based on the Acceptance Criteria.",
"Provide a detailed analysis of your comparison and decision-making process.",
"Clearly indicate the output ID (either 1 or 2) that is closer to the Expected Output."
],
"output_format": {{
"type": "object",
"properties": {{
"analysis": {{
"type": "string",
"description": "A detailed analysis explaining the comparison and decision-making process based on the Acceptance Criteria."
}},
"closerOutputID": {{
"type": "integer",
"description": "The output ID (1 or 2) that is closer to the Expected Output, or 0 if both outputs are equally close."
}}
}},
"required": [
"analysis",
"closerOutputID"
]
}},
"output_example": {{
"analysis": "The Acceptance Criteria specified that the output should be in English and follow a specific JSON format. Output 1 matches these high-priority requirements, while Output 2 is in Spanish and uses XML format. Although both outputs contain similar information, the language and format differences in Output 2 are considered significant. Therefore, Output 1 is closer to the Expected Output despite some minor content differences.",
"closerOutputID": 1
}},
"evaluation_criteria": [
"The analysis should demonstrate a clear understanding of the Acceptance Criteria, with the highest priority given to language and format requirements if specified.",
"The comparison should accurately identify and ignore acceptable or ignorable differences, while emphasizing significant language or format discrepancies.",
"The decision should be based on a thorough analysis of the outputs in relation to the Expected Output, prioritizing language and format matching when required.",
"The output ID indicated as closer to the Expected Output should align with the analysis, reflecting the importance of language and format requirements."
],
"error_handling": [
"If the Acceptance Criteria are unclear or contradictory, provide an analysis explaining the ambiguity and suggest possible interpretations.",
"If neither output is closer to the Expected Output, provide an analysis explaining why and use \"closerOutputID\": 0."
],
"ethical_considerations": [
"Ensure that the comparison process is unbiased and solely based on the Acceptance Criteria.",
"Do not introduce personal opinions or preferences into the analysis."
],
"conclusion": "Confirm that your output adheres to the specified language and format, includes a detailed analysis, and clearly indicates the closer output ID based on the Acceptance Criteria."
}}
"""),
("human", """<|Start_Output_ID_1|>{best_output}<|End_Output_ID_1|>
<|Start_Output_ID_2|>{output}<|End_Output_ID_2|>
<|Start_Acceptance_Criteria|>{acceptance_criteria}<|End_Acceptance_Criteria|>
<|Start_Expected_Output|>{expected_output}<|End_Expected_Output|>
""")
]),
NODE_PROMPT_ANALYZER: ChatPromptTemplate.from_messages([
("system", """{{
"task_description": "Compare the Expected Output with the Actual Output according to the Acceptance Criteria and provide a JSON output with the analysis.",
"requirements": [
"Strictly follow the Acceptance Criteria to compare Expected and Actual Outputs",
"Set 'Accept' to 'Yes' only if all criteria are met, otherwise set it to 'No'",
"List acceptable and unacceptable differences based on the criteria"
],
"output_format": {{
"type": "object",
"properties": {{
"Accept": {{
"type": "string",
"enum": ["Yes", "No"]
}},
"Acceptable Differences": {{
"type": "array",
"items": {{
"type": "string"
}}
}},
"Unacceptable Differences": {{
"type": "array",
"items": {{
"type": "string"
}}
}}
}},
"required": ["Accept", "Acceptable Differences", "Unacceptable Differences"]
}},
"output_example": {{
"Accept": "No",
"Acceptable Differences": [
"Spelling variations: 'colour' vs 'color'"
],
"Unacceptable Differences": [
"Missing section: 'Conclusion'",
"Incorrect date format: '2023/10/12' vs '12-10-2023'"
]
}}
}}
```
"""),
("human", """<|Start_Expected_Output|>
{expected_output}
<|End_Expected_Output|>
<|Start_Actual_Output|>
{expected_output}
<|End_Expected_Output|>
<|Start_Actual_Output|>
{output}
<|End_Actual_Output|>
<|Start_Acceptance_Criteria|>
{acceptance_criteria}
<|End_Acceptance_Criteria|>
```
""")
]),
NODE_PROMPT_SUGGESTER: ChatPromptTemplate.from_messages([
("system", """Read the following inputs and outputs of an LLM prompt, and also analysis about them. Then suggest how to improve System Message.
* The goal is to improve the System Message to match the Expected Output better.
* Ignore all Acceptable Differences and focus on Unacceptable Differences.
* Suggest formal changes first, then semantic changes.
* Provide your suggestions in a Markdown list, nothing else. Output only the suggestions related with Unacceptable Differences.
* Start every suggestion with `The System Message should ...`.
* Figue out the contexts of the System Message that conflict with the suggestions, and suggest modification or deletion.
* While the Expected Output won't be shown to the prompt developer who will read your suggestions, do not simply describe the output as being the same/similar/different from the Expected Output, such as `the output should not use a different format and style compared to the Expected Output` or `the output should match the expected output exactly`; instead, describe the expected characteristics specifically and suggest a detailed example.
* Avoiding the behavior should be explicitly requested (e.g. `The System Message should explicitly state that the output shoud not ...`) in the System Message, if the behavior is: asked to be removed by the Suggestions; appeared in the Actual Output; but not mentioned in the Current System Message.
* Expected Output text should not appear in System Message as an example. But it's OK to use some similar but distinct text as an example instead.
* Ask to remove the Expected Output text or text highly similar to Expected Output from System Message, if it's present.
* Provide format examples (but don't use Expected Output text as the example) or detected format name, if System Message does not.
* Specify the detected format name (e.g. XML, JSON, etc.) of Expected Output, if System Message does not mention it.
"""),
("human", """
<|Start_System_Message|>
{system_message}
<|End_System_Message|>
<|Start_User_Message|>
{user_message}
<|End_User_Message|>
<|Start_Expected_Output|>
{expected_output}
<|End_Expected_Output|>
<|Start_Actual_Output|>
{output}
<|End_Actual_Output|>
<|Start_Acceptance Criteria|>
{acceptance_criteria}
<|End_Acceptance Criteria|>
<|Start_Analysis|>
{analysis}
<|End_Analysis|>
""")
])
} |