yaleh commited on
Commit
ce61883
·
1 Parent(s): d535943

Optimized history analyzing prompts.

Browse files
Files changed (3) hide show
  1. config.yml +96 -29
  2. meta_prompt/consts.py +2 -2
  3. meta_prompt/meta_prompt.py +12 -4
config.yml CHANGED
@@ -86,6 +86,57 @@ allow_flagging: false
86
  prompt_templates:
87
 
88
  gpt:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
  prompt_initial_developer:
90
  - role: system
91
  message: |
@@ -197,51 +248,67 @@ prompt_templates:
197
 
198
  You output the following analysis according to the Acceptance Criteria:
199
 
200
- * Your analysis in a Markdown list.
201
- * Indicates an output ID that is closer to the Expected Output, in the following format:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
202
 
203
- ```
204
- # Analysis
205
 
206
- ...
207
 
208
- # Output ID closer to Expected Output: [ID]
209
  ```
210
-
211
- You must choose one of the two outputs. If both outputs are exactly the same, output the following:
212
-
213
  ```
214
- # Analysis
215
 
216
- ...
217
-
218
- # Draw
219
- ```
220
- - role: human
221
- message: |
222
- # Output ID: A
223
 
224
  ```
225
  {best_output}
226
  ```
227
 
228
- # Output ID: B
229
 
230
  ```
231
  {output}
232
  ```
233
 
234
- # Acceptance Criteria
235
-
236
- Compared with Expected Output [EO]:
237
- {acceptance_criteria}
238
-
239
- # Expected Output
240
-
241
- ```
242
- {expected_output}
243
- ```
244
-
245
  prompt_analyzer:
246
  - role: system
247
  message: |
 
86
  prompt_templates:
87
 
88
  gpt:
89
+ acceptance_criteria_developer:
90
+ - role: system
91
+ message: |
92
+ # Acceptance Criteria Developer
93
+
94
+ You are an acceptance criteria developer. You will receive a specific example of a task type to create acceptance criteria. You will respond directly with the acceptance criteria.
95
+
96
+ ## Instructions
97
+
98
+ The user will provide you a specific example with User Message (input) and Expected Output (output) of a task type. You will respond with acceptance criteria for the task type, by comparing with Expected Output (which may be referenced as EO), includes the following:
99
+
100
+ * What the output should include
101
+ * What the output should not include
102
+ * Language requirements
103
+ * Formatting requirements
104
+ * Structure requirements
105
+ * Style requirements
106
+ * Any specific requirements
107
+
108
+ ## Output
109
+
110
+ Create acceptance criteria in the following format:
111
+
112
+ ```
113
+ # Acceptance Criteria
114
+
115
+ * [Overall Criteria]
116
+ * ...
117
+ * Unacceptable differences (compared with EO):
118
+ * ...
119
+ * Acceptable differences (compared with EO):
120
+ * ...
121
+ ```
122
+
123
+ Focus on `Unacceptable differences` and `Acceptable differences`. Keep Overall Criteria brief (no more than 50 words).
124
+ - role: human
125
+ message: |
126
+ # Task Brief
127
+
128
+ {system_message}
129
+
130
+ # User Message
131
+
132
+ {user_message}
133
+
134
+ # Expected Output
135
+
136
+ {expected_output}
137
+
138
+ # Acceptance Criteria
139
+
140
  prompt_initial_developer:
141
  - role: system
142
  message: |
 
248
 
249
  You output the following analysis according to the Acceptance Criteria:
250
 
251
+ * Your analysis.
252
+ * Indicates an output ID that is closer to the Expected Output.
253
+
254
+ Requirements:
255
+ 1. Read and understand the provided Acceptance Criteria carefully.
256
+ 2. Compare the Expected Output with two different outputs (Output 1 and Output 2).
257
+ 3. Ignore the differences that are specified as acceptable or ignorable in the Acceptance Criteria.
258
+ 4. Determine which output (Output 1 or Output 2) is closer to the Expected Output based on the Acceptance Criteria.
259
+ 5. Provide a detailed analysis of your comparison and decision-making process.
260
+ 6. Clearly indicate the output ID (either 1 or 2) that is closer to the Expected Output.
261
+
262
+ Output Format:
263
+ Your output should be in the following JSON format:
264
+ {{
265
+ "analysis": "[Your detailed analysis here. Explain your comparison and decision-making process based on the Acceptance Criteria.]",
266
+ "closerOutputID": [1 or 2 or 0]
267
+ }}
268
+
269
+ Note:
270
+ - Use "closerOutputID": 1 if Output 1 is closer to the Expected Output.
271
+ - Use "closerOutputID": 2 if Output 2 is closer to the Expected Output.
272
+ - Use "closerOutputID": 0 if both outputs are exactly the same or equally close to the Expected Output.
273
+
274
+ Examples:
275
+ Example 1:
276
+ {{
277
+ "analysis": "Based on the Acceptance Criteria, the differences in formatting and whitespace are ignorable. Both outputs convey the same information as the Expected Output, with only minor differences in presentation. Therefore, both outputs are considered equally close to the Expected Output.",
278
+ "closerOutputID": 0
279
+ }}
280
+
281
+ Example 2:
282
+ {{
283
+ "analysis": "According to the Acceptance Criteria, the presence of additional information in Output 2 that is not present in the Expected Output is acceptable. However, Output 1 contains a significant omission of required information compared to the Expected Output. Therefore, Output 2 is closer to the Expected Output.",
284
+ "closerOutputID": 2
285
+ }}
286
+
287
+ Remember to adhere to the Acceptance Criteria when comparing the outputs and provide a clear and detailed analysis to support your decision. Confirm that your output follows the specified format and includes the required information.
288
+ - role: human
289
+ message: |
290
+ # Acceptance Criteria
291
 
292
+ {acceptance_criteria}
 
293
 
294
+ # Expected Output
295
 
 
296
  ```
297
+ {expected_output}
 
 
298
  ```
 
299
 
300
+ # Output ID: 1
 
 
 
 
 
 
301
 
302
  ```
303
  {best_output}
304
  ```
305
 
306
+ # Output ID: 2
307
 
308
  ```
309
  {output}
310
  ```
311
 
 
 
 
 
 
 
 
 
 
 
 
312
  prompt_analyzer:
313
  - role: system
314
  message: |
meta_prompt/consts.py CHANGED
@@ -77,9 +77,9 @@ Create acceptance criteria in the following format:
77
  * [Criteria 1]
78
  * [Criteria 2]
79
  * ...
80
- * Unacceptable differences (comapire with EO):
81
  * ...
82
- * Acceptable differences (comapire with EO):
83
  * ...
84
  ```
85
 
 
77
  * [Criteria 1]
78
  * [Criteria 2]
79
  * ...
80
+ * Unacceptable differences (compared with EO):
81
  * ...
82
+ * Acceptable differences (compared with EO):
83
  * ...
84
  ```
85
 
meta_prompt/meta_prompt.py CHANGED
@@ -1,3 +1,4 @@
 
1
  import logging
2
  import pprint
3
  from langchain_core.language_models import BaseLanguageModel
@@ -471,12 +472,19 @@ class MetaPromptGraph:
471
  'message': response.content
472
  })
473
 
474
- analysis = response.content
 
 
 
 
 
 
 
 
475
 
476
  if (state["best_output"] is None or
477
- "# Output ID closer to Expected Output: B" in analysis or
478
- (self.aggressive_exploration and
479
- "# Output ID closer to Expected Output: A" not in analysis)):
480
  result_dict = {
481
  "best_output": state["output"],
482
  "best_system_message": state["system_message"],
 
1
+ import json
2
  import logging
3
  import pprint
4
  from langchain_core.language_models import BaseLanguageModel
 
472
  'message': response.content
473
  })
474
 
475
+ response_content = response.content.strip()
476
+ if response_content.startswith('```json') and response_content.endswith('```'):
477
+ response_content = response_content[7:-3].strip()
478
+ elif response_content.startswith('```') and response_content.endswith('```'):
479
+ response_content = response_content[3:-3].strip()
480
+ analysis_dict = json.loads(response_content)
481
+
482
+ analysis = analysis_dict["analysis"]
483
+ closer_output_id = analysis_dict["closerOutputID"]
484
 
485
  if (state["best_output"] is None or
486
+ closer_output_id == 2 or
487
+ (self.aggressive_exploration and closer_output_id != 1)):
 
488
  result_dict = {
489
  "best_output": state["output"],
490
  "best_system_message": state["system_message"],