Spaces:
Running
Running
Optimized history analyzing prompts.
Browse files- config.yml +96 -29
- meta_prompt/consts.py +2 -2
- meta_prompt/meta_prompt.py +12 -4
config.yml
CHANGED
@@ -86,6 +86,57 @@ allow_flagging: false
|
|
86 |
prompt_templates:
|
87 |
|
88 |
gpt:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
89 |
prompt_initial_developer:
|
90 |
- role: system
|
91 |
message: |
|
@@ -197,51 +248,67 @@ prompt_templates:
|
|
197 |
|
198 |
You output the following analysis according to the Acceptance Criteria:
|
199 |
|
200 |
-
* Your analysis
|
201 |
-
* Indicates an output ID that is closer to the Expected Output
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
202 |
|
203 |
-
|
204 |
-
# Analysis
|
205 |
|
206 |
-
|
207 |
|
208 |
-
# Output ID closer to Expected Output: [ID]
|
209 |
```
|
210 |
-
|
211 |
-
You must choose one of the two outputs. If both outputs are exactly the same, output the following:
|
212 |
-
|
213 |
```
|
214 |
-
# Analysis
|
215 |
|
216 |
-
|
217 |
-
|
218 |
-
# Draw
|
219 |
-
```
|
220 |
-
- role: human
|
221 |
-
message: |
|
222 |
-
# Output ID: A
|
223 |
|
224 |
```
|
225 |
{best_output}
|
226 |
```
|
227 |
|
228 |
-
# Output ID:
|
229 |
|
230 |
```
|
231 |
{output}
|
232 |
```
|
233 |
|
234 |
-
# Acceptance Criteria
|
235 |
-
|
236 |
-
Compared with Expected Output [EO]:
|
237 |
-
{acceptance_criteria}
|
238 |
-
|
239 |
-
# Expected Output
|
240 |
-
|
241 |
-
```
|
242 |
-
{expected_output}
|
243 |
-
```
|
244 |
-
|
245 |
prompt_analyzer:
|
246 |
- role: system
|
247 |
message: |
|
|
|
86 |
prompt_templates:
|
87 |
|
88 |
gpt:
|
89 |
+
acceptance_criteria_developer:
|
90 |
+
- role: system
|
91 |
+
message: |
|
92 |
+
# Acceptance Criteria Developer
|
93 |
+
|
94 |
+
You are an acceptance criteria developer. You will receive a specific example of a task type to create acceptance criteria. You will respond directly with the acceptance criteria.
|
95 |
+
|
96 |
+
## Instructions
|
97 |
+
|
98 |
+
The user will provide you a specific example with User Message (input) and Expected Output (output) of a task type. You will respond with acceptance criteria for the task type, by comparing with Expected Output (which may be referenced as EO), includes the following:
|
99 |
+
|
100 |
+
* What the output should include
|
101 |
+
* What the output should not include
|
102 |
+
* Language requirements
|
103 |
+
* Formatting requirements
|
104 |
+
* Structure requirements
|
105 |
+
* Style requirements
|
106 |
+
* Any specific requirements
|
107 |
+
|
108 |
+
## Output
|
109 |
+
|
110 |
+
Create acceptance criteria in the following format:
|
111 |
+
|
112 |
+
```
|
113 |
+
# Acceptance Criteria
|
114 |
+
|
115 |
+
* [Overall Criteria]
|
116 |
+
* ...
|
117 |
+
* Unacceptable differences (compared with EO):
|
118 |
+
* ...
|
119 |
+
* Acceptable differences (compared with EO):
|
120 |
+
* ...
|
121 |
+
```
|
122 |
+
|
123 |
+
Focus on `Unacceptable differences` and `Acceptable differences`. Keep Overall Criteria brief (no more than 50 words).
|
124 |
+
- role: human
|
125 |
+
message: |
|
126 |
+
# Task Brief
|
127 |
+
|
128 |
+
{system_message}
|
129 |
+
|
130 |
+
# User Message
|
131 |
+
|
132 |
+
{user_message}
|
133 |
+
|
134 |
+
# Expected Output
|
135 |
+
|
136 |
+
{expected_output}
|
137 |
+
|
138 |
+
# Acceptance Criteria
|
139 |
+
|
140 |
prompt_initial_developer:
|
141 |
- role: system
|
142 |
message: |
|
|
|
248 |
|
249 |
You output the following analysis according to the Acceptance Criteria:
|
250 |
|
251 |
+
* Your analysis.
|
252 |
+
* Indicates an output ID that is closer to the Expected Output.
|
253 |
+
|
254 |
+
Requirements:
|
255 |
+
1. Read and understand the provided Acceptance Criteria carefully.
|
256 |
+
2. Compare the Expected Output with two different outputs (Output 1 and Output 2).
|
257 |
+
3. Ignore the differences that are specified as acceptable or ignorable in the Acceptance Criteria.
|
258 |
+
4. Determine which output (Output 1 or Output 2) is closer to the Expected Output based on the Acceptance Criteria.
|
259 |
+
5. Provide a detailed analysis of your comparison and decision-making process.
|
260 |
+
6. Clearly indicate the output ID (either 1 or 2) that is closer to the Expected Output.
|
261 |
+
|
262 |
+
Output Format:
|
263 |
+
Your output should be in the following JSON format:
|
264 |
+
{{
|
265 |
+
"analysis": "[Your detailed analysis here. Explain your comparison and decision-making process based on the Acceptance Criteria.]",
|
266 |
+
"closerOutputID": [1 or 2 or 0]
|
267 |
+
}}
|
268 |
+
|
269 |
+
Note:
|
270 |
+
- Use "closerOutputID": 1 if Output 1 is closer to the Expected Output.
|
271 |
+
- Use "closerOutputID": 2 if Output 2 is closer to the Expected Output.
|
272 |
+
- Use "closerOutputID": 0 if both outputs are exactly the same or equally close to the Expected Output.
|
273 |
+
|
274 |
+
Examples:
|
275 |
+
Example 1:
|
276 |
+
{{
|
277 |
+
"analysis": "Based on the Acceptance Criteria, the differences in formatting and whitespace are ignorable. Both outputs convey the same information as the Expected Output, with only minor differences in presentation. Therefore, both outputs are considered equally close to the Expected Output.",
|
278 |
+
"closerOutputID": 0
|
279 |
+
}}
|
280 |
+
|
281 |
+
Example 2:
|
282 |
+
{{
|
283 |
+
"analysis": "According to the Acceptance Criteria, the presence of additional information in Output 2 that is not present in the Expected Output is acceptable. However, Output 1 contains a significant omission of required information compared to the Expected Output. Therefore, Output 2 is closer to the Expected Output.",
|
284 |
+
"closerOutputID": 2
|
285 |
+
}}
|
286 |
+
|
287 |
+
Remember to adhere to the Acceptance Criteria when comparing the outputs and provide a clear and detailed analysis to support your decision. Confirm that your output follows the specified format and includes the required information.
|
288 |
+
- role: human
|
289 |
+
message: |
|
290 |
+
# Acceptance Criteria
|
291 |
|
292 |
+
{acceptance_criteria}
|
|
|
293 |
|
294 |
+
# Expected Output
|
295 |
|
|
|
296 |
```
|
297 |
+
{expected_output}
|
|
|
|
|
298 |
```
|
|
|
299 |
|
300 |
+
# Output ID: 1
|
|
|
|
|
|
|
|
|
|
|
|
|
301 |
|
302 |
```
|
303 |
{best_output}
|
304 |
```
|
305 |
|
306 |
+
# Output ID: 2
|
307 |
|
308 |
```
|
309 |
{output}
|
310 |
```
|
311 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
312 |
prompt_analyzer:
|
313 |
- role: system
|
314 |
message: |
|
meta_prompt/consts.py
CHANGED
@@ -77,9 +77,9 @@ Create acceptance criteria in the following format:
|
|
77 |
* [Criteria 1]
|
78 |
* [Criteria 2]
|
79 |
* ...
|
80 |
-
* Unacceptable differences (
|
81 |
* ...
|
82 |
-
* Acceptable differences (
|
83 |
* ...
|
84 |
```
|
85 |
|
|
|
77 |
* [Criteria 1]
|
78 |
* [Criteria 2]
|
79 |
* ...
|
80 |
+
* Unacceptable differences (compared with EO):
|
81 |
* ...
|
82 |
+
* Acceptable differences (compared with EO):
|
83 |
* ...
|
84 |
```
|
85 |
|
meta_prompt/meta_prompt.py
CHANGED
@@ -1,3 +1,4 @@
|
|
|
|
1 |
import logging
|
2 |
import pprint
|
3 |
from langchain_core.language_models import BaseLanguageModel
|
@@ -471,12 +472,19 @@ class MetaPromptGraph:
|
|
471 |
'message': response.content
|
472 |
})
|
473 |
|
474 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
475 |
|
476 |
if (state["best_output"] is None or
|
477 |
-
|
478 |
-
(self.aggressive_exploration and
|
479 |
-
"# Output ID closer to Expected Output: A" not in analysis)):
|
480 |
result_dict = {
|
481 |
"best_output": state["output"],
|
482 |
"best_system_message": state["system_message"],
|
|
|
1 |
+
import json
|
2 |
import logging
|
3 |
import pprint
|
4 |
from langchain_core.language_models import BaseLanguageModel
|
|
|
472 |
'message': response.content
|
473 |
})
|
474 |
|
475 |
+
response_content = response.content.strip()
|
476 |
+
if response_content.startswith('```json') and response_content.endswith('```'):
|
477 |
+
response_content = response_content[7:-3].strip()
|
478 |
+
elif response_content.startswith('```') and response_content.endswith('```'):
|
479 |
+
response_content = response_content[3:-3].strip()
|
480 |
+
analysis_dict = json.loads(response_content)
|
481 |
+
|
482 |
+
analysis = analysis_dict["analysis"]
|
483 |
+
closer_output_id = analysis_dict["closerOutputID"]
|
484 |
|
485 |
if (state["best_output"] is None or
|
486 |
+
closer_output_id == 2 or
|
487 |
+
(self.aggressive_exploration and closer_output_id != 1)):
|
|
|
488 |
result_dict = {
|
489 |
"best_output": state["output"],
|
490 |
"best_system_message": state["system_message"],
|