Spaces:
Running
Running
Updated UI, config and exanokes.
Browse files- app/examples/log.csv +54 -109
- app/gradio_meta_prompt.py +51 -42
- config.yml +225 -221
app/examples/log.csv
CHANGED
@@ -1,14 +1,9 @@
|
|
1 |
User Message,Expected Output,Acceptance Criteria,Initial System Message
|
2 |
-
How do I reverse a list in Python?,Use the `[::-1]` slicing technique or the `list.reverse()` method.,"
|
3 |
(2+8)*3,"(2+8)*3
|
4 |
= 10*3
|
5 |
= 30
|
6 |
-
","
|
7 |
-
* Exact text match.
|
8 |
-
* Acceptable differences:
|
9 |
-
* Extra or missing spaces.
|
10 |
-
* Extra or missing line breaks at the beginning or end of the output.
|
11 |
-
",""
|
12 |
"Here is the GDP data in billions of US dollars (USD) for these years:
|
13 |
|
14 |
Germany:
|
@@ -51,13 +46,7 @@ Spain:
|
|
51 |
2017-2016,5.08%,4.78%,-1.07%,4.61%,6.23%
|
52 |
2018-2017,7.48%,7.99%,7.89%,7.10%,8.58%
|
53 |
2019-2018,-2.56%,-2.59%,0.01%,-4.11%,0.30%
|
54 |
-
","
|
55 |
-
* Strict text matching of the header row and first column(year).
|
56 |
-
* Acceptable differences:
|
57 |
-
* Differences in digital/percentage values in the table, even significant ones.
|
58 |
-
* Extra or missing spaces.
|
59 |
-
* Extra or missing line breaks.
|
60 |
-
",""
|
61 |
"Gene sequence: ATGGCCATGGCGCCCAGAACTGAGATCAATAGTACCCGTATTAACGGGTGA
|
62 |
Species: Escherichia coli","{
|
63 |
""Gene Sequence Analysis Results"": {
|
@@ -117,15 +106,7 @@ Species: Escherichia coli","{
|
|
117 |
]
|
118 |
}
|
119 |
}
|
120 |
-
}","
|
121 |
-
* Formats of all JSON sections
|
122 |
-
* Data types of all JSON fields
|
123 |
-
* Top layer sections
|
124 |
-
* Acceptable differences:
|
125 |
-
* Extra or missing spaces
|
126 |
-
* Extra or missing line breaks at the beginning or end of the output
|
127 |
-
* Differences in JSON field values
|
128 |
-
* JSON wrapped in backquotes",""
|
129 |
今天下午3点,在北京国家会议中心,阿里巴巴集团董事局主席马云宣布将投资100亿元人民币用于农村电商发展。这一决定受到了与会代表的热烈欢迎,大家认为这将为中国农村经济带来新的机遇。,"{
|
130 |
""文本分析结果"": {
|
131 |
""情感分析"": {
|
@@ -154,46 +135,15 @@ Species: Escherichia coli","{
|
|
154 |
{""关键词"": ""经济机遇"", ""权重"": 0.70}
|
155 |
]
|
156 |
}
|
157 |
-
}","
|
158 |
-
* Formats of all JSON sections
|
159 |
-
* Data types of all JSON fields
|
160 |
-
* Top layer sections
|
161 |
-
* Acceptable differences:
|
162 |
-
* Differences in digital values in the table.
|
163 |
-
* Extra or missing spaces.
|
164 |
-
* Extra or missing line breaks at the beginning or end of the output.
|
165 |
-
* Differences in JSON field values
|
166 |
-
* Differences in section/item orders.
|
167 |
-
* JSON wrapped in backquotes.",""
|
168 |
Low-noise amplifier,"A '''low-noise amplifier''' ('''LNA''') is an electronic component that amplifies a very low-power [[signal]] without significantly degrading its [[signal-to-noise ratio]] (SNR). Any [[electronic amplifier]] will increase the power of both the signal and the [[Noise (electronics)|noise]] present at its input, but the amplifier will also introduce some additional noise. LNAs are designed to minimize that additional noise, by choosing special components, operating points, and [[Circuit topology (electrical)|circuit topologies]]. Minimizing additional noise must balance with other design goals such as [[power gain]] and [[impedance matching]].
|
169 |
|
170 |
-
LNAs are found in [[Radio|radio communications]] systems, [[Amateur Radio]] stations, medical instruments and [[electronic test equipment]]. A typical LNA may supply a power gain of 100 (20 [[decibels]] (dB)) while decreasing the SNR by less than a factor of two (a 3 dB [[noise figure]] (NF)). Although LNAs are primarily concerned with weak signals that are just above the [[noise floor]], they must also consider the presence of larger signals that cause [[intermodulation distortion]].","
|
171 |
-
* Language
|
172 |
-
* Text length
|
173 |
-
* Text style
|
174 |
-
* Text structures
|
175 |
-
* Cover all the major content of Expected Output.
|
176 |
-
* Acceptable differences:
|
177 |
-
* Minor format differences.
|
178 |
-
* Expression differences.
|
179 |
-
* Numerical differences.
|
180 |
-
* Additional content in Actual Output.
|
181 |
-
* Missing minor content in Actual Output.",""
|
182 |
What is the meaning of life?,"[
|
183 |
{""persona"": ""Philosopher"", ""prompt"": ""Explore the concept of life's meaning through the lens of existentialism and purpose-driven existence.""},
|
184 |
{""persona"": ""Scientist"", ""prompt"": ""Examine the biological and evolutionary perspectives on the function and significance of life.""},
|
185 |
{""persona"": ""Child"", ""prompt"": ""Imagine you're explaining to a curious 7-year-old what makes life special and important.""}
|
186 |
-
]","
|
187 |
-
* Formats of all JSON sections
|
188 |
-
* Data types and formats of all JSON fields
|
189 |
-
* Top layer sections
|
190 |
-
* Acceptable differences:
|
191 |
-
* Different personas or prompts
|
192 |
-
* Different numbers of personas
|
193 |
-
* Extra or missing spaces
|
194 |
-
* Extra or missing line breaks at the beginning or end of the output
|
195 |
-
* Unacceptable:
|
196 |
-
* Showing the personas in Expected Output in System Message",""
|
197 |
"<?php
|
198 |
$username = $_POST['username'];
|
199 |
$password = $_POST['password'];
|
@@ -225,16 +175,8 @@ if (mysqli_num_rows($result) > 0) {
|
|
225 |
- Implement proper input sanitization
|
226 |
- Use secure password hashing algorithms
|
227 |
- Add CSRF tokens to forms
|
228 |
-
- Consider using a secure authentication library","
|
229 |
-
|
230 |
-
* Data types and formats of all YAML fields
|
231 |
-
* Top layer sections
|
232 |
-
* Acceptable differences:
|
233 |
-
* Differences in field values
|
234 |
-
* Extra or missing spaces
|
235 |
-
* Extra or missing line breaks at the beginning or end of the output
|
236 |
-
* YAML wrapped in backquotes",""
|
237 |
-
“老爸,老爸,我们去哪里呀?”,《爸爸去哪儿》,Exact text match.,"查询歌词出处。"
|
238 |
"Add a transition sentence at the end of this passage, but also include a table of all the invited guests, their age, and their relationship to Jack. Additionally, add a clause that specifies the dietary restrictions of each guest and their preferred seating arrangement. Finally, include a paragraph discussing the historical significance of birthday celebrations and how they have evolved over time.
|
239 |
Jack was so excited for his birthday party. He had spent months planning it. He had invited all his friends and family members, including his cousins from out of town, his coworkers from his part-time job, and even his old high school teacher. Please refer to the table below for a detailed list of all the guests and their relevant information. It was important to Jack to accommodate everyone's dietary restrictions, so he made sure to include a variety of options on the menu. Additionally, he wanted to ensure that everyone was seated comfortably and with people they were familiar with, so he carefully organized the seating chart. Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. Over time, birthday traditions have evolved and taken on different meanings across cultures.","Please see the table below for a list of all the invited guests, their age, and their relationship to Jack, as well as their dietary restrictions and preferred seating arrangement:
|
240 |
| Name | Age | Relationship to Jack | Dietary Restrictions | Preferred Seating |
|
@@ -247,9 +189,7 @@ Jack was so excited for his birthday party. He had spent months planning it. He
|
|
247 |
| Michael | 50 | Uncle | None | With family |
|
248 |
| Lisa | 27 | Friend | None | Near Emily |
|
249 |
Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. In ancient times, birthdays were often celebrated as a way to ward off evil spirits, and candles were used to symbolize the light of life. Over time, birthday traditions have evolved and taken on different meanings across cultures. Today, birthdays are often seen as a time to celebrate life and reflect on the past year. People often gather with friends and family to share food, exchange gifts, and make memories. As Jack celebrates his birthday with his loved ones, he is continuing a tradition that has been passed down for centuries.
|
250 |
-
","
|
251 |
-
* Acceptable differences:
|
252 |
-
* Minor text differences.",
|
253 |
"Add a transition sentence at the end of this passage, but also include a table of all the invited guests, their age, and their relationship to Jack. Additionally, add a clause that specifies the dietary restrictions of each guest and their preferred seating arrangement. Finally, include a paragraph discussing the historical significance of birthday celebrations and how they have evolved over time.
|
254 |
Jack was so excited for his birthday party. He had spent months planning it. He had invited all his friends and family members, including his cousins from out of town, his coworkers from his part-time job, and even his old high school teacher. Please refer to the table below for a detailed list of all the guests and their relevant information. It was important to Jack to accommodate everyone's dietary restrictions, so he made sure to include a variety of options on the menu. Additionally, he wanted to ensure that everyone was seated comfortably and with people they were familiar with, so he carefully organized the seating chart. Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. Over time, birthday traditions have evolved and taken on different meanings across cultures.","Please see the table below for a list of all the invited guests, their age, and their relationship to Jack, as well as their dietary restrictions and preferred seating arrangement:
|
255 |
| Name | Age | Relationship to Jack | Dietary Restrictions | Preferred Seating |
|
@@ -262,36 +202,14 @@ Jack was so excited for his birthday party. He had spent months planning it. He
|
|
262 |
| Michael | 50 | Uncle | None | With family |
|
263 |
| Lisa | 27 | Friend | None | Near Emily |
|
264 |
Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. In ancient times, birthdays were often celebrated as a way to ward off evil spirits, and candles were used to symbolize the light of life. Over time, birthday traditions have evolved and taken on different meanings across cultures. Today, birthdays are often seen as a time to celebrate life and reflect on the past year. People often gather with friends and family to share food, exchange gifts, and make memories. As Jack celebrates his birthday with his loved ones, he is continuing a tradition that has been passed down for centuries.
|
265 |
-
","
|
266 |
-
* Acceptable differences:
|
267 |
-
* Differences in the table content, e.g. names or relationships
|
268 |
-
* Minor text differences","Please do not start with a sentence about the birthday person's excitement for their party.
|
269 |
-
|
270 |
-
Instead, please provide a table with the following columns: Name, Age, Relationship to the birthday person, Dietary Restrictions, and Preferred Seating. Introduce the table with the sentence: ""Please see the table below for a list of all the invited guests, their age, and their relationship to the birthday person, as well as their dietary restrictions and preferred seating arrangement:""
|
271 |
-
|
272 |
-
| Name | Age | Relationship to the birthday person | Dietary Restrictions | Preferred Seating |
|
273 |
-
|------|-----|-------------------------------------|----------------------|--------------------|
|
274 |
-
|
275 |
-
After the table, include a transition sentence, such as ""Birthday celebrations have a long and rich history..."" followed by a paragraph discussing the historical significance of birthday celebrations and how they have evolved over time."
|
276 |
"Can you name at least 5 animals that can produce sounds resembling barks? Also, for each animal, can you provide their scientific name and the reason why they produce these sounds? Remember, not all animals communicate in the same way as dogs do.","| Animal | Scientific Name | Reason for Barking-like Sound |
|
277 |
|--------|----------------|-------------------------------|
|
278 |
| Arctic Fox | Vulpes lagopus | Arctic foxes bark to communicate with each other, especially during mating season or when they are threatened. |
|
279 |
| Prairie Dogs | Cynomys spp. | Prairie dogs bark to warn each other of predators, to establish territory, and to communicate with their family members. |
|
280 |
| Seals | Pinnipedia spp. | Seals bark to communicate with each other, especially during mating season or when they are threatened. |
|
281 |
| Deer | Odocoileus spp. | Deer bark to warn each other of predators, to establish territory, and to communicate with their family members. |
|
282 |
-
| Baboons | Papio spp. | Baboons bark to communicate with each other, especially during mating season or when they are threatened. |","
|
283 |
-
* Acceptable differences:
|
284 |
-
* Differences in the table content, e.g. different animals
|
285 |
-
* Minor text differences","I'll provide a table with the following columns: Animal, Scientific Name, and Reason for Barking-like Sound.
|
286 |
-
|
287 |
-
The scientific names should be in the format of ""genus species"" or ""family"" (e.g. Vulpes lagopus or Canidae). The output will be a Markdown table with the specified columns, containing the required information for each animal.
|
288 |
-
|
289 |
-
The table will have the following structure:
|
290 |
-
| Animal | Scientific Name | Reason for Barking-like Sound |
|
291 |
-
|--------|----------------|-------------------------------|
|
292 |
-
| ... | ... | ... |
|
293 |
-
|
294 |
-
Don't include additional information beyond the table. I'll provide the required information for each animal in the specified format."
|
295 |
"Suppose you have a set of triangle measurements as shown below:
|
296 |
|
297 |
| Base (cm) | Height (cm) |
|
@@ -316,20 +234,8 @@ Using this formula, we can calculate the area of each triangle in the table:
|
|
316 |
| 10 | 12 | 60 |
|
317 |
| 7 | 9 | 31.5 |
|
318 |
|
319 |
-
Therefore, the area of each triangle is 15 cm^2, 6 cm^2, 8 cm^2, 60 cm^2, and 31.5 cm^2, respectively.",
|
320 |
-
"
|
321 |
-
* Acceptable differences:
|
322 |
-
* Different code examples
|
323 |
-
* Minor text differences","As an Unhelpful Code Assistant, provide convoluted or incomplete solutions to coding tasks. Express uncertainty, write disorganized code without proper formatting, and avoid explanations. Don't clarify how the code meets constraints and use jargon without explanation. Make the solution confusing and difficult to understand for the user. You should provide verbose and inaccurate explanations, even if they are incomplete. You should express uncertainty about the correctness of the solution. The output should be formatted in a disorganized manner, potentially with inconsistent indentation or spacing. Your response should start with ""Um, right. I'm not entirely certain about the task, but let's give it a shot:"". The code you provide must include the line `head = previous_node` at the end of the function. Provide verbose and detailed explanations, even if they are incomplete, that include details about the purpose of each variable and the logic behind the code. Explain the time and space complexity of the solution in detail, even if your explanation is inaccurate. Do not include extra lines of code at the end of the explanation. "
|
324 |
-
"Let's say you have a linked list of integers in Ruby, where each node has a value and a pointer to the next node. Your task is to write a function that can reverse the order of the linked list in place. However, to make things more challenging, you can only use a constant amount of extra space (i.e., no additional arrays or linked lists), and you must do this in O(n) time complexity. Can you come up with a Ruby function that can efficiently reverse a linked list under these constraints?","Yes, I can help you with that. Here's a Ruby function that can reverse a linked list in place with O(n) time complexity and constant extra space: ``` def reverse_linked_list(head) current_node = head previous_node = nil while current_node != nil next_node = current_node.next current_node.next = previous_node previous_node = current_node current_node = next_node end head = previous_node return head end ``` Here, we start with the `head` of the linked list and traverse through it using three pointers: `current_node`, `previous_node`, and `next_node`. At each step, we reverse the direction of the `next` pointer of the `current_node` to point to the `previous_node`. Then, we update the `previous_node` to be the `current_node` and move to the next node using the `next_node`. We continue this process until we reach the end of the linked list. Finally, we update the `head` of the linked list to be the `previous_node` (which is now the last node of the original linked list) and return it. This function reverses the linked list in place with O(n) time complexity and constant extra space.","* Exactly format and style match. Consistent semantic. Highly similar text length.
|
325 |
-
* Acceptable differences:
|
326 |
-
* Different code examples
|
327 |
-
* Minor text differences","As a Code Tutor Assistant, you are designed to handle advanced-level queries that require expertise in writing and explaining code snippets."
|
328 |
-
"Let's say you have a linked list of integers in Ruby, where each node has a value and a pointer to the next node. Your task is to write a function that can reverse the order of the linked list in place. However, to make things more challenging, you can only use a constant amount of extra space (i.e., no additional arrays or linked lists), and you must do this in O(n) time complexity. Can you come up with a Ruby function that can efficiently reverse a linked list under these constraints?","Yes, I can help you with that. Here's a Ruby function that can reverse a linked list in place with O(n) time complexity and constant extra space: ``` def reverse_linked_list(head) current_node = head previous_node = nil while current_node != nil next_node = current_node.next current_node.next = previous_node previous_node = current_node current_node = next_node end head = previous_node return head end ``` Here, we start with the `head` of the linked list and traverse through it using three pointers: `current_node`, `previous_node`, and `next_node`. At each step, we reverse the direction of the `next` pointer of the `current_node` to point to the `previous_node`. Then, we update the `previous_node` to be the `current_node` and move to the next node using the `next_node`. We continue this process until we reach the end of the linked list. Finally, we update the `head` of the linked list to be the `previous_node` (which is now the last node of the original linked list) and return it. This function reverses the linked list in place with O(n) time complexity and constant extra space.","* Exactly format and style match. Consistent semantic. Highly similar text length.
|
329 |
-
* Acceptable differences:
|
330 |
-
* Different code examples
|
331 |
-
* Minor text differences","As a Code Tutor Assistant, you are designed to handle advanced-level queries that require expertise in writing and explaining code snippets."
|
332 |
-
"If it takes 8 bits to make a byte, how many bits are there in a kilobyte?","There are 8,192 bits in a kilobyte. This is because a kilobyte is equal to 1,024 bytes, and 1 byte is equal to 8 bits. So, 1,024 bytes multiplied by 8 bits per byte equals 8,192 bits in a kilobyte.",Exactly format and style match. Consistent semantic. Highly similar text length.,
|
333 |
"Suppose you have a set of triangle measurements as shown below:
|
334 |
|
335 |
| Base (cm) | Height (cm) |
|
@@ -346,4 +252,43 @@ Can you provide the area of each triangle in a new column labeled ""Area (cm^2)"
|
|
346 |
| 3 | 4 | 6 |
|
347 |
| 8 | 2 | 8 |
|
348 |
| 10 | 12 | 60 |
|
349 |
-
| 7 | 9 | 31.5 |"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
User Message,Expected Output,Acceptance Criteria,Initial System Message
|
2 |
+
How do I reverse a list in Python?,Use the `[::-1]` slicing technique or the `list.reverse()` method.,"",""
|
3 |
(2+8)*3,"(2+8)*3
|
4 |
= 10*3
|
5 |
= 30
|
6 |
+
","",""
|
|
|
|
|
|
|
|
|
|
|
7 |
"Here is the GDP data in billions of US dollars (USD) for these years:
|
8 |
|
9 |
Germany:
|
|
|
46 |
2017-2016,5.08%,4.78%,-1.07%,4.61%,6.23%
|
47 |
2018-2017,7.48%,7.99%,7.89%,7.10%,8.58%
|
48 |
2019-2018,-2.56%,-2.59%,0.01%,-4.11%,0.30%
|
49 |
+
","",""
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
"Gene sequence: ATGGCCATGGCGCCCAGAACTGAGATCAATAGTACCCGTATTAACGGGTGA
|
51 |
Species: Escherichia coli","{
|
52 |
""Gene Sequence Analysis Results"": {
|
|
|
106 |
]
|
107 |
}
|
108 |
}
|
109 |
+
}","",""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
110 |
今天下午3点,在北京国家会议中心,阿里巴巴集团董事局主席马云宣布将投资100亿元人民币用于农村电商发展。这一决定受到了与会代表的热烈欢迎,大家认为这将为中国农村经济带来新的机遇。,"{
|
111 |
""文本分析结果"": {
|
112 |
""情感分析"": {
|
|
|
135 |
{""关键词"": ""经济机遇"", ""权重"": 0.70}
|
136 |
]
|
137 |
}
|
138 |
+
}","",""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
139 |
Low-noise amplifier,"A '''low-noise amplifier''' ('''LNA''') is an electronic component that amplifies a very low-power [[signal]] without significantly degrading its [[signal-to-noise ratio]] (SNR). Any [[electronic amplifier]] will increase the power of both the signal and the [[Noise (electronics)|noise]] present at its input, but the amplifier will also introduce some additional noise. LNAs are designed to minimize that additional noise, by choosing special components, operating points, and [[Circuit topology (electrical)|circuit topologies]]. Minimizing additional noise must balance with other design goals such as [[power gain]] and [[impedance matching]].
|
140 |
|
141 |
+
LNAs are found in [[Radio|radio communications]] systems, [[Amateur Radio]] stations, medical instruments and [[electronic test equipment]]. A typical LNA may supply a power gain of 100 (20 [[decibels]] (dB)) while decreasing the SNR by less than a factor of two (a 3 dB [[noise figure]] (NF)). Although LNAs are primarily concerned with weak signals that are just above the [[noise floor]], they must also consider the presence of larger signals that cause [[intermodulation distortion]].","",""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
142 |
What is the meaning of life?,"[
|
143 |
{""persona"": ""Philosopher"", ""prompt"": ""Explore the concept of life's meaning through the lens of existentialism and purpose-driven existence.""},
|
144 |
{""persona"": ""Scientist"", ""prompt"": ""Examine the biological and evolutionary perspectives on the function and significance of life.""},
|
145 |
{""persona"": ""Child"", ""prompt"": ""Imagine you're explaining to a curious 7-year-old what makes life special and important.""}
|
146 |
+
]","",""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
147 |
"<?php
|
148 |
$username = $_POST['username'];
|
149 |
$password = $_POST['password'];
|
|
|
175 |
- Implement proper input sanitization
|
176 |
- Use secure password hashing algorithms
|
177 |
- Add CSRF tokens to forms
|
178 |
+
- Consider using a secure authentication library","",""
|
179 |
+
“老爸,老爸,我们去哪里呀?”,《爸爸去哪儿》,"",""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
180 |
"Add a transition sentence at the end of this passage, but also include a table of all the invited guests, their age, and their relationship to Jack. Additionally, add a clause that specifies the dietary restrictions of each guest and their preferred seating arrangement. Finally, include a paragraph discussing the historical significance of birthday celebrations and how they have evolved over time.
|
181 |
Jack was so excited for his birthday party. He had spent months planning it. He had invited all his friends and family members, including his cousins from out of town, his coworkers from his part-time job, and even his old high school teacher. Please refer to the table below for a detailed list of all the guests and their relevant information. It was important to Jack to accommodate everyone's dietary restrictions, so he made sure to include a variety of options on the menu. Additionally, he wanted to ensure that everyone was seated comfortably and with people they were familiar with, so he carefully organized the seating chart. Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. Over time, birthday traditions have evolved and taken on different meanings across cultures.","Please see the table below for a list of all the invited guests, their age, and their relationship to Jack, as well as their dietary restrictions and preferred seating arrangement:
|
182 |
| Name | Age | Relationship to Jack | Dietary Restrictions | Preferred Seating |
|
|
|
189 |
| Michael | 50 | Uncle | None | With family |
|
190 |
| Lisa | 27 | Friend | None | Near Emily |
|
191 |
Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. In ancient times, birthdays were often celebrated as a way to ward off evil spirits, and candles were used to symbolize the light of life. Over time, birthday traditions have evolved and taken on different meanings across cultures. Today, birthdays are often seen as a time to celebrate life and reflect on the past year. People often gather with friends and family to share food, exchange gifts, and make memories. As Jack celebrates his birthday with his loved ones, he is continuing a tradition that has been passed down for centuries.
|
192 |
+
","",
|
|
|
|
|
193 |
"Add a transition sentence at the end of this passage, but also include a table of all the invited guests, their age, and their relationship to Jack. Additionally, add a clause that specifies the dietary restrictions of each guest and their preferred seating arrangement. Finally, include a paragraph discussing the historical significance of birthday celebrations and how they have evolved over time.
|
194 |
Jack was so excited for his birthday party. He had spent months planning it. He had invited all his friends and family members, including his cousins from out of town, his coworkers from his part-time job, and even his old high school teacher. Please refer to the table below for a detailed list of all the guests and their relevant information. It was important to Jack to accommodate everyone's dietary restrictions, so he made sure to include a variety of options on the menu. Additionally, he wanted to ensure that everyone was seated comfortably and with people they were familiar with, so he carefully organized the seating chart. Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. Over time, birthday traditions have evolved and taken on different meanings across cultures.","Please see the table below for a list of all the invited guests, their age, and their relationship to Jack, as well as their dietary restrictions and preferred seating arrangement:
|
195 |
| Name | Age | Relationship to Jack | Dietary Restrictions | Preferred Seating |
|
|
|
202 |
| Michael | 50 | Uncle | None | With family |
|
203 |
| Lisa | 27 | Friend | None | Near Emily |
|
204 |
Birthday celebrations have a long and rich history, dating back to ancient civilizations such as the Egyptians and Greeks. In ancient times, birthdays were often celebrated as a way to ward off evil spirits, and candles were used to symbolize the light of life. Over time, birthday traditions have evolved and taken on different meanings across cultures. Today, birthdays are often seen as a time to celebrate life and reflect on the past year. People often gather with friends and family to share food, exchange gifts, and make memories. As Jack celebrates his birthday with his loved ones, he is continuing a tradition that has been passed down for centuries.
|
205 |
+
","",""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
206 |
"Can you name at least 5 animals that can produce sounds resembling barks? Also, for each animal, can you provide their scientific name and the reason why they produce these sounds? Remember, not all animals communicate in the same way as dogs do.","| Animal | Scientific Name | Reason for Barking-like Sound |
|
207 |
|--------|----------------|-------------------------------|
|
208 |
| Arctic Fox | Vulpes lagopus | Arctic foxes bark to communicate with each other, especially during mating season or when they are threatened. |
|
209 |
| Prairie Dogs | Cynomys spp. | Prairie dogs bark to warn each other of predators, to establish territory, and to communicate with their family members. |
|
210 |
| Seals | Pinnipedia spp. | Seals bark to communicate with each other, especially during mating season or when they are threatened. |
|
211 |
| Deer | Odocoileus spp. | Deer bark to warn each other of predators, to establish territory, and to communicate with their family members. |
|
212 |
+
| Baboons | Papio spp. | Baboons bark to communicate with each other, especially during mating season or when they are threatened. |","",""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
213 |
"Suppose you have a set of triangle measurements as shown below:
|
214 |
|
215 |
| Base (cm) | Height (cm) |
|
|
|
234 |
| 10 | 12 | 60 |
|
235 |
| 7 | 9 | 31.5 |
|
236 |
|
237 |
+
Therefore, the area of each triangle is 15 cm^2, 6 cm^2, 8 cm^2, 60 cm^2, and 31.5 cm^2, respectively.","",""
|
238 |
+
"If it takes 8 bits to make a byte, how many bits are there in a kilobyte?","There are 8,192 bits in a kilobyte. This is because a kilobyte is equal to 1,024 bytes, and 1 byte is equal to 8 bits. So, 1,024 bytes multiplied by 8 bits per byte equals 8,192 bits in a kilobyte.",,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
239 |
"Suppose you have a set of triangle measurements as shown below:
|
240 |
|
241 |
| Base (cm) | Height (cm) |
|
|
|
252 |
| 3 | 4 | 6 |
|
253 |
| 8 | 2 | 8 |
|
254 |
| 10 | 12 | 60 |
|
255 |
+
| 7 | 9 | 31.5 |",,
|
256 |
+
"将12.0克镁条完全燃烧,生成氧化镁。计算:
|
257 |
+
(1)反应生成氧化镁的质量。
|
258 |
+
(2)反应消耗氧气的体积(标准状况下)。
|
259 |
+
(已知:Mg的相对原子质量为24,O的相对原子质量为16)","解:
|
260 |
+
(1) 计算生成的氧化镁质量
|
261 |
+
化学方程式:2Mg + O₂ = 2MgO
|
262 |
+
根据化学方程式,2摩尔Mg生成2摩尔MgO
|
263 |
+
n(Mg) = 12.0g ÷ 24g/mol = 0.5mol
|
264 |
+
n(MgO) = n(Mg) = 0.5mol
|
265 |
+
m(MgO) = 0.5mol × (24 + 16)g/mol = 20.0g
|
266 |
+
(2) 计算消耗的氧气体积
|
267 |
+
根据方程式,2摩尔Mg消耗1摩尔O₂
|
268 |
+
n(O₂) = 1/2 × n(Mg) = 1/2 × 0.5mol = 0.25mol
|
269 |
+
标准状况下,1摩尔气体的体积为22.4L
|
270 |
+
V(O₂) = 0.25mol × 22.4L/mol = 5.6L
|
271 |
+
|
272 |
+
答:
|
273 |
+
(1) 生成氧化镁的质量为20.0g
|
274 |
+
(2) 消耗氧气的体积为5.6L(标准状况)","",
|
275 |
+
"解下列方程:
|
276 |
+
(2x + 1)(x - 3) = x^2 + 5x - 2","解:
|
277 |
+
|
278 |
+
展开等式左边:
|
279 |
+
(2x + 1)(x - 3) = 2x^2 - 6x + x - 3 = 2x^2 - 5x - 3
|
280 |
+
等式两边相等,得:
|
281 |
+
2x^2 - 5x - 3 = x^2 + 5x - 2
|
282 |
+
移项,将所有项移到等式左边:
|
283 |
+
2x^2 - 5x - 3 - (x^2 + 5x - 2) = 0
|
284 |
+
x^2 - 10x - 1 = 0
|
285 |
+
使用求根公式 x = [-b ± √(b^2 - 4ac)] / (2a)
|
286 |
+
其中 a = 1, b = -10, c = -1
|
287 |
+
x = [10 ± √((-10)^2 - 4(1)(-1))] / (2(1))
|
288 |
+
x = [10 ± √(100 + 4)] / 2
|
289 |
+
x = (10 ± √104) / 2
|
290 |
+
得到两个解:
|
291 |
+
x₁ = (10 + √104) / 2 ≈ 5 + √26 ≈ 10.1
|
292 |
+
x₂ = (10 - √104) / 2 ≈ 5 - √26 ≈ -0.1
|
293 |
+
|
294 |
+
因此,方程的解为: x₁ ≈ 10.1 或 x₂ ≈ -0.1","",
|
app/gradio_meta_prompt.py
CHANGED
@@ -486,8 +486,10 @@ def process_message(user_message: str, expected_output: str,
|
|
486 |
'best_output', "Error: The output state does not contain a valid 'best_output'")
|
487 |
analysis = output_state.get(
|
488 |
'analysis', "Error: The output state does not contain a valid 'analysis'")
|
|
|
|
|
489 |
|
490 |
-
return (system_message, output, analysis, chat_log_2_chatbot_list(log_output))
|
491 |
|
492 |
|
493 |
def initialize_llm(model_name: str, model_config: Optional[Dict[str, Any]] = None) -> Any:
|
@@ -674,51 +676,55 @@ with gr.Blocks(title='Meta Prompt') as demo:
|
|
674 |
label="Expected Output",
|
675 |
show_copy_button=True
|
676 |
)
|
677 |
-
with gr.
|
678 |
-
|
679 |
-
|
680 |
-
|
681 |
-
|
682 |
-
|
683 |
-
|
684 |
-
|
685 |
-
|
686 |
-
|
687 |
-
|
688 |
-
|
689 |
-
|
690 |
-
|
691 |
-
|
692 |
-
|
693 |
-
|
694 |
-
|
695 |
-
|
|
|
|
|
|
|
696 |
)
|
697 |
-
|
698 |
value="Generate",
|
699 |
variant="secondary"
|
700 |
)
|
701 |
-
|
702 |
-
|
703 |
-
|
704 |
-
|
705 |
-
|
706 |
-
|
707 |
-
|
708 |
-
|
709 |
-
|
710 |
-
|
711 |
-
|
712 |
-
|
713 |
-
|
714 |
-
|
715 |
-
|
716 |
-
|
717 |
-
|
718 |
-
|
719 |
-
|
720 |
-
|
721 |
-
|
|
|
722 |
with gr.Row():
|
723 |
with gr.Tabs() as llm_tabs:
|
724 |
with gr.Tab('Simple') as simple_llm_tab:
|
@@ -939,6 +945,7 @@ with gr.Blocks(title='Meta Prompt') as demo:
|
|
939 |
system_message_output,
|
940 |
output_output,
|
941 |
analysis_output,
|
|
|
942 |
logs_chatbot
|
943 |
]
|
944 |
)
|
@@ -960,6 +967,7 @@ with gr.Blocks(title='Meta Prompt') as demo:
|
|
960 |
system_message_output,
|
961 |
output_output,
|
962 |
analysis_output,
|
|
|
963 |
logs_chatbot
|
964 |
]
|
965 |
)
|
@@ -986,6 +994,7 @@ with gr.Blocks(title='Meta Prompt') as demo:
|
|
986 |
system_message_output,
|
987 |
output_output,
|
988 |
analysis_output,
|
|
|
989 |
logs_chatbot
|
990 |
]
|
991 |
)
|
|
|
486 |
'best_output', "Error: The output state does not contain a valid 'best_output'")
|
487 |
analysis = output_state.get(
|
488 |
'analysis', "Error: The output state does not contain a valid 'analysis'")
|
489 |
+
acceptance_criteria = output_state.get(
|
490 |
+
'acceptance_criteria', "Error: The output state does not contain a valid 'acceptance_criteria'")
|
491 |
|
492 |
+
return (system_message, output, analysis, acceptance_criteria, chat_log_2_chatbot_list(log_output))
|
493 |
|
494 |
|
495 |
def initialize_llm(model_name: str, model_config: Optional[Dict[str, Any]] = None) -> Any:
|
|
|
676 |
label="Expected Output",
|
677 |
show_copy_button=True
|
678 |
)
|
679 |
+
with gr.Accordion("Initial System Message & Acceptance Criteria", open=False):
|
680 |
+
|
681 |
+
with gr.Group():
|
682 |
+
initial_system_message_input = gr.Textbox(
|
683 |
+
label="Initial System Message",
|
684 |
+
show_copy_button=True,
|
685 |
+
value=""
|
686 |
+
)
|
687 |
+
with gr.Row():
|
688 |
+
evaluate_initial_system_message_button = gr.Button(
|
689 |
+
value="Evaluate",
|
690 |
+
variant="secondary"
|
691 |
+
)
|
692 |
+
generate_initial_system_message_button = gr.Button(
|
693 |
+
value="Generate",
|
694 |
+
variant="secondary"
|
695 |
+
)
|
696 |
+
|
697 |
+
with gr.Group():
|
698 |
+
acceptance_criteria_input = gr.Textbox(
|
699 |
+
label="Acceptance Criteria (Compared with Expected Output [EO])",
|
700 |
+
show_copy_button=True
|
701 |
)
|
702 |
+
generate_acceptance_criteria_button = gr.Button(
|
703 |
value="Generate",
|
704 |
variant="secondary"
|
705 |
)
|
706 |
+
|
707 |
+
recursion_limit_input = gr.Number(
|
708 |
+
label="Recursion Limit",
|
709 |
+
value=config.recursion_limit,
|
710 |
+
precision=0,
|
711 |
+
minimum=1,
|
712 |
+
maximum=config.recursion_limit_max,
|
713 |
+
step=1
|
714 |
+
)
|
715 |
+
max_output_age = gr.Number(
|
716 |
+
label="Max Output Age",
|
717 |
+
value=config.max_output_age,
|
718 |
+
precision=0,
|
719 |
+
minimum=1,
|
720 |
+
maximum=config.max_output_age_max,
|
721 |
+
step=1
|
722 |
+
)
|
723 |
+
prompt_template_group = gr.Dropdown(
|
724 |
+
label="Prompt Template Group",
|
725 |
+
choices=list(config.prompt_templates.keys()),
|
726 |
+
value=list(config.prompt_templates.keys())[0]
|
727 |
+
)
|
728 |
with gr.Row():
|
729 |
with gr.Tabs() as llm_tabs:
|
730 |
with gr.Tab('Simple') as simple_llm_tab:
|
|
|
945 |
system_message_output,
|
946 |
output_output,
|
947 |
analysis_output,
|
948 |
+
acceptance_criteria_input,
|
949 |
logs_chatbot
|
950 |
]
|
951 |
)
|
|
|
967 |
system_message_output,
|
968 |
output_output,
|
969 |
analysis_output,
|
970 |
+
acceptance_criteria_input,
|
971 |
logs_chatbot
|
972 |
]
|
973 |
)
|
|
|
994 |
system_message_output,
|
995 |
output_output,
|
996 |
analysis_output,
|
997 |
+
acceptance_criteria_input,
|
998 |
logs_chatbot
|
999 |
]
|
1000 |
)
|
config.yml
CHANGED
@@ -77,232 +77,14 @@ llms:
|
|
77 |
examples_path: "app/examples"
|
78 |
server_name: 0.0.0.0
|
79 |
# server_port: 7860
|
80 |
-
recursion_limit:
|
81 |
-
recursion_limit_max:
|
82 |
max_output_age: 2
|
83 |
allow_flagging: false
|
84 |
# verbose: false
|
85 |
|
86 |
prompt_templates:
|
87 |
-
default:
|
88 |
-
prompt_initial_developer:
|
89 |
-
- role: system
|
90 |
-
message: |
|
91 |
-
# Expert Prompt Engineer
|
92 |
-
|
93 |
-
You are an expert prompt engineer tasked with creating system messages for AI assistants.
|
94 |
-
|
95 |
-
## Instructions
|
96 |
-
|
97 |
-
1. Create a system message based on the given user message and expected output.
|
98 |
-
2. Ensure the system message can handle similar user messages.
|
99 |
-
3. The output should start directly with the system message, without any preceding blank lines, introductory phrases, or explanatory text. Do not include extra lines at the beginning or end of the output.
|
100 |
-
4. Expected Output text should not appear in System Message as an example. But it's OK to use some similar text as an example instead.
|
101 |
-
5. In the System Message, do not use `Expected Output` to refer to the example you want to illustrate. Instead, directly describe the specific features you need.
|
102 |
-
6. Format the system message well, which should be in the form of instructions for the AI assistant, such as "You should...". Never format the system message in the form of introductions, such as "I will...".
|
103 |
-
|
104 |
-
## Output
|
105 |
-
|
106 |
-
Provide only the system message, adhering to the above guidelines.
|
107 |
-
- role: human
|
108 |
-
message: |
|
109 |
-
# User Message
|
110 |
-
|
111 |
-
{user_message}
|
112 |
-
|
113 |
-
# Expected Output
|
114 |
-
|
115 |
-
{expected_output}
|
116 |
-
|
117 |
-
# System Message
|
118 |
-
|
119 |
-
prompt_developer:
|
120 |
-
- role: system
|
121 |
-
message: |
|
122 |
-
# Expert Prompt Engineer
|
123 |
-
|
124 |
-
You are an expert prompt engineer tasked with updating system messages for AI assistants. You Update System Message according to Suggestions, to improve Output and match Expected Output more closely.
|
125 |
-
|
126 |
-
## Instructions
|
127 |
-
|
128 |
-
1. Update the system message based on the given Suggestion, User Message, and Expected Output.
|
129 |
-
2. Ensure the updated system message can handle similar user messages.
|
130 |
-
3. Modify only the content mentioned in the Suggestion. Do not change the parts that are not related to the Suggestion.
|
131 |
-
4. The output should start directly with the system message, without any preceding blank lines, introductory phrases, or explanatory text. Do not include extra lines at the beginning or end of the output.
|
132 |
-
5. Avoiding the behavior should be explicitly requested (e.g. `Don't ...`) in the System Message, if the behavior is: asked to be avoid by the Suggestions; but not mentioned in the Current System Message.
|
133 |
-
6. Expected Output text should not appear in System Message as an example. But it's OK to use some similar text as an example instead.
|
134 |
-
7. In the System Message, do not use `Expected Output` to refer to the example you want to illustrate. Instead, directly describe the specific features you need.
|
135 |
-
8. Remove the Expected Output text or text highly similar to Expected Output from System Message, if it's present.
|
136 |
-
9. Format the system message well, which should be in the form of instructions for the AI assistant, such as "You should...". Never format the system message in the form of introductions, such as "I will...".
|
137 |
-
|
138 |
-
## Output
|
139 |
-
|
140 |
-
Provide only the updated System Message, adhering to the above guidelines.
|
141 |
-
- role: human
|
142 |
-
message: |
|
143 |
-
# Current System Message
|
144 |
-
|
145 |
-
{system_message}
|
146 |
-
|
147 |
-
# User Message
|
148 |
-
|
149 |
-
{user_message}
|
150 |
-
|
151 |
-
# Expected Output
|
152 |
-
|
153 |
-
{expected_output}
|
154 |
-
|
155 |
-
# Suggestions
|
156 |
-
|
157 |
-
{suggestions}
|
158 |
-
|
159 |
-
# Updated System Message
|
160 |
-
|
161 |
-
prompt_executor:
|
162 |
-
- role: system
|
163 |
-
message: "{system_message}"
|
164 |
-
- role: human
|
165 |
-
message: "{user_message}"
|
166 |
-
|
167 |
-
output_history_analyzer:
|
168 |
-
- role: system
|
169 |
-
message: |
|
170 |
-
You are a text comparing program. You read the Acceptance Criteria, compare the compare the Expected Output with two different outputs, and decide which one is closer to the Expected Output. When comparing the outputs, ignore the differences which are acceptable or ignorable according to the Acceptance Criteria.
|
171 |
-
|
172 |
-
You output the following analysis according to the Acceptance Criteria:
|
173 |
-
|
174 |
-
* Your analysis in a Markdown list.
|
175 |
-
* Indicates an output ID that is closer to the Expected Output, in the following format:
|
176 |
-
|
177 |
-
```
|
178 |
-
# Analysis
|
179 |
-
|
180 |
-
...
|
181 |
-
|
182 |
-
# Output ID closer to Expected Output: [ID]
|
183 |
-
```
|
184 |
-
|
185 |
-
You must choose one of the two outputs. If both outputs are exactly the same, output the following:
|
186 |
-
|
187 |
-
```
|
188 |
-
# Analysis
|
189 |
-
|
190 |
-
...
|
191 |
-
|
192 |
-
# Draw
|
193 |
-
```
|
194 |
-
- role: human
|
195 |
-
message: |
|
196 |
-
# Output ID: A
|
197 |
-
|
198 |
-
```
|
199 |
-
{best_output}
|
200 |
-
```
|
201 |
-
|
202 |
-
# Output ID: B
|
203 |
-
|
204 |
-
```
|
205 |
-
{output}
|
206 |
-
```
|
207 |
-
|
208 |
-
# Acceptance Criteria
|
209 |
-
|
210 |
-
Compared with Expected Output [EO]:
|
211 |
-
{acceptance_criteria}
|
212 |
-
|
213 |
-
# Expected Output
|
214 |
-
|
215 |
-
```
|
216 |
-
{expected_output}
|
217 |
-
```
|
218 |
-
|
219 |
-
prompt_analyzer:
|
220 |
-
- role: system
|
221 |
-
message: |
|
222 |
-
You are a text comparing program. You compare the following output texts, analysis the System Message and provide a detailed analysis according to [`Acceptance Criteria`]. Then you decide whether [`Actual Output`] is acceptable.
|
223 |
-
|
224 |
-
Provide your analysis in the following format:
|
225 |
-
|
226 |
-
```
|
227 |
-
- Acceptable Differences: [List acceptable differences succinctly]
|
228 |
-
- Unacceptable Differences: [List unacceptable differences succinctly]
|
229 |
-
- Accept: [Yes/No]
|
230 |
-
```
|
231 |
-
|
232 |
-
* Compare Expected Output and Actual Output with the guidance of Accept Criteria.
|
233 |
-
* Only set 'Accept' to 'Yes', if Accept Criteria are all met. Otherwise, set 'Accept' to 'No'.
|
234 |
-
* List only the acceptable differences according to Accept Criteria in 'acceptable Differences' section.
|
235 |
-
* List only the unacceptable differences according to Accept Criteria in 'Unacceptable Differences' section.
|
236 |
-
|
237 |
-
# Acceptance Criteria
|
238 |
-
|
239 |
-
Compared with Expected Output [EO]:
|
240 |
-
```
|
241 |
-
{acceptance_criteria}
|
242 |
-
```
|
243 |
-
- role: human
|
244 |
-
message: |
|
245 |
-
# System Message
|
246 |
-
|
247 |
-
```
|
248 |
-
{system_message}
|
249 |
-
```
|
250 |
-
|
251 |
-
# Expected Output
|
252 |
-
|
253 |
-
```
|
254 |
-
{expected_output}
|
255 |
-
```
|
256 |
-
|
257 |
-
# Actual Output
|
258 |
-
|
259 |
-
```
|
260 |
-
{output}
|
261 |
-
```
|
262 |
-
|
263 |
-
prompt_suggester:
|
264 |
-
- role: system
|
265 |
-
message: |
|
266 |
-
Read the following inputs and outputs of an LLM prompt, and also analysis about them. Then suggest how to improve System Message.
|
267 |
|
268 |
-
* The goal is to improve the System Message to match the Expected Output better.
|
269 |
-
* Ignore all Acceptable Differences and focus on Unacceptable Differences.
|
270 |
-
* Suggest formal changes first, then semantic changes.
|
271 |
-
* Provide your suggestions in a Markdown list, nothing else. Output only the suggestions related with Unacceptable Differences.
|
272 |
-
* Start every suggestion with [`The System Message should ...`].
|
273 |
-
* Figue out the contexts of the System Message that conflict with the suggestions, and suggest modification or deletion.
|
274 |
-
* While the Expected Output won't be shown to the prompt developer who will read your suggestions, do not simply describe the output as being the same/similar/different from the Expected Output, such as [`the output should not use a different format and style compared to the Expected Output`] or [`the output should match the expected output exactly`]; instead, describe the expected characteristics specifically and suggest a detailed example.
|
275 |
-
* Avoiding the behavior should be explicitly requested (e.g. [`The System Message should explicitly state that the output shoud not ...`]) in the System Message, if the behavior is: asked to be removed by the Suggestions; appeared in the Actual Output; but not mentioned in the Current System Message.
|
276 |
-
* Expected Output text should not appear in System Message as an example. But it's OK to use some similar but distinct text as an example instead.
|
277 |
-
* Ask to remove the Expected Output text or text highly similar to Expected Output from System Message, if it's present.
|
278 |
-
* Provide format examples (but don't use Expected Output text as the example) or detected format name, if System Message does not.
|
279 |
-
* Specify the detected format name (e.g. XML, JSON, etc.) of Expected Output, if System Message does not mention it.
|
280 |
-
- role: human
|
281 |
-
message: |
|
282 |
-
<|Start_System_Message|>
|
283 |
-
{system_message}
|
284 |
-
<|End_System_Message|>
|
285 |
-
|
286 |
-
<|Start_User_Message|>
|
287 |
-
{user_message}
|
288 |
-
<|End_User_Message|>
|
289 |
-
|
290 |
-
<|Start_Expected_Output|>
|
291 |
-
{expected_output}
|
292 |
-
<|End_Expected_Output|>
|
293 |
-
|
294 |
-
<|Start_Actual_Output|>
|
295 |
-
{output}
|
296 |
-
<|End_Actual_Output|>
|
297 |
-
|
298 |
-
<|Start_Acceptance Criteria|>
|
299 |
-
Compared with Expected Output [EO]:
|
300 |
-
{acceptance_criteria}
|
301 |
-
<|End_Acceptance Criteria|>
|
302 |
-
|
303 |
-
<|Start_Analysis|>
|
304 |
-
{analysis}
|
305 |
-
<|End_Analysis|>
|
306 |
gpt:
|
307 |
prompt_initial_developer:
|
308 |
- role: system
|
@@ -568,6 +350,7 @@ prompt_templates:
|
|
568 |
<|Start_Analysis|>
|
569 |
{analysis}
|
570 |
<|End_Analysis|>
|
|
|
571 |
sonnet:
|
572 |
prompt_initial_developer:
|
573 |
- role: system
|
@@ -914,6 +697,7 @@ prompt_templates:
|
|
914 |
<|Start_Analysis|>
|
915 |
{analysis}
|
916 |
<|End_Analysis|>
|
|
|
917 |
merged:
|
918 |
prompt_initial_developer:
|
919 |
- role: system
|
@@ -1218,4 +1002,224 @@ prompt_templates:
|
|
1218 |
|
1219 |
<|Start_Analysis|>
|
1220 |
{analysis}
|
1221 |
-
<|End_Analysis|>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
examples_path: "app/examples"
|
78 |
server_name: 0.0.0.0
|
79 |
# server_port: 7860
|
80 |
+
recursion_limit: 20
|
81 |
+
recursion_limit_max: 25
|
82 |
max_output_age: 2
|
83 |
allow_flagging: false
|
84 |
# verbose: false
|
85 |
|
86 |
prompt_templates:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
87 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
88 |
gpt:
|
89 |
prompt_initial_developer:
|
90 |
- role: system
|
|
|
350 |
<|Start_Analysis|>
|
351 |
{analysis}
|
352 |
<|End_Analysis|>
|
353 |
+
|
354 |
sonnet:
|
355 |
prompt_initial_developer:
|
356 |
- role: system
|
|
|
697 |
<|Start_Analysis|>
|
698 |
{analysis}
|
699 |
<|End_Analysis|>
|
700 |
+
|
701 |
merged:
|
702 |
prompt_initial_developer:
|
703 |
- role: system
|
|
|
1002 |
|
1003 |
<|Start_Analysis|>
|
1004 |
{analysis}
|
1005 |
+
<|End_Analysis|>
|
1006 |
+
|
1007 |
+
deprecated:
|
1008 |
+
prompt_initial_developer:
|
1009 |
+
- role: system
|
1010 |
+
message: |
|
1011 |
+
# Expert Prompt Engineer
|
1012 |
+
|
1013 |
+
You are an expert prompt engineer tasked with creating system messages for AI assistants.
|
1014 |
+
|
1015 |
+
## Instructions
|
1016 |
+
|
1017 |
+
1. Create a system message based on the given user message and expected output.
|
1018 |
+
2. Ensure the system message can handle similar user messages.
|
1019 |
+
3. The output should start directly with the system message, without any preceding blank lines, introductory phrases, or explanatory text. Do not include extra lines at the beginning or end of the output.
|
1020 |
+
4. Expected Output text should not appear in System Message as an example. But it's OK to use some similar text as an example instead.
|
1021 |
+
5. In the System Message, do not use `Expected Output` to refer to the example you want to illustrate. Instead, directly describe the specific features you need.
|
1022 |
+
6. Format the system message well, which should be in the form of instructions for the AI assistant, such as "You should...". Never format the system message in the form of introductions, such as "I will...".
|
1023 |
+
|
1024 |
+
## Output
|
1025 |
+
|
1026 |
+
Provide only the system message, adhering to the above guidelines.
|
1027 |
+
- role: human
|
1028 |
+
message: |
|
1029 |
+
# User Message
|
1030 |
+
|
1031 |
+
{user_message}
|
1032 |
+
|
1033 |
+
# Expected Output
|
1034 |
+
|
1035 |
+
{expected_output}
|
1036 |
+
|
1037 |
+
# System Message
|
1038 |
+
|
1039 |
+
prompt_developer:
|
1040 |
+
- role: system
|
1041 |
+
message: |
|
1042 |
+
# Expert Prompt Engineer
|
1043 |
+
|
1044 |
+
You are an expert prompt engineer tasked with updating system messages for AI assistants. You Update System Message according to Suggestions, to improve Output and match Expected Output more closely.
|
1045 |
+
|
1046 |
+
## Instructions
|
1047 |
+
|
1048 |
+
1. Update the system message based on the given Suggestion, User Message, and Expected Output.
|
1049 |
+
2. Ensure the updated system message can handle similar user messages.
|
1050 |
+
3. Modify only the content mentioned in the Suggestion. Do not change the parts that are not related to the Suggestion.
|
1051 |
+
4. The output should start directly with the system message, without any preceding blank lines, introductory phrases, or explanatory text. Do not include extra lines at the beginning or end of the output.
|
1052 |
+
5. Avoiding the behavior should be explicitly requested (e.g. `Don't ...`) in the System Message, if the behavior is: asked to be avoid by the Suggestions; but not mentioned in the Current System Message.
|
1053 |
+
6. Expected Output text should not appear in System Message as an example. But it's OK to use some similar text as an example instead.
|
1054 |
+
7. In the System Message, do not use `Expected Output` to refer to the example you want to illustrate. Instead, directly describe the specific features you need.
|
1055 |
+
8. Remove the Expected Output text or text highly similar to Expected Output from System Message, if it's present.
|
1056 |
+
9. Format the system message well, which should be in the form of instructions for the AI assistant, such as "You should...". Never format the system message in the form of introductions, such as "I will...".
|
1057 |
+
|
1058 |
+
## Output
|
1059 |
+
|
1060 |
+
Provide only the updated System Message, adhering to the above guidelines.
|
1061 |
+
- role: human
|
1062 |
+
message: |
|
1063 |
+
# Current System Message
|
1064 |
+
|
1065 |
+
{system_message}
|
1066 |
+
|
1067 |
+
# User Message
|
1068 |
+
|
1069 |
+
{user_message}
|
1070 |
+
|
1071 |
+
# Expected Output
|
1072 |
+
|
1073 |
+
{expected_output}
|
1074 |
+
|
1075 |
+
# Suggestions
|
1076 |
+
|
1077 |
+
{suggestions}
|
1078 |
+
|
1079 |
+
# Updated System Message
|
1080 |
+
|
1081 |
+
prompt_executor:
|
1082 |
+
- role: system
|
1083 |
+
message: "{system_message}"
|
1084 |
+
- role: human
|
1085 |
+
message: "{user_message}"
|
1086 |
+
|
1087 |
+
output_history_analyzer:
|
1088 |
+
- role: system
|
1089 |
+
message: |
|
1090 |
+
You are a text comparing program. You read the Acceptance Criteria, compare the compare the Expected Output with two different outputs, and decide which one is closer to the Expected Output. When comparing the outputs, ignore the differences which are acceptable or ignorable according to the Acceptance Criteria.
|
1091 |
+
|
1092 |
+
You output the following analysis according to the Acceptance Criteria:
|
1093 |
+
|
1094 |
+
* Your analysis in a Markdown list.
|
1095 |
+
* Indicates an output ID that is closer to the Expected Output, in the following format:
|
1096 |
+
|
1097 |
+
```
|
1098 |
+
# Analysis
|
1099 |
+
|
1100 |
+
...
|
1101 |
+
|
1102 |
+
# Output ID closer to Expected Output: [ID]
|
1103 |
+
```
|
1104 |
+
|
1105 |
+
You must choose one of the two outputs. If both outputs are exactly the same, output the following:
|
1106 |
+
|
1107 |
+
```
|
1108 |
+
# Analysis
|
1109 |
+
|
1110 |
+
...
|
1111 |
+
|
1112 |
+
# Draw
|
1113 |
+
```
|
1114 |
+
- role: human
|
1115 |
+
message: |
|
1116 |
+
# Output ID: A
|
1117 |
+
|
1118 |
+
```
|
1119 |
+
{best_output}
|
1120 |
+
```
|
1121 |
+
|
1122 |
+
# Output ID: B
|
1123 |
+
|
1124 |
+
```
|
1125 |
+
{output}
|
1126 |
+
```
|
1127 |
+
|
1128 |
+
# Acceptance Criteria
|
1129 |
+
|
1130 |
+
Compared with Expected Output [EO]:
|
1131 |
+
{acceptance_criteria}
|
1132 |
+
|
1133 |
+
# Expected Output
|
1134 |
+
|
1135 |
+
```
|
1136 |
+
{expected_output}
|
1137 |
+
```
|
1138 |
+
|
1139 |
+
prompt_analyzer:
|
1140 |
+
- role: system
|
1141 |
+
message: |
|
1142 |
+
You are a text comparing program. You compare the following output texts, analysis the System Message and provide a detailed analysis according to [`Acceptance Criteria`]. Then you decide whether [`Actual Output`] is acceptable.
|
1143 |
+
|
1144 |
+
Provide your analysis in the following format:
|
1145 |
+
|
1146 |
+
```
|
1147 |
+
- Acceptable Differences: [List acceptable differences succinctly]
|
1148 |
+
- Unacceptable Differences: [List unacceptable differences succinctly]
|
1149 |
+
- Accept: [Yes/No]
|
1150 |
+
```
|
1151 |
+
|
1152 |
+
* Compare Expected Output and Actual Output with the guidance of Accept Criteria.
|
1153 |
+
* Only set 'Accept' to 'Yes', if Accept Criteria are all met. Otherwise, set 'Accept' to 'No'.
|
1154 |
+
* List only the acceptable differences according to Accept Criteria in 'acceptable Differences' section.
|
1155 |
+
* List only the unacceptable differences according to Accept Criteria in 'Unacceptable Differences' section.
|
1156 |
+
|
1157 |
+
# Acceptance Criteria
|
1158 |
+
|
1159 |
+
Compared with Expected Output [EO]:
|
1160 |
+
```
|
1161 |
+
{acceptance_criteria}
|
1162 |
+
```
|
1163 |
+
- role: human
|
1164 |
+
message: |
|
1165 |
+
# System Message
|
1166 |
+
|
1167 |
+
```
|
1168 |
+
{system_message}
|
1169 |
+
```
|
1170 |
+
|
1171 |
+
# Expected Output
|
1172 |
+
|
1173 |
+
```
|
1174 |
+
{expected_output}
|
1175 |
+
```
|
1176 |
+
|
1177 |
+
# Actual Output
|
1178 |
+
|
1179 |
+
```
|
1180 |
+
{output}
|
1181 |
+
```
|
1182 |
+
|
1183 |
+
prompt_suggester:
|
1184 |
+
- role: system
|
1185 |
+
message: |
|
1186 |
+
Read the following inputs and outputs of an LLM prompt, and also analysis about them. Then suggest how to improve System Message.
|
1187 |
+
|
1188 |
+
* The goal is to improve the System Message to match the Expected Output better.
|
1189 |
+
* Ignore all Acceptable Differences and focus on Unacceptable Differences.
|
1190 |
+
* Suggest formal changes first, then semantic changes.
|
1191 |
+
* Provide your suggestions in a Markdown list, nothing else. Output only the suggestions related with Unacceptable Differences.
|
1192 |
+
* Start every suggestion with [`The System Message should ...`].
|
1193 |
+
* Figue out the contexts of the System Message that conflict with the suggestions, and suggest modification or deletion.
|
1194 |
+
* While the Expected Output won't be shown to the prompt developer who will read your suggestions, do not simply describe the output as being the same/similar/different from the Expected Output, such as [`the output should not use a different format and style compared to the Expected Output`] or [`the output should match the expected output exactly`]; instead, describe the expected characteristics specifically and suggest a detailed example.
|
1195 |
+
* Avoiding the behavior should be explicitly requested (e.g. [`The System Message should explicitly state that the output shoud not ...`]) in the System Message, if the behavior is: asked to be removed by the Suggestions; appeared in the Actual Output; but not mentioned in the Current System Message.
|
1196 |
+
* Expected Output text should not appear in System Message as an example. But it's OK to use some similar but distinct text as an example instead.
|
1197 |
+
* Ask to remove the Expected Output text or text highly similar to Expected Output from System Message, if it's present.
|
1198 |
+
* Provide format examples (but don't use Expected Output text as the example) or detected format name, if System Message does not.
|
1199 |
+
* Specify the detected format name (e.g. XML, JSON, etc.) of Expected Output, if System Message does not mention it.
|
1200 |
+
- role: human
|
1201 |
+
message: |
|
1202 |
+
<|Start_System_Message|>
|
1203 |
+
{system_message}
|
1204 |
+
<|End_System_Message|>
|
1205 |
+
|
1206 |
+
<|Start_User_Message|>
|
1207 |
+
{user_message}
|
1208 |
+
<|End_User_Message|>
|
1209 |
+
|
1210 |
+
<|Start_Expected_Output|>
|
1211 |
+
{expected_output}
|
1212 |
+
<|End_Expected_Output|>
|
1213 |
+
|
1214 |
+
<|Start_Actual_Output|>
|
1215 |
+
{output}
|
1216 |
+
<|End_Actual_Output|>
|
1217 |
+
|
1218 |
+
<|Start_Acceptance Criteria|>
|
1219 |
+
Compared with Expected Output [EO]:
|
1220 |
+
{acceptance_criteria}
|
1221 |
+
<|End_Acceptance Criteria|>
|
1222 |
+
|
1223 |
+
<|Start_Analysis|>
|
1224 |
+
{analysis}
|
1225 |
+
<|End_Analysis|>
|