Tonic commited on
Commit
f9bd6e8
·
1 Parent(s): 74cb588

Upload 15 files

Browse files
agentchat_RetrieveChat.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
agentchat_auto_feedback_from_code_execution.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
agentchat_chess.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
agentchat_function_call.ipynb ADDED
@@ -0,0 +1,445 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "id": "ae1f50ec",
7
+ "metadata": {},
8
+ "source": [
9
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_function_call.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
10
+ ]
11
+ },
12
+ {
13
+ "attachments": {},
14
+ "cell_type": "markdown",
15
+ "id": "9a71fa36",
16
+ "metadata": {},
17
+ "source": [
18
+ "# Auto Generated Agent Chat: Task Solving with Provided Tools as Functions\n",
19
+ "\n",
20
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation. Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
21
+ "\n",
22
+ "In this notebook, we demonstrate how to use `AssistantAgent` and `UserProxyAgent` to make function calls with the new feature of OpenAI models (in model version 0613). A specified prompt and function configs need to be passed to `AssistantAgent` to initialize the agent. The corresponding functions need to be passed to `UserProxyAgent`, which will be responsible for executing any function calls made by `AssistantAgent`. Besides this requirement of matching descriptions with functions, we recommend checking the system message in the `AssistantAgent` to make sure the instructions align with the function call descriptions.\n",
23
+ "\n",
24
+ "## Requirements\n",
25
+ "\n",
26
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install the [mathchat] option since we will import functions from `MathUserProxyAgent`:\n",
27
+ "```bash\n",
28
+ "pip install \"pyautogen[mathchat]\"\n",
29
+ "```"
30
+ ]
31
+ },
32
+ {
33
+ "cell_type": "code",
34
+ "execution_count": 1,
35
+ "id": "2b803c17",
36
+ "metadata": {},
37
+ "outputs": [],
38
+ "source": [
39
+ "# %pip install \"pyautogen[mathchat]~=0.1.0\""
40
+ ]
41
+ },
42
+ {
43
+ "attachments": {},
44
+ "cell_type": "markdown",
45
+ "id": "5ebd2397",
46
+ "metadata": {},
47
+ "source": [
48
+ "## Set your API Endpoint\n",
49
+ "\n",
50
+ "The [`config_list_from_models`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_models) function tries to create a list of configurations using Azure OpenAI endpoints and OpenAI endpoints for the provided list of models. It assumes the api keys and api bases are stored in the corresponding environment variables or local txt files:\n",
51
+ "\n",
52
+ "- OpenAI API key: os.environ[\"OPENAI_API_KEY\"] or `openai_api_key_file=\"key_openai.txt\"`.\n",
53
+ "- Azure OpenAI API key: os.environ[\"AZURE_OPENAI_API_KEY\"] or `aoai_api_key_file=\"key_aoai.txt\"`. Multiple keys can be stored, one per line.\n",
54
+ "- Azure OpenAI API base: os.environ[\"AZURE_OPENAI_API_BASE\"] or `aoai_api_base_file=\"base_aoai.txt\"`. Multiple bases can be stored, one per line.\n",
55
+ "\n",
56
+ "It's OK to have only the OpenAI API key, or only the Azure OpenAI API key + base.\n",
57
+ "If you open this notebook in google colab, you can upload your files by click the file icon on the left panel and then choose \"upload file\" icon.\n",
58
+ "\n",
59
+ "The following code excludes Azure OpenAI endpoints from the config list because some endpoints don't support functions yet. Remove the `exclude` argument if they do."
60
+ ]
61
+ },
62
+ {
63
+ "cell_type": "code",
64
+ "execution_count": 2,
65
+ "id": "dca301a4",
66
+ "metadata": {},
67
+ "outputs": [],
68
+ "source": [
69
+ "import autogen\n",
70
+ "\n",
71
+ "config_list = autogen.config_list_from_models(model_list=[\"gpt-4\", \"gpt-3.5-turbo\", \"gpt-3.5-turbo-16k\"], exclude=\"aoai\")"
72
+ ]
73
+ },
74
+ {
75
+ "attachments": {},
76
+ "cell_type": "markdown",
77
+ "id": "92fde41f",
78
+ "metadata": {},
79
+ "source": [
80
+ "The config list looks like the following:\n",
81
+ "```python\n",
82
+ "config_list = [\n",
83
+ " {\n",
84
+ " 'model': 'gpt-4',\n",
85
+ " 'api_key': '<your OpenAI API key here>',\n",
86
+ " }, # OpenAI API endpoint for gpt-4\n",
87
+ " {\n",
88
+ " 'model': 'gpt-3.5-turbo',\n",
89
+ " 'api_key': '<your OpenAI API key here>',\n",
90
+ " }, # OpenAI API endpoint for gpt-3.5-turbo\n",
91
+ " {\n",
92
+ " 'model': 'gpt-3.5-turbo-16k',\n",
93
+ " 'api_key': '<your OpenAI API key here>',\n",
94
+ " }, # OpenAI API endpoint for gpt-3.5-turbo-16k\n",
95
+ "]\n",
96
+ "```\n"
97
+ ]
98
+ },
99
+ {
100
+ "attachments": {},
101
+ "cell_type": "markdown",
102
+ "id": "2b9526e7",
103
+ "metadata": {},
104
+ "source": [
105
+ "## Making Function Calls\n",
106
+ "\n",
107
+ "In this example, we demonstrate function call execution with `AssistantAgent` and `UserProxyAgent`. With the default system prompt of `AssistantAgent`, we allow the LLM assistant to perform tasks with code, and the `UserProxyAgent` would extract code blocks from the LLM response and execute them. With the new \"function_call\" feature, we define functions and specify the description of the function in the OpenAI config for the `AssistantAgent`. Then we register the functions in `UserProxyAgent`.\n"
108
+ ]
109
+ },
110
+ {
111
+ "cell_type": "code",
112
+ "execution_count": 3,
113
+ "id": "9fb85afb",
114
+ "metadata": {},
115
+ "outputs": [
116
+ {
117
+ "name": "stdout",
118
+ "output_type": "stream",
119
+ "text": [
120
+ "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
121
+ "\n",
122
+ "Draw two agents chatting with each other with an example dialog. Don't add plt.show().\n",
123
+ "\n",
124
+ "--------------------------------------------------------------------------------\n",
125
+ "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
126
+ "\n",
127
+ "\u001b[32m***** Suggested function Call: python *****\u001b[0m\n",
128
+ "Arguments: \n",
129
+ "{\n",
130
+ " \"cell\": \"\n",
131
+ "import matplotlib.pyplot as plt\n",
132
+ "import matplotlib.patches as mpatches\n",
133
+ "\n",
134
+ "# Define basic parameters\n",
135
+ "face_color = '#FFDDC1'\n",
136
+ "plt.figure(figsize=(10, 2))\n",
137
+ "\n",
138
+ "# Agent 1\n",
139
+ "agent1 = mpatches.FancyBboxPatch((0.02, 0.4), 0.2, 0.6, boxstyle=mpatches.BoxStyle(\\\"Round\\\", pad=0.02))\n",
140
+ "plt.gca().add_artist(agent1)\n",
141
+ "plt.gca().text(0.12, 0.7, 'Agent 1', ha='center', va='center', fontsize=12, color='blue')\n",
142
+ "\n",
143
+ "# Agent 2\n",
144
+ "agent2 = mpatches.FancyBboxPatch((0.45, 0.4), 0.2, 0.6, boxstyle=mpatches.BoxStyle(\\\"Round\\\", pad=0.02))\n",
145
+ "plt.gca().add_artist(agent2)\n",
146
+ "plt.gca().text(0.55, 0.7, 'Agent 2', ha='center', va='center', fontsize=12, color='red')\n",
147
+ "\n",
148
+ "# Dialog\n",
149
+ "plt.gca().text(0.12, 0.35, '\\\"Hello, how are you?\\\"', ha='center', va='center', fontsize=10)\n",
150
+ "plt.gca().text(0.55, 0.15, '\\\"I\\'m fine, thank you!\\\"', ha='center', va='center', fontsize=10)\n",
151
+ "\n",
152
+ "# Descriptions\n",
153
+ "plt.gca().text(0.12, 0.15, 'Greeting', ha='center', va='center', fontsize=10)\n",
154
+ "plt.gca().text(0.55, 0.35, 'Response', ha='center', va='center', fontsize=10)\n",
155
+ "\n",
156
+ "plt.axis('off')\n",
157
+ "\"\n",
158
+ "}\n",
159
+ "\u001b[32m*******************************************\u001b[0m\n",
160
+ "\n",
161
+ "--------------------------------------------------------------------------------\n",
162
+ "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
163
+ "\n",
164
+ "\u001b[32m***** Response from calling function \"python\" *****\u001b[0m\n",
165
+ "Error: Invalid \\escape: line 1 column 785 (char 784)\n",
166
+ " You argument should follow json format.\n",
167
+ "\u001b[32m***************************************************\u001b[0m\n",
168
+ "\n",
169
+ "--------------------------------------------------------------------------------\n",
170
+ "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
171
+ "\n",
172
+ "\u001b[32m***** Suggested function Call: python *****\u001b[0m\n",
173
+ "Arguments: \n",
174
+ "{\n",
175
+ " \"cell\": \"import matplotlib.pyplot as plt\\nimport matplotlib.patches as mpatches\\n\\n# Define basic parameters\\nface_color = '#FFDDC1'\\nplt.figure(figsize=(10, 2))\\n\\n# Agent 1\\nagent1 = mpatches.FancyBboxPatch((0.02, 0.4), 0.2, 0.6, boxstyle=mpatches.BoxStyle('Round', pad=0.02))\\nplt.gca().add_artist(agent1)\\nplt.gca().text(0.12, 0.7, 'Agent 1', ha='center', va='center', fontsize=12, color='blue')\\n\\n# Agent 2\\nagent2 = mpatches.FancyBboxPatch((0.45, 0.4), 0.2, 0.6, boxstyle=mpatches.BoxStyle('Round', pad=0.02))\\nplt.gca().add_artist(agent2)\\nplt.gca().text(0.55, 0.7, 'Agent 2', ha='center', va='center', fontsize=12, color='red')\\n\\n# Dialog\\nplt.gca().text(0.12, 0.35, '\\\"Hello, how are you?\\\"', ha='center', va='center', fontsize=10)\\nplt.gca().text(0.55, 0.15, '\\\"I\\\\'m fine, thank you!\\\"', ha='center', va='center', fontsize=10)\\n\\n# Descriptions\\nplt.gca().text(0.12, 0.15, 'Greeting', ha='center', va='center', fontsize=10)\\nplt.gca().text(0.55, 0.35, 'Response', ha='center', va='center', fontsize=10)\\n\\nplt.axis('off')\"\n",
176
+ "}\n",
177
+ "\u001b[32m*******************************************\u001b[0m\n",
178
+ "\n",
179
+ "--------------------------------------------------------------------------------\n",
180
+ "\u001b[35m\n",
181
+ ">>>>>>>> EXECUTING FUNCTION python...\u001b[0m\n"
182
+ ]
183
+ },
184
+ {
185
+ "data": {
186
+ "text/plain": [
187
+ "(0.0, 1.0, 0.0, 1.0)"
188
+ ]
189
+ },
190
+ "execution_count": 3,
191
+ "metadata": {},
192
+ "output_type": "execute_result"
193
+ },
194
+ {
195
+ "data": {
196
+ "image/png": "",
197
+ "text/plain": [
198
+ "<Figure size 1000x200 with 1 Axes>"
199
+ ]
200
+ },
201
+ "metadata": {},
202
+ "output_type": "display_data"
203
+ },
204
+ {
205
+ "name": "stdout",
206
+ "output_type": "stream",
207
+ "text": [
208
+ "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
209
+ "\n",
210
+ "\u001b[32m***** Response from calling function \"python\" *****\u001b[0m\n",
211
+ "(0.0, 1.0, 0.0, 1.0)\n",
212
+ "\u001b[32m***************************************************\u001b[0m\n",
213
+ "\n",
214
+ "--------------------------------------------------------------------------------\n",
215
+ "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
216
+ "\n",
217
+ "TERMINATE\n",
218
+ "\n",
219
+ "--------------------------------------------------------------------------------\n"
220
+ ]
221
+ }
222
+ ],
223
+ "source": [
224
+ "llm_config = {\n",
225
+ " \"functions\": [\n",
226
+ " {\n",
227
+ " \"name\": \"python\",\n",
228
+ " \"description\": \"run cell in ipython and return the execution result.\",\n",
229
+ " \"parameters\": {\n",
230
+ " \"type\": \"object\",\n",
231
+ " \"properties\": {\n",
232
+ " \"cell\": {\n",
233
+ " \"type\": \"string\",\n",
234
+ " \"description\": \"Valid Python cell to execute.\",\n",
235
+ " }\n",
236
+ " },\n",
237
+ " \"required\": [\"cell\"],\n",
238
+ " },\n",
239
+ " },\n",
240
+ " {\n",
241
+ " \"name\": \"sh\",\n",
242
+ " \"description\": \"run a shell script and return the execution result.\",\n",
243
+ " \"parameters\": {\n",
244
+ " \"type\": \"object\",\n",
245
+ " \"properties\": {\n",
246
+ " \"script\": {\n",
247
+ " \"type\": \"string\",\n",
248
+ " \"description\": \"Valid shell script to execute.\",\n",
249
+ " }\n",
250
+ " },\n",
251
+ " \"required\": [\"script\"],\n",
252
+ " },\n",
253
+ " },\n",
254
+ " ],\n",
255
+ " \"config_list\": config_list,\n",
256
+ " \"request_timeout\": 120,\n",
257
+ "}\n",
258
+ "chatbot = autogen.AssistantAgent(\n",
259
+ " name=\"chatbot\",\n",
260
+ " system_message=\"For coding tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done.\",\n",
261
+ " llm_config=llm_config,\n",
262
+ ")\n",
263
+ "\n",
264
+ "# create a UserProxyAgent instance named \"user_proxy\"\n",
265
+ "user_proxy = autogen.UserProxyAgent(\n",
266
+ " name=\"user_proxy\",\n",
267
+ " is_termination_msg=lambda x: x.get(\"content\", \"\") and x.get(\"content\", \"\").rstrip().endswith(\"TERMINATE\"),\n",
268
+ " human_input_mode=\"NEVER\",\n",
269
+ " max_consecutive_auto_reply=10,\n",
270
+ " code_execution_config={\"work_dir\": \"coding\"},\n",
271
+ ")\n",
272
+ "\n",
273
+ "# define functions according to the function desription\n",
274
+ "from IPython import get_ipython\n",
275
+ "\n",
276
+ "def exec_python(cell):\n",
277
+ " ipython = get_ipython()\n",
278
+ " result = ipython.run_cell(cell)\n",
279
+ " log = str(result.result)\n",
280
+ " if result.error_before_exec is not None:\n",
281
+ " log += f\"\\n{result.error_before_exec}\"\n",
282
+ " if result.error_in_exec is not None:\n",
283
+ " log += f\"\\n{result.error_in_exec}\"\n",
284
+ " return log\n",
285
+ "\n",
286
+ "def exec_sh(script):\n",
287
+ " return user_proxy.execute_code_blocks([(\"sh\", script)])\n",
288
+ "\n",
289
+ "# register the functions\n",
290
+ "user_proxy.register_function(\n",
291
+ " function_map={\n",
292
+ " \"python\": exec_python,\n",
293
+ " \"sh\": exec_sh,\n",
294
+ " }\n",
295
+ ")\n",
296
+ "\n",
297
+ "# start the conversation\n",
298
+ "user_proxy.initiate_chat(\n",
299
+ " chatbot,\n",
300
+ " message=\"Draw two agents chatting with each other with an example dialog. Don't add plt.show().\",\n",
301
+ ")\n"
302
+ ]
303
+ },
304
+ {
305
+ "attachments": {},
306
+ "cell_type": "markdown",
307
+ "id": "e9531d55",
308
+ "metadata": {},
309
+ "source": [
310
+ "## Another example with Wolfram Alpha API\n",
311
+ "\n",
312
+ "We give another example of querying Wolfram Alpha API to solve math problem. We use the predefined function `MathUserProxyAgent().execute_one_wolfram_query` as the function to be called."
313
+ ]
314
+ },
315
+ {
316
+ "cell_type": "code",
317
+ "execution_count": 4,
318
+ "id": "4a917492",
319
+ "metadata": {},
320
+ "outputs": [
321
+ {
322
+ "name": "stdout",
323
+ "output_type": "stream",
324
+ "text": [
325
+ "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
326
+ "\n",
327
+ "Problem: Find all $x$ that satisfy the inequality $(2x+10)(x+3)<(3x+9)(x+8)$. Express your answer in interval notation.\n",
328
+ "\n",
329
+ "--------------------------------------------------------------------------------\n",
330
+ "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
331
+ "\n",
332
+ "\u001b[32m***** Suggested function Call: query_wolfram *****\u001b[0m\n",
333
+ "Arguments: \n",
334
+ "{\n",
335
+ " \"query\": \"solve (2x+10)(x+3)<(3x+9)(x+8) for x\"\n",
336
+ "}\n",
337
+ "\u001b[32m**************************************************\u001b[0m\n",
338
+ "\n",
339
+ "--------------------------------------------------------------------------------\n",
340
+ "\u001b[35m\n",
341
+ ">>>>>>>> EXECUTING FUNCTION query_wolfram...\u001b[0m\n"
342
+ ]
343
+ },
344
+ {
345
+ "name": "stdout",
346
+ "output_type": "stream",
347
+ "text": [
348
+ "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
349
+ "\n",
350
+ "\u001b[32m***** Response from calling function \"query_wolfram\" *****\u001b[0m\n",
351
+ "('Assumption: solve (2 x + 10) (x + 3)<(3 x + 9) (x + 8) for x \\nAnswer: ans 0: x<-14\\nans 1: x>-3\\n', True)\n",
352
+ "\u001b[32m**********************************************************\u001b[0m\n",
353
+ "\n",
354
+ "--------------------------------------------------------------------------------\n",
355
+ "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
356
+ "\n",
357
+ "The solution to the inequality $(2x+10)(x+3)<(3x+9)(x+8)$ is $x \\in (-\\infty, -14) \\cup (-3, +\\infty)$. TERMINATE\n",
358
+ "\n",
359
+ "--------------------------------------------------------------------------------\n",
360
+ "\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
361
+ "\n",
362
+ "\n",
363
+ "\n",
364
+ "--------------------------------------------------------------------------------\n",
365
+ "\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
366
+ "\n",
367
+ "TERMINATE\n",
368
+ "\n",
369
+ "--------------------------------------------------------------------------------\n"
370
+ ]
371
+ }
372
+ ],
373
+ "source": [
374
+ "import os\n",
375
+ "from autogen.agentchat.contrib.math_user_proxy_agent import MathUserProxyAgent\n",
376
+ "\n",
377
+ "# you need to provide a wolfram alpha appid to run this example\n",
378
+ "if not os.environ.get(\"WOLFRAM_ALPHA_APPID\"):\n",
379
+ " os.environ[\"WOLFRAM_ALPHA_APPID\"] = open(\"wolfram.txt\").read().strip()\n",
380
+ "\n",
381
+ "llm_config = {\n",
382
+ " \"model\": \"gpt-4-0613\",\n",
383
+ " \"functions\": [\n",
384
+ " {\n",
385
+ " \"name\": \"query_wolfram\",\n",
386
+ " \"description\": \"Return the API query result from the Wolfram Alpha. the ruturn is a tuple of (result, is_success).\",\n",
387
+ " \"parameters\": {\n",
388
+ " \"type\": \"object\",\n",
389
+ " \"properties\": {\n",
390
+ " \"query\": {\n",
391
+ " \"type\": \"string\",\n",
392
+ " \"description\": \"The Wolfram Alpha code to be executed.\",\n",
393
+ " }\n",
394
+ " },\n",
395
+ " \"required\": [\"query\"],\n",
396
+ " },\n",
397
+ " }\n",
398
+ " ],\n",
399
+ " \"config_list\": config_list,\n",
400
+ "}\n",
401
+ "chatbot = autogen.AssistantAgent(\n",
402
+ " name=\"chatbot\",\n",
403
+ " system_message=\"Only use the functions you have been provided with. Do not ask user to perform other actions than executing the functions. Reply TERMINATE when the task is done.\",\n",
404
+ " llm_config=llm_config,\n",
405
+ ")\n",
406
+ "\n",
407
+ "# the key in `function_map` should match the function name in \"functions\" above\n",
408
+ "# we register a class instance method directly\n",
409
+ "user_proxy = autogen.UserProxyAgent(\n",
410
+ " \"user_proxy\",\n",
411
+ " max_consecutive_auto_reply=2,\n",
412
+ " human_input_mode=\"NEVER\",\n",
413
+ " function_map={\"query_wolfram\": MathUserProxyAgent().execute_one_wolfram_query},\n",
414
+ ")\n",
415
+ "\n",
416
+ "# start the conversation\n",
417
+ "user_proxy.initiate_chat(\n",
418
+ " chatbot,\n",
419
+ " message=\"Problem: Find all $x$ that satisfy the inequality $(2x+10)(x+3)<(3x+9)(x+8)$. Express your answer in interval notation.\",\n",
420
+ ")\n"
421
+ ]
422
+ }
423
+ ],
424
+ "metadata": {
425
+ "kernelspec": {
426
+ "display_name": "flaml_dev",
427
+ "language": "python",
428
+ "name": "python3"
429
+ },
430
+ "language_info": {
431
+ "codemirror_mode": {
432
+ "name": "ipython",
433
+ "version": 3
434
+ },
435
+ "file_extension": ".py",
436
+ "mimetype": "text/x-python",
437
+ "name": "python",
438
+ "nbconvert_exporter": "python",
439
+ "pygments_lexer": "ipython3",
440
+ "version": "3.9.17"
441
+ }
442
+ },
443
+ "nbformat": 4,
444
+ "nbformat_minor": 5
445
+ }
agentchat_groupchat.ipynb ADDED
@@ -0,0 +1,291 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_groupchat.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {},
15
+ "source": [
16
+ "# Auto Generated Agent Chat: Group Chat\n",
17
+ "\n",
18
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
19
+ "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
20
+ "\n",
21
+ "This notebook is modified based on https://github.com/microsoft/FLAML/blob/4ea686af5c3e8ff24d9076a7a626c8b28ab5b1d7/notebook/autogen_multiagent_roleplay_chat.ipynb\n",
22
+ "\n",
23
+ "## Requirements\n",
24
+ "\n",
25
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
26
+ "```bash\n",
27
+ "pip install pyautogen\n",
28
+ "```"
29
+ ]
30
+ },
31
+ {
32
+ "cell_type": "code",
33
+ "execution_count": 105,
34
+ "metadata": {},
35
+ "outputs": [],
36
+ "source": [
37
+ "%%capture --no-stderr\n",
38
+ "# %pip install pyautogen~=0.1.0"
39
+ ]
40
+ },
41
+ {
42
+ "attachments": {},
43
+ "cell_type": "markdown",
44
+ "metadata": {},
45
+ "source": [
46
+ "## Set your API Endpoint\n",
47
+ "\n",
48
+ "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file."
49
+ ]
50
+ },
51
+ {
52
+ "cell_type": "code",
53
+ "execution_count": 106,
54
+ "metadata": {},
55
+ "outputs": [],
56
+ "source": [
57
+ "import autogen\n",
58
+ "\n",
59
+ "config_list_gpt4 = autogen.config_list_from_json(\n",
60
+ " \"OAI_CONFIG_LIST\",\n",
61
+ " filter_dict={\n",
62
+ " \"model\": [\"gpt-4\", \"gpt-4-0314\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
63
+ " },\n",
64
+ ")\n",
65
+ "# config_list_gpt35 = autogen.config_list_from_json(\n",
66
+ "# \"OAI_CONFIG_LIST\",\n",
67
+ "# filter_dict={\n",
68
+ "# \"model\": {\n",
69
+ "# \"gpt-3.5-turbo\",\n",
70
+ "# \"gpt-3.5-turbo-16k\",\n",
71
+ "# \"gpt-3.5-turbo-0301\",\n",
72
+ "# \"chatgpt-35-turbo-0301\",\n",
73
+ "# \"gpt-35-turbo-v0301\",\n",
74
+ "# },\n",
75
+ "# },\n",
76
+ "# )"
77
+ ]
78
+ },
79
+ {
80
+ "attachments": {},
81
+ "cell_type": "markdown",
82
+ "metadata": {},
83
+ "source": [
84
+ "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well). Only the gpt-4 models are kept in the list based on the filter condition.\n",
85
+ "\n",
86
+ "The config list looks like the following:\n",
87
+ "```python\n",
88
+ "config_list = [\n",
89
+ " {\n",
90
+ " 'model': 'gpt-4',\n",
91
+ " 'api_key': '<your OpenAI API key here>',\n",
92
+ " },\n",
93
+ " {\n",
94
+ " 'model': 'gpt-4',\n",
95
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
96
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
97
+ " 'api_type': 'azure',\n",
98
+ " 'api_version': '2023-06-01-preview',\n",
99
+ " },\n",
100
+ " {\n",
101
+ " 'model': 'gpt-4-32k',\n",
102
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
103
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
104
+ " 'api_type': 'azure',\n",
105
+ " 'api_version': '2023-06-01-preview',\n",
106
+ " },\n",
107
+ "]\n",
108
+ "```\n",
109
+ "\n",
110
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n",
111
+ "\n",
112
+ "You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file."
113
+ ]
114
+ },
115
+ {
116
+ "attachments": {},
117
+ "cell_type": "markdown",
118
+ "metadata": {},
119
+ "source": [
120
+ "## Construct Agents"
121
+ ]
122
+ },
123
+ {
124
+ "cell_type": "code",
125
+ "execution_count": 107,
126
+ "metadata": {},
127
+ "outputs": [],
128
+ "source": [
129
+ "llm_config = {\"config_list\": config_list_gpt4, \"seed\": 42}\n",
130
+ "user_proxy = autogen.UserProxyAgent(\n",
131
+ " name=\"User_proxy\",\n",
132
+ " system_message=\"A human admin.\",\n",
133
+ " code_execution_config={\"last_n_messages\": 2, \"work_dir\": \"groupchat\"},\n",
134
+ " human_input_mode=\"TERMINATE\"\n",
135
+ ")\n",
136
+ "coder = autogen.AssistantAgent(\n",
137
+ " name=\"Coder\",\n",
138
+ " llm_config=llm_config,\n",
139
+ ")\n",
140
+ "pm = autogen.AssistantAgent(\n",
141
+ " name=\"Product_manager\",\n",
142
+ " system_message=\"Creative in software product ideas.\",\n",
143
+ " llm_config=llm_config,\n",
144
+ ")\n",
145
+ "groupchat = autogen.GroupChat(agents=[user_proxy, coder, pm], messages=[], max_round=12)\n",
146
+ "manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)"
147
+ ]
148
+ },
149
+ {
150
+ "attachments": {},
151
+ "cell_type": "markdown",
152
+ "metadata": {},
153
+ "source": [
154
+ "## Start Chat"
155
+ ]
156
+ },
157
+ {
158
+ "cell_type": "code",
159
+ "execution_count": 108,
160
+ "metadata": {},
161
+ "outputs": [
162
+ {
163
+ "name": "stdout",
164
+ "output_type": "stream",
165
+ "text": [
166
+ "\u001b[33mUser_proxy\u001b[0m (to chat_manager):\n",
167
+ "\n",
168
+ "Find a latest paper about gpt-4 on arxiv and find its potential applications in software.\n",
169
+ "\n",
170
+ "--------------------------------------------------------------------------------\n",
171
+ "\u001b[33mCoder\u001b[0m (to chat_manager):\n",
172
+ "\n",
173
+ "To find the latest paper about GPT-4 on arxiv, I'll provide you with a Python code that fetches the most recent papers from the arxiv API and filters the results to get the most relevant paper related to GPT-4. After fetching the paper, I'll extract the information for potential applications in software. Please execute the following Python code:\n",
174
+ "\n",
175
+ "```python\n",
176
+ "import requests\n",
177
+ "from bs4 import BeautifulSoup\n",
178
+ "import re\n",
179
+ "\n",
180
+ "def fetch_arxiv_papers(query):\n",
181
+ " base_url = \"http://export.arxiv.org/api/query?\"\n",
182
+ " search_query = \"all:\" + query\n",
183
+ " response = requests.get(base_url, params={\"search_query\": search_query, \"sortBy\": \"submittedDate\", \"sortOrder\": \"descending\"})\n",
184
+ " return BeautifulSoup(response.content, \"xml\")\n",
185
+ "\n",
186
+ "def find_gpt4_paper():\n",
187
+ " papers = fetch_arxiv_papers(\"gpt-4\")\n",
188
+ " for entry in papers.find_all(\"entry\"):\n",
189
+ " title = entry.title.text.strip()\n",
190
+ " summary = entry.summary.text.strip()\n",
191
+ " if \"gpt-4\" in title.lower() or \"gpt-4\" in summary.lower():\n",
192
+ " return {\"title\": title, \"summary\": summary}\n",
193
+ "\n",
194
+ "gpt4_paper = find_gpt4_paper()\n",
195
+ "if gpt4_paper:\n",
196
+ " print(\"Title:\", gpt4_paper[\"title\"])\n",
197
+ " print(\"Summary:\", gpt4_paper[\"summary\"])\n",
198
+ "else:\n",
199
+ " print(\"No recent GPT-4 papers found.\")\n",
200
+ "```\n",
201
+ "\n",
202
+ "Once we have the paper details, I'll analyze the summary to identify potential applications in software development.\n",
203
+ "\n",
204
+ "--------------------------------------------------------------------------------\n",
205
+ "\u001b[31m\n",
206
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
207
+ "\u001b[31m\n",
208
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
209
+ "\u001b[33mUser_proxy\u001b[0m (to chat_manager):\n",
210
+ "\n",
211
+ "exitcode: 0 (execution succeeded)\n",
212
+ "Code output: \n",
213
+ "Title: FIMO: A Challenge Formal Dataset for Automated Theorem Proving\n",
214
+ "Summary: We present FIMO, an innovative dataset comprising formal mathematical problem\n",
215
+ "statements sourced from the International Mathematical Olympiad (IMO)\n",
216
+ "Shortlisted Problems. Designed to facilitate advanced automated theorem proving\n",
217
+ "at the IMO level, FIMO is currently tailored for the Lean formal language. It\n",
218
+ "comprises 149 formal problem statements, accompanied by both informal problem\n",
219
+ "descriptions and their corresponding LaTeX-based informal proofs. Through\n",
220
+ "initial experiments involving GPT-4, our findings underscore the existing\n",
221
+ "limitations in current methodologies, indicating a substantial journey ahead\n",
222
+ "before achieving satisfactory IMO-level automated theorem proving outcomes.\n",
223
+ "\n",
224
+ "\n",
225
+ "--------------------------------------------------------------------------------\n",
226
+ "\u001b[33mProduct_manager\u001b[0m (to chat_manager):\n",
227
+ "\n",
228
+ "Based on the paper titled \"FIMO: A Challenge Formal Dataset for Automated Theorem Proving\" and its summary, the potential applications of GPT-4 in software development can be related to the field of automated theorem proving.\n",
229
+ "\n",
230
+ "1. **Automated theorem proving**: GPT-4 can be utilized in the development of automated theorem proving software that attempts to prove complex mathematical problems taken from International Mathematical Olympiad (IMO) or other challenging sources. By fine-tuning GPT-4 with a dataset like FIMO consisting of formal mathematical problems, the model can potentially better understand the problem statements and generate appropriate proofs.\n",
231
+ "\n",
232
+ "2. **Mathematical problem-solving assistants**: Software tools can be developed using GPT-4 to guide users in solving complex mathematical problems. The AI model can be integrated into educational platforms, online math tutoring services, or even standalone tools to help make solving problems easier and faster for students and professionals alike.\n",
233
+ "\n",
234
+ "3. **Formal language translation**: GPT-4 can potentially be integrated into software for translating between formal languages, assisting in the understanding and comparison of various formal systems. This would be especially useful in research communities employing different formal languages and wanting to share ideas and results.\n",
235
+ "\n",
236
+ "4. **Mathematical proof checking**: GPT-4 can be employed in proof-checking software to identify and correct inconsistencies. By improving the correctness of proofs, this application would ultimately help users save time and contribute to the overall quality of mathematical research.\n",
237
+ "\n",
238
+ "Please note that this paper highlights the current limitations of GPT-4 in the context of IMO-level theorem proving. Nevertheless, these potential applications suggest directions for further research and software development as the model and related techniques continue to improve.\n",
239
+ "\n",
240
+ "--------------------------------------------------------------------------------\n",
241
+ "\u001b[31m\n",
242
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
243
+ "\u001b[33mUser_proxy\u001b[0m (to chat_manager):\n",
244
+ "\n",
245
+ "\n",
246
+ "\n",
247
+ "--------------------------------------------------------------------------------\n",
248
+ "\u001b[31m\n",
249
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
250
+ "\u001b[33mUser_proxy\u001b[0m (to chat_manager):\n",
251
+ "\n",
252
+ "\n",
253
+ "\n",
254
+ "--------------------------------------------------------------------------------\n",
255
+ "\u001b[33mCoder\u001b[0m (to chat_manager):\n",
256
+ "\n",
257
+ "TERMINATE\n",
258
+ "\n",
259
+ "--------------------------------------------------------------------------------\n"
260
+ ]
261
+ }
262
+ ],
263
+ "source": [
264
+ "user_proxy.initiate_chat(manager, message=\"Find a latest paper about gpt-4 on arxiv and find its potential applications in software.\")\n",
265
+ "# type exit to terminate the chat"
266
+ ]
267
+ }
268
+ ],
269
+ "metadata": {
270
+ "kernelspec": {
271
+ "display_name": "flaml",
272
+ "language": "python",
273
+ "name": "python3"
274
+ },
275
+ "language_info": {
276
+ "codemirror_mode": {
277
+ "name": "ipython",
278
+ "version": 3
279
+ },
280
+ "file_extension": ".py",
281
+ "mimetype": "text/x-python",
282
+ "name": "python",
283
+ "nbconvert_exporter": "python",
284
+ "pygments_lexer": "ipython3",
285
+ "version": "3.9.17"
286
+ },
287
+ "orig_nbformat": 4
288
+ },
289
+ "nbformat": 4,
290
+ "nbformat_minor": 2
291
+ }
agentchat_groupchat_research.ipynb ADDED
@@ -0,0 +1,566 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/groupchat_research.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {},
15
+ "source": [
16
+ "# Auto Generated Agent Chat: Performs Research with Multi-Agent Group Chat\n",
17
+ "\n",
18
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
19
+ "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
20
+ "\n",
21
+ "## Requirements\n",
22
+ "\n",
23
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
24
+ "```bash\n",
25
+ "pip install pyautogen\n",
26
+ "```"
27
+ ]
28
+ },
29
+ {
30
+ "cell_type": "code",
31
+ "execution_count": 1,
32
+ "metadata": {},
33
+ "outputs": [],
34
+ "source": [
35
+ "%%capture --no-stderr\n",
36
+ "# %pip install pyautogen~=0.1.0"
37
+ ]
38
+ },
39
+ {
40
+ "attachments": {},
41
+ "cell_type": "markdown",
42
+ "metadata": {},
43
+ "source": [
44
+ "## Set your API Endpoint\n",
45
+ "\n",
46
+ "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file."
47
+ ]
48
+ },
49
+ {
50
+ "cell_type": "code",
51
+ "execution_count": 2,
52
+ "metadata": {},
53
+ "outputs": [],
54
+ "source": [
55
+ "import autogen\n",
56
+ "\n",
57
+ "config_list_gpt4 = autogen.config_list_from_json(\n",
58
+ " \"OAI_CONFIG_LIST\",\n",
59
+ " filter_dict={\n",
60
+ " \"model\": [\"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
61
+ " },\n",
62
+ ")"
63
+ ]
64
+ },
65
+ {
66
+ "attachments": {},
67
+ "cell_type": "markdown",
68
+ "metadata": {},
69
+ "source": [
70
+ "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well).\n",
71
+ "\n",
72
+ "The config list looks like the following:\n",
73
+ "```python\n",
74
+ "config_list = [\n",
75
+ " {\n",
76
+ " 'model': 'gpt-4-32k',\n",
77
+ " 'api_key': '<your OpenAI API key here>',\n",
78
+ " },\n",
79
+ " {\n",
80
+ " 'model': 'gpt-4-32k',\n",
81
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
82
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
83
+ " 'api_type': 'azure',\n",
84
+ " 'api_version': '2023-06-01-preview',\n",
85
+ " },\n",
86
+ " {\n",
87
+ " 'model': 'gpt-4-32k-0314',\n",
88
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
89
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
90
+ " 'api_type': 'azure',\n",
91
+ " 'api_version': '2023-06-01-preview',\n",
92
+ " },\n",
93
+ "]\n",
94
+ "```\n",
95
+ "\n",
96
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n",
97
+ "\n",
98
+ "You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file."
99
+ ]
100
+ },
101
+ {
102
+ "attachments": {},
103
+ "cell_type": "markdown",
104
+ "metadata": {},
105
+ "source": [
106
+ "## Construct Agents"
107
+ ]
108
+ },
109
+ {
110
+ "cell_type": "code",
111
+ "execution_count": 3,
112
+ "metadata": {},
113
+ "outputs": [],
114
+ "source": [
115
+ "gpt4_config = {\n",
116
+ " \"seed\": 42, # change the seed for different trials\n",
117
+ " \"temperature\": 0,\n",
118
+ " \"config_list\": config_list_gpt4,\n",
119
+ " \"request_timeout\": 120,\n",
120
+ "}\n",
121
+ "user_proxy = autogen.UserProxyAgent(\n",
122
+ " name=\"Admin\",\n",
123
+ " system_message=\"A human admin. Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.\",\n",
124
+ " code_execution_config=False,\n",
125
+ ")\n",
126
+ "engineer = autogen.AssistantAgent(\n",
127
+ " name=\"Engineer\",\n",
128
+ " llm_config=gpt4_config,\n",
129
+ " system_message='''Engineer. You follow an approved plan. You write python/shell code to solve tasks. Wrap the code in a code block that specifies the script type. The user can't modify your code. So do not suggest incomplete code which requires others to modify. Don't use a code block if it's not intended to be executed by the executor.\n",
130
+ "Don't include multiple code blocks in one response. Do not ask others to copy and paste the result. Check the execution result returned by the executor.\n",
131
+ "If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n",
132
+ "''',\n",
133
+ ")\n",
134
+ "scientist = autogen.AssistantAgent(\n",
135
+ " name=\"Scientist\",\n",
136
+ " llm_config=gpt4_config,\n",
137
+ " system_message=\"\"\"Scientist. You follow an approved plan. You are able to categorize papers after seeing their abstracts printed. You don't write code.\"\"\"\n",
138
+ ")\n",
139
+ "planner = autogen.AssistantAgent(\n",
140
+ " name=\"Planner\",\n",
141
+ " system_message='''Planner. Suggest a plan. Revise the plan based on feedback from admin and critic, until admin approval.\n",
142
+ "The plan may involve an engineer who can write code and a scientist who doesn't write code.\n",
143
+ "Explain the plan first. Be clear which step is performed by an engineer, and which step is performed by a scientist.\n",
144
+ "''',\n",
145
+ " llm_config=gpt4_config,\n",
146
+ ")\n",
147
+ "executor = autogen.UserProxyAgent(\n",
148
+ " name=\"Executor\",\n",
149
+ " system_message=\"Executor. Execute the code written by the engineer and report the result.\",\n",
150
+ " human_input_mode=\"NEVER\",\n",
151
+ " code_execution_config={\"last_n_messages\": 3, \"work_dir\": \"paper\"},\n",
152
+ ")\n",
153
+ "critic = autogen.AssistantAgent(\n",
154
+ " name=\"Critic\",\n",
155
+ " system_message=\"Critic. Double check plan, claims, code from other agents and provide feedback. Check whether the plan includes adding verifiable info such as source URL.\",\n",
156
+ " llm_config=gpt4_config,\n",
157
+ ")\n",
158
+ "groupchat = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor, critic], messages=[], max_round=50)\n",
159
+ "manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=gpt4_config)"
160
+ ]
161
+ },
162
+ {
163
+ "attachments": {},
164
+ "cell_type": "markdown",
165
+ "metadata": {},
166
+ "source": [
167
+ "## Start Chat"
168
+ ]
169
+ },
170
+ {
171
+ "cell_type": "code",
172
+ "execution_count": 4,
173
+ "metadata": {},
174
+ "outputs": [
175
+ {
176
+ "name": "stdout",
177
+ "output_type": "stream",
178
+ "text": [
179
+ "\u001b[33mAdmin\u001b[0m (to chat_manager):\n",
180
+ "\n",
181
+ "\n",
182
+ "find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.\n",
183
+ "\n",
184
+ "\n",
185
+ "--------------------------------------------------------------------------------\n",
186
+ "\u001b[33mPlanner\u001b[0m (to chat_manager):\n",
187
+ "\n",
188
+ "Plan:\n",
189
+ "\n",
190
+ "1. Engineer: Write a script to scrape the arXiv website for papers related to LLM (Language Model) applications published in the last week. The script should extract the title, authors, abstract, and link to the paper.\n",
191
+ "\n",
192
+ "2. Scientist: Review the scraped data to identify the different domains in which LLM is applied. This could be based on keywords in the title or abstract, or the scientist's knowledge of the field.\n",
193
+ "\n",
194
+ "3. Engineer: Modify the script to categorize the papers based on the domains identified by the scientist. The script should output a markdown table with columns for the domain, title, authors, abstract, and link.\n",
195
+ "\n",
196
+ "4. Scientist: Review the markdown table to ensure the papers are correctly categorized and the information is accurate.\n",
197
+ "\n",
198
+ "5. Engineer: Make any necessary revisions to the script based on the scientist's feedback.\n",
199
+ "\n",
200
+ "6. Scientist: Give final approval of the markdown table.\n",
201
+ "\n",
202
+ "7. Engineer: Submit the final markdown table.\n",
203
+ "\n",
204
+ "--------------------------------------------------------------------------------\n",
205
+ "\u001b[33mCritic\u001b[0m (to chat_manager):\n",
206
+ "\n",
207
+ "The plan seems solid and well-structured. However, it lacks the inclusion of verifiable information such as source URLs. Here's a revised version:\n",
208
+ "\n",
209
+ "1. Engineer: Write a script to scrape the arXiv website for papers related to LLM (Language Model) applications published in the last week. The script should extract the title, authors, abstract, and link to the paper.\n",
210
+ "\n",
211
+ "2. Scientist: Review the scraped data to identify the different domains in which LLM is applied. This could be based on keywords in the title or abstract, or the scientist's knowledge of the field.\n",
212
+ "\n",
213
+ "3. Engineer: Modify the script to categorize the papers based on the domains identified by the scientist. The script should output a markdown table with columns for the domain, title, authors, abstract, and link (source URL).\n",
214
+ "\n",
215
+ "4. Scientist: Review the markdown table to ensure the papers are correctly categorized and the information is accurate. Check the source URLs to verify the information.\n",
216
+ "\n",
217
+ "5. Engineer: Make any necessary revisions to the script based on the scientist's feedback.\n",
218
+ "\n",
219
+ "6. Scientist: Give final approval of the markdown table, ensuring all entries have a valid source URL.\n",
220
+ "\n",
221
+ "7. Engineer: Submit the final markdown table.\n",
222
+ "\n",
223
+ "--------------------------------------------------------------------------------\n",
224
+ "\u001b[31m\n",
225
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
226
+ "\u001b[31m\n",
227
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
228
+ "\u001b[33mAdmin\u001b[0m (to chat_manager):\n",
229
+ "\n",
230
+ "\n",
231
+ "\n",
232
+ "--------------------------------------------------------------------------------\n",
233
+ "\u001b[33mEngineer\u001b[0m (to chat_manager):\n",
234
+ "\n",
235
+ "I'm sorry for the confusion, but as an AI model developed by OpenAI, I don't have the ability to directly scrape websites or access real-time data from external databases or websites, including arXiv. However, I can help you write a Python script that uses the arXiv API to fetch the data you need.\n",
236
+ "\n",
237
+ "Here's a Python script that uses the `arxiv` package to search for papers related to \"LLM applications\" from the last week. This script will print out the title, authors, summary, and URL of each paper.\n",
238
+ "\n",
239
+ "```python\n",
240
+ "import arxiv\n",
241
+ "import datetime\n",
242
+ "\n",
243
+ "# Calculate the date one week ago\n",
244
+ "one_week_ago = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime('%Y%m%d%H%M%S')\n",
245
+ "\n",
246
+ "# Search for papers on LLM applications\n",
247
+ "search = arxiv.Search(\n",
248
+ " query=\"LLM applications\",\n",
249
+ " max_results=100,\n",
250
+ " sort_by=arxiv.SortCriterion.SubmittedDate,\n",
251
+ " sort_order=arxiv.SortOrder.Descending\n",
252
+ ")\n",
253
+ "\n",
254
+ "for result in search.get():\n",
255
+ " # Check if the paper was submitted in the last week\n",
256
+ " if result.submitted.strftime('%Y%m%d%H%M%S') > one_week_ago:\n",
257
+ " print(f\"Title: {result.title}\")\n",
258
+ " print(f\"Authors: {', '.join(author.name for author in result.authors)}\")\n",
259
+ " print(f\"Summary: {result.summary}\")\n",
260
+ " print(f\"URL: {result.entry_id}\")\n",
261
+ " print(\"\\n\")\n",
262
+ "```\n",
263
+ "\n",
264
+ "You can install the `arxiv` package using pip:\n",
265
+ "\n",
266
+ "```shell\n",
267
+ "pip install arxiv\n",
268
+ "```\n",
269
+ "\n",
270
+ "Please note that you'll need to replace `\"LLM applications\"` with the actual search term you're interested in. The arXiv API doesn't support searching by the submission date, so the script fetches the 100 most recent results and filters them manually. You may need to increase `max_results` if there are more than 100 papers submitted per week on your topic.\n",
271
+ "\n",
272
+ "To create a markdown table, you'll need to format the output like this:\n",
273
+ "\n",
274
+ "```markdown\n",
275
+ "| Title | Authors | Summary | URL |\n",
276
+ "|-------|---------|---------|-----|\n",
277
+ "| ... | ... | ... | ... |\n",
278
+ "```\n",
279
+ "\n",
280
+ "You can do this in Python using f-strings:\n",
281
+ "\n",
282
+ "```python\n",
283
+ "print(\"| Title | Authors | Summary | URL |\")\n",
284
+ "print(\"|-------|---------|---------|-----|\")\n",
285
+ "for result in search.get():\n",
286
+ " # Check if the paper was submitted in the last week\n",
287
+ " if result.submitted.strftime('%Y%m%d%H%M%S') > one_week_ago:\n",
288
+ " print(f\"| {result.title} | {', '.join(author.name for author in result.authors)} | {result.summary} | {result.entry_id} |\")\n",
289
+ "```\n",
290
+ "\n",
291
+ "This will print a markdown table to the console. If you want to write the table to a file, you can open a file and use the `write` method instead of `print`.\n",
292
+ "\n",
293
+ "--------------------------------------------------------------------------------\n",
294
+ "\u001b[31m\n",
295
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
296
+ "\u001b[33mExecutor\u001b[0m (to chat_manager):\n",
297
+ "\n",
298
+ "exitcode: 1 (execution failed)\n",
299
+ "Code output: \n",
300
+ ":15: DeprecationWarning: The 'get' method is deprecated, use 'results' instead\n",
301
+ " for result in search.get():\n",
302
+ "Traceback (most recent call last):\n",
303
+ " File \"\", line 17, in <module>\n",
304
+ " if result.submitted.strftime('%Y%m%d%H%M%S') > one_week_ago:\n",
305
+ "AttributeError: 'Result' object has no attribute 'submitted'\n",
306
+ "\n",
307
+ "\n",
308
+ "--------------------------------------------------------------------------------\n",
309
+ "\u001b[33mEngineer\u001b[0m (to chat_manager):\n",
310
+ "\n",
311
+ "I apologize for the confusion. It seems there was a mistake in the code. The `arxiv` package's `Result` object doesn't have a `submitted` attribute. Instead, we should use the `published` attribute to get the date when the paper was published.\n",
312
+ "\n",
313
+ "Here's the corrected code:\n",
314
+ "\n",
315
+ "```python\n",
316
+ "import arxiv\n",
317
+ "import datetime\n",
318
+ "\n",
319
+ "# Calculate the date one week ago\n",
320
+ "one_week_ago = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime('%Y%m%d%H%M%S')\n",
321
+ "\n",
322
+ "# Search for papers on LLM applications\n",
323
+ "search = arxiv.Search(\n",
324
+ " query=\"LLM applications\",\n",
325
+ " max_results=100,\n",
326
+ " sort_by=arxiv.SortCriterion.Relevance,\n",
327
+ " sort_order=arxiv.SortOrder.Descending\n",
328
+ ")\n",
329
+ "\n",
330
+ "print(\"| Title | Authors | Summary | URL |\")\n",
331
+ "print(\"|-------|---------|---------|-----|\")\n",
332
+ "for result in search.results():\n",
333
+ " # Check if the paper was published in the last week\n",
334
+ " if result.published.strftime('%Y%m%d%H%M%S') > one_week_ago:\n",
335
+ " print(f\"| {result.title} | {', '.join(author.name for author in result.authors)} | {result.summary} | {result.entry_id} |\")\n",
336
+ "```\n",
337
+ "\n",
338
+ "This script will print a markdown table of papers related to \"LLM applications\" that were published in the last week. The table includes the title, authors, summary, and URL of each paper.\n",
339
+ "\n",
340
+ "--------------------------------------------------------------------------------\n",
341
+ "\u001b[31m\n",
342
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
343
+ "\u001b[33mExecutor\u001b[0m (to chat_manager):\n",
344
+ "\n",
345
+ "exitcode: 0 (execution succeeded)\n",
346
+ "Code output: \n",
347
+ "| Title | Authors | Summary | URL |\n",
348
+ "|-------|---------|---------|-----|\n",
349
+ "| Large Language Models as Data Preprocessors | Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada | Large Language Models (LLMs), typified by OpenAI's GPT series and Meta's\n",
350
+ "LLaMA variants, have marked a significant advancement in artificial\n",
351
+ "intelligence. Trained on vast amounts of text data, LLMs are capable of\n",
352
+ "understanding and generating human-like text across a diverse range of topics.\n",
353
+ "This study expands on the applications of LLMs, exploring their potential in\n",
354
+ "data preprocessing, a critical stage in data mining and analytics applications.\n",
355
+ "We delve into the applicability of state-of-the-art LLMs such as GPT-3.5,\n",
356
+ "GPT-4, and Vicuna-13B for error detection, data imputation, schema matching,\n",
357
+ "and entity matching tasks. Alongside showcasing the inherent capabilities of\n",
358
+ "LLMs, we highlight their limitations, particularly in terms of computational\n",
359
+ "expense and inefficiency. We propose an LLM-based framework for data\n",
360
+ "preprocessing, which integrates cutting-edge prompt engineering techniques,\n",
361
+ "coupled with traditional methods like contextualization and feature selection,\n",
362
+ "to improve the performance and efficiency of these models. The effectiveness of\n",
363
+ "LLMs in data preprocessing is evaluated through an experimental study spanning\n",
364
+ "12 datasets. GPT-4 emerged as a standout, achieving 100\\% accuracy or F1 score\n",
365
+ "on 4 datasets, suggesting LLMs' immense potential in these tasks. Despite\n",
366
+ "certain limitations, our study underscores the promise of LLMs in this domain\n",
367
+ "and anticipates future developments to overcome current hurdles. | http://arxiv.org/abs/2308.16361v1 |\n",
368
+ "| Large language models in medicine: the potentials and pitfalls | Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou | Large language models (LLMs) have been applied to tasks in healthcare,\n",
369
+ "ranging from medical exam questions to responding to patient questions. With\n",
370
+ "increasing institutional partnerships between companies producing LLMs and\n",
371
+ "healthcare systems, real world clinical application is coming closer to\n",
372
+ "reality. As these models gain traction, it is essential for healthcare\n",
373
+ "practitioners to understand what LLMs are, their development, their current and\n",
374
+ "potential applications, and the associated pitfalls when utilized in medicine.\n",
375
+ "This review and accompanying tutorial aim to give an overview of these topics\n",
376
+ "to aid healthcare practitioners in understanding the rapidly changing landscape\n",
377
+ "of LLMs as applied to medicine. | http://arxiv.org/abs/2309.00087v1 |\n",
378
+ "| Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following | Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, Jiaming Han, Kexin Chen, Peng Gao, Xianzhi Li, Hongsheng Li, Pheng-Ann Heng | We introduce Point-Bind, a 3D multi-modality model aligning point clouds with\n",
379
+ "2D image, language, audio, and video. Guided by ImageBind, we construct a joint\n",
380
+ "embedding space between 3D and multi-modalities, enabling many promising\n",
381
+ "applications, e.g., any-to-3D generation, 3D embedding arithmetic, and 3D\n",
382
+ "open-world understanding. On top of this, we further present Point-LLM, the\n",
383
+ "first 3D large language model (LLM) following 3D multi-modal instructions. By\n",
384
+ "parameter-efficient fine-tuning techniques, Point-LLM injects the semantics of\n",
385
+ "Point-Bind into pre-trained LLMs, e.g., LLaMA, which requires no 3D instruction\n",
386
+ "data, but exhibits superior 3D and multi-modal question-answering capacity. We\n",
387
+ "hope our work may cast a light on the community for extending 3D point clouds\n",
388
+ "to multi-modality applications. Code is available at\n",
389
+ "https://github.com/ZiyuGuo99/Point-Bind_Point-LLM. | http://arxiv.org/abs/2309.00615v1 |\n",
390
+ "| Where Would I Go Next? Large Language Models as Human Mobility Predictors | Xinglei Wang, Meng Fang, Zichao Zeng, Tao Cheng | Accurate human mobility prediction underpins many important applications\n",
391
+ "across a variety of domains, including epidemic modelling, transport planning,\n",
392
+ "and emergency responses. Due to the sparsity of mobility data and the\n",
393
+ "stochastic nature of people's daily activities, achieving precise predictions\n",
394
+ "of people's locations remains a challenge. While recently developed large\n",
395
+ "language models (LLMs) have demonstrated superior performance across numerous\n",
396
+ "language-related tasks, their applicability to human mobility studies remains\n",
397
+ "unexplored. Addressing this gap, this article delves into the potential of LLMs\n",
398
+ "for human mobility prediction tasks. We introduce a novel method, LLM-Mob,\n",
399
+ "which leverages the language understanding and reasoning capabilities of LLMs\n",
400
+ "for analysing human mobility data. We present concepts of historical stays and\n",
401
+ "context stays to capture both long-term and short-term dependencies in human\n",
402
+ "movement and enable time-aware prediction by using time information of the\n",
403
+ "prediction target. Additionally, we design context-inclusive prompts that\n",
404
+ "enable LLMs to generate more accurate predictions. Comprehensive evaluations of\n",
405
+ "our method reveal that LLM-Mob excels in providing accurate and interpretable\n",
406
+ "predictions, highlighting the untapped potential of LLMs in advancing human\n",
407
+ "mobility prediction techniques. We posit that our research marks a significant\n",
408
+ "paradigm shift in human mobility modelling, transitioning from building complex\n",
409
+ "domain-specific models to harnessing general-purpose LLMs that yield accurate\n",
410
+ "predictions through language instructions. The code for this work is available\n",
411
+ "at https://github.com/xlwang233/LLM-Mob. | http://arxiv.org/abs/2308.15197v1 |\n",
412
+ "| Interactively Robot Action Planning with Uncertainty Analysis and Active Questioning by Large Language Model | Kazuki Hori, Kanata Suzuki, Tetsuya Ogata | The application of the Large Language Model (LLM) to robot action planning\n",
413
+ "has been actively studied. The instructions given to the LLM by natural\n",
414
+ "language may include ambiguity and lack of information depending on the task\n",
415
+ "context. It is possible to adjust the output of LLM by making the instruction\n",
416
+ "input more detailed; however, the design cost is high. In this paper, we\n",
417
+ "propose the interactive robot action planning method that allows the LLM to\n",
418
+ "analyze and gather missing information by asking questions to humans. The\n",
419
+ "method can minimize the design cost of generating precise robot instructions.\n",
420
+ "We demonstrated the effectiveness of our method through concrete examples in\n",
421
+ "cooking tasks. However, our experiments also revealed challenges in robot\n",
422
+ "action planning with LLM, such as asking unimportant questions and assuming\n",
423
+ "crucial information without asking. Shedding light on these issues provides\n",
424
+ "valuable insights for future research on utilizing LLM for robotics. | http://arxiv.org/abs/2308.15684v1 |\n",
425
+ "| AskIt: Unified Programming Interface for Programming with Large Language Models | Katsumi Okuda, Saman Amarasinghe | In the evolving landscape of software development, Large Language Models\n",
426
+ "(LLMs) exhibit a unique phenomenon known as emergent abilities, demonstrating\n",
427
+ "adeptness across numerous tasks, from text summarization to code generation.\n",
428
+ "While these abilities open up novel avenues in software design and crafting,\n",
429
+ "their incorporation presents substantial challenges. Developers grapple with\n",
430
+ "decisions surrounding the direct embedding of LLMs within applications versus\n",
431
+ "employing them for code generation. Moreover, effective prompt design becomes a\n",
432
+ "critical concern, given the necessity of data extraction from natural language\n",
433
+ "outputs. To address these intricacies, this paper introduces AskIt, a\n",
434
+ "domain-specific language (DSL) specifically designed for LLMs. AskIt simplifies\n",
435
+ "LLM integration, offering type-guided output control, template-based function\n",
436
+ "definitions, and a unified interface that diminishes the distinction between\n",
437
+ "LLM-based code generation and application integration. Furthermore, through\n",
438
+ "Programming by Example (PBE), AskIt harnesses the power of few-shot learning at\n",
439
+ "the programming language level. Our evaluations underscore AskIt's potency.\n",
440
+ "Across 50 tasks, AskIt generated concise prompts for the given tasks, achieving\n",
441
+ "a 16.14% reduction in prompt length relative to benchmarks. Additionally, by\n",
442
+ "enabling the transition from direct LLM application usage to function\n",
443
+ "generation, AskIt achieved significant speedups, as observed in our GSM8K\n",
444
+ "benchmark experiments. Through these advancements, AskIt streamlines the\n",
445
+ "integration of LLMs in software development, offering a more efficient,\n",
446
+ "versatile approach for leveraging emergent abilities. The implementations of\n",
447
+ "AskIt in TypeScript and Python are available at\n",
448
+ "https://github.com/katsumiok/ts-askit and https://github.com/katsumiok/pyaskit,\n",
449
+ "respectively. | http://arxiv.org/abs/2308.15645v1 |\n",
450
+ "\n",
451
+ "\n",
452
+ "--------------------------------------------------------------------------------\n",
453
+ "\u001b[33mScientist\u001b[0m (to chat_manager):\n",
454
+ "\n",
455
+ "Here are the papers related to LLM applications published in the last week:\n",
456
+ "\n",
457
+ "| Domain | Title | Authors | Summary | URL |\n",
458
+ "|-------|---------|---------|-----|-----|\n",
459
+ "| Data Preprocessing | [Large Language Models as Data Preprocessors](http://arxiv.org/abs/2308.16361v1) | Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada | This study expands on the applications of LLMs, exploring their potential in data preprocessing, a critical stage in data mining and analytics applications. | [Link](http://arxiv.org/abs/2308.16361v1) |\n",
460
+ "| Medicine | [Large language models in medicine: the potentials and pitfalls](http://arxiv.org/abs/2309.00087v1) | Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou | This review and accompanying tutorial aim to give an overview of these topics to aid healthcare practitioners in understanding the rapidly changing landscape of LLMs as applied to medicine. | [Link](http://arxiv.org/abs/2309.00087v1) |\n",
461
+ "| 3D Understanding, Generation, and Instruction Following | [Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following](http://arxiv.org/abs/2309.00615v1) | Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, Jiaming Han, Kexin Chen, Peng Gao, Xianzhi Li, Hongsheng Li, Pheng-Ann Heng | We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video. | [Link](http://arxiv.org/abs/2309.00615v1) |\n",
462
+ "| Human Mobility Prediction | [Where Would I Go Next? Large Language Models as Human Mobility Predictors](http://arxiv.org/abs/2308.15197v1) | Xinglei Wang, Meng Fang, Zichao Zeng, Tao Cheng | This article delves into the potential of LLMs for human mobility prediction tasks. | [Link](http://arxiv.org/abs/2308.15197v1) |\n",
463
+ "| Robotics | [Interactively Robot Action Planning with Uncertainty Analysis and Active Questioning by Large Language Model](http://arxiv.org/abs/2308.15684v1) | Kazuki Hori, Kanata Suzuki, Tetsuya Ogata | In this paper, we propose the interactive robot action planning method that allows the LLM to analyze and gather missing information by asking questions to humans. | [Link](http://arxiv.org/abs/2308.15684v1) |\n",
464
+ "| Software Development | [AskIt: Unified Programming Interface for Programming with Large Language Models](http://arxiv.org/abs/2308.15645v1) | Katsumi Okuda, Saman Amarasinghe | This paper introduces AskIt, a domain-specific language (DSL) specifically designed for LLMs. | [Link](http://arxiv.org/abs/2308.15645v1) |\n",
465
+ "\n",
466
+ "--------------------------------------------------------------------------------\n",
467
+ "\u001b[33mCritic\u001b[0m (to chat_manager):\n",
468
+ "\n",
469
+ "The scientist has done a good job categorizing the papers into different domains and providing a summary for each. The markdown table is correctly formatted and includes the source URL for each paper, which allows for verification of the information. The domains identified are diverse, indicating a broad range of applications for Large Language Models (LLMs). This review and categorization should provide a useful resource for anyone interested in the recent applications of LLMs.\n",
470
+ "\n",
471
+ "--------------------------------------------------------------------------------\n"
472
+ ]
473
+ }
474
+ ],
475
+ "source": [
476
+ "user_proxy.initiate_chat(\n",
477
+ " manager,\n",
478
+ " message=\"\"\"\n",
479
+ "find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.\n",
480
+ "\"\"\",\n",
481
+ ")"
482
+ ]
483
+ },
484
+ {
485
+ "attachments": {},
486
+ "cell_type": "markdown",
487
+ "metadata": {},
488
+ "source": [
489
+ "## Create Group Chat without Critic for Comparison"
490
+ ]
491
+ },
492
+ {
493
+ "cell_type": "code",
494
+ "execution_count": 5,
495
+ "metadata": {},
496
+ "outputs": [
497
+ {
498
+ "name": "stdout",
499
+ "output_type": "stream",
500
+ "text": [
501
+ "\u001b[33mAdmin\u001b[0m (to chat_manager):\n",
502
+ "\n",
503
+ "\n",
504
+ "find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.\n",
505
+ "\n",
506
+ "\n",
507
+ "--------------------------------------------------------------------------------\n",
508
+ "\u001b[33mPlanner\u001b[0m (to chat_manager):\n",
509
+ "\n",
510
+ "Plan:\n",
511
+ "\n",
512
+ "1. Engineer: Write a script to scrape the arXiv website for papers related to LLM (Language Model) applications published in the last week. The script should extract the title, authors, abstract, and link to the paper.\n",
513
+ "\n",
514
+ "2. Scientist: Review the scraped data to identify the different domains in which LLM is applied. This could be based on keywords in the title or abstract, or the scientist's knowledge of the field.\n",
515
+ "\n",
516
+ "3. Engineer: Modify the script to categorize the papers based on the domains identified by the scientist. The script should output a markdown table with columns for the domain, title, authors, abstract, and link.\n",
517
+ "\n",
518
+ "4. Scientist: Review the markdown table to ensure the papers are correctly categorized and the information is accurate.\n",
519
+ "\n",
520
+ "5. Engineer: Make any necessary revisions to the script based on the scientist's feedback.\n",
521
+ "\n",
522
+ "6. Scientist: Give final approval of the markdown table.\n",
523
+ "\n",
524
+ "7. Engineer: Submit the final markdown table.\n",
525
+ "\n",
526
+ "--------------------------------------------------------------------------------\n"
527
+ ]
528
+ }
529
+ ],
530
+ "source": [
531
+ "groupchat_nocritic = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor], messages=[], max_round=50)\n",
532
+ "for agent in groupchat.agents:\n",
533
+ " agent.reset()\n",
534
+ "manager_nocritic = autogen.GroupChatManager(groupchat=groupchat_nocritic, llm_config=gpt4_config)\n",
535
+ "user_proxy.initiate_chat(\n",
536
+ " manager_nocritic,\n",
537
+ " message=\"\"\"\n",
538
+ "find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.\n",
539
+ "\"\"\",\n",
540
+ ")"
541
+ ]
542
+ }
543
+ ],
544
+ "metadata": {
545
+ "kernelspec": {
546
+ "display_name": "flaml",
547
+ "language": "python",
548
+ "name": "python3"
549
+ },
550
+ "language_info": {
551
+ "codemirror_mode": {
552
+ "name": "ipython",
553
+ "version": 3
554
+ },
555
+ "file_extension": ".py",
556
+ "mimetype": "text/x-python",
557
+ "name": "python",
558
+ "nbconvert_exporter": "python",
559
+ "pygments_lexer": "ipython3",
560
+ "version": "3.9.17"
561
+ },
562
+ "orig_nbformat": 4
563
+ },
564
+ "nbformat": 4,
565
+ "nbformat_minor": 2
566
+ }
agentchat_groupchat_vis.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
agentchat_human_feedback.ipynb ADDED
@@ -0,0 +1,801 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_human_feedback.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "slideshow": {
16
+ "slide_type": "slide"
17
+ }
18
+ },
19
+ "source": [
20
+ "# Auto Generated Agent Chat: Task Solving with Code Generation, Execution, Debugging & Human Feedback\n",
21
+ "\n",
22
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
23
+ "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
24
+ "\n",
25
+ "In this notebook, we demonstrate how to use `AssistantAgent` and `UserProxyAgent` to solve a challenging math problem with human feedback. Here `AssistantAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for a user to execute the code written by `AssistantAgent`. By setting `human_input_mode` properly, the `UserProxyAgent` can also prompt the user for feedback to `AssistantAgent`. For example, when `human_input_mode` is set to \"ALWAYS\", the `UserProxyAgent` will always prompt the user for feedback. When user feedback is provided, the `UserProxyAgent` will directly pass the feedback to `AssistantAgent`. When no user feedback is provided, the `UserProxyAgent` will execute the code written by `AssistantAgent` and return the execution results (success or failure and corresponding outputs) to `AssistantAgent`.\n",
26
+ "\n",
27
+ "## Requirements\n",
28
+ "\n",
29
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
30
+ "```bash\n",
31
+ "pip install pyautogen\n",
32
+ "```"
33
+ ]
34
+ },
35
+ {
36
+ "cell_type": "code",
37
+ "execution_count": 1,
38
+ "metadata": {
39
+ "execution": {
40
+ "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
41
+ "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
42
+ "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
43
+ "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
44
+ }
45
+ },
46
+ "outputs": [],
47
+ "source": [
48
+ "# %pip install pyautogen~=0.1.0"
49
+ ]
50
+ },
51
+ {
52
+ "attachments": {},
53
+ "cell_type": "markdown",
54
+ "metadata": {},
55
+ "source": [
56
+ "## Set your API Endpoint\n",
57
+ "\n",
58
+ "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file."
59
+ ]
60
+ },
61
+ {
62
+ "cell_type": "code",
63
+ "execution_count": 2,
64
+ "metadata": {},
65
+ "outputs": [],
66
+ "source": [
67
+ "import autogen\n",
68
+ "\n",
69
+ "config_list = autogen.config_list_from_json(\"OAI_CONFIG_LIST\")"
70
+ ]
71
+ },
72
+ {
73
+ "attachments": {},
74
+ "cell_type": "markdown",
75
+ "metadata": {},
76
+ "source": [
77
+ "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\".\n",
78
+ "\n",
79
+ "The config list looks like the following:\n",
80
+ "```python\n",
81
+ "config_list = [\n",
82
+ " {\n",
83
+ " 'model': 'gpt-4',\n",
84
+ " 'api_key': '<your OpenAI API key here>',\n",
85
+ " }, # OpenAI API endpoint for gpt-4\n",
86
+ " {\n",
87
+ " 'model': 'gpt-4',\n",
88
+ " 'api_key': '<your first Azure OpenAI API key here>',\n",
89
+ " 'api_base': '<your first Azure OpenAI API base here>',\n",
90
+ " 'api_type': 'azure',\n",
91
+ " 'api_version': '2023-06-01-preview',\n",
92
+ " }, # Azure OpenAI API endpoint for gpt-4\n",
93
+ " {\n",
94
+ " 'model': 'gpt-4',\n",
95
+ " 'api_key': '<your second Azure OpenAI API key here>',\n",
96
+ " 'api_base': '<your second Azure OpenAI API base here>',\n",
97
+ " 'api_type': 'azure',\n",
98
+ " 'api_version': '2023-06-01-preview',\n",
99
+ " }, # another Azure OpenAI API endpoint for gpt-4\n",
100
+ " {\n",
101
+ " 'model': 'gpt-3.5-turbo',\n",
102
+ " 'api_key': '<your OpenAI API key here>',\n",
103
+ " }, # OpenAI API endpoint for gpt-3.5-turbo\n",
104
+ " {\n",
105
+ " 'model': 'gpt-3.5-turbo',\n",
106
+ " 'api_key': '<your first Azure OpenAI API key here>',\n",
107
+ " 'api_base': '<your first Azure OpenAI API base here>',\n",
108
+ " 'api_type': 'azure',\n",
109
+ " 'api_version': '2023-06-01-preview',\n",
110
+ " }, # Azure OpenAI API endpoint for gpt-3.5-turbo\n",
111
+ " {\n",
112
+ " 'model': 'gpt-3.5-turbo',\n",
113
+ " 'api_key': '<your second Azure OpenAI API key here>',\n",
114
+ " 'api_base': '<your second Azure OpenAI API base here>',\n",
115
+ " 'api_type': 'azure',\n",
116
+ " 'api_version': '2023-06-01-preview',\n",
117
+ " }, # another Azure OpenAI API endpoint for gpt-3.5-turbo\n",
118
+ "]\n",
119
+ "```\n",
120
+ "\n",
121
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n",
122
+ "\n",
123
+ "You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file."
124
+ ]
125
+ },
126
+ {
127
+ "attachments": {},
128
+ "cell_type": "markdown",
129
+ "metadata": {},
130
+ "source": [
131
+ "## Construct Agents\n",
132
+ "\n",
133
+ "We construct the assistant agent and the user proxy agent."
134
+ ]
135
+ },
136
+ {
137
+ "cell_type": "code",
138
+ "execution_count": 3,
139
+ "metadata": {},
140
+ "outputs": [],
141
+ "source": [
142
+ "# create an AssistantAgent instance named \"assistant\"\n",
143
+ "assistant = autogen.AssistantAgent(\n",
144
+ " name=\"assistant\",\n",
145
+ " llm_config={\n",
146
+ " \"seed\": 41,\n",
147
+ " \"config_list\": config_list,\n",
148
+ " }\n",
149
+ ")\n",
150
+ "# create a UserProxyAgent instance named \"user_proxy\"\n",
151
+ "user_proxy = autogen.UserProxyAgent(\n",
152
+ " name=\"user_proxy\",\n",
153
+ " human_input_mode=\"ALWAYS\",\n",
154
+ " is_termination_msg=lambda x: x.get(\"content\", \"\").rstrip().endswith(\"TERMINATE\"),\n",
155
+ ")\n",
156
+ "\n",
157
+ "# the purpose of the following line is to log the conversation history\n",
158
+ "autogen.ChatCompletion.start_logging()\n"
159
+ ]
160
+ },
161
+ {
162
+ "attachments": {},
163
+ "cell_type": "markdown",
164
+ "metadata": {},
165
+ "source": [
166
+ "## Perform a task\n",
167
+ "\n",
168
+ "We invoke the `initiate_chat()` method of the user proxy agent to start the conversation. When you run the cell below, you will be prompted to provide feedback after receving a message from the assistant agent. If you don't provide any feedback (by pressing Enter directly), the user proxy agent will try to execute the code suggested by the assistant agent on behalf of you, or terminate if the assistant agent sends a \"TERMINATE\" signal in the end of the message."
169
+ ]
170
+ },
171
+ {
172
+ "cell_type": "code",
173
+ "execution_count": 4,
174
+ "metadata": {},
175
+ "outputs": [
176
+ {
177
+ "name": "stdout",
178
+ "output_type": "stream",
179
+ "text": [
180
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
181
+ "\n",
182
+ "\n",
183
+ "Find $a + b + c$, given that $x+y \\neq -1$ and \n",
184
+ "\\begin{align}\n",
185
+ "\tax + by + c & = x + 7,\\\n",
186
+ "\ta + bx + cy & = 2x + 6y,\\\n",
187
+ "\tay + b + cx & = 4x + y.\n",
188
+ "\\end{align}.\n",
189
+ "\n",
190
+ "\n",
191
+ "--------------------------------------------------------------------------------\n",
192
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
193
+ "\n",
194
+ "To solve the system of linear equations, I will use NumPy library in Python to calculate the coefficients and constants and then use the linalg.solve() function to find the values of a, b, and c. Finally, I will add them up to find $a + b + c$.\n",
195
+ "\n",
196
+ "Step 1: Define the coefficients and constants\n",
197
+ "\n",
198
+ "Step 2: Solve the system of equations using NumPy\n",
199
+ "\n",
200
+ "Step 3: Calculate the sum of a, b, and c\n",
201
+ "\n",
202
+ "--------------------------------------------------------------------------------\n",
203
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
204
+ "\n",
205
+ "use sympy\n",
206
+ "\n",
207
+ "--------------------------------------------------------------------------------\n",
208
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
209
+ "\n",
210
+ "Sure, I will use Sympy to solve the system of linear equations.\n",
211
+ "\n",
212
+ "Step 1: Import Sympy library and define the symbols\n",
213
+ "``` python\n",
214
+ "from sympy import symbols, Eq, solve\n",
215
+ "\n",
216
+ "a, b, c, x, y = symbols('a b c x y')\n",
217
+ "```\n",
218
+ "\n",
219
+ "Step 2: Define the equations\n",
220
+ "``` python\n",
221
+ "eq1 = Eq(a * x + b * y + c, x + 7)\n",
222
+ "eq2 = Eq(a + b * x + c * y, 2 * x + 6 * y)\n",
223
+ "eq3 = Eq(a * y + b + c * x, 4 * x + y)\n",
224
+ "```\n",
225
+ "\n",
226
+ "Step 3: Solve the system of equations\n",
227
+ "``` python\n",
228
+ "solutions = solve((eq1, eq2, eq3), (a, b, c))\n",
229
+ "```\n",
230
+ "\n",
231
+ "Step 4: Calculate the sum of a, b, and c\n",
232
+ "``` python\n",
233
+ "a_value, b_value, c_value = solutions[a], solutions[b], solutions[c]\n",
234
+ "sum_abc = a_value + b_value + c_value\n",
235
+ "```\n",
236
+ "\n",
237
+ "\n",
238
+ "\n",
239
+ "--------------------------------------------------------------------------------\n",
240
+ "\u001b[31m\n",
241
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
242
+ "\u001b[31m\n",
243
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
244
+ "\u001b[31m\n",
245
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
246
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
247
+ "\n",
248
+ "exitcode: 1 (execution failed)\n",
249
+ "Code output: \n",
250
+ " File \"\", line 2\n",
251
+ " Step 2: Define the equations\n",
252
+ " ^\n",
253
+ "SyntaxError: invalid syntax\n",
254
+ "\n",
255
+ "\n",
256
+ "--------------------------------------------------------------------------------\n",
257
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
258
+ "\n",
259
+ "Apologies for the confusion. I should have provided the code in a single block without the step-by-step comments. Here is the code to solve the system of linear equations using Sympy:\n",
260
+ "\n",
261
+ "```python\n",
262
+ "from sympy import symbols, Eq, solve\n",
263
+ "\n",
264
+ "a, b, c, x, y = symbols('a b c x y')\n",
265
+ "\n",
266
+ "eq1 = Eq(a * x + b * y + c, x + 7)\n",
267
+ "eq2 = Eq(a + b * x + c * y, 2 * x + 6 * y)\n",
268
+ "eq3 = Eq(a * y + b + c * x, 4 * x + y)\n",
269
+ "\n",
270
+ "solutions = solve((eq1, eq2, eq3), (a, b, c))\n",
271
+ "\n",
272
+ "a_value, b_value, c_value = solutions[a], solutions[b], solutions[c]\n",
273
+ "sum_abc = a_value + b_value + c_value\n",
274
+ "print(sum_abc)\n",
275
+ "```\n",
276
+ "\n",
277
+ "Please run the provided code to find the sum of a, b, and c.\n",
278
+ "\n",
279
+ "--------------------------------------------------------------------------------\n",
280
+ "\u001b[31m\n",
281
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
282
+ "\u001b[31m\n",
283
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
284
+ "\u001b[31m\n",
285
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
286
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
287
+ "\n",
288
+ "exitcode: 0 (execution succeeded)\n",
289
+ "Code output: \n",
290
+ "(2*x**2 - 3*x + y)/(x**2 - x*y - x + y**2 - y + 1) + (x**2 - 3*x*y + 2*x + y**2 - y)/(x**2 - x*y - x + y**2 - y + 1) + (4*x**2 - 4*x*y - 6*x + 6*y**2 - 7*y + 7)/(x**2 - x*y - x + y**2 - y + 1)\n",
291
+ "\n",
292
+ "\n",
293
+ "--------------------------------------------------------------------------------\n",
294
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
295
+ "\n",
296
+ "I apologize for the confusion. The code provided the sum of a, b, and c in a symbolic representation. To get an answer, we need to put some constraints on x and y for which x + y is not equal to -1.\n",
297
+ "\n",
298
+ "We can choose any values for x and y that don't satisfy x + y = -1, and then substitute them into the equation to find the corresponding sum of a, b, and c. Let's use x = 1 and y = 0, which satisfy the given condition.\n",
299
+ "\n",
300
+ "Here is the updated code with the chosen values for x and y:\n",
301
+ "\n",
302
+ "```python\n",
303
+ "from sympy import symbols, Eq, solve\n",
304
+ "\n",
305
+ "a, b, c, x, y = symbols('a b c x y')\n",
306
+ "\n",
307
+ "eq1 = Eq(a * x + b * y + c, x + 7)\n",
308
+ "eq2 = Eq(a + b * x + c * y, 2 * x + 6 * y)\n",
309
+ "eq3 = Eq(a * y + b + c * x, 4 * x + y)\n",
310
+ "\n",
311
+ "solutions = solve((eq1, eq2, eq3), (a, b, c))\n",
312
+ "\n",
313
+ "chosen_x = 1\n",
314
+ "chosen_y = 0\n",
315
+ "\n",
316
+ "a_value, b_value, c_value = solutions[a].subs(x, chosen_x).subs(y, chosen_y), solutions[b].subs(x, chosen_x).subs(y, chosen_y), solutions[c].subs(x, chosen_x).subs(y, chosen_y)\n",
317
+ "sum_abc = a_value + b_value + c_value\n",
318
+ "print(sum_abc)\n",
319
+ "```\n",
320
+ "\n",
321
+ "Please run the provided code again to find the sum of a, b, and c.\n",
322
+ "\n",
323
+ "--------------------------------------------------------------------------------\n",
324
+ "\u001b[31m\n",
325
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
326
+ "\u001b[31m\n",
327
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
328
+ "\u001b[31m\n",
329
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
330
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
331
+ "\n",
332
+ "exitcode: 0 (execution succeeded)\n",
333
+ "Code output: \n",
334
+ "7\n",
335
+ "\n",
336
+ "\n",
337
+ "--------------------------------------------------------------------------------\n",
338
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
339
+ "\n",
340
+ "Great! Now we have the sum of a, b, and c. The answer is:\n",
341
+ "\n",
342
+ "$$a + b + c = 7$$\n",
343
+ "\n",
344
+ "TERMINATE\n",
345
+ "\n",
346
+ "--------------------------------------------------------------------------------\n",
347
+ "\u001b[31m\n",
348
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n"
349
+ ]
350
+ }
351
+ ],
352
+ "source": [
353
+ "math_problem_to_solve = \"\"\"\n",
354
+ "Find $a + b + c$, given that $x+y \\\\neq -1$ and \n",
355
+ "\\\\begin{align}\n",
356
+ "\tax + by + c & = x + 7,\\\\\n",
357
+ "\ta + bx + cy & = 2x + 6y,\\\\\n",
358
+ "\tay + b + cx & = 4x + y.\n",
359
+ "\\\\end{align}.\n",
360
+ "\"\"\"\n",
361
+ "\n",
362
+ "# the assistant receives a message from the user, which contains the task description\n",
363
+ "user_proxy.initiate_chat(assistant, message=math_problem_to_solve)"
364
+ ]
365
+ },
366
+ {
367
+ "attachments": {},
368
+ "cell_type": "markdown",
369
+ "metadata": {},
370
+ "source": [
371
+ "## Analyze the conversation\n",
372
+ "\n",
373
+ "The human user can provide feedback at each step. When the human user didn't provide feedback, the code was executed. The executed results and error messages are returned to the assistant, and the assistant was able to modify the code based on the feedback. In the end, the task is complete and a \"TERMINATE\" signal is sent from the assistant. The user skipped feedback in the end and the conversation is finished.\n",
374
+ "\n",
375
+ "After the conversation is finished, we can save the log of the conversation between the two agents. The log can be accessed from `autogen.ChatCompletion.logged_history`."
376
+ ]
377
+ },
378
+ {
379
+ "cell_type": "code",
380
+ "execution_count": 5,
381
+ "metadata": {},
382
+ "outputs": [
383
+ {
384
+ "name": "stdout",
385
+ "output_type": "stream",
386
+ "text": [
387
+ "{'[{\"content\": \"You are a helpful AI assistant.\\\\nSolve tasks using your coding and language skills.\\\\nIn the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.\\\\n 1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.\\\\n 2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.\\\\nSolve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.\\\\nWhen using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can\\'t modify your code. So do not suggest incomplete code which requires users to modify. Don\\'t use a code block if it\\'s not intended to be executed by the user.\\\\nIf you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don\\'t include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use \\'print\\' function for the output when relevant. Check the execution result returned by the user.\\\\nIf the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can\\'t be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\\\\nWhen you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.\\\\nReply \\\\\"TERMINATE\\\\\" in the end when everything is done.\\\\n \", \"role\": \"system\"}, {\"content\": \"\\\\nFind $a + b + c$, given that $x+y \\\\\\\\neq -1$ and \\\\n\\\\\\\\begin{align}\\\\n\\\\tax + by + c & = x + 7,\\\\\\\\\\\\n\\\\ta + bx + cy & = 2x + 6y,\\\\\\\\\\\\n\\\\tay + b + cx & = 4x + y.\\\\n\\\\\\\\end{align}.\\\\n\", \"role\": \"user\"}, {\"content\": \"To solve the system of linear equations, I will use NumPy library in Python to calculate the coefficients and constants and then use the linalg.solve() function to find the values of a, b, and c. Finally, I will add them up to find $a + b + c$.\\\\n\\\\nStep 1: Define the coefficients and constants\\\\n\\\\nStep 2: Solve the system of equations using NumPy\\\\n\\\\nStep 3: Calculate the sum of a, b, and c\", \"role\": \"assistant\"}, {\"content\": \"use sympy\", \"role\": \"user\"}, {\"content\": \"Sure, I will use Sympy to solve the system of linear equations.\\\\n\\\\nStep 1: Import Sympy library and define the symbols\\\\n``` python\\\\nfrom sympy import symbols, Eq, solve\\\\n\\\\na, b, c, x, y = symbols(\\'a b c x y\\')\\\\n```\\\\n\\\\nStep 2: Define the equations\\\\n``` python\\\\neq1 = Eq(a * x + b * y + c, x + 7)\\\\neq2 = Eq(a + b * x + c * y, 2 * x + 6 * y)\\\\neq3 = Eq(a * y + b + c * x, 4 * x + y)\\\\n```\\\\n\\\\nStep 3: Solve the system of equations\\\\n``` python\\\\nsolutions = solve((eq1, eq2, eq3), (a, b, c))\\\\n```\\\\n\\\\nStep 4: Calculate the sum of a, b, and c\\\\n``` python\\\\na_value, b_value, c_value = solutions[a], solutions[b], solutions[c]\\\\nsum_abc = a_value + b_value + c_value\\\\n```\\\\n\\\\n\", \"role\": \"assistant\"}, {\"content\": \"exitcode: 1 (execution failed)\\\\nCode output: \\\\n File \\\\\"\\\\\", line 2\\\\n Step 2: Define the equations\\\\n ^\\\\nSyntaxError: invalid syntax\\\\n\", \"role\": \"user\"}, {\"content\": \"Apologies for the confusion. I should have provided the code in a single block without the step-by-step comments. Here is the code to solve the system of linear equations using Sympy:\\\\n\\\\n```python\\\\nfrom sympy import symbols, Eq, solve\\\\n\\\\na, b, c, x, y = symbols(\\'a b c x y\\')\\\\n\\\\neq1 = Eq(a * x + b * y + c, x + 7)\\\\neq2 = Eq(a + b * x + c * y, 2 * x + 6 * y)\\\\neq3 = Eq(a * y + b + c * x, 4 * x + y)\\\\n\\\\nsolutions = solve((eq1, eq2, eq3), (a, b, c))\\\\n\\\\na_value, b_value, c_value = solutions[a], solutions[b], solutions[c]\\\\nsum_abc = a_value + b_value + c_value\\\\nprint(sum_abc)\\\\n```\\\\n\\\\nPlease run the provided code to find the sum of a, b, and c.\", \"role\": \"assistant\"}, {\"content\": \"exitcode: 0 (execution succeeded)\\\\nCode output: \\\\n(2*x**2 - 3*x + y)/(x**2 - x*y - x + y**2 - y + 1) + (x**2 - 3*x*y + 2*x + y**2 - y)/(x**2 - x*y - x + y**2 - y + 1) + (4*x**2 - 4*x*y - 6*x + 6*y**2 - 7*y + 7)/(x**2 - x*y - x + y**2 - y + 1)\\\\n\", \"role\": \"user\"}, {\"content\": \"I apologize for the confusion. The code provided the sum of a, b, and c in a symbolic representation. To get an answer, we need to put some constraints on x and y for which x + y is not equal to -1.\\\\n\\\\nWe can choose any values for x and y that don\\'t satisfy x + y = -1, and then substitute them into the equation to find the corresponding sum of a, b, and c. Let\\'s use x = 1 and y = 0, which satisfy the given condition.\\\\n\\\\nHere is the updated code with the chosen values for x and y:\\\\n\\\\n```python\\\\nfrom sympy import symbols, Eq, solve\\\\n\\\\na, b, c, x, y = symbols(\\'a b c x y\\')\\\\n\\\\neq1 = Eq(a * x + b * y + c, x + 7)\\\\neq2 = Eq(a + b * x + c * y, 2 * x + 6 * y)\\\\neq3 = Eq(a * y + b + c * x, 4 * x + y)\\\\n\\\\nsolutions = solve((eq1, eq2, eq3), (a, b, c))\\\\n\\\\nchosen_x = 1\\\\nchosen_y = 0\\\\n\\\\na_value, b_value, c_value = solutions[a].subs(x, chosen_x).subs(y, chosen_y), solutions[b].subs(x, chosen_x).subs(y, chosen_y), solutions[c].subs(x, chosen_x).subs(y, chosen_y)\\\\nsum_abc = a_value + b_value + c_value\\\\nprint(sum_abc)\\\\n```\\\\n\\\\nPlease run the provided code again to find the sum of a, b, and c.\", \"role\": \"assistant\"}, {\"content\": \"exitcode: 0 (execution succeeded)\\\\nCode output: \\\\n7\\\\n\", \"role\": \"user\"}, {\"content\": \"Great! Now we have the sum of a, b, and c. The answer is:\\\\n\\\\n$$a + b + c = 7$$\\\\n\\\\nTERMINATE\", \"role\": \"assistant\"}]': {'created_at': [0, 1, 2, 3, 4], 'cost': [0.022019999999999998, 0.03305999999999999, 0.04019999999999999, 0.058589999999999996, 0.050969999999999994]}}\n"
388
+ ]
389
+ }
390
+ ],
391
+ "source": [
392
+ "print(autogen.ChatCompletion.logged_history)"
393
+ ]
394
+ },
395
+ {
396
+ "cell_type": "code",
397
+ "execution_count": 6,
398
+ "metadata": {},
399
+ "outputs": [],
400
+ "source": [
401
+ "import json\n",
402
+ "\n",
403
+ "json.dump(autogen.ChatCompletion.logged_history, open(\"conversations.json\", \"w\"), indent=2)"
404
+ ]
405
+ }
406
+ ],
407
+ "metadata": {
408
+ "kernelspec": {
409
+ "display_name": "Python 3",
410
+ "language": "python",
411
+ "name": "python3"
412
+ },
413
+ "language_info": {
414
+ "codemirror_mode": {
415
+ "name": "ipython",
416
+ "version": 3
417
+ },
418
+ "file_extension": ".py",
419
+ "mimetype": "text/x-python",
420
+ "name": "python",
421
+ "nbconvert_exporter": "python",
422
+ "pygments_lexer": "ipython3",
423
+ "version": "3.9.17"
424
+ },
425
+ "vscode": {
426
+ "interpreter": {
427
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
428
+ }
429
+ },
430
+ "widgets": {
431
+ "application/vnd.jupyter.widget-state+json": {
432
+ "state": {
433
+ "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
434
+ "model_module": "@jupyter-widgets/base",
435
+ "model_module_version": "2.0.0",
436
+ "model_name": "LayoutModel",
437
+ "state": {
438
+ "_model_module": "@jupyter-widgets/base",
439
+ "_model_module_version": "2.0.0",
440
+ "_model_name": "LayoutModel",
441
+ "_view_count": null,
442
+ "_view_module": "@jupyter-widgets/base",
443
+ "_view_module_version": "2.0.0",
444
+ "_view_name": "LayoutView",
445
+ "align_content": null,
446
+ "align_items": null,
447
+ "align_self": null,
448
+ "border_bottom": null,
449
+ "border_left": null,
450
+ "border_right": null,
451
+ "border_top": null,
452
+ "bottom": null,
453
+ "display": null,
454
+ "flex": null,
455
+ "flex_flow": null,
456
+ "grid_area": null,
457
+ "grid_auto_columns": null,
458
+ "grid_auto_flow": null,
459
+ "grid_auto_rows": null,
460
+ "grid_column": null,
461
+ "grid_gap": null,
462
+ "grid_row": null,
463
+ "grid_template_areas": null,
464
+ "grid_template_columns": null,
465
+ "grid_template_rows": null,
466
+ "height": null,
467
+ "justify_content": null,
468
+ "justify_items": null,
469
+ "left": null,
470
+ "margin": null,
471
+ "max_height": null,
472
+ "max_width": null,
473
+ "min_height": null,
474
+ "min_width": null,
475
+ "object_fit": null,
476
+ "object_position": null,
477
+ "order": null,
478
+ "overflow": null,
479
+ "padding": null,
480
+ "right": null,
481
+ "top": null,
482
+ "visibility": null,
483
+ "width": null
484
+ }
485
+ },
486
+ "454146d0f7224f038689031002906e6f": {
487
+ "model_module": "@jupyter-widgets/controls",
488
+ "model_module_version": "2.0.0",
489
+ "model_name": "HBoxModel",
490
+ "state": {
491
+ "_dom_classes": [],
492
+ "_model_module": "@jupyter-widgets/controls",
493
+ "_model_module_version": "2.0.0",
494
+ "_model_name": "HBoxModel",
495
+ "_view_count": null,
496
+ "_view_module": "@jupyter-widgets/controls",
497
+ "_view_module_version": "2.0.0",
498
+ "_view_name": "HBoxView",
499
+ "box_style": "",
500
+ "children": [
501
+ "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
502
+ "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
503
+ "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
504
+ ],
505
+ "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
506
+ "tabbable": null,
507
+ "tooltip": null
508
+ }
509
+ },
510
+ "577e1e3cc4db4942b0883577b3b52755": {
511
+ "model_module": "@jupyter-widgets/controls",
512
+ "model_module_version": "2.0.0",
513
+ "model_name": "FloatProgressModel",
514
+ "state": {
515
+ "_dom_classes": [],
516
+ "_model_module": "@jupyter-widgets/controls",
517
+ "_model_module_version": "2.0.0",
518
+ "_model_name": "FloatProgressModel",
519
+ "_view_count": null,
520
+ "_view_module": "@jupyter-widgets/controls",
521
+ "_view_module_version": "2.0.0",
522
+ "_view_name": "ProgressView",
523
+ "bar_style": "success",
524
+ "description": "",
525
+ "description_allow_html": false,
526
+ "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
527
+ "max": 1,
528
+ "min": 0,
529
+ "orientation": "horizontal",
530
+ "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
531
+ "tabbable": null,
532
+ "tooltip": null,
533
+ "value": 1
534
+ }
535
+ },
536
+ "6086462a12d54bafa59d3c4566f06cb2": {
537
+ "model_module": "@jupyter-widgets/base",
538
+ "model_module_version": "2.0.0",
539
+ "model_name": "LayoutModel",
540
+ "state": {
541
+ "_model_module": "@jupyter-widgets/base",
542
+ "_model_module_version": "2.0.0",
543
+ "_model_name": "LayoutModel",
544
+ "_view_count": null,
545
+ "_view_module": "@jupyter-widgets/base",
546
+ "_view_module_version": "2.0.0",
547
+ "_view_name": "LayoutView",
548
+ "align_content": null,
549
+ "align_items": null,
550
+ "align_self": null,
551
+ "border_bottom": null,
552
+ "border_left": null,
553
+ "border_right": null,
554
+ "border_top": null,
555
+ "bottom": null,
556
+ "display": null,
557
+ "flex": null,
558
+ "flex_flow": null,
559
+ "grid_area": null,
560
+ "grid_auto_columns": null,
561
+ "grid_auto_flow": null,
562
+ "grid_auto_rows": null,
563
+ "grid_column": null,
564
+ "grid_gap": null,
565
+ "grid_row": null,
566
+ "grid_template_areas": null,
567
+ "grid_template_columns": null,
568
+ "grid_template_rows": null,
569
+ "height": null,
570
+ "justify_content": null,
571
+ "justify_items": null,
572
+ "left": null,
573
+ "margin": null,
574
+ "max_height": null,
575
+ "max_width": null,
576
+ "min_height": null,
577
+ "min_width": null,
578
+ "object_fit": null,
579
+ "object_position": null,
580
+ "order": null,
581
+ "overflow": null,
582
+ "padding": null,
583
+ "right": null,
584
+ "top": null,
585
+ "visibility": null,
586
+ "width": null
587
+ }
588
+ },
589
+ "74a6ba0c3cbc4051be0a83e152fe1e62": {
590
+ "model_module": "@jupyter-widgets/controls",
591
+ "model_module_version": "2.0.0",
592
+ "model_name": "ProgressStyleModel",
593
+ "state": {
594
+ "_model_module": "@jupyter-widgets/controls",
595
+ "_model_module_version": "2.0.0",
596
+ "_model_name": "ProgressStyleModel",
597
+ "_view_count": null,
598
+ "_view_module": "@jupyter-widgets/base",
599
+ "_view_module_version": "2.0.0",
600
+ "_view_name": "StyleView",
601
+ "bar_color": null,
602
+ "description_width": ""
603
+ }
604
+ },
605
+ "7d3f3d9e15894d05a4d188ff4f466554": {
606
+ "model_module": "@jupyter-widgets/controls",
607
+ "model_module_version": "2.0.0",
608
+ "model_name": "HTMLStyleModel",
609
+ "state": {
610
+ "_model_module": "@jupyter-widgets/controls",
611
+ "_model_module_version": "2.0.0",
612
+ "_model_name": "HTMLStyleModel",
613
+ "_view_count": null,
614
+ "_view_module": "@jupyter-widgets/base",
615
+ "_view_module_version": "2.0.0",
616
+ "_view_name": "StyleView",
617
+ "background": null,
618
+ "description_width": "",
619
+ "font_size": null,
620
+ "text_color": null
621
+ }
622
+ },
623
+ "b40bdfb1ac1d4cffb7cefcb870c64d45": {
624
+ "model_module": "@jupyter-widgets/controls",
625
+ "model_module_version": "2.0.0",
626
+ "model_name": "HTMLModel",
627
+ "state": {
628
+ "_dom_classes": [],
629
+ "_model_module": "@jupyter-widgets/controls",
630
+ "_model_module_version": "2.0.0",
631
+ "_model_name": "HTMLModel",
632
+ "_view_count": null,
633
+ "_view_module": "@jupyter-widgets/controls",
634
+ "_view_module_version": "2.0.0",
635
+ "_view_name": "HTMLView",
636
+ "description": "",
637
+ "description_allow_html": false,
638
+ "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
639
+ "placeholder": "​",
640
+ "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
641
+ "tabbable": null,
642
+ "tooltip": null,
643
+ "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
644
+ }
645
+ },
646
+ "ca245376fd9f4354af6b2befe4af4466": {
647
+ "model_module": "@jupyter-widgets/controls",
648
+ "model_module_version": "2.0.0",
649
+ "model_name": "HTMLStyleModel",
650
+ "state": {
651
+ "_model_module": "@jupyter-widgets/controls",
652
+ "_model_module_version": "2.0.0",
653
+ "_model_name": "HTMLStyleModel",
654
+ "_view_count": null,
655
+ "_view_module": "@jupyter-widgets/base",
656
+ "_view_module_version": "2.0.0",
657
+ "_view_name": "StyleView",
658
+ "background": null,
659
+ "description_width": "",
660
+ "font_size": null,
661
+ "text_color": null
662
+ }
663
+ },
664
+ "dc83c7bff2f241309537a8119dfc7555": {
665
+ "model_module": "@jupyter-widgets/base",
666
+ "model_module_version": "2.0.0",
667
+ "model_name": "LayoutModel",
668
+ "state": {
669
+ "_model_module": "@jupyter-widgets/base",
670
+ "_model_module_version": "2.0.0",
671
+ "_model_name": "LayoutModel",
672
+ "_view_count": null,
673
+ "_view_module": "@jupyter-widgets/base",
674
+ "_view_module_version": "2.0.0",
675
+ "_view_name": "LayoutView",
676
+ "align_content": null,
677
+ "align_items": null,
678
+ "align_self": null,
679
+ "border_bottom": null,
680
+ "border_left": null,
681
+ "border_right": null,
682
+ "border_top": null,
683
+ "bottom": null,
684
+ "display": null,
685
+ "flex": null,
686
+ "flex_flow": null,
687
+ "grid_area": null,
688
+ "grid_auto_columns": null,
689
+ "grid_auto_flow": null,
690
+ "grid_auto_rows": null,
691
+ "grid_column": null,
692
+ "grid_gap": null,
693
+ "grid_row": null,
694
+ "grid_template_areas": null,
695
+ "grid_template_columns": null,
696
+ "grid_template_rows": null,
697
+ "height": null,
698
+ "justify_content": null,
699
+ "justify_items": null,
700
+ "left": null,
701
+ "margin": null,
702
+ "max_height": null,
703
+ "max_width": null,
704
+ "min_height": null,
705
+ "min_width": null,
706
+ "object_fit": null,
707
+ "object_position": null,
708
+ "order": null,
709
+ "overflow": null,
710
+ "padding": null,
711
+ "right": null,
712
+ "top": null,
713
+ "visibility": null,
714
+ "width": null
715
+ }
716
+ },
717
+ "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
718
+ "model_module": "@jupyter-widgets/controls",
719
+ "model_module_version": "2.0.0",
720
+ "model_name": "HTMLModel",
721
+ "state": {
722
+ "_dom_classes": [],
723
+ "_model_module": "@jupyter-widgets/controls",
724
+ "_model_module_version": "2.0.0",
725
+ "_model_name": "HTMLModel",
726
+ "_view_count": null,
727
+ "_view_module": "@jupyter-widgets/controls",
728
+ "_view_module_version": "2.0.0",
729
+ "_view_name": "HTMLView",
730
+ "description": "",
731
+ "description_allow_html": false,
732
+ "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
733
+ "placeholder": "​",
734
+ "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
735
+ "tabbable": null,
736
+ "tooltip": null,
737
+ "value": "100%"
738
+ }
739
+ },
740
+ "f1355871cc6f4dd4b50d9df5af20e5c8": {
741
+ "model_module": "@jupyter-widgets/base",
742
+ "model_module_version": "2.0.0",
743
+ "model_name": "LayoutModel",
744
+ "state": {
745
+ "_model_module": "@jupyter-widgets/base",
746
+ "_model_module_version": "2.0.0",
747
+ "_model_name": "LayoutModel",
748
+ "_view_count": null,
749
+ "_view_module": "@jupyter-widgets/base",
750
+ "_view_module_version": "2.0.0",
751
+ "_view_name": "LayoutView",
752
+ "align_content": null,
753
+ "align_items": null,
754
+ "align_self": null,
755
+ "border_bottom": null,
756
+ "border_left": null,
757
+ "border_right": null,
758
+ "border_top": null,
759
+ "bottom": null,
760
+ "display": null,
761
+ "flex": null,
762
+ "flex_flow": null,
763
+ "grid_area": null,
764
+ "grid_auto_columns": null,
765
+ "grid_auto_flow": null,
766
+ "grid_auto_rows": null,
767
+ "grid_column": null,
768
+ "grid_gap": null,
769
+ "grid_row": null,
770
+ "grid_template_areas": null,
771
+ "grid_template_columns": null,
772
+ "grid_template_rows": null,
773
+ "height": null,
774
+ "justify_content": null,
775
+ "justify_items": null,
776
+ "left": null,
777
+ "margin": null,
778
+ "max_height": null,
779
+ "max_width": null,
780
+ "min_height": null,
781
+ "min_width": null,
782
+ "object_fit": null,
783
+ "object_position": null,
784
+ "order": null,
785
+ "overflow": null,
786
+ "padding": null,
787
+ "right": null,
788
+ "top": null,
789
+ "visibility": null,
790
+ "width": null
791
+ }
792
+ }
793
+ },
794
+ "version_major": 2,
795
+ "version_minor": 0
796
+ }
797
+ }
798
+ },
799
+ "nbformat": 4,
800
+ "nbformat_minor": 2
801
+ }
agentchat_planning.ipynb ADDED
@@ -0,0 +1,818 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_planning.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "slideshow": {
16
+ "slide_type": "slide"
17
+ }
18
+ },
19
+ "source": [
20
+ "# Auto Generated Agent Chat: Collaborative Task Solving with Coding and Planning Agent\n",
21
+ "\n",
22
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
23
+ "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
24
+ "\n",
25
+ "In this notebook, we demonstrate how to use multiple agents to work together and accomplish a task which requires finding info from the web and coding. `AssistantAgent` is an LLM-based agent that can write and debug Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for a user to execute the code written by `AssistantAgent`. We further create a planning agent for the assistant agent to consult. The planning agent is a variation of the LLM-based `AssistantAgent` with a different system message.\n",
26
+ "\n",
27
+ "## Requirements\n",
28
+ "\n",
29
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install pyautogen and docker:\n",
30
+ "```bash\n",
31
+ "pip install pyautogen docker\n",
32
+ "```"
33
+ ]
34
+ },
35
+ {
36
+ "cell_type": "code",
37
+ "execution_count": 1,
38
+ "metadata": {
39
+ "execution": {
40
+ "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
41
+ "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
42
+ "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
43
+ "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
44
+ }
45
+ },
46
+ "outputs": [],
47
+ "source": [
48
+ "# %pip install pyautogen~=0.1.0 docker"
49
+ ]
50
+ },
51
+ {
52
+ "attachments": {},
53
+ "cell_type": "markdown",
54
+ "metadata": {},
55
+ "source": [
56
+ "## Set your API Endpoint\n",
57
+ "\n",
58
+ "* The [`config_list_openai_aoai`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_openai_aoai) function tries to create a list of configurations using Azure OpenAI endpoints and OpenAI endpoints. It assumes the api keys and api bases are stored in the corresponding environment variables or local txt files:\n",
59
+ " - OpenAI API key: os.environ[\"OPENAI_API_KEY\"] or `openai_api_key_file=\"key_openai.txt\"`.\n",
60
+ " - Azure OpenAI API key: os.environ[\"AZURE_OPENAI_API_KEY\"] or `aoai_api_key_file=\"key_aoai.txt\"`. Multiple keys can be stored, one per line.\n",
61
+ " - Azure OpenAI API base: os.environ[\"AZURE_OPENAI_API_BASE\"] or `aoai_api_base_file=\"base_aoai.txt\"`. Multiple bases can be stored, one per line.\n",
62
+ "* The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file. It first looks for an environment variable with a specified name. The value of the environment variable needs to be a valid json string. If that variable is not found, it then looks for a json file with the same name. It filters the configs by filter_dict.\n",
63
+ "\n",
64
+ "It's OK to have only the OpenAI API key, or only the Azure OpenAI API key + base. If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n"
65
+ ]
66
+ },
67
+ {
68
+ "cell_type": "code",
69
+ "execution_count": 2,
70
+ "metadata": {},
71
+ "outputs": [],
72
+ "source": [
73
+ "import autogen\n",
74
+ "\n",
75
+ "config_list = autogen.config_list_from_json(\n",
76
+ " \"OAI_CONFIG_LIST\",\n",
77
+ " filter_dict={\n",
78
+ " \"model\": [\"gpt-4\", \"gpt-4-0314\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
79
+ " },\n",
80
+ ")"
81
+ ]
82
+ },
83
+ {
84
+ "attachments": {},
85
+ "cell_type": "markdown",
86
+ "metadata": {},
87
+ "source": [
88
+ "The config list looks like the following:\n",
89
+ "```python\n",
90
+ "config_list = [\n",
91
+ " {\n",
92
+ " 'model': 'gpt-4',\n",
93
+ " 'api_key': '<your OpenAI API key here>',\n",
94
+ " }, # OpenAI API endpoint for gpt-4\n",
95
+ " {\n",
96
+ " 'model': 'gpt-4',\n",
97
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
98
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
99
+ " 'api_type': 'azure',\n",
100
+ " 'api_version': '2023-07-01-preview',\n",
101
+ " }, # Azure OpenAI API endpoint for gpt-4\n",
102
+ " {\n",
103
+ " 'model': 'gpt-4-32k',\n",
104
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
105
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
106
+ " 'api_type': 'azure',\n",
107
+ " 'api_version': '2023-07-01-preview',\n",
108
+ " }, # Azure OpenAI API endpoint for gpt-4-32k\n",
109
+ "]\n",
110
+ "```\n",
111
+ "\n",
112
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n",
113
+ "\n",
114
+ "You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file.\n",
115
+ "\n",
116
+ "## Construct Agents\n",
117
+ "\n",
118
+ "We construct the planning agent named \"planner\" and a user proxy agent for the planner named \"planner_user\". We specify `human_input_mode` as \"NEVER\" in the user proxy agent, which will never ask for human feedback. We define `ask_planner` function to send a message to planner and return the suggestion from the planner."
119
+ ]
120
+ },
121
+ {
122
+ "cell_type": "code",
123
+ "execution_count": 3,
124
+ "metadata": {},
125
+ "outputs": [],
126
+ "source": [
127
+ "planner = autogen.AssistantAgent(\n",
128
+ " name=\"planner\",\n",
129
+ " llm_config={\"config_list\": config_list},\n",
130
+ " # the default system message of the AssistantAgent is overwritten here\n",
131
+ " system_message=\"You are a helpful AI assistant. You suggest coding and reasoning steps for another AI assistant to accomplish a task. Do not suggest concrete code. For any action beyond writing code or reasoning, convert it to a step which can be implemented by writing code. For example, the action of browsing the web can be implemented by writing code which reads and prints the content of a web page. Finally, inspect the execution result. If the plan is not good, suggest a better plan. If the execution is wrong, analyze the error and suggest a fix.\"\n",
132
+ ")\n",
133
+ "planner_user = autogen.UserProxyAgent(\n",
134
+ " name=\"planner_user\",\n",
135
+ " max_consecutive_auto_reply=0, # terminate without auto-reply\n",
136
+ " human_input_mode=\"NEVER\",\n",
137
+ ")\n",
138
+ "\n",
139
+ "def ask_planner(message):\n",
140
+ " planner_user.initiate_chat(planner, message=message)\n",
141
+ " # return the last message received from the planner\n",
142
+ " return planner_user.last_message()[\"content\"]\n"
143
+ ]
144
+ },
145
+ {
146
+ "attachments": {},
147
+ "cell_type": "markdown",
148
+ "metadata": {},
149
+ "source": [
150
+ "We construct the assistant agent and the user proxy agent. We specify `human_input_mode` as \"TERMINATE\" in the user proxy agent, which will ask for feedback when it receives a \"TERMINATE\" signal from the assistant agent. We set the `functions` in `AssistantAgent` and `function_map` in `UserProxyAgent` to use the created `ask_planner` function."
151
+ ]
152
+ },
153
+ {
154
+ "cell_type": "code",
155
+ "execution_count": 4,
156
+ "metadata": {},
157
+ "outputs": [],
158
+ "source": [
159
+ "# create an AssistantAgent instance named \"assistant\"\n",
160
+ "assistant = autogen.AssistantAgent(\n",
161
+ " name=\"assistant\",\n",
162
+ " llm_config={\n",
163
+ " \"temperature\": 0,\n",
164
+ " \"request_timeout\": 600,\n",
165
+ " \"seed\": 42,\n",
166
+ " \"model\": \"gpt-4-0613\",\n",
167
+ " \"config_list\": autogen.config_list_openai_aoai(exclude=\"aoai\"),\n",
168
+ " \"functions\": [\n",
169
+ " {\n",
170
+ " \"name\": \"ask_planner\",\n",
171
+ " \"description\": \"ask planner to: 1. get a plan for finishing a task, 2. verify the execution result of the plan and potentially suggest new plan.\",\n",
172
+ " \"parameters\": {\n",
173
+ " \"type\": \"object\",\n",
174
+ " \"properties\": {\n",
175
+ " \"message\": {\n",
176
+ " \"type\": \"string\",\n",
177
+ " \"description\": \"question to ask planner. Make sure the question include enough context, such as the code and the execution result. The planner does not know the conversation between you and the user, unless you share the conversation with the planner.\",\n",
178
+ " },\n",
179
+ " },\n",
180
+ " \"required\": [\"message\"],\n",
181
+ " },\n",
182
+ " },\n",
183
+ " ],\n",
184
+ " }\n",
185
+ ")\n",
186
+ "\n",
187
+ "# create a UserProxyAgent instance named \"user_proxy\"\n",
188
+ "user_proxy = autogen.UserProxyAgent(\n",
189
+ " name=\"user_proxy\",\n",
190
+ " human_input_mode=\"TERMINATE\",\n",
191
+ " max_consecutive_auto_reply=10,\n",
192
+ " # is_termination_msg=lambda x: \"content\" in x and x[\"content\"] is not None and x[\"content\"].rstrip().endswith(\"TERMINATE\"),\n",
193
+ " code_execution_config={\"work_dir\": \"planning\"},\n",
194
+ " function_map={\"ask_planner\": ask_planner},\n",
195
+ ")"
196
+ ]
197
+ },
198
+ {
199
+ "attachments": {},
200
+ "cell_type": "markdown",
201
+ "metadata": {},
202
+ "source": [
203
+ "## Perform a task\n",
204
+ "\n",
205
+ "We invoke the `initiate_chat()` method of the user proxy agent to start the conversation. When you run the cell below, you will be prompted to provide feedback after the assistant agent sends a \"TERMINATE\" signal in the end of the message. If you don't provide any feedback (by pressing Enter directly), the conversation will finish. Before the \"TERMINATE\" signal, the user proxy agent will try to execute the code suggested by the assistant agent on behalf of the user."
206
+ ]
207
+ },
208
+ {
209
+ "cell_type": "code",
210
+ "execution_count": 5,
211
+ "metadata": {},
212
+ "outputs": [
213
+ {
214
+ "name": "stdout",
215
+ "output_type": "stream",
216
+ "text": [
217
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
218
+ "\n",
219
+ "Suggest a fix to an open good first issue of flaml\n",
220
+ "\n",
221
+ "--------------------------------------------------------------------------------\n",
222
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
223
+ "\n",
224
+ "To suggest a fix to an open good first issue of FLAML, we first need to fetch the list of open issues labeled as \"good first issue\" from the FLAML GitHub repository. We can do this using the GitHub API.\n",
225
+ "\n",
226
+ "Here is a Python script that uses the requests library to fetch the list of open issues labeled as \"good first issue\" from the FLAML GitHub repository.\n",
227
+ "\n",
228
+ "```python\n",
229
+ "# filename: fetch_issues.py\n",
230
+ "\n",
231
+ "import requests\n",
232
+ "import json\n",
233
+ "\n",
234
+ "def fetch_issues():\n",
235
+ " url = \"https://api.github.com/repos/microsoft/FLAML/issues\"\n",
236
+ " params = {\n",
237
+ " \"state\": \"open\",\n",
238
+ " \"labels\": \"good first issue\"\n",
239
+ " }\n",
240
+ " response = requests.get(url, params=params)\n",
241
+ " issues = response.json()\n",
242
+ " for issue in issues:\n",
243
+ " print(f\"Issue ID: {issue['id']}, Title: {issue['title']}, URL: {issue['html_url']}\")\n",
244
+ "\n",
245
+ "fetch_issues()\n",
246
+ "```\n",
247
+ "\n",
248
+ "Please run this script to fetch the list of open issues. After that, we can select one issue and suggest a fix for it.\n",
249
+ "\n",
250
+ "--------------------------------------------------------------------------------\n",
251
+ "\u001b[31m\n",
252
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
253
+ "\u001b[31m\n",
254
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
255
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
256
+ "\n",
257
+ "exitcode: 0 (execution succeeded)\n",
258
+ "Code output: \n",
259
+ "Issue ID: 1809297895, Title: Moving function execution out of UserProxyAgent to be an openai util, URL: https://github.com/microsoft/FLAML/issues/1135\n",
260
+ "Issue ID: 1799114476, Title: use_label_encoder warning with xgboost, URL: https://github.com/microsoft/FLAML/issues/1120\n",
261
+ "Issue ID: 1705274482, Title: Use appropriate wait time for retry based on the error message. , URL: https://github.com/microsoft/FLAML/issues/1034\n",
262
+ "Issue ID: 1702580697, Title: Issues with Adding Custom APIs in Auto Generation, URL: https://github.com/microsoft/FLAML/issues/1029\n",
263
+ "Issue ID: 1658981020, Title: Running flaml[tune] using \"-O\" flag for python interpreter (optimization - disables assertions) crashes, URL: https://github.com/microsoft/FLAML/issues/981\n",
264
+ "Issue ID: 1560969891, Title: Conditional parameter flow2 crash, URL: https://github.com/microsoft/FLAML/issues/903\n",
265
+ "Issue ID: 1538549388, Title: indentation space, URL: https://github.com/microsoft/FLAML/issues/884\n",
266
+ "Issue ID: 1531028010, Title: Check if openml version is required, URL: https://github.com/microsoft/FLAML/issues/882\n",
267
+ "Issue ID: 1470354491, Title: Adjust the indent, URL: https://github.com/microsoft/FLAML/issues/834\n",
268
+ "Issue ID: 1456950742, Title: pip install flaml FAIL, URL: https://github.com/microsoft/FLAML/issues/821\n",
269
+ "Issue ID: 1441047067, Title: Isolate the ensemble part and expose it to users, URL: https://github.com/microsoft/FLAML/issues/807\n",
270
+ "Issue ID: 1440171793, Title: how to pass categorical features names or indices to learner, URL: https://github.com/microsoft/FLAML/issues/805\n",
271
+ "Issue ID: 1429945686, Title: Flaml/LightGBM - Shouldn't I found better/faster or equal results from FLAML than direct LightGBM?, URL: https://github.com/microsoft/FLAML/issues/785\n",
272
+ "Issue ID: 1408240042, Title: Add an announcement of the discord channel, URL: https://github.com/microsoft/FLAML/issues/764\n",
273
+ "Issue ID: 1396515109, Title: Documentation about small budget, URL: https://github.com/microsoft/FLAML/issues/748\n",
274
+ "Issue ID: 1378268096, Title: Make zero-shot automl more discoverable, URL: https://github.com/microsoft/FLAML/issues/737\n",
275
+ "Issue ID: 1189515901, Title: New HCrystalBall release, URL: https://github.com/microsoft/FLAML/issues/509\n",
276
+ "Issue ID: 1114253143, Title: samples about conversion to ONNX, URL: https://github.com/microsoft/FLAML/issues/429\n",
277
+ "Issue ID: 1107488969, Title: support anomaly detection, URL: https://github.com/microsoft/FLAML/issues/413\n",
278
+ "Issue ID: 1061332179, Title: CatBoost Fails with Keyword 'groups', URL: https://github.com/microsoft/FLAML/issues/304\n",
279
+ "\n",
280
+ "\n",
281
+ "--------------------------------------------------------------------------------\n",
282
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
283
+ "\n",
284
+ "\u001b[32m***** Suggested function Call: ask_planner *****\u001b[0m\n",
285
+ "Arguments: \n",
286
+ "{\n",
287
+ "\"message\": \"We have fetched a list of open issues labeled as 'good first issue' from the FLAML GitHub repository. Now, we need to select one issue and suggest a fix for it. Could you please provide a plan for this?\"\n",
288
+ "}\n",
289
+ "\u001b[32m************************************************\u001b[0m\n",
290
+ "\n",
291
+ "--------------------------------------------------------------------------------\n",
292
+ "\u001b[31m\n",
293
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
294
+ "\u001b[35m\n",
295
+ ">>>>>>>> EXECUTING FUNCTION ask_planner...\u001b[0m\n",
296
+ "\u001b[33mplanner_user\u001b[0m (to planner):\n",
297
+ "\n",
298
+ "We have fetched a list of open issues labeled as 'good first issue' from the FLAML GitHub repository. Now, we need to select one issue and suggest a fix for it. Could you please provide a plan for this?\n",
299
+ "\n",
300
+ "--------------------------------------------------------------------------------\n",
301
+ "\u001b[33mplanner\u001b[0m (to planner_user):\n",
302
+ "\n",
303
+ "Sure, here's a plan for selecting one issue from the list and suggesting a fix for it:\n",
304
+ "\n",
305
+ "1. Import the fetched list of open issues labeled as 'good first issue' from the FLAML GitHub repository into your AI assistant. \n",
306
+ "2. Examine the list for common issue attributes like 'title', 'description', 'labels', 'issue number', 'created at', and 'updated at'. \n",
307
+ "3. To select a suitable issue for fixing, apply a selection criteria based on your preferences, such as prioritizing by the 'created at' attribute in descending order to pick the most recent issue, or filtering by a specific label in addition to 'good first issue'. Write code to filter and sort the issues accordingly.\n",
308
+ "4. Inspect the execution result. If the selection criteria are not applied correctly, modify the code to fix any errors.\n",
309
+ "5. Once the issue is selected, read the issue's title, description, and any linked resources or documents to understand the problem to be solved.\n",
310
+ "6. Break down the issue into smaller tasks that can be addressed by writing code, and create a step-by-step plan.\n",
311
+ "\n",
312
+ "For instance, the following could be smaller tasks to address the selected issue:\n",
313
+ " a. Understand the issue's background and requirements.\n",
314
+ " b. Write clear and concise instructions to reproduce the issue.\n",
315
+ " c. Analyze existing code or tests related to the issue.\n",
316
+ " d. Devise a solution to fix the issue.\n",
317
+ " e. Implement the solution in separate code pieces.\n",
318
+ " f. Verify that the solution addresses the issue.\n",
319
+ " g. Write unit tests to ensure the solution is robust and handles edge cases.\n",
320
+ "\n",
321
+ "7. Inspect the execution result. If the issue is misunderstood or the tasks' breakdown is incorrect, revise the understanding of the issue and modify the tasks accordingly.\n",
322
+ "8. With the defined tasks and step-by-step plan, work on each task, and test the implemented code to ensure the issue is solved.\n",
323
+ "9. If any issues arise during the task execution, analyze the errors and adjust the plan or code accordingly.\n",
324
+ "10. Once the issue is fixed, prepare a pull request on GitHub, mentioning the issue number and giving a brief description of the solution in the merge request.\n",
325
+ "\n",
326
+ "Remember that this is meant to be a general plan, and the specific tasks may vary depending on the selected issue. Adjust the plan as needed, based on the selected issue's requirements and your problem-solving approach.\n",
327
+ "\n",
328
+ "--------------------------------------------------------------------------------\n",
329
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
330
+ "\n",
331
+ "\u001b[32m***** Response from calling function \"ask_planner\" *****\u001b[0m\n",
332
+ "Sure, here's a plan for selecting one issue from the list and suggesting a fix for it:\n",
333
+ "\n",
334
+ "1. Import the fetched list of open issues labeled as 'good first issue' from the FLAML GitHub repository into your AI assistant. \n",
335
+ "2. Examine the list for common issue attributes like 'title', 'description', 'labels', 'issue number', 'created at', and 'updated at'. \n",
336
+ "3. To select a suitable issue for fixing, apply a selection criteria based on your preferences, such as prioritizing by the 'created at' attribute in descending order to pick the most recent issue, or filtering by a specific label in addition to 'good first issue'. Write code to filter and sort the issues accordingly.\n",
337
+ "4. Inspect the execution result. If the selection criteria are not applied correctly, modify the code to fix any errors.\n",
338
+ "5. Once the issue is selected, read the issue's title, description, and any linked resources or documents to understand the problem to be solved.\n",
339
+ "6. Break down the issue into smaller tasks that can be addressed by writing code, and create a step-by-step plan.\n",
340
+ "\n",
341
+ "For instance, the following could be smaller tasks to address the selected issue:\n",
342
+ " a. Understand the issue's background and requirements.\n",
343
+ " b. Write clear and concise instructions to reproduce the issue.\n",
344
+ " c. Analyze existing code or tests related to the issue.\n",
345
+ " d. Devise a solution to fix the issue.\n",
346
+ " e. Implement the solution in separate code pieces.\n",
347
+ " f. Verify that the solution addresses the issue.\n",
348
+ " g. Write unit tests to ensure the solution is robust and handles edge cases.\n",
349
+ "\n",
350
+ "7. Inspect the execution result. If the issue is misunderstood or the tasks' breakdown is incorrect, revise the understanding of the issue and modify the tasks accordingly.\n",
351
+ "8. With the defined tasks and step-by-step plan, work on each task, and test the implemented code to ensure the issue is solved.\n",
352
+ "9. If any issues arise during the task execution, analyze the errors and adjust the plan or code accordingly.\n",
353
+ "10. Once the issue is fixed, prepare a pull request on GitHub, mentioning the issue number and giving a brief description of the solution in the merge request.\n",
354
+ "\n",
355
+ "Remember that this is meant to be a general plan, and the specific tasks may vary depending on the selected issue. Adjust the plan as needed, based on the selected issue's requirements and your problem-solving approach.\n",
356
+ "\u001b[32m********************************************************\u001b[0m\n",
357
+ "\n",
358
+ "--------------------------------------------------------------------------------\n",
359
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
360
+ "\n",
361
+ "Based on the plan, let's select the most recent issue from the list. The most recent issue is \"Moving function execution out of UserProxyAgent to be an openai util\" with the URL: https://github.com/microsoft/FLAML/issues/1135\n",
362
+ "\n",
363
+ "To suggest a fix for this issue, we need to understand the problem first. Please visit the issue URL and read the issue's description and any comments to understand the problem.\n",
364
+ "\n",
365
+ "Once you understand the problem, here are the steps you can follow to suggest a fix:\n",
366
+ "\n",
367
+ "1. Understand the issue's background and requirements: Read the issue description and any linked resources or documents to understand what needs to be done.\n",
368
+ "\n",
369
+ "2. Write clear and concise instructions to reproduce the issue: Based on the issue description, write down the steps to reproduce the issue. This will help you understand the problem better and will be useful when you test your fix.\n",
370
+ "\n",
371
+ "3. Analyze existing code or tests related to the issue: Look at the code in the FLAML repository that is related to the issue. Try to understand how it works and how it might be causing the issue.\n",
372
+ "\n",
373
+ "4. Devise a solution to fix the issue: Based on your understanding of the problem and the code, come up with a solution to fix the issue. This might involve writing new code or modifying existing code.\n",
374
+ "\n",
375
+ "5. Implement the solution in separate code pieces: Write the code for your solution. Make sure to break it down into small, manageable pieces. This will make it easier to understand and test.\n",
376
+ "\n",
377
+ "6. Verify that the solution addresses the issue: Test your code to make sure it solves the issue. You can do this by following the instructions you wrote to reproduce the issue.\n",
378
+ "\n",
379
+ "7. Write unit tests to ensure the solution is robust and handles edge cases: In addition to testing your solution manually, write unit tests to automatically test your code. This will help ensure that your solution works in all cases and that future changes to the code won't break your solution.\n",
380
+ "\n",
381
+ "8. Prepare a pull request on GitHub: Once you're confident that your solution works, prepare a pull request on GitHub. In the pull request description, mention the issue number and give a brief description of your solution.\n",
382
+ "\n",
383
+ "Please note that this is a general plan and the specific steps may vary depending on the issue. Adjust the plan as needed based on the issue's requirements and your problem-solving approach. \n",
384
+ "\n",
385
+ "TERMINATE\n",
386
+ "\n",
387
+ "--------------------------------------------------------------------------------\n",
388
+ "\u001b[31m\n",
389
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
390
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
391
+ "\n",
392
+ "\n",
393
+ "\n",
394
+ "--------------------------------------------------------------------------------\n",
395
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
396
+ "\n",
397
+ "TERMINATE\n",
398
+ "\n",
399
+ "--------------------------------------------------------------------------------\n",
400
+ "\u001b[31m\n",
401
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n"
402
+ ]
403
+ }
404
+ ],
405
+ "source": [
406
+ "# the assistant receives a message from the user, which contains the task description\n",
407
+ "user_proxy.initiate_chat(\n",
408
+ " assistant,\n",
409
+ " message=\"\"\"Suggest a fix to an open good first issue of flaml\"\"\",\n",
410
+ ")"
411
+ ]
412
+ },
413
+ {
414
+ "attachments": {},
415
+ "cell_type": "markdown",
416
+ "metadata": {},
417
+ "source": [
418
+ "When the assistant needs to consult the planner, it suggests a function call to `ask_planner`. When this happens, a line like the following will be displayed:\n",
419
+ "\n",
420
+ "***** Suggested function Call: ask_planner *****\n"
421
+ ]
422
+ }
423
+ ],
424
+ "metadata": {
425
+ "kernelspec": {
426
+ "display_name": "Python 3",
427
+ "language": "python",
428
+ "name": "python3"
429
+ },
430
+ "language_info": {
431
+ "codemirror_mode": {
432
+ "name": "ipython",
433
+ "version": 3
434
+ },
435
+ "file_extension": ".py",
436
+ "mimetype": "text/x-python",
437
+ "name": "python",
438
+ "nbconvert_exporter": "python",
439
+ "pygments_lexer": "ipython3",
440
+ "version": "3.9.17"
441
+ },
442
+ "vscode": {
443
+ "interpreter": {
444
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
445
+ }
446
+ },
447
+ "widgets": {
448
+ "application/vnd.jupyter.widget-state+json": {
449
+ "state": {
450
+ "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
451
+ "model_module": "@jupyter-widgets/base",
452
+ "model_module_version": "2.0.0",
453
+ "model_name": "LayoutModel",
454
+ "state": {
455
+ "_model_module": "@jupyter-widgets/base",
456
+ "_model_module_version": "2.0.0",
457
+ "_model_name": "LayoutModel",
458
+ "_view_count": null,
459
+ "_view_module": "@jupyter-widgets/base",
460
+ "_view_module_version": "2.0.0",
461
+ "_view_name": "LayoutView",
462
+ "align_content": null,
463
+ "align_items": null,
464
+ "align_self": null,
465
+ "border_bottom": null,
466
+ "border_left": null,
467
+ "border_right": null,
468
+ "border_top": null,
469
+ "bottom": null,
470
+ "display": null,
471
+ "flex": null,
472
+ "flex_flow": null,
473
+ "grid_area": null,
474
+ "grid_auto_columns": null,
475
+ "grid_auto_flow": null,
476
+ "grid_auto_rows": null,
477
+ "grid_column": null,
478
+ "grid_gap": null,
479
+ "grid_row": null,
480
+ "grid_template_areas": null,
481
+ "grid_template_columns": null,
482
+ "grid_template_rows": null,
483
+ "height": null,
484
+ "justify_content": null,
485
+ "justify_items": null,
486
+ "left": null,
487
+ "margin": null,
488
+ "max_height": null,
489
+ "max_width": null,
490
+ "min_height": null,
491
+ "min_width": null,
492
+ "object_fit": null,
493
+ "object_position": null,
494
+ "order": null,
495
+ "overflow": null,
496
+ "padding": null,
497
+ "right": null,
498
+ "top": null,
499
+ "visibility": null,
500
+ "width": null
501
+ }
502
+ },
503
+ "454146d0f7224f038689031002906e6f": {
504
+ "model_module": "@jupyter-widgets/controls",
505
+ "model_module_version": "2.0.0",
506
+ "model_name": "HBoxModel",
507
+ "state": {
508
+ "_dom_classes": [],
509
+ "_model_module": "@jupyter-widgets/controls",
510
+ "_model_module_version": "2.0.0",
511
+ "_model_name": "HBoxModel",
512
+ "_view_count": null,
513
+ "_view_module": "@jupyter-widgets/controls",
514
+ "_view_module_version": "2.0.0",
515
+ "_view_name": "HBoxView",
516
+ "box_style": "",
517
+ "children": [
518
+ "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
519
+ "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
520
+ "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
521
+ ],
522
+ "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
523
+ "tabbable": null,
524
+ "tooltip": null
525
+ }
526
+ },
527
+ "577e1e3cc4db4942b0883577b3b52755": {
528
+ "model_module": "@jupyter-widgets/controls",
529
+ "model_module_version": "2.0.0",
530
+ "model_name": "FloatProgressModel",
531
+ "state": {
532
+ "_dom_classes": [],
533
+ "_model_module": "@jupyter-widgets/controls",
534
+ "_model_module_version": "2.0.0",
535
+ "_model_name": "FloatProgressModel",
536
+ "_view_count": null,
537
+ "_view_module": "@jupyter-widgets/controls",
538
+ "_view_module_version": "2.0.0",
539
+ "_view_name": "ProgressView",
540
+ "bar_style": "success",
541
+ "description": "",
542
+ "description_allow_html": false,
543
+ "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
544
+ "max": 1,
545
+ "min": 0,
546
+ "orientation": "horizontal",
547
+ "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
548
+ "tabbable": null,
549
+ "tooltip": null,
550
+ "value": 1
551
+ }
552
+ },
553
+ "6086462a12d54bafa59d3c4566f06cb2": {
554
+ "model_module": "@jupyter-widgets/base",
555
+ "model_module_version": "2.0.0",
556
+ "model_name": "LayoutModel",
557
+ "state": {
558
+ "_model_module": "@jupyter-widgets/base",
559
+ "_model_module_version": "2.0.0",
560
+ "_model_name": "LayoutModel",
561
+ "_view_count": null,
562
+ "_view_module": "@jupyter-widgets/base",
563
+ "_view_module_version": "2.0.0",
564
+ "_view_name": "LayoutView",
565
+ "align_content": null,
566
+ "align_items": null,
567
+ "align_self": null,
568
+ "border_bottom": null,
569
+ "border_left": null,
570
+ "border_right": null,
571
+ "border_top": null,
572
+ "bottom": null,
573
+ "display": null,
574
+ "flex": null,
575
+ "flex_flow": null,
576
+ "grid_area": null,
577
+ "grid_auto_columns": null,
578
+ "grid_auto_flow": null,
579
+ "grid_auto_rows": null,
580
+ "grid_column": null,
581
+ "grid_gap": null,
582
+ "grid_row": null,
583
+ "grid_template_areas": null,
584
+ "grid_template_columns": null,
585
+ "grid_template_rows": null,
586
+ "height": null,
587
+ "justify_content": null,
588
+ "justify_items": null,
589
+ "left": null,
590
+ "margin": null,
591
+ "max_height": null,
592
+ "max_width": null,
593
+ "min_height": null,
594
+ "min_width": null,
595
+ "object_fit": null,
596
+ "object_position": null,
597
+ "order": null,
598
+ "overflow": null,
599
+ "padding": null,
600
+ "right": null,
601
+ "top": null,
602
+ "visibility": null,
603
+ "width": null
604
+ }
605
+ },
606
+ "74a6ba0c3cbc4051be0a83e152fe1e62": {
607
+ "model_module": "@jupyter-widgets/controls",
608
+ "model_module_version": "2.0.0",
609
+ "model_name": "ProgressStyleModel",
610
+ "state": {
611
+ "_model_module": "@jupyter-widgets/controls",
612
+ "_model_module_version": "2.0.0",
613
+ "_model_name": "ProgressStyleModel",
614
+ "_view_count": null,
615
+ "_view_module": "@jupyter-widgets/base",
616
+ "_view_module_version": "2.0.0",
617
+ "_view_name": "StyleView",
618
+ "bar_color": null,
619
+ "description_width": ""
620
+ }
621
+ },
622
+ "7d3f3d9e15894d05a4d188ff4f466554": {
623
+ "model_module": "@jupyter-widgets/controls",
624
+ "model_module_version": "2.0.0",
625
+ "model_name": "HTMLStyleModel",
626
+ "state": {
627
+ "_model_module": "@jupyter-widgets/controls",
628
+ "_model_module_version": "2.0.0",
629
+ "_model_name": "HTMLStyleModel",
630
+ "_view_count": null,
631
+ "_view_module": "@jupyter-widgets/base",
632
+ "_view_module_version": "2.0.0",
633
+ "_view_name": "StyleView",
634
+ "background": null,
635
+ "description_width": "",
636
+ "font_size": null,
637
+ "text_color": null
638
+ }
639
+ },
640
+ "b40bdfb1ac1d4cffb7cefcb870c64d45": {
641
+ "model_module": "@jupyter-widgets/controls",
642
+ "model_module_version": "2.0.0",
643
+ "model_name": "HTMLModel",
644
+ "state": {
645
+ "_dom_classes": [],
646
+ "_model_module": "@jupyter-widgets/controls",
647
+ "_model_module_version": "2.0.0",
648
+ "_model_name": "HTMLModel",
649
+ "_view_count": null,
650
+ "_view_module": "@jupyter-widgets/controls",
651
+ "_view_module_version": "2.0.0",
652
+ "_view_name": "HTMLView",
653
+ "description": "",
654
+ "description_allow_html": false,
655
+ "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
656
+ "placeholder": "​",
657
+ "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
658
+ "tabbable": null,
659
+ "tooltip": null,
660
+ "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
661
+ }
662
+ },
663
+ "ca245376fd9f4354af6b2befe4af4466": {
664
+ "model_module": "@jupyter-widgets/controls",
665
+ "model_module_version": "2.0.0",
666
+ "model_name": "HTMLStyleModel",
667
+ "state": {
668
+ "_model_module": "@jupyter-widgets/controls",
669
+ "_model_module_version": "2.0.0",
670
+ "_model_name": "HTMLStyleModel",
671
+ "_view_count": null,
672
+ "_view_module": "@jupyter-widgets/base",
673
+ "_view_module_version": "2.0.0",
674
+ "_view_name": "StyleView",
675
+ "background": null,
676
+ "description_width": "",
677
+ "font_size": null,
678
+ "text_color": null
679
+ }
680
+ },
681
+ "dc83c7bff2f241309537a8119dfc7555": {
682
+ "model_module": "@jupyter-widgets/base",
683
+ "model_module_version": "2.0.0",
684
+ "model_name": "LayoutModel",
685
+ "state": {
686
+ "_model_module": "@jupyter-widgets/base",
687
+ "_model_module_version": "2.0.0",
688
+ "_model_name": "LayoutModel",
689
+ "_view_count": null,
690
+ "_view_module": "@jupyter-widgets/base",
691
+ "_view_module_version": "2.0.0",
692
+ "_view_name": "LayoutView",
693
+ "align_content": null,
694
+ "align_items": null,
695
+ "align_self": null,
696
+ "border_bottom": null,
697
+ "border_left": null,
698
+ "border_right": null,
699
+ "border_top": null,
700
+ "bottom": null,
701
+ "display": null,
702
+ "flex": null,
703
+ "flex_flow": null,
704
+ "grid_area": null,
705
+ "grid_auto_columns": null,
706
+ "grid_auto_flow": null,
707
+ "grid_auto_rows": null,
708
+ "grid_column": null,
709
+ "grid_gap": null,
710
+ "grid_row": null,
711
+ "grid_template_areas": null,
712
+ "grid_template_columns": null,
713
+ "grid_template_rows": null,
714
+ "height": null,
715
+ "justify_content": null,
716
+ "justify_items": null,
717
+ "left": null,
718
+ "margin": null,
719
+ "max_height": null,
720
+ "max_width": null,
721
+ "min_height": null,
722
+ "min_width": null,
723
+ "object_fit": null,
724
+ "object_position": null,
725
+ "order": null,
726
+ "overflow": null,
727
+ "padding": null,
728
+ "right": null,
729
+ "top": null,
730
+ "visibility": null,
731
+ "width": null
732
+ }
733
+ },
734
+ "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
735
+ "model_module": "@jupyter-widgets/controls",
736
+ "model_module_version": "2.0.0",
737
+ "model_name": "HTMLModel",
738
+ "state": {
739
+ "_dom_classes": [],
740
+ "_model_module": "@jupyter-widgets/controls",
741
+ "_model_module_version": "2.0.0",
742
+ "_model_name": "HTMLModel",
743
+ "_view_count": null,
744
+ "_view_module": "@jupyter-widgets/controls",
745
+ "_view_module_version": "2.0.0",
746
+ "_view_name": "HTMLView",
747
+ "description": "",
748
+ "description_allow_html": false,
749
+ "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
750
+ "placeholder": "​",
751
+ "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
752
+ "tabbable": null,
753
+ "tooltip": null,
754
+ "value": "100%"
755
+ }
756
+ },
757
+ "f1355871cc6f4dd4b50d9df5af20e5c8": {
758
+ "model_module": "@jupyter-widgets/base",
759
+ "model_module_version": "2.0.0",
760
+ "model_name": "LayoutModel",
761
+ "state": {
762
+ "_model_module": "@jupyter-widgets/base",
763
+ "_model_module_version": "2.0.0",
764
+ "_model_name": "LayoutModel",
765
+ "_view_count": null,
766
+ "_view_module": "@jupyter-widgets/base",
767
+ "_view_module_version": "2.0.0",
768
+ "_view_name": "LayoutView",
769
+ "align_content": null,
770
+ "align_items": null,
771
+ "align_self": null,
772
+ "border_bottom": null,
773
+ "border_left": null,
774
+ "border_right": null,
775
+ "border_top": null,
776
+ "bottom": null,
777
+ "display": null,
778
+ "flex": null,
779
+ "flex_flow": null,
780
+ "grid_area": null,
781
+ "grid_auto_columns": null,
782
+ "grid_auto_flow": null,
783
+ "grid_auto_rows": null,
784
+ "grid_column": null,
785
+ "grid_gap": null,
786
+ "grid_row": null,
787
+ "grid_template_areas": null,
788
+ "grid_template_columns": null,
789
+ "grid_template_rows": null,
790
+ "height": null,
791
+ "justify_content": null,
792
+ "justify_items": null,
793
+ "left": null,
794
+ "margin": null,
795
+ "max_height": null,
796
+ "max_width": null,
797
+ "min_height": null,
798
+ "min_width": null,
799
+ "object_fit": null,
800
+ "object_position": null,
801
+ "order": null,
802
+ "overflow": null,
803
+ "padding": null,
804
+ "right": null,
805
+ "top": null,
806
+ "visibility": null,
807
+ "width": null
808
+ }
809
+ }
810
+ },
811
+ "version_major": 2,
812
+ "version_minor": 0
813
+ }
814
+ }
815
+ },
816
+ "nbformat": 4,
817
+ "nbformat_minor": 2
818
+ }
agentchat_stream.ipynb ADDED
@@ -0,0 +1,781 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_stream.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "slideshow": {
16
+ "slide_type": "slide"
17
+ }
18
+ },
19
+ "source": [
20
+ "# Interactive LLM Agent Dealing with Data Stream\n",
21
+ "\n",
22
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
23
+ "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
24
+ "\n",
25
+ "In this notebook, we demonstrate how to use customized agents to continuously acquires news from the web and ask for investment suggestions.\n",
26
+ "\n",
27
+ "## Requirements\n",
28
+ "\n",
29
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
30
+ "```bash\n",
31
+ "pip install pyautogen\n",
32
+ "```"
33
+ ]
34
+ },
35
+ {
36
+ "cell_type": "code",
37
+ "execution_count": 1,
38
+ "metadata": {
39
+ "execution": {
40
+ "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
41
+ "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
42
+ "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
43
+ "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
44
+ }
45
+ },
46
+ "outputs": [],
47
+ "source": [
48
+ "# %pip install pyautogen~=0.1.0"
49
+ ]
50
+ },
51
+ {
52
+ "attachments": {},
53
+ "cell_type": "markdown",
54
+ "metadata": {},
55
+ "source": [
56
+ "## Set your API Endpoint\n",
57
+ "\n",
58
+ "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n"
59
+ ]
60
+ },
61
+ {
62
+ "cell_type": "code",
63
+ "execution_count": 2,
64
+ "metadata": {},
65
+ "outputs": [],
66
+ "source": [
67
+ "import autogen\n",
68
+ "\n",
69
+ "config_list = autogen.config_list_from_json(\"OAI_CONFIG_LIST\")"
70
+ ]
71
+ },
72
+ {
73
+ "attachments": {},
74
+ "cell_type": "markdown",
75
+ "metadata": {},
76
+ "source": [
77
+ "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well). Only the models with matching names are kept in the list based on the filter condition.\n",
78
+ "\n",
79
+ "The config list looks like the following:\n",
80
+ "```python\n",
81
+ "config_list = [\n",
82
+ " {\n",
83
+ " 'model': 'gpt-4',\n",
84
+ " 'api_key': '<your OpenAI API key here>',\n",
85
+ " }, # OpenAI API endpoint for gpt-4\n",
86
+ " {\n",
87
+ " 'model': 'gpt-4',\n",
88
+ " 'api_key': '<your first Azure OpenAI API key here>',\n",
89
+ " 'api_base': '<your first Azure OpenAI API base here>',\n",
90
+ " 'api_type': 'azure',\n",
91
+ " 'api_version': '2023-06-01-preview',\n",
92
+ " }, # Azure OpenAI API endpoint for gpt-4\n",
93
+ " {\n",
94
+ " 'model': 'gpt-4',\n",
95
+ " 'api_key': '<your second Azure OpenAI API key here>',\n",
96
+ " 'api_base': '<your second Azure OpenAI API base here>',\n",
97
+ " 'api_type': 'azure',\n",
98
+ " 'api_version': '2023-06-01-preview',\n",
99
+ " }, # another Azure OpenAI API endpoint for gpt-4\n",
100
+ " {\n",
101
+ " 'model': 'gpt-3.5-turbo',\n",
102
+ " 'api_key': '<your OpenAI API key here>',\n",
103
+ " }, # OpenAI API endpoint for gpt-3.5-turbo\n",
104
+ " {\n",
105
+ " 'model': 'gpt-3.5-turbo',\n",
106
+ " 'api_key': '<your first Azure OpenAI API key here>',\n",
107
+ " 'api_base': '<your first Azure OpenAI API base here>',\n",
108
+ " 'api_type': 'azure',\n",
109
+ " 'api_version': '2023-06-01-preview',\n",
110
+ " }, # Azure OpenAI API endpoint for gpt-3.5-turbo\n",
111
+ " {\n",
112
+ " 'model': 'gpt-3.5-turbo',\n",
113
+ " 'api_key': '<your second Azure OpenAI API key here>',\n",
114
+ " 'api_base': '<your second Azure OpenAI API base here>',\n",
115
+ " 'api_type': 'azure',\n",
116
+ " 'api_version': '2023-06-01-preview',\n",
117
+ " }, # another Azure OpenAI API endpoint for gpt-3.5-turbo\n",
118
+ "]\n",
119
+ "```\n",
120
+ "\n",
121
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n",
122
+ "\n",
123
+ "You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file."
124
+ ]
125
+ },
126
+ {
127
+ "attachments": {},
128
+ "cell_type": "markdown",
129
+ "metadata": {},
130
+ "source": [
131
+ "## Example Task: Investment suggestion with realtime data\n",
132
+ "\n",
133
+ "We consider a scenario where news data are streamed from a source, and we use an assistant agent to continually provide investment suggestions based on the data.\n",
134
+ "\n",
135
+ "First, we use the following code to simulate the data stream process."
136
+ ]
137
+ },
138
+ {
139
+ "cell_type": "code",
140
+ "execution_count": 3,
141
+ "metadata": {},
142
+ "outputs": [],
143
+ "source": [
144
+ "import asyncio\n",
145
+ "\n",
146
+ "def get_market_news(ind, ind_upper):\n",
147
+ " import requests\n",
148
+ " import json\n",
149
+ " # replace the \"demo\" apikey below with your own key from https://www.alphavantage.co/support/#api-key\n",
150
+ " # url = 'https://www.alphavantage.co/query?function=NEWS_SENTIMENT&tickers=AAPL&sort=LATEST&limit=5&apikey=demo'\n",
151
+ " # r = requests.get(url)\n",
152
+ " # data = r.json()\n",
153
+ " # with open('market_news_local.json', 'r') as file:\n",
154
+ " # # Load JSON data from file\n",
155
+ " # data = json.load(file)\n",
156
+ " data = {\n",
157
+ " \"feed\": [\n",
158
+ " {\n",
159
+ " \"title\": \"Palantir CEO Says Our Generation's Atomic Bomb Could Be AI Weapon - And Arrive Sooner Than You Think - Palantir Technologies ( NYSE:PLTR ) \",\n",
160
+ " \"summary\": \"Christopher Nolan's blockbuster movie \\\"Oppenheimer\\\" has reignited the public discourse surrounding the United States' use of an atomic bomb on Japan at the end of World War II.\",\n",
161
+ " \"overall_sentiment_score\": 0.009687,\n",
162
+ " },\n",
163
+ " {\n",
164
+ " \"title\": '3 \"Hedge Fund Hotels\" Pulling into Support',\n",
165
+ " \"summary\": \"Institutional quality stocks have several benefits including high-liquidity, low beta, and a long runway. Strategist Andrew Rocco breaks down what investors should look for and pitches 3 ideas.\",\n",
166
+ " \"banner_image\": \"https://staticx-tuner.zacks.com/images/articles/main/92/87.jpg\",\n",
167
+ " \"overall_sentiment_score\": 0.219747,\n",
168
+ " },\n",
169
+ " {\n",
170
+ " \"title\": \"PDFgear, Bringing a Completely-Free PDF Text Editing Feature\",\n",
171
+ " \"summary\": \"LOS ANGELES, July 26, 2023 /PRNewswire/ -- PDFgear, a leading provider of PDF solutions, announced a piece of exciting news for everyone who works extensively with PDF documents.\",\n",
172
+ " \"overall_sentiment_score\": 0.360071,\n",
173
+ " },\n",
174
+ " {\n",
175
+ " \"title\": \"Researchers Pitch 'Immunizing' Images Against Deepfake Manipulation\",\n",
176
+ " \"summary\": \"A team at MIT says injecting tiny disruptive bits of code can cause distorted deepfake images.\",\n",
177
+ " \"overall_sentiment_score\": -0.026894,\n",
178
+ " },\n",
179
+ " {\n",
180
+ " \"title\": \"Nvidia wins again - plus two more takeaways from this week's mega-cap earnings\",\n",
181
+ " \"summary\": \"We made some key conclusions combing through quarterly results for Microsoft and Alphabet and listening to their conference calls with investors.\",\n",
182
+ " \"overall_sentiment_score\": 0.235177,\n",
183
+ " },\n",
184
+ " ]\n",
185
+ " }\n",
186
+ " feeds = data[\"feed\"][ind:ind_upper]\n",
187
+ " feeds_summary = \"\\n\".join(\n",
188
+ " [\n",
189
+ " f\"News summary: {f['title']}. {f['summary']} overall_sentiment_score: {f['overall_sentiment_score']}\"\n",
190
+ " for f in feeds\n",
191
+ " ]\n",
192
+ " )\n",
193
+ " return feeds_summary\n",
194
+ "\n",
195
+ "data = asyncio.Future()\n",
196
+ "\n",
197
+ "async def add_stock_price_data():\n",
198
+ " # simulating the data stream\n",
199
+ " for i in range(0, 5, 1):\n",
200
+ " latest_news = get_market_news(i, i + 1)\n",
201
+ " if data.done():\n",
202
+ " data.result().append(latest_news)\n",
203
+ " else:\n",
204
+ " data.set_result([latest_news])\n",
205
+ " # print(data.result())\n",
206
+ " await asyncio.sleep(5)\n",
207
+ "\n",
208
+ "data_task = asyncio.create_task(add_stock_price_data())\n"
209
+ ]
210
+ },
211
+ {
212
+ "attachments": {},
213
+ "cell_type": "markdown",
214
+ "metadata": {},
215
+ "source": [
216
+ "Then, we construct agents. An assistant agent is created to answer the question using LLM. A UserProxyAgent is created to ask questions, and add the new data in the conversation when they are available."
217
+ ]
218
+ },
219
+ {
220
+ "cell_type": "code",
221
+ "execution_count": 4,
222
+ "metadata": {},
223
+ "outputs": [],
224
+ "source": [
225
+ "import autogen\n",
226
+ "\n",
227
+ "# create an AssistantAgent instance named \"assistant\"\n",
228
+ "assistant = autogen.AssistantAgent(\n",
229
+ " name=\"assistant\",\n",
230
+ " llm_config={\n",
231
+ " \"request_timeout\": 600,\n",
232
+ " \"seed\": 41,\n",
233
+ " \"config_list\": config_list,\n",
234
+ " \"temperature\": 0,\n",
235
+ " },\n",
236
+ " system_message=\"You are a financial expert.\",\n",
237
+ ")\n",
238
+ "# create a UserProxyAgent instance named \"user\"\n",
239
+ "user_proxy = autogen.UserProxyAgent(\n",
240
+ " name=\"user\",\n",
241
+ " human_input_mode=\"NEVER\",\n",
242
+ " max_consecutive_auto_reply=5,\n",
243
+ " code_execution_config=False,\n",
244
+ " default_auto_reply=None,\n",
245
+ ")\n",
246
+ "\n",
247
+ "async def add_data_reply(recipient, messages, sender, config):\n",
248
+ " await asyncio.sleep(0.1)\n",
249
+ " data = config[\"news_stream\"]\n",
250
+ " if data.done():\n",
251
+ " result = data.result()\n",
252
+ " if result:\n",
253
+ " news_str = \"\\n\".join(result)\n",
254
+ " result.clear()\n",
255
+ " return (\n",
256
+ " True,\n",
257
+ " f\"Just got some latest market news. Merge your new suggestion with previous ones.\\n{news_str}\",\n",
258
+ " )\n",
259
+ " return False, None\n",
260
+ "\n",
261
+ "user_proxy.register_reply(autogen.AssistantAgent, add_data_reply, 1, config={\"news_stream\": data})"
262
+ ]
263
+ },
264
+ {
265
+ "attachments": {},
266
+ "cell_type": "markdown",
267
+ "metadata": {},
268
+ "source": [
269
+ "We invoke the `a_initiate_chat()` method of the user proxy agent to start the conversation."
270
+ ]
271
+ },
272
+ {
273
+ "cell_type": "code",
274
+ "execution_count": 5,
275
+ "metadata": {},
276
+ "outputs": [
277
+ {
278
+ "name": "stdout",
279
+ "output_type": "stream",
280
+ "text": [
281
+ "\u001b[33muser\u001b[0m (to assistant):\n",
282
+ "\n",
283
+ "Give me investment suggestion in 3 bullet points.\n",
284
+ "\n",
285
+ "--------------------------------------------------------------------------------\n",
286
+ "\u001b[33massistant\u001b[0m (to user):\n",
287
+ "\n",
288
+ "1. Diversify Your Portfolio: Don't put all your eggs in one basket. Spread your investments across a variety of asset classes such as stocks, bonds, real estate, and commodities. This can help to mitigate risk and potentially increase returns.\n",
289
+ "\n",
290
+ "2. Invest for the Long Term: Investing is not about making a quick buck, but about growing your wealth over time. Stick to a long-term investment strategy and avoid the temptation to engage in frequent buying and selling.\n",
291
+ "\n",
292
+ "3. Regularly Review Your Investments: The market is dynamic and constantly changing. Regularly review your investment portfolio to ensure it aligns with your financial goals and risk tolerance. Adjust your investments as necessary based on changes in your personal circumstances and market conditions.\n",
293
+ "\n",
294
+ "--------------------------------------------------------------------------------\n",
295
+ "\u001b[33muser\u001b[0m (to assistant):\n",
296
+ "\n",
297
+ "Just got some latest market news. Merge your new suggestion with previous ones.\n",
298
+ "News summary: Palantir CEO Says Our Generation's Atomic Bomb Could Be AI Weapon - And Arrive Sooner Than You Think - Palantir Technologies ( NYSE:PLTR ) . Christopher Nolan's blockbuster movie \"Oppenheimer\" has reignited the public discourse surrounding the United States' use of an atomic bomb on Japan at the end of World War II. overall_sentiment_score: 0.009687\n",
299
+ "News summary: 3 \"Hedge Fund Hotels\" Pulling into Support. Institutional quality stocks have several benefits including high-liquidity, low beta, and a long runway. Strategist Andrew Rocco breaks down what investors should look for and pitches 3 ideas. overall_sentiment_score: 0.219747\n",
300
+ "\n",
301
+ "--------------------------------------------------------------------------------\n",
302
+ "\u001b[33massistant\u001b[0m (to user):\n",
303
+ "\n",
304
+ "1. Diversify Your Portfolio: Given the recent news about AI technology and its potential impact, consider investing in tech companies like Palantir Technologies that are at the forefront of AI development. However, remember to maintain a diversified portfolio across various sectors and asset classes to mitigate risk.\n",
305
+ "\n",
306
+ "2. Long-Term Investment Strategy: Despite the potential for rapid advancements in AI, it's important to maintain a long-term investment perspective. While these developments may bring short-term volatility, they could also present long-term growth opportunities.\n",
307
+ "\n",
308
+ "3. Regularly Review and Adjust Your Investments: With the news about \"Hedge Fund Hotels\" and their potential benefits, consider reviewing your portfolio to see if these high-liquidity, low beta stocks fit into your investment strategy. However, always adjust your investments based on your personal circumstances, risk tolerance, and market conditions. \n",
309
+ "\n",
310
+ "4. Invest in AI and Tech Stocks: Given the potential of AI as highlighted by Palantir's CEO, consider investing in companies that are leading in AI and other technological advancements. This could provide significant growth opportunities.\n",
311
+ "\n",
312
+ "5. Consider \"Hedge Fund Hotels\": The news suggests that these stocks have several benefits including high-liquidity, low beta, and a long runway. If these align with your investment goals, they could be a good addition to your portfolio.\n",
313
+ "\n",
314
+ "--------------------------------------------------------------------------------\n",
315
+ "\u001b[33muser\u001b[0m (to assistant):\n",
316
+ "\n",
317
+ "Just got some latest market news. Merge your new suggestion with previous ones.\n",
318
+ "News summary: PDFgear, Bringing a Completely-Free PDF Text Editing Feature. LOS ANGELES, July 26, 2023 /PRNewswire/ -- PDFgear, a leading provider of PDF solutions, announced a piece of exciting news for everyone who works extensively with PDF documents. overall_sentiment_score: 0.360071\n",
319
+ "\n",
320
+ "--------------------------------------------------------------------------------\n",
321
+ "\u001b[33massistant\u001b[0m (to user):\n",
322
+ "\n",
323
+ "1. Diversify Your Portfolio: With the latest news about PDFgear, consider investing in companies that provide digital solutions, as they are becoming increasingly important in our digital age. However, continue to maintain a diversified portfolio across various sectors and asset classes to mitigate risk.\n",
324
+ "\n",
325
+ "2. Long-Term Investment Strategy: The announcement from PDFgear could potentially lead to short-term gains, but remember to maintain a long-term investment perspective. The digital solutions sector is expected to grow in the long run, providing potential growth opportunities.\n",
326
+ "\n",
327
+ "3. Regularly Review and Adjust Your Investments: Given the news about PDFgear and its potential impact on the digital solutions sector, it's important to review your portfolio to see if it aligns with these new market trends. Adjust your investments based on your personal circumstances, risk tolerance, and market conditions.\n",
328
+ "\n",
329
+ "4. Invest in Digital Solutions: The news about PDFgear highlights the potential growth in the digital solutions sector. Consider investing in companies that are leading in this field, as they could provide significant growth opportunities.\n",
330
+ "\n",
331
+ "5. Consider Tech Stocks: With the continuous advancements in technology, tech stocks, including those in AI, digital solutions, and other tech sub-sectors, could be a good addition to your portfolio, given their potential for high growth.\n",
332
+ "\n",
333
+ "--------------------------------------------------------------------------------\n",
334
+ "\u001b[33muser\u001b[0m (to assistant):\n",
335
+ "\n",
336
+ "Just got some latest market news. Merge your new suggestion with previous ones.\n",
337
+ "News summary: Researchers Pitch 'Immunizing' Images Against Deepfake Manipulation. A team at MIT says injecting tiny disruptive bits of code can cause distorted deepfake images. overall_sentiment_score: -0.026894\n",
338
+ "\n",
339
+ "--------------------------------------------------------------------------------\n",
340
+ "\u001b[33massistant\u001b[0m (to user):\n",
341
+ "\n",
342
+ "1. Diversify Your Portfolio: The latest news about deepfake manipulation suggests potential growth in cybersecurity and AI sectors. Consider investing in companies that are working on these technologies, but continue to maintain a diversified portfolio across various sectors and asset classes to mitigate risk.\n",
343
+ "\n",
344
+ "2. Long-Term Investment Strategy: While the news about deepfake manipulation might cause short-term volatility in the tech sector, it's important to maintain a long-term investment perspective. The cybersecurity and AI sectors are expected to grow in the long run, providing potential growth opportunities.\n",
345
+ "\n",
346
+ "3. Regularly Review and Adjust Your Investments: Given the news about deepfake manipulation and its potential impact on the tech sector, it's important to review your portfolio to see if it aligns with these new market trends. Adjust your investments based on your personal circumstances, risk tolerance, and market conditions.\n",
347
+ "\n",
348
+ "4. Invest in Cybersecurity and AI: The news about deepfake manipulation highlights the potential growth in the cybersecurity and AI sectors. Consider investing in companies that are leading in these fields, as they could provide significant growth opportunities.\n",
349
+ "\n",
350
+ "5. Consider Tech Stocks: With the continuous advancements in technology, tech stocks, including those in AI, digital solutions, cybersecurity, and other tech sub-sectors, could be a good addition to your portfolio, given their potential for high growth.\n",
351
+ "\n",
352
+ "--------------------------------------------------------------------------------\n",
353
+ "\u001b[33muser\u001b[0m (to assistant):\n",
354
+ "\n",
355
+ "Just got some latest market news. Merge your new suggestion with previous ones.\n",
356
+ "News summary: Nvidia wins again - plus two more takeaways from this week's mega-cap earnings. We made some key conclusions combing through quarterly results for Microsoft and Alphabet and listening to their conference calls with investors. overall_sentiment_score: 0.235177\n",
357
+ "\n",
358
+ "--------------------------------------------------------------------------------\n",
359
+ "\u001b[33massistant\u001b[0m (to user):\n",
360
+ "\n",
361
+ "1. Diversify Your Portfolio: The latest news about Nvidia, Microsoft, and Alphabet suggests potential growth in the tech sector. Consider investing in these companies, but continue to maintain a diversified portfolio across various sectors and asset classes to mitigate risk.\n",
362
+ "\n",
363
+ "2. Long-Term Investment Strategy: While the news about these tech giants might cause short-term volatility, it's important to maintain a long-term investment perspective. The tech sector, particularly companies like Nvidia, Microsoft, and Alphabet, are expected to grow in the long run, providing potential growth opportunities.\n",
364
+ "\n",
365
+ "3. Regularly Review and Adjust Your Investments: Given the news about Nvidia, Microsoft, and Alphabet and their potential impact on the tech sector, it's important to review your portfolio to see if it aligns with these new market trends. Adjust your investments based on your personal circumstances, risk tolerance, and market conditions.\n",
366
+ "\n",
367
+ "4. Invest in Tech Giants: The news about Nvidia, Microsoft, and Alphabet highlights the potential growth in the tech sector. Consider investing in these tech giants, as they could provide significant growth opportunities.\n",
368
+ "\n",
369
+ "5. Consider Tech Stocks: With the continuous advancements in technology, tech stocks, including those in AI, digital solutions, cybersecurity, and other tech sub-sectors, could be a good addition to your portfolio, given their potential for high growth.\n",
370
+ "\n",
371
+ "--------------------------------------------------------------------------------\n"
372
+ ]
373
+ }
374
+ ],
375
+ "source": [
376
+ "await user_proxy.a_initiate_chat(\n",
377
+ " assistant,\n",
378
+ " message=\"\"\"Give me investment suggestion in 3 bullet points.\"\"\",\n",
379
+ ")\n",
380
+ "while not data_task.done() and not data_task.cancelled():\n",
381
+ " reply = await user_proxy.a_generate_reply(sender=assistant)\n",
382
+ " if reply is not None:\n",
383
+ " await user_proxy.a_send(reply, assistant)"
384
+ ]
385
+ }
386
+ ],
387
+ "metadata": {
388
+ "kernelspec": {
389
+ "display_name": "Python 3",
390
+ "language": "python",
391
+ "name": "python3"
392
+ },
393
+ "language_info": {
394
+ "codemirror_mode": {
395
+ "name": "ipython",
396
+ "version": 3
397
+ },
398
+ "file_extension": ".py",
399
+ "mimetype": "text/x-python",
400
+ "name": "python",
401
+ "nbconvert_exporter": "python",
402
+ "pygments_lexer": "ipython3",
403
+ "version": "3.9.17"
404
+ },
405
+ "vscode": {
406
+ "interpreter": {
407
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
408
+ }
409
+ },
410
+ "widgets": {
411
+ "application/vnd.jupyter.widget-state+json": {
412
+ "state": {
413
+ "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
414
+ "model_module": "@jupyter-widgets/base",
415
+ "model_module_version": "2.0.0",
416
+ "model_name": "LayoutModel",
417
+ "state": {
418
+ "_model_module": "@jupyter-widgets/base",
419
+ "_model_module_version": "2.0.0",
420
+ "_model_name": "LayoutModel",
421
+ "_view_count": null,
422
+ "_view_module": "@jupyter-widgets/base",
423
+ "_view_module_version": "2.0.0",
424
+ "_view_name": "LayoutView",
425
+ "align_content": null,
426
+ "align_items": null,
427
+ "align_self": null,
428
+ "border_bottom": null,
429
+ "border_left": null,
430
+ "border_right": null,
431
+ "border_top": null,
432
+ "bottom": null,
433
+ "display": null,
434
+ "flex": null,
435
+ "flex_flow": null,
436
+ "grid_area": null,
437
+ "grid_auto_columns": null,
438
+ "grid_auto_flow": null,
439
+ "grid_auto_rows": null,
440
+ "grid_column": null,
441
+ "grid_gap": null,
442
+ "grid_row": null,
443
+ "grid_template_areas": null,
444
+ "grid_template_columns": null,
445
+ "grid_template_rows": null,
446
+ "height": null,
447
+ "justify_content": null,
448
+ "justify_items": null,
449
+ "left": null,
450
+ "margin": null,
451
+ "max_height": null,
452
+ "max_width": null,
453
+ "min_height": null,
454
+ "min_width": null,
455
+ "object_fit": null,
456
+ "object_position": null,
457
+ "order": null,
458
+ "overflow": null,
459
+ "padding": null,
460
+ "right": null,
461
+ "top": null,
462
+ "visibility": null,
463
+ "width": null
464
+ }
465
+ },
466
+ "454146d0f7224f038689031002906e6f": {
467
+ "model_module": "@jupyter-widgets/controls",
468
+ "model_module_version": "2.0.0",
469
+ "model_name": "HBoxModel",
470
+ "state": {
471
+ "_dom_classes": [],
472
+ "_model_module": "@jupyter-widgets/controls",
473
+ "_model_module_version": "2.0.0",
474
+ "_model_name": "HBoxModel",
475
+ "_view_count": null,
476
+ "_view_module": "@jupyter-widgets/controls",
477
+ "_view_module_version": "2.0.0",
478
+ "_view_name": "HBoxView",
479
+ "box_style": "",
480
+ "children": [
481
+ "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
482
+ "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
483
+ "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
484
+ ],
485
+ "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
486
+ "tabbable": null,
487
+ "tooltip": null
488
+ }
489
+ },
490
+ "577e1e3cc4db4942b0883577b3b52755": {
491
+ "model_module": "@jupyter-widgets/controls",
492
+ "model_module_version": "2.0.0",
493
+ "model_name": "FloatProgressModel",
494
+ "state": {
495
+ "_dom_classes": [],
496
+ "_model_module": "@jupyter-widgets/controls",
497
+ "_model_module_version": "2.0.0",
498
+ "_model_name": "FloatProgressModel",
499
+ "_view_count": null,
500
+ "_view_module": "@jupyter-widgets/controls",
501
+ "_view_module_version": "2.0.0",
502
+ "_view_name": "ProgressView",
503
+ "bar_style": "success",
504
+ "description": "",
505
+ "description_allow_html": false,
506
+ "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
507
+ "max": 1,
508
+ "min": 0,
509
+ "orientation": "horizontal",
510
+ "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
511
+ "tabbable": null,
512
+ "tooltip": null,
513
+ "value": 1
514
+ }
515
+ },
516
+ "6086462a12d54bafa59d3c4566f06cb2": {
517
+ "model_module": "@jupyter-widgets/base",
518
+ "model_module_version": "2.0.0",
519
+ "model_name": "LayoutModel",
520
+ "state": {
521
+ "_model_module": "@jupyter-widgets/base",
522
+ "_model_module_version": "2.0.0",
523
+ "_model_name": "LayoutModel",
524
+ "_view_count": null,
525
+ "_view_module": "@jupyter-widgets/base",
526
+ "_view_module_version": "2.0.0",
527
+ "_view_name": "LayoutView",
528
+ "align_content": null,
529
+ "align_items": null,
530
+ "align_self": null,
531
+ "border_bottom": null,
532
+ "border_left": null,
533
+ "border_right": null,
534
+ "border_top": null,
535
+ "bottom": null,
536
+ "display": null,
537
+ "flex": null,
538
+ "flex_flow": null,
539
+ "grid_area": null,
540
+ "grid_auto_columns": null,
541
+ "grid_auto_flow": null,
542
+ "grid_auto_rows": null,
543
+ "grid_column": null,
544
+ "grid_gap": null,
545
+ "grid_row": null,
546
+ "grid_template_areas": null,
547
+ "grid_template_columns": null,
548
+ "grid_template_rows": null,
549
+ "height": null,
550
+ "justify_content": null,
551
+ "justify_items": null,
552
+ "left": null,
553
+ "margin": null,
554
+ "max_height": null,
555
+ "max_width": null,
556
+ "min_height": null,
557
+ "min_width": null,
558
+ "object_fit": null,
559
+ "object_position": null,
560
+ "order": null,
561
+ "overflow": null,
562
+ "padding": null,
563
+ "right": null,
564
+ "top": null,
565
+ "visibility": null,
566
+ "width": null
567
+ }
568
+ },
569
+ "74a6ba0c3cbc4051be0a83e152fe1e62": {
570
+ "model_module": "@jupyter-widgets/controls",
571
+ "model_module_version": "2.0.0",
572
+ "model_name": "ProgressStyleModel",
573
+ "state": {
574
+ "_model_module": "@jupyter-widgets/controls",
575
+ "_model_module_version": "2.0.0",
576
+ "_model_name": "ProgressStyleModel",
577
+ "_view_count": null,
578
+ "_view_module": "@jupyter-widgets/base",
579
+ "_view_module_version": "2.0.0",
580
+ "_view_name": "StyleView",
581
+ "bar_color": null,
582
+ "description_width": ""
583
+ }
584
+ },
585
+ "7d3f3d9e15894d05a4d188ff4f466554": {
586
+ "model_module": "@jupyter-widgets/controls",
587
+ "model_module_version": "2.0.0",
588
+ "model_name": "HTMLStyleModel",
589
+ "state": {
590
+ "_model_module": "@jupyter-widgets/controls",
591
+ "_model_module_version": "2.0.0",
592
+ "_model_name": "HTMLStyleModel",
593
+ "_view_count": null,
594
+ "_view_module": "@jupyter-widgets/base",
595
+ "_view_module_version": "2.0.0",
596
+ "_view_name": "StyleView",
597
+ "background": null,
598
+ "description_width": "",
599
+ "font_size": null,
600
+ "text_color": null
601
+ }
602
+ },
603
+ "b40bdfb1ac1d4cffb7cefcb870c64d45": {
604
+ "model_module": "@jupyter-widgets/controls",
605
+ "model_module_version": "2.0.0",
606
+ "model_name": "HTMLModel",
607
+ "state": {
608
+ "_dom_classes": [],
609
+ "_model_module": "@jupyter-widgets/controls",
610
+ "_model_module_version": "2.0.0",
611
+ "_model_name": "HTMLModel",
612
+ "_view_count": null,
613
+ "_view_module": "@jupyter-widgets/controls",
614
+ "_view_module_version": "2.0.0",
615
+ "_view_name": "HTMLView",
616
+ "description": "",
617
+ "description_allow_html": false,
618
+ "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
619
+ "placeholder": "​",
620
+ "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
621
+ "tabbable": null,
622
+ "tooltip": null,
623
+ "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
624
+ }
625
+ },
626
+ "ca245376fd9f4354af6b2befe4af4466": {
627
+ "model_module": "@jupyter-widgets/controls",
628
+ "model_module_version": "2.0.0",
629
+ "model_name": "HTMLStyleModel",
630
+ "state": {
631
+ "_model_module": "@jupyter-widgets/controls",
632
+ "_model_module_version": "2.0.0",
633
+ "_model_name": "HTMLStyleModel",
634
+ "_view_count": null,
635
+ "_view_module": "@jupyter-widgets/base",
636
+ "_view_module_version": "2.0.0",
637
+ "_view_name": "StyleView",
638
+ "background": null,
639
+ "description_width": "",
640
+ "font_size": null,
641
+ "text_color": null
642
+ }
643
+ },
644
+ "dc83c7bff2f241309537a8119dfc7555": {
645
+ "model_module": "@jupyter-widgets/base",
646
+ "model_module_version": "2.0.0",
647
+ "model_name": "LayoutModel",
648
+ "state": {
649
+ "_model_module": "@jupyter-widgets/base",
650
+ "_model_module_version": "2.0.0",
651
+ "_model_name": "LayoutModel",
652
+ "_view_count": null,
653
+ "_view_module": "@jupyter-widgets/base",
654
+ "_view_module_version": "2.0.0",
655
+ "_view_name": "LayoutView",
656
+ "align_content": null,
657
+ "align_items": null,
658
+ "align_self": null,
659
+ "border_bottom": null,
660
+ "border_left": null,
661
+ "border_right": null,
662
+ "border_top": null,
663
+ "bottom": null,
664
+ "display": null,
665
+ "flex": null,
666
+ "flex_flow": null,
667
+ "grid_area": null,
668
+ "grid_auto_columns": null,
669
+ "grid_auto_flow": null,
670
+ "grid_auto_rows": null,
671
+ "grid_column": null,
672
+ "grid_gap": null,
673
+ "grid_row": null,
674
+ "grid_template_areas": null,
675
+ "grid_template_columns": null,
676
+ "grid_template_rows": null,
677
+ "height": null,
678
+ "justify_content": null,
679
+ "justify_items": null,
680
+ "left": null,
681
+ "margin": null,
682
+ "max_height": null,
683
+ "max_width": null,
684
+ "min_height": null,
685
+ "min_width": null,
686
+ "object_fit": null,
687
+ "object_position": null,
688
+ "order": null,
689
+ "overflow": null,
690
+ "padding": null,
691
+ "right": null,
692
+ "top": null,
693
+ "visibility": null,
694
+ "width": null
695
+ }
696
+ },
697
+ "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
698
+ "model_module": "@jupyter-widgets/controls",
699
+ "model_module_version": "2.0.0",
700
+ "model_name": "HTMLModel",
701
+ "state": {
702
+ "_dom_classes": [],
703
+ "_model_module": "@jupyter-widgets/controls",
704
+ "_model_module_version": "2.0.0",
705
+ "_model_name": "HTMLModel",
706
+ "_view_count": null,
707
+ "_view_module": "@jupyter-widgets/controls",
708
+ "_view_module_version": "2.0.0",
709
+ "_view_name": "HTMLView",
710
+ "description": "",
711
+ "description_allow_html": false,
712
+ "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
713
+ "placeholder": "​",
714
+ "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
715
+ "tabbable": null,
716
+ "tooltip": null,
717
+ "value": "100%"
718
+ }
719
+ },
720
+ "f1355871cc6f4dd4b50d9df5af20e5c8": {
721
+ "model_module": "@jupyter-widgets/base",
722
+ "model_module_version": "2.0.0",
723
+ "model_name": "LayoutModel",
724
+ "state": {
725
+ "_model_module": "@jupyter-widgets/base",
726
+ "_model_module_version": "2.0.0",
727
+ "_model_name": "LayoutModel",
728
+ "_view_count": null,
729
+ "_view_module": "@jupyter-widgets/base",
730
+ "_view_module_version": "2.0.0",
731
+ "_view_name": "LayoutView",
732
+ "align_content": null,
733
+ "align_items": null,
734
+ "align_self": null,
735
+ "border_bottom": null,
736
+ "border_left": null,
737
+ "border_right": null,
738
+ "border_top": null,
739
+ "bottom": null,
740
+ "display": null,
741
+ "flex": null,
742
+ "flex_flow": null,
743
+ "grid_area": null,
744
+ "grid_auto_columns": null,
745
+ "grid_auto_flow": null,
746
+ "grid_auto_rows": null,
747
+ "grid_column": null,
748
+ "grid_gap": null,
749
+ "grid_row": null,
750
+ "grid_template_areas": null,
751
+ "grid_template_columns": null,
752
+ "grid_template_rows": null,
753
+ "height": null,
754
+ "justify_content": null,
755
+ "justify_items": null,
756
+ "left": null,
757
+ "margin": null,
758
+ "max_height": null,
759
+ "max_width": null,
760
+ "min_height": null,
761
+ "min_width": null,
762
+ "object_fit": null,
763
+ "object_position": null,
764
+ "order": null,
765
+ "overflow": null,
766
+ "padding": null,
767
+ "right": null,
768
+ "top": null,
769
+ "visibility": null,
770
+ "width": null
771
+ }
772
+ }
773
+ },
774
+ "version_major": 2,
775
+ "version_minor": 0
776
+ }
777
+ }
778
+ },
779
+ "nbformat": 4,
780
+ "nbformat_minor": 2
781
+ }
agentchat_teaching.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
agentchat_two_users.ipynb ADDED
@@ -0,0 +1,886 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_two_users.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "slideshow": {
16
+ "slide_type": "slide"
17
+ }
18
+ },
19
+ "source": [
20
+ "# Auto Generated Agent Chat: Collaborative Task Solving with Multiple Agents and Human Users\n",
21
+ "\n",
22
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation. Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
23
+ "\n",
24
+ "In this notebook, we demonstrate an application involving multiple agents and human users to work together and accomplish a task. `AssistantAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for a user to execute the code written by `AssistantAgent`. We create multiple `UserProxyAgent` instances which can represent different human users.\n",
25
+ "\n",
26
+ "## Requirements\n",
27
+ "\n",
28
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
29
+ "```bash\n",
30
+ "pip install pyautogen\n",
31
+ "```"
32
+ ]
33
+ },
34
+ {
35
+ "cell_type": "code",
36
+ "execution_count": 1,
37
+ "metadata": {
38
+ "execution": {
39
+ "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
40
+ "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
41
+ "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
42
+ "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
43
+ }
44
+ },
45
+ "outputs": [],
46
+ "source": [
47
+ "# %pip install pyautogen~=0.1.1"
48
+ ]
49
+ },
50
+ {
51
+ "attachments": {},
52
+ "cell_type": "markdown",
53
+ "metadata": {},
54
+ "source": [
55
+ "## Set your API Endpoint\n",
56
+ "\n",
57
+ "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n",
58
+ "\n",
59
+ "It first looks for an environment variable of a specified name (\"OAI_CONFIG_LIST\" in this example) which needs to be a valid json string. If that variable is not found, it then looks for a json file with the same name. It filters the configs by models (you can filter by other keys as well).\n",
60
+ "\n",
61
+ "The json looks like the following:\n",
62
+ "```json\n",
63
+ "[\n",
64
+ " {\n",
65
+ " \"model\": \"gpt-4\",\n",
66
+ " \"api_key\": \"<your OpenAI API key here>\"\n",
67
+ " },\n",
68
+ " {\n",
69
+ " \"model\": \"gpt-4\",\n",
70
+ " \"api_key\": \"<your Azure OpenAI API key here>\",\n",
71
+ " \"api_base\": \"<your Azure OpenAI API base here>\",\n",
72
+ " \"api_type\": \"azure\",\n",
73
+ " \"api_version\": \"2023-07-01-preview\"\n",
74
+ " },\n",
75
+ " {\n",
76
+ " \"model\": \"gpt-4-32k\",\n",
77
+ " \"api_key\": \"<your Azure OpenAI API key here>\",\n",
78
+ " \"api_base\": \"<your Azure OpenAI API base here>\",\n",
79
+ " \"api_type\": \"azure\",\n",
80
+ " \"api_version\": \"2023-07-01-preview\"\n",
81
+ " }\n",
82
+ "]\n",
83
+ "```\n",
84
+ "\n",
85
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n"
86
+ ]
87
+ },
88
+ {
89
+ "cell_type": "code",
90
+ "execution_count": 2,
91
+ "metadata": {},
92
+ "outputs": [],
93
+ "source": [
94
+ "import autogen\n",
95
+ "\n",
96
+ "config_list = autogen.config_list_from_json(\n",
97
+ " \"OAI_CONFIG_LIST\",\n",
98
+ " filter_dict={\n",
99
+ " \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
100
+ " },\n",
101
+ ")"
102
+ ]
103
+ },
104
+ {
105
+ "attachments": {},
106
+ "cell_type": "markdown",
107
+ "metadata": {},
108
+ "source": [
109
+ "## Construct Agents\n",
110
+ "\n",
111
+ "We define `ask_expert` function to start a conversation between two agents and return a summary of the result. We construct an assistant agent named \"assistant_for_expert\" and a user proxy agent named \"expert\". We specify `human_input_mode` as \"ALWAYS\" in the user proxy agent, which will always ask for feedback from the expert user."
112
+ ]
113
+ },
114
+ {
115
+ "cell_type": "code",
116
+ "execution_count": 3,
117
+ "metadata": {},
118
+ "outputs": [],
119
+ "source": [
120
+ "def ask_expert(message):\n",
121
+ " assistant_for_expert = autogen.AssistantAgent(\n",
122
+ " name=\"assistant_for_expert\",\n",
123
+ " llm_config={\n",
124
+ " \"temperature\": 0,\n",
125
+ " \"config_list\": config_list,\n",
126
+ " },\n",
127
+ " )\n",
128
+ " expert = autogen.UserProxyAgent(\n",
129
+ " name=\"expert\",\n",
130
+ " human_input_mode=\"ALWAYS\",\n",
131
+ " code_execution_config={\"work_dir\": \"expert\"},\n",
132
+ " )\n",
133
+ "\n",
134
+ " expert.initiate_chat(assistant_for_expert, message=message)\n",
135
+ " expert.stop_reply_at_receive(assistant_for_expert)\n",
136
+ " # expert.human_input_mode, expert.max_consecutive_auto_reply = \"NEVER\", 0\n",
137
+ " # final message sent from the expert\n",
138
+ " expert.send(\"summarize the solution and explain the answer in an easy-to-understand way\", assistant_for_expert)\n",
139
+ " # return the last message the expert received\n",
140
+ " return expert.last_message()[\"content\"]\n"
141
+ ]
142
+ },
143
+ {
144
+ "attachments": {},
145
+ "cell_type": "markdown",
146
+ "metadata": {},
147
+ "source": [
148
+ "We construct another assistant agent named \"assistant_for_student\" and a user proxy agent named \"student\". We specify `human_input_mode` as \"TERMINATE\" in the user proxy agent, which will ask for feedback when it receives a \"TERMINATE\" signal from the assistant agent. We set the `functions` in `AssistantAgent` and `function_map` in `UserProxyAgent` to use the created `ask_expert` function.\n",
149
+ "\n",
150
+ "For simplicity, the `ask_expert` function is defined to run locally. For real applications, the function should run remotely to interact with an expert user."
151
+ ]
152
+ },
153
+ {
154
+ "cell_type": "code",
155
+ "execution_count": 4,
156
+ "metadata": {},
157
+ "outputs": [],
158
+ "source": [
159
+ "assistant_for_student = autogen.AssistantAgent(\n",
160
+ " name=\"assistant_for_student\",\n",
161
+ " system_message=\"You are a helpful assistant. Reply TERMINATE when the task is done.\",\n",
162
+ " llm_config={\n",
163
+ " \"request_timeout\": 600,\n",
164
+ " \"seed\": 42,\n",
165
+ " # Excluding azure openai endpoints from the config list.\n",
166
+ " # Change to `exclude=\"openai\"` to exclude openai endpoints, or remove the `exclude` argument to include both.\n",
167
+ " \"config_list\": autogen.config_list_openai_aoai(exclude=\"aoai\"),\n",
168
+ " \"model\": \"gpt-4-0613\", # make sure the endpoint you use supports the model\n",
169
+ " \"temperature\": 0,\n",
170
+ " \"functions\": [\n",
171
+ " {\n",
172
+ " \"name\": \"ask_expert\",\n",
173
+ " \"description\": \"ask expert when you can't solve the problem satisfactorily.\",\n",
174
+ " \"parameters\": {\n",
175
+ " \"type\": \"object\",\n",
176
+ " \"properties\": {\n",
177
+ " \"message\": {\n",
178
+ " \"type\": \"string\",\n",
179
+ " \"description\": \"question to ask expert. Make sure the question include enough context, such as the code and the execution result. The expert does not know the conversation between you and the user, unless you share the conversation with the expert.\",\n",
180
+ " },\n",
181
+ " },\n",
182
+ " \"required\": [\"message\"],\n",
183
+ " },\n",
184
+ " }\n",
185
+ " ],\n",
186
+ " }\n",
187
+ ")\n",
188
+ "\n",
189
+ "student = autogen.UserProxyAgent(\n",
190
+ " name=\"student\",\n",
191
+ " human_input_mode=\"TERMINATE\",\n",
192
+ " max_consecutive_auto_reply=10,\n",
193
+ " code_execution_config={\"work_dir\": \"student\"},\n",
194
+ " function_map={\"ask_expert\": ask_expert},\n",
195
+ ")"
196
+ ]
197
+ },
198
+ {
199
+ "attachments": {},
200
+ "cell_type": "markdown",
201
+ "metadata": {},
202
+ "source": [
203
+ "## Perform a task\n",
204
+ "\n",
205
+ "We invoke the `initiate_chat()` method of the student proxy agent to start the conversation. When you run the cell below, you will be prompted to provide feedback after the assistant agent sends a \"TERMINATE\" signal in the end of the message. If you don't provide any feedback (by pressing Enter directly), the conversation will finish. Before the \"TERMINATE\" signal, the student proxy agent will try to execute the code suggested by the assistant agent on behalf of the user."
206
+ ]
207
+ },
208
+ {
209
+ "cell_type": "code",
210
+ "execution_count": 5,
211
+ "metadata": {},
212
+ "outputs": [
213
+ {
214
+ "name": "stdout",
215
+ "output_type": "stream",
216
+ "text": [
217
+ "\u001b[33mstudent\u001b[0m (to assistant_for_student):\n",
218
+ "\n",
219
+ "Find $a + b + c$, given that $x+y \\neq -1$ and \n",
220
+ "\\begin{align}\n",
221
+ "\tax + by + c & = x + 7,\\\n",
222
+ "\ta + bx + cy & = 2x + 6y,\\\n",
223
+ "\tay + b + cx & = 4x + y.\n",
224
+ "\\end{align}.\n",
225
+ "\n",
226
+ "\n",
227
+ "--------------------------------------------------------------------------------\n",
228
+ "\u001b[33massistant_for_student\u001b[0m (to student):\n",
229
+ "\n",
230
+ "To find the values of $a$, $b$, and $c$, we need to solve the system of equations. However, the system of equations you provided seems to be incomplete or incorrect. The equations are not properly formatted, and it's unclear how $x$ and $y$ are related to $a$, $b$, and $c$. \n",
231
+ "\n",
232
+ "Could you please provide the correct system of equations?\n",
233
+ "\n",
234
+ "--------------------------------------------------------------------------------\n",
235
+ "\u001b[31m\n",
236
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
237
+ "\u001b[33mstudent\u001b[0m (to assistant_for_student):\n",
238
+ "\n",
239
+ "\n",
240
+ "\n",
241
+ "--------------------------------------------------------------------------------\n",
242
+ "\u001b[33massistant_for_student\u001b[0m (to student):\n",
243
+ "\n",
244
+ "\u001b[32m***** Suggested function Call: ask_expert *****\u001b[0m\n",
245
+ "Arguments: \n",
246
+ "{\n",
247
+ " \"message\": \"The user provided a system of equations and asked to find $a + b + c$, given that $x+y \\\\neq -1$ and \\n\\\\begin{align}\\n\\tax + by + c & = x + 7,\\\\\\n\\ta + bx + cy & = 2x + 6y,\\\\\\n\\tay + b + cx & = 4x + y.\\n\\\\end{align}. However, the system of equations seems to be incomplete or incorrectly formatted. How should I proceed?\"\n",
248
+ "}\n",
249
+ "\u001b[32m***********************************************\u001b[0m\n",
250
+ "\n",
251
+ "--------------------------------------------------------------------------------\n",
252
+ "\u001b[31m\n",
253
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
254
+ "\u001b[35m\n",
255
+ ">>>>>>>> EXECUTING FUNCTION ask_expert...\u001b[0m\n",
256
+ "\u001b[33mexpert\u001b[0m (to assistant_for_expert):\n",
257
+ "\n",
258
+ "The user provided a system of equations and asked to find $a + b + c$, given that $x+y \\neq -1$ and \n",
259
+ "\\begin{align}\n",
260
+ "\tax + by + c & = x + 7,\\\n",
261
+ "\ta + bx + cy & = 2x + 6y,\\\n",
262
+ "\tay + b + cx & = 4x + y.\n",
263
+ "\\end{align}. However, the system of equations seems to be incomplete or incorrectly formatted. How should I proceed?\n",
264
+ "\n",
265
+ "--------------------------------------------------------------------------------\n",
266
+ "\u001b[33massistant_for_expert\u001b[0m (to expert):\n",
267
+ "\n",
268
+ "To proceed, we can first try to solve the given system of equations and see if we can find a unique solution for $a$, $b$, and $c$. If we can't find a unique solution, we can then determine if there's any additional information needed or if the system is indeed incomplete or incorrectly formatted.\n",
269
+ "\n",
270
+ "Let's solve the given system of equations using Python's SymPy library.\n",
271
+ "\n",
272
+ "--------------------------------------------------------------------------------\n",
273
+ "\u001b[33mexpert\u001b[0m (to assistant_for_expert):\n",
274
+ "\n",
275
+ "make sure you simplify the answer\n",
276
+ "\n",
277
+ "--------------------------------------------------------------------------------\n",
278
+ "\u001b[33massistant_for_expert\u001b[0m (to expert):\n",
279
+ "\n",
280
+ "Sure, I will simplify the answer. Let's solve the given system of equations using Python's SymPy library and simplify the result.\n",
281
+ "\n",
282
+ "--------------------------------------------------------------------------------\n",
283
+ "\u001b[31m\n",
284
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
285
+ "\u001b[31m\n",
286
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
287
+ "\u001b[33mexpert\u001b[0m (to assistant_for_expert):\n",
288
+ "\n",
289
+ "\n",
290
+ "\n",
291
+ "--------------------------------------------------------------------------------\n",
292
+ "\u001b[33massistant_for_expert\u001b[0m (to expert):\n",
293
+ "\n",
294
+ "First, let's install the SymPy library if you haven't already. Execute the following command to install it:\n",
295
+ "\n",
296
+ "```sh\n",
297
+ "pip install sympy\n",
298
+ "```\n",
299
+ "\n",
300
+ "Now, let's solve the given system of equations using Python's SymPy library and simplify the result.\n",
301
+ "\n",
302
+ "```python\n",
303
+ "from sympy import symbols, Eq, solve, simplify\n",
304
+ "\n",
305
+ "a, b, c, x, y = symbols('a b c x y')\n",
306
+ "\n",
307
+ "eq1 = Eq(a * x + b * y + c, x + 7)\n",
308
+ "eq2 = Eq(a + b * x + c * y, 2 * x + 6 * y)\n",
309
+ "eq3 = Eq(a * y + b + c * x, 4 * x + y)\n",
310
+ "\n",
311
+ "solutions = solve((eq1, eq2, eq3), (a, b, c))\n",
312
+ "\n",
313
+ "simplified_solutions = {key: simplify(value) for key, value in solutions.items()}\n",
314
+ "a_val, b_val, c_val = simplified_solutions[a], simplified_solutions[b], simplified_solutions[c]\n",
315
+ "sum_abc = simplify(a_val + b_val + c_val)\n",
316
+ "\n",
317
+ "print(f\"a: {a_val}\")\n",
318
+ "print(f\"b: {b_val}\")\n",
319
+ "print(f\"c: {c_val}\")\n",
320
+ "print(f\"a + b + c: {sum_abc}\")\n",
321
+ "```\n",
322
+ "\n",
323
+ "Please execute the above Python code to solve the system of equations and simplify the result.\n",
324
+ "\n",
325
+ "--------------------------------------------------------------------------------\n",
326
+ "\u001b[31m\n",
327
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
328
+ "\u001b[31m\n",
329
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
330
+ "\u001b[31m\n",
331
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is sh)...\u001b[0m\n",
332
+ "\u001b[31m\n",
333
+ ">>>>>>>> EXECUTING CODE BLOCK 1 (inferred language is python)...\u001b[0m\n",
334
+ "\u001b[33mexpert\u001b[0m (to assistant_for_expert):\n",
335
+ "\n",
336
+ "exitcode: 0 (execution succeeded)\n",
337
+ "Code output: \n",
338
+ "Defaulting to user installation because normal site-packages is not writeable\n",
339
+ "Requirement already satisfied: sympy in /home/vscode/.local/lib/python3.9/site-packages (1.12)\n",
340
+ "Requirement already satisfied: mpmath>=0.19 in /home/vscode/.local/lib/python3.9/site-packages (from sympy) (1.3.0)\n",
341
+ "\n",
342
+ "a: (x**2 - 3*x*y + 2*x + y**2 - y)/(x**2 - x*y - x + y**2 - y + 1)\n",
343
+ "b: (2*x**2 - 3*x + y)/(x**2 - x*y - x + y**2 - y + 1)\n",
344
+ "c: (4*x**2 - 4*x*y - 6*x + 6*y**2 - 7*y + 7)/(x**2 - x*y - x + y**2 - y + 1)\n",
345
+ "a + b + c: 7\n",
346
+ "\n",
347
+ "\n",
348
+ "--------------------------------------------------------------------------------\n",
349
+ "\u001b[33massistant_for_expert\u001b[0m (to expert):\n",
350
+ "\n",
351
+ "The given system of equations has been solved and simplified. The result is:\n",
352
+ "\n",
353
+ "$$\n",
354
+ "a = \\frac{x^2 - 3xy + 2x + y^2 - y}{x^2 - xy - x + y^2 - y + 1}, \\\\\n",
355
+ "b = \\frac{2x^2 - 3x + y}{x^2 - xy - x + y^2 - y + 1}, \\\\\n",
356
+ "c = \\frac{4x^2 - 4xy - 6x + 6y^2 - 7y + 7}{x^2 - xy - x + y^2 - y + 1}.\n",
357
+ "$$\n",
358
+ "\n",
359
+ "The sum $a + b + c$ is equal to $7$. Since we were able to find a unique solution for $a$, $b$, and $c$, the system of equations is complete and correctly formatted.\n",
360
+ "\n",
361
+ "TERMINATE\n",
362
+ "\n",
363
+ "--------------------------------------------------------------------------------\n",
364
+ "\u001b[31m\n",
365
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
366
+ "\u001b[31m\n",
367
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
368
+ "\u001b[33mexpert\u001b[0m (to assistant_for_expert):\n",
369
+ "\n",
370
+ "\n",
371
+ "\n",
372
+ "--------------------------------------------------------------------------------\n",
373
+ "\u001b[33massistant_for_expert\u001b[0m (to expert):\n",
374
+ "\n",
375
+ "TERMINATE\n",
376
+ "\n",
377
+ "--------------------------------------------------------------------------------\n",
378
+ "\u001b[31m\n",
379
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
380
+ "\u001b[33mexpert\u001b[0m (to assistant_for_expert):\n",
381
+ "\n",
382
+ "summarize the solution and explain the answer in an easy-to-understand way\n",
383
+ "\n",
384
+ "--------------------------------------------------------------------------------\n",
385
+ "\u001b[33massistant_for_expert\u001b[0m (to expert):\n",
386
+ "\n",
387
+ "Sure, let me summarize the solution and explain the answer in an easy-to-understand way.\n",
388
+ "\n",
389
+ "We were given a system of three equations with three unknowns, $a$, $b$, and $c$. The equations were:\n",
390
+ "\n",
391
+ "1. $ax + by + c = x + 7$\n",
392
+ "2. $a + bx + cy = 2x + 6y$\n",
393
+ "3. $ay + b + cx = 4x + y$\n",
394
+ "\n",
395
+ "We were asked to find the sum $a + b + c$ given that $x + y \\neq -1$.\n",
396
+ "\n",
397
+ "We used Python's SymPy library to solve the system of equations and found the following solutions for $a$, $b$, and $c$:\n",
398
+ "\n",
399
+ "$$\n",
400
+ "a = \\frac{x^2 - 3xy + 2x + y^2 - y}{x^2 - xy - x + y^2 - y + 1}, \\\\\n",
401
+ "b = \\frac{2x^2 - 3x + y}{x^2 - xy - x + y^2 - y + 1}, \\\\\n",
402
+ "c = \\frac{4x^2 - 4xy - 6x + 6y^2 - 7y + 7}{x^2 - xy - x + y^2 - y + 1}.\n",
403
+ "$$\n",
404
+ "\n",
405
+ "These solutions are expressed in terms of $x$ and $y$. However, we were able to find the sum $a + b + c$ without knowing the values of $x$ and $y$. The sum $a + b + c$ simplifies to $7$.\n",
406
+ "\n",
407
+ "In conclusion, the sum $a + b + c$ for the given system of equations is equal to $7$.\n",
408
+ "\n",
409
+ "--------------------------------------------------------------------------------\n",
410
+ "\u001b[33mstudent\u001b[0m (to assistant_for_student):\n",
411
+ "\n",
412
+ "\u001b[32m***** Response from calling function \"ask_expert\" *****\u001b[0m\n",
413
+ "Sure, let me summarize the solution and explain the answer in an easy-to-understand way.\n",
414
+ "\n",
415
+ "We were given a system of three equations with three unknowns, $a$, $b$, and $c$. The equations were:\n",
416
+ "\n",
417
+ "1. $ax + by + c = x + 7$\n",
418
+ "2. $a + bx + cy = 2x + 6y$\n",
419
+ "3. $ay + b + cx = 4x + y$\n",
420
+ "\n",
421
+ "We were asked to find the sum $a + b + c$ given that $x + y \\neq -1$.\n",
422
+ "\n",
423
+ "We used Python's SymPy library to solve the system of equations and found the following solutions for $a$, $b$, and $c$:\n",
424
+ "\n",
425
+ "$$\n",
426
+ "a = \\frac{x^2 - 3xy + 2x + y^2 - y}{x^2 - xy - x + y^2 - y + 1}, \\\\\n",
427
+ "b = \\frac{2x^2 - 3x + y}{x^2 - xy - x + y^2 - y + 1}, \\\\\n",
428
+ "c = \\frac{4x^2 - 4xy - 6x + 6y^2 - 7y + 7}{x^2 - xy - x + y^2 - y + 1}.\n",
429
+ "$$\n",
430
+ "\n",
431
+ "These solutions are expressed in terms of $x$ and $y$. However, we were able to find the sum $a + b + c$ without knowing the values of $x$ and $y$. The sum $a + b + c$ simplifies to $7$.\n",
432
+ "\n",
433
+ "In conclusion, the sum $a + b + c$ for the given system of equations is equal to $7$.\n",
434
+ "\u001b[32m*******************************************************\u001b[0m\n",
435
+ "\n",
436
+ "--------------------------------------------------------------------------------\n",
437
+ "\u001b[33massistant_for_student\u001b[0m (to student):\n",
438
+ "\n",
439
+ "The solution to the system of equations you provided is:\n",
440
+ "\n",
441
+ "1. $a = \\frac{x^2 - 3xy + 2x + y^2 - y}{x^2 - xy - x + y^2 - y + 1}$\n",
442
+ "2. $b = \\frac{2x^2 - 3x + y}{x^2 - xy - x + y^2 - y + 1}$\n",
443
+ "3. $c = \\frac{4x^2 - 4xy - 6x + 6y^2 - 7y + 7}{x^2 - xy - x + y^2 - y + 1}$\n",
444
+ "\n",
445
+ "These solutions are expressed in terms of $x$ and $y$. However, we can find the sum $a + b + c$ without knowing the values of $x$ and $y$. The sum $a + b + c$ simplifies to $7$.\n",
446
+ "\n",
447
+ "So, the sum $a + b + c$ for the given system of equations is equal to $7$.\n",
448
+ "\n",
449
+ "--------------------------------------------------------------------------------\n",
450
+ "\u001b[31m\n",
451
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
452
+ "\u001b[33mstudent\u001b[0m (to assistant_for_student):\n",
453
+ "\n",
454
+ "\n",
455
+ "\n",
456
+ "--------------------------------------------------------------------------------\n",
457
+ "\u001b[33massistant_for_student\u001b[0m (to student):\n",
458
+ "\n",
459
+ "TERMINATE\n",
460
+ "\n",
461
+ "--------------------------------------------------------------------------------\n",
462
+ "\u001b[31m\n",
463
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n"
464
+ ]
465
+ }
466
+ ],
467
+ "source": [
468
+ "# the assistant receives a message from the student, which contains the task description\n",
469
+ "student.initiate_chat(\n",
470
+ " assistant_for_student,\n",
471
+ " message=\"\"\"Find $a + b + c$, given that $x+y \\\\neq -1$ and \n",
472
+ "\\\\begin{align}\n",
473
+ "\tax + by + c & = x + 7,\\\\\n",
474
+ "\ta + bx + cy & = 2x + 6y,\\\\\n",
475
+ "\tay + b + cx & = 4x + y.\n",
476
+ "\\\\end{align}.\n",
477
+ "\"\"\",\n",
478
+ ")"
479
+ ]
480
+ },
481
+ {
482
+ "attachments": {},
483
+ "cell_type": "markdown",
484
+ "metadata": {},
485
+ "source": [
486
+ "When the assistant needs to consult the expert, it suggests a function call to `ask_expert`. When this happens, a line like the following will be displayed:\n",
487
+ "\n",
488
+ "***** Suggested function Call: ask_expert *****\n"
489
+ ]
490
+ }
491
+ ],
492
+ "metadata": {
493
+ "kernelspec": {
494
+ "display_name": "Python 3",
495
+ "language": "python",
496
+ "name": "python3"
497
+ },
498
+ "language_info": {
499
+ "codemirror_mode": {
500
+ "name": "ipython",
501
+ "version": 3
502
+ },
503
+ "file_extension": ".py",
504
+ "mimetype": "text/x-python",
505
+ "name": "python",
506
+ "nbconvert_exporter": "python",
507
+ "pygments_lexer": "ipython3",
508
+ "version": "3.9.17"
509
+ },
510
+ "vscode": {
511
+ "interpreter": {
512
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
513
+ }
514
+ },
515
+ "widgets": {
516
+ "application/vnd.jupyter.widget-state+json": {
517
+ "state": {
518
+ "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
519
+ "model_module": "@jupyter-widgets/base",
520
+ "model_module_version": "2.0.0",
521
+ "model_name": "LayoutModel",
522
+ "state": {
523
+ "_model_module": "@jupyter-widgets/base",
524
+ "_model_module_version": "2.0.0",
525
+ "_model_name": "LayoutModel",
526
+ "_view_count": null,
527
+ "_view_module": "@jupyter-widgets/base",
528
+ "_view_module_version": "2.0.0",
529
+ "_view_name": "LayoutView",
530
+ "align_content": null,
531
+ "align_items": null,
532
+ "align_self": null,
533
+ "border_bottom": null,
534
+ "border_left": null,
535
+ "border_right": null,
536
+ "border_top": null,
537
+ "bottom": null,
538
+ "display": null,
539
+ "flex": null,
540
+ "flex_flow": null,
541
+ "grid_area": null,
542
+ "grid_auto_columns": null,
543
+ "grid_auto_flow": null,
544
+ "grid_auto_rows": null,
545
+ "grid_column": null,
546
+ "grid_gap": null,
547
+ "grid_row": null,
548
+ "grid_template_areas": null,
549
+ "grid_template_columns": null,
550
+ "grid_template_rows": null,
551
+ "height": null,
552
+ "justify_content": null,
553
+ "justify_items": null,
554
+ "left": null,
555
+ "margin": null,
556
+ "max_height": null,
557
+ "max_width": null,
558
+ "min_height": null,
559
+ "min_width": null,
560
+ "object_fit": null,
561
+ "object_position": null,
562
+ "order": null,
563
+ "overflow": null,
564
+ "padding": null,
565
+ "right": null,
566
+ "top": null,
567
+ "visibility": null,
568
+ "width": null
569
+ }
570
+ },
571
+ "454146d0f7224f038689031002906e6f": {
572
+ "model_module": "@jupyter-widgets/controls",
573
+ "model_module_version": "2.0.0",
574
+ "model_name": "HBoxModel",
575
+ "state": {
576
+ "_dom_classes": [],
577
+ "_model_module": "@jupyter-widgets/controls",
578
+ "_model_module_version": "2.0.0",
579
+ "_model_name": "HBoxModel",
580
+ "_view_count": null,
581
+ "_view_module": "@jupyter-widgets/controls",
582
+ "_view_module_version": "2.0.0",
583
+ "_view_name": "HBoxView",
584
+ "box_style": "",
585
+ "children": [
586
+ "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
587
+ "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
588
+ "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
589
+ ],
590
+ "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
591
+ "tabbable": null,
592
+ "tooltip": null
593
+ }
594
+ },
595
+ "577e1e3cc4db4942b0883577b3b52755": {
596
+ "model_module": "@jupyter-widgets/controls",
597
+ "model_module_version": "2.0.0",
598
+ "model_name": "FloatProgressModel",
599
+ "state": {
600
+ "_dom_classes": [],
601
+ "_model_module": "@jupyter-widgets/controls",
602
+ "_model_module_version": "2.0.0",
603
+ "_model_name": "FloatProgressModel",
604
+ "_view_count": null,
605
+ "_view_module": "@jupyter-widgets/controls",
606
+ "_view_module_version": "2.0.0",
607
+ "_view_name": "ProgressView",
608
+ "bar_style": "success",
609
+ "description": "",
610
+ "description_allow_html": false,
611
+ "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
612
+ "max": 1,
613
+ "min": 0,
614
+ "orientation": "horizontal",
615
+ "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
616
+ "tabbable": null,
617
+ "tooltip": null,
618
+ "value": 1
619
+ }
620
+ },
621
+ "6086462a12d54bafa59d3c4566f06cb2": {
622
+ "model_module": "@jupyter-widgets/base",
623
+ "model_module_version": "2.0.0",
624
+ "model_name": "LayoutModel",
625
+ "state": {
626
+ "_model_module": "@jupyter-widgets/base",
627
+ "_model_module_version": "2.0.0",
628
+ "_model_name": "LayoutModel",
629
+ "_view_count": null,
630
+ "_view_module": "@jupyter-widgets/base",
631
+ "_view_module_version": "2.0.0",
632
+ "_view_name": "LayoutView",
633
+ "align_content": null,
634
+ "align_items": null,
635
+ "align_self": null,
636
+ "border_bottom": null,
637
+ "border_left": null,
638
+ "border_right": null,
639
+ "border_top": null,
640
+ "bottom": null,
641
+ "display": null,
642
+ "flex": null,
643
+ "flex_flow": null,
644
+ "grid_area": null,
645
+ "grid_auto_columns": null,
646
+ "grid_auto_flow": null,
647
+ "grid_auto_rows": null,
648
+ "grid_column": null,
649
+ "grid_gap": null,
650
+ "grid_row": null,
651
+ "grid_template_areas": null,
652
+ "grid_template_columns": null,
653
+ "grid_template_rows": null,
654
+ "height": null,
655
+ "justify_content": null,
656
+ "justify_items": null,
657
+ "left": null,
658
+ "margin": null,
659
+ "max_height": null,
660
+ "max_width": null,
661
+ "min_height": null,
662
+ "min_width": null,
663
+ "object_fit": null,
664
+ "object_position": null,
665
+ "order": null,
666
+ "overflow": null,
667
+ "padding": null,
668
+ "right": null,
669
+ "top": null,
670
+ "visibility": null,
671
+ "width": null
672
+ }
673
+ },
674
+ "74a6ba0c3cbc4051be0a83e152fe1e62": {
675
+ "model_module": "@jupyter-widgets/controls",
676
+ "model_module_version": "2.0.0",
677
+ "model_name": "ProgressStyleModel",
678
+ "state": {
679
+ "_model_module": "@jupyter-widgets/controls",
680
+ "_model_module_version": "2.0.0",
681
+ "_model_name": "ProgressStyleModel",
682
+ "_view_count": null,
683
+ "_view_module": "@jupyter-widgets/base",
684
+ "_view_module_version": "2.0.0",
685
+ "_view_name": "StyleView",
686
+ "bar_color": null,
687
+ "description_width": ""
688
+ }
689
+ },
690
+ "7d3f3d9e15894d05a4d188ff4f466554": {
691
+ "model_module": "@jupyter-widgets/controls",
692
+ "model_module_version": "2.0.0",
693
+ "model_name": "HTMLStyleModel",
694
+ "state": {
695
+ "_model_module": "@jupyter-widgets/controls",
696
+ "_model_module_version": "2.0.0",
697
+ "_model_name": "HTMLStyleModel",
698
+ "_view_count": null,
699
+ "_view_module": "@jupyter-widgets/base",
700
+ "_view_module_version": "2.0.0",
701
+ "_view_name": "StyleView",
702
+ "background": null,
703
+ "description_width": "",
704
+ "font_size": null,
705
+ "text_color": null
706
+ }
707
+ },
708
+ "b40bdfb1ac1d4cffb7cefcb870c64d45": {
709
+ "model_module": "@jupyter-widgets/controls",
710
+ "model_module_version": "2.0.0",
711
+ "model_name": "HTMLModel",
712
+ "state": {
713
+ "_dom_classes": [],
714
+ "_model_module": "@jupyter-widgets/controls",
715
+ "_model_module_version": "2.0.0",
716
+ "_model_name": "HTMLModel",
717
+ "_view_count": null,
718
+ "_view_module": "@jupyter-widgets/controls",
719
+ "_view_module_version": "2.0.0",
720
+ "_view_name": "HTMLView",
721
+ "description": "",
722
+ "description_allow_html": false,
723
+ "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
724
+ "placeholder": "​",
725
+ "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
726
+ "tabbable": null,
727
+ "tooltip": null,
728
+ "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
729
+ }
730
+ },
731
+ "ca245376fd9f4354af6b2befe4af4466": {
732
+ "model_module": "@jupyter-widgets/controls",
733
+ "model_module_version": "2.0.0",
734
+ "model_name": "HTMLStyleModel",
735
+ "state": {
736
+ "_model_module": "@jupyter-widgets/controls",
737
+ "_model_module_version": "2.0.0",
738
+ "_model_name": "HTMLStyleModel",
739
+ "_view_count": null,
740
+ "_view_module": "@jupyter-widgets/base",
741
+ "_view_module_version": "2.0.0",
742
+ "_view_name": "StyleView",
743
+ "background": null,
744
+ "description_width": "",
745
+ "font_size": null,
746
+ "text_color": null
747
+ }
748
+ },
749
+ "dc83c7bff2f241309537a8119dfc7555": {
750
+ "model_module": "@jupyter-widgets/base",
751
+ "model_module_version": "2.0.0",
752
+ "model_name": "LayoutModel",
753
+ "state": {
754
+ "_model_module": "@jupyter-widgets/base",
755
+ "_model_module_version": "2.0.0",
756
+ "_model_name": "LayoutModel",
757
+ "_view_count": null,
758
+ "_view_module": "@jupyter-widgets/base",
759
+ "_view_module_version": "2.0.0",
760
+ "_view_name": "LayoutView",
761
+ "align_content": null,
762
+ "align_items": null,
763
+ "align_self": null,
764
+ "border_bottom": null,
765
+ "border_left": null,
766
+ "border_right": null,
767
+ "border_top": null,
768
+ "bottom": null,
769
+ "display": null,
770
+ "flex": null,
771
+ "flex_flow": null,
772
+ "grid_area": null,
773
+ "grid_auto_columns": null,
774
+ "grid_auto_flow": null,
775
+ "grid_auto_rows": null,
776
+ "grid_column": null,
777
+ "grid_gap": null,
778
+ "grid_row": null,
779
+ "grid_template_areas": null,
780
+ "grid_template_columns": null,
781
+ "grid_template_rows": null,
782
+ "height": null,
783
+ "justify_content": null,
784
+ "justify_items": null,
785
+ "left": null,
786
+ "margin": null,
787
+ "max_height": null,
788
+ "max_width": null,
789
+ "min_height": null,
790
+ "min_width": null,
791
+ "object_fit": null,
792
+ "object_position": null,
793
+ "order": null,
794
+ "overflow": null,
795
+ "padding": null,
796
+ "right": null,
797
+ "top": null,
798
+ "visibility": null,
799
+ "width": null
800
+ }
801
+ },
802
+ "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
803
+ "model_module": "@jupyter-widgets/controls",
804
+ "model_module_version": "2.0.0",
805
+ "model_name": "HTMLModel",
806
+ "state": {
807
+ "_dom_classes": [],
808
+ "_model_module": "@jupyter-widgets/controls",
809
+ "_model_module_version": "2.0.0",
810
+ "_model_name": "HTMLModel",
811
+ "_view_count": null,
812
+ "_view_module": "@jupyter-widgets/controls",
813
+ "_view_module_version": "2.0.0",
814
+ "_view_name": "HTMLView",
815
+ "description": "",
816
+ "description_allow_html": false,
817
+ "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
818
+ "placeholder": "​",
819
+ "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
820
+ "tabbable": null,
821
+ "tooltip": null,
822
+ "value": "100%"
823
+ }
824
+ },
825
+ "f1355871cc6f4dd4b50d9df5af20e5c8": {
826
+ "model_module": "@jupyter-widgets/base",
827
+ "model_module_version": "2.0.0",
828
+ "model_name": "LayoutModel",
829
+ "state": {
830
+ "_model_module": "@jupyter-widgets/base",
831
+ "_model_module_version": "2.0.0",
832
+ "_model_name": "LayoutModel",
833
+ "_view_count": null,
834
+ "_view_module": "@jupyter-widgets/base",
835
+ "_view_module_version": "2.0.0",
836
+ "_view_name": "LayoutView",
837
+ "align_content": null,
838
+ "align_items": null,
839
+ "align_self": null,
840
+ "border_bottom": null,
841
+ "border_left": null,
842
+ "border_right": null,
843
+ "border_top": null,
844
+ "bottom": null,
845
+ "display": null,
846
+ "flex": null,
847
+ "flex_flow": null,
848
+ "grid_area": null,
849
+ "grid_auto_columns": null,
850
+ "grid_auto_flow": null,
851
+ "grid_auto_rows": null,
852
+ "grid_column": null,
853
+ "grid_gap": null,
854
+ "grid_row": null,
855
+ "grid_template_areas": null,
856
+ "grid_template_columns": null,
857
+ "grid_template_rows": null,
858
+ "height": null,
859
+ "justify_content": null,
860
+ "justify_items": null,
861
+ "left": null,
862
+ "margin": null,
863
+ "max_height": null,
864
+ "max_width": null,
865
+ "min_height": null,
866
+ "min_width": null,
867
+ "object_fit": null,
868
+ "object_position": null,
869
+ "order": null,
870
+ "overflow": null,
871
+ "padding": null,
872
+ "right": null,
873
+ "top": null,
874
+ "visibility": null,
875
+ "width": null
876
+ }
877
+ }
878
+ },
879
+ "version_major": 2,
880
+ "version_minor": 0
881
+ }
882
+ }
883
+ },
884
+ "nbformat": 4,
885
+ "nbformat_minor": 2
886
+ }
agentchat_web_info.ipynb ADDED
@@ -0,0 +1,987 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_web_info.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "slideshow": {
16
+ "slide_type": "slide"
17
+ }
18
+ },
19
+ "source": [
20
+ "# Auto Generated Agent Chat: Solving Tasks Requiring Web Info\n",
21
+ "\n",
22
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
23
+ "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
24
+ "\n",
25
+ "In this notebook, we demonstrate how to use `AssistantAgent` and `UserProxyAgent` to perform tasks which require acquiring info from the web:\n",
26
+ "* discuss a paper based on its URL.\n",
27
+ "* discuss about stock market.\n",
28
+ "\n",
29
+ "Here `AssistantAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for a user to execute the code written by `AssistantAgent`. By setting `human_input_mode` properly, the `UserProxyAgent` can also prompt the user for feedback to `AssistantAgent`. For example, when `human_input_mode` is set to \"TERMINATE\", the `UserProxyAgent` will execute the code written by `AssistantAgent` directly and return the execution results (success or failure and corresponding outputs) to `AssistantAgent`, and prompt the user for feedback when the task is finished. When user feedback is provided, the `UserProxyAgent` will directly pass the feedback to `AssistantAgent`.\n",
30
+ "\n",
31
+ "## Requirements\n",
32
+ "\n",
33
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install pyautogen and docker:\n",
34
+ "```bash\n",
35
+ "pip install pyautogen docker\n",
36
+ "```"
37
+ ]
38
+ },
39
+ {
40
+ "cell_type": "code",
41
+ "execution_count": 1,
42
+ "metadata": {
43
+ "execution": {
44
+ "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
45
+ "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
46
+ "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
47
+ "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
48
+ }
49
+ },
50
+ "outputs": [],
51
+ "source": [
52
+ "# %pip install pyautogen~=0.1.0 docker"
53
+ ]
54
+ },
55
+ {
56
+ "attachments": {},
57
+ "cell_type": "markdown",
58
+ "metadata": {},
59
+ "source": [
60
+ "## Set your API Endpoint\n",
61
+ "\n",
62
+ "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n"
63
+ ]
64
+ },
65
+ {
66
+ "cell_type": "code",
67
+ "execution_count": 2,
68
+ "metadata": {},
69
+ "outputs": [],
70
+ "source": [
71
+ "import autogen\n",
72
+ "\n",
73
+ "config_list = autogen.config_list_from_json(\n",
74
+ " \"OAI_CONFIG_LIST\",\n",
75
+ " filter_dict={\n",
76
+ " \"model\": [\"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
77
+ " },\n",
78
+ ")\n",
79
+ "\n",
80
+ "llm_config={\n",
81
+ " \"request_timeout\": 600,\n",
82
+ " \"seed\": 42,\n",
83
+ " \"config_list\": config_list,\n",
84
+ " \"temperature\": 0,\n",
85
+ "}"
86
+ ]
87
+ },
88
+ {
89
+ "attachments": {},
90
+ "cell_type": "markdown",
91
+ "metadata": {},
92
+ "source": [
93
+ "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well). Only the models with matching names are kept in the list based on the filter condition.\n",
94
+ "\n",
95
+ "The config list looks like the following:\n",
96
+ "```python\n",
97
+ "config_list = [\n",
98
+ " {\n",
99
+ " 'model': 'gpt-4-32k',\n",
100
+ " 'api_key': '<your OpenAI API key here>',\n",
101
+ " },\n",
102
+ " {\n",
103
+ " 'model': 'gpt4',\n",
104
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
105
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
106
+ " 'api_type': 'azure',\n",
107
+ " 'api_version': '2023-06-01-preview',\n",
108
+ " },\n",
109
+ " {\n",
110
+ " 'model': 'gpt-4-32k-0314',\n",
111
+ " 'api_key': '<your Azure OpenAI API key here>',\n",
112
+ " 'api_base': '<your Azure OpenAI API base here>',\n",
113
+ " 'api_type': 'azure',\n",
114
+ " 'api_version': '2023-06-01-preview',\n",
115
+ " },\n",
116
+ "]\n",
117
+ "```\n",
118
+ "\n",
119
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n",
120
+ "\n",
121
+ "You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file."
122
+ ]
123
+ },
124
+ {
125
+ "attachments": {},
126
+ "cell_type": "markdown",
127
+ "metadata": {},
128
+ "source": [
129
+ "## Construct Agents\n",
130
+ "\n",
131
+ "We construct the assistant agent and the user proxy agent. We specify `human_input_mode` as \"TERMINATE\" in the user proxy agent, which will ask for human feedback when it receives a \"TERMINATE\" signal from the assistant agent."
132
+ ]
133
+ },
134
+ {
135
+ "cell_type": "code",
136
+ "execution_count": 3,
137
+ "metadata": {},
138
+ "outputs": [],
139
+ "source": [
140
+ "# create an AssistantAgent instance named \"assistant\"\n",
141
+ "assistant = autogen.AssistantAgent(\n",
142
+ " name=\"assistant\",\n",
143
+ " llm_config=llm_config,\n",
144
+ ")\n",
145
+ "# create a UserProxyAgent instance named \"user_proxy\"\n",
146
+ "user_proxy = autogen.UserProxyAgent(\n",
147
+ " name=\"user_proxy\",\n",
148
+ " human_input_mode=\"TERMINATE\",\n",
149
+ " max_consecutive_auto_reply=10,\n",
150
+ " is_termination_msg=lambda x: x.get(\"content\", \"\").rstrip().endswith(\"TERMINATE\"),\n",
151
+ " code_execution_config={\"work_dir\": \"web\"},\n",
152
+ " llm_config=llm_config,\n",
153
+ " system_message=\"\"\"Reply TERMINATE if the task has been solved at full satisfaction.\n",
154
+ "Otherwise, reply CONTINUE, or the reason why the task is not solved yet.\"\"\"\n",
155
+ ")"
156
+ ]
157
+ },
158
+ {
159
+ "attachments": {},
160
+ "cell_type": "markdown",
161
+ "metadata": {},
162
+ "source": [
163
+ "## Example Task: Paper Talk from URL\n",
164
+ "\n",
165
+ "We invoke the `initiate_chat()` method of the user proxy agent to start the conversation. When you run the cell below, you will be prompted to provide feedback after the assistant agent sends a \"TERMINATE\" signal in the end of the message. If you don't provide any feedback (by pressing Enter directly), the conversation will finish. Before the \"TERMINATE\" signal, the user proxy agent will try to execute the code suggested by the assistant agent on behalf of the user."
166
+ ]
167
+ },
168
+ {
169
+ "cell_type": "code",
170
+ "execution_count": 4,
171
+ "metadata": {},
172
+ "outputs": [
173
+ {
174
+ "name": "stdout",
175
+ "output_type": "stream",
176
+ "text": [
177
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
178
+ "\n",
179
+ "\n",
180
+ "Who should read this paper: https://arxiv.org/abs/2306.01337\n",
181
+ "\n",
182
+ "\n",
183
+ "--------------------------------------------------------------------------------\n",
184
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
185
+ "\n",
186
+ "To determine who should read the paper, we need to first understand the content and context of the paper. We can do this by fetching the abstract of the paper from the provided URL and analyzing it. \n",
187
+ "\n",
188
+ "Here is a Python script that uses the BeautifulSoup library to scrape the abstract of the paper from the webpage. \n",
189
+ "\n",
190
+ "```python\n",
191
+ "# Python script to scrape the abstract of the paper\n",
192
+ "\n",
193
+ "import requests\n",
194
+ "from bs4 import BeautifulSoup\n",
195
+ "\n",
196
+ "def get_abstract(url):\n",
197
+ " response = requests.get(url)\n",
198
+ " soup = BeautifulSoup(response.text, 'html.parser')\n",
199
+ " abstract = soup.find('blockquote', attrs={'class': 'abstract mathjax'}).text.strip()\n",
200
+ " return abstract\n",
201
+ "\n",
202
+ "url = \"https://arxiv.org/abs/2306.01337\"\n",
203
+ "abstract = get_abstract(url)\n",
204
+ "print(abstract)\n",
205
+ "```\n",
206
+ "\n",
207
+ "Please run this script and provide the output. Based on the abstract, I can suggest who might be interested in reading this paper.\n",
208
+ "\n",
209
+ "--------------------------------------------------------------------------------\n",
210
+ "\u001b[31m\n",
211
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
212
+ "\u001b[31m\n",
213
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
214
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
215
+ "\n",
216
+ "exitcode: 0 (execution succeeded)\n",
217
+ "Code output: \n",
218
+ "Abstract: Employing Large Language Models (LLMs) to address mathematical problems is an\n",
219
+ "intriguing research endeavor, considering the abundance of math problems\n",
220
+ "expressed in natural language across numerous science and engineering fields.\n",
221
+ "While several prior works have investigated solving elementary mathematics\n",
222
+ "using LLMs, this work explores the frontier of using GPT-4 for solving more\n",
223
+ "complex and challenging math problems. We evaluate various ways of using GPT-4.\n",
224
+ "Some of them are adapted from existing work, and one is MathChat, a\n",
225
+ "conversational problem-solving framework newly proposed in this work. We\n",
226
+ "perform the evaluation on difficult high school competition problems from the\n",
227
+ "MATH dataset, which shows the advantage of the proposed conversational\n",
228
+ "approach.\n",
229
+ "\n",
230
+ "\n",
231
+ "--------------------------------------------------------------------------------\n",
232
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
233
+ "\n",
234
+ "Based on the abstract, the paper is about using Large Language Models (LLMs), specifically GPT-4, to solve complex mathematical problems. The paper introduces a new conversational problem-solving framework called MathChat and evaluates its performance on difficult high school competition problems from the MATH dataset.\n",
235
+ "\n",
236
+ "Given this, the paper would be of interest to the following groups:\n",
237
+ "\n",
238
+ "1. **Researchers in Artificial Intelligence and Natural Language Processing**: The paper discusses the use of a large language model (GPT-4) for problem-solving, which is a key research area in AI and NLP.\n",
239
+ "\n",
240
+ "2. **Mathematicians and Math Educators**: The paper focuses on solving complex mathematical problems, so those with a background in mathematics might find the techniques and results interesting.\n",
241
+ "\n",
242
+ "3. **Data Scientists and Machine Learning Engineers**: These professionals often use models like GPT-4 in their work and might be interested in new applications and techniques.\n",
243
+ "\n",
244
+ "4. **Students studying AI, NLP, or Mathematics**: The paper could provide valuable insights for these students into how AI can be used in problem-solving.\n",
245
+ "\n",
246
+ "5. **Developers working on AI-based chatbots or conversational agents**: The paper introduces a new conversational problem-solving framework, which could be of interest to these developers.\n",
247
+ "\n",
248
+ "Please note that while the paper is likely to be of interest to these groups, the specific relevance will depend on the individual's specific interests and research needs.\n",
249
+ "\n",
250
+ "TERMINATE\n",
251
+ "\n",
252
+ "--------------------------------------------------------------------------------\n",
253
+ "\u001b[31m\n",
254
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n"
255
+ ]
256
+ }
257
+ ],
258
+ "source": [
259
+ "# the assistant receives a message from the user, which contains the task description\n",
260
+ "user_proxy.initiate_chat(\n",
261
+ " assistant,\n",
262
+ " message=\"\"\"\n",
263
+ "Who should read this paper: https://arxiv.org/abs/2308.08155\n",
264
+ "\"\"\",\n",
265
+ ")"
266
+ ]
267
+ },
268
+ {
269
+ "attachments": {},
270
+ "cell_type": "markdown",
271
+ "metadata": {},
272
+ "source": [
273
+ "## Example Task: Chat about Stock Market"
274
+ ]
275
+ },
276
+ {
277
+ "cell_type": "code",
278
+ "execution_count": 5,
279
+ "metadata": {},
280
+ "outputs": [
281
+ {
282
+ "name": "stdout",
283
+ "output_type": "stream",
284
+ "text": [
285
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
286
+ "\n",
287
+ "Show me the YTD gain of 10 largest technology companies as of today.\n",
288
+ "\n",
289
+ "--------------------------------------------------------------------------------\n",
290
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
291
+ "\n",
292
+ "To get the YTD gain of the 10 largest technology companies, we need to do the following:\n",
293
+ "\n",
294
+ "1. Identify the 10 largest technology companies. We can use the list of the largest technology companies by market capitalization. This list can change frequently, so we need to get the latest data. We can use web scraping to get this data from a reliable source.\n",
295
+ "\n",
296
+ "2. Get the YTD gain for each of these companies. We can use a financial data API to get this data. Yahoo Finance is a popular source for this kind of data.\n",
297
+ "\n",
298
+ "Here is a Python script that uses the BeautifulSoup library for web scraping and the yfinance library to get data from Yahoo Finance. This script will print the 10 largest technology companies and their YTD gains.\n",
299
+ "\n",
300
+ "```python\n",
301
+ "# filename: ytd_gain.py\n",
302
+ "\n",
303
+ "import requests\n",
304
+ "from bs4 import BeautifulSoup\n",
305
+ "import yfinance as yf\n",
306
+ "import pandas as pd\n",
307
+ "\n",
308
+ "# Get the list of largest tech companies by market cap\n",
309
+ "url = \"https://companiesmarketcap.com/technology/largest-technology-companies-by-market-cap/\"\n",
310
+ "response = requests.get(url)\n",
311
+ "soup = BeautifulSoup(response.text, 'html.parser')\n",
312
+ "\n",
313
+ "# Find the table with the data\n",
314
+ "table = soup.find('table')\n",
315
+ "\n",
316
+ "# Get the ticker symbols of the 10 largest companies\n",
317
+ "tickers = []\n",
318
+ "for row in table.find_all('tr')[1:11]: # Skip the header row\n",
319
+ " ticker = row.find_all('td')[1].text\n",
320
+ " tickers.append(ticker)\n",
321
+ "\n",
322
+ "# Get the YTD gain for each company\n",
323
+ "ytd_gains = {}\n",
324
+ "for ticker in tickers:\n",
325
+ " ticker_data = yf.Ticker(ticker)\n",
326
+ " hist = ticker_data.history(period=\"ytd\")\n",
327
+ " start_price = hist.iloc[0]['Close']\n",
328
+ " end_price = hist.iloc[-1]['Close']\n",
329
+ " ytd_gain = (end_price - start_price) / start_price * 100\n",
330
+ " ytd_gains[ticker] = ytd_gain\n",
331
+ "\n",
332
+ "# Print the results\n",
333
+ "for ticker, ytd_gain in ytd_gains.items():\n",
334
+ " print(f\"{ticker}: {ytd_gain:.2f}%\")\n",
335
+ "```\n",
336
+ "\n",
337
+ "To run this script, save it to a file named `ytd_gain.py`, then run it with the command `python ytd_gain.py`.\n",
338
+ "\n",
339
+ "Please note that this script requires the BeautifulSoup, requests, yfinance, and pandas libraries. If you don't have these libraries installed, you can install them with the command `pip install beautifulsoup4 requests yfinance pandas`.\n",
340
+ "\n",
341
+ "--------------------------------------------------------------------------------\n",
342
+ "\u001b[31m\n",
343
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
344
+ "\u001b[31m\n",
345
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
346
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
347
+ "\n",
348
+ "exitcode: 1 (execution failed)\n",
349
+ "Code output: \n",
350
+ "Traceback (most recent call last):\n",
351
+ " File \"ytd_gain.py\", line 18, in <module>\n",
352
+ " for row in table.find_all('tr')[1:11]: # Skip the header row\n",
353
+ "AttributeError: 'NoneType' object has no attribute 'find_all'\n",
354
+ "\n",
355
+ "\n",
356
+ "--------------------------------------------------------------------------------\n",
357
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
358
+ "\n",
359
+ "I apologize for the error. It seems that the script failed to find the table in the webpage. This could be due to changes in the webpage structure or because the page uses JavaScript to load the table, which BeautifulSoup can't handle.\n",
360
+ "\n",
361
+ "Let's try a different approach. We'll use the yfinance library to get the data for the 10 largest technology companies. These are currently: Apple (AAPL), Microsoft (MSFT), Alphabet (GOOGL), Amazon (AMZN), Facebook (FB), Tesla (TSLA), Nvidia (NVDA), Paypal (PYPL), Adobe (ADBE), and Netflix (NFLX).\n",
362
+ "\n",
363
+ "Here's the updated Python script:\n",
364
+ "\n",
365
+ "```python\n",
366
+ "# filename: ytd_gain.py\n",
367
+ "\n",
368
+ "import yfinance as yf\n",
369
+ "\n",
370
+ "# List of the 10 largest tech companies by market cap\n",
371
+ "tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'FB', 'TSLA', 'NVDA', 'PYPL', 'ADBE', 'NFLX']\n",
372
+ "\n",
373
+ "# Get the YTD gain for each company\n",
374
+ "ytd_gains = {}\n",
375
+ "for ticker in tickers:\n",
376
+ " ticker_data = yf.Ticker(ticker)\n",
377
+ " hist = ticker_data.history(period=\"ytd\")\n",
378
+ " start_price = hist.iloc[0]['Close']\n",
379
+ " end_price = hist.iloc[-1]['Close']\n",
380
+ " ytd_gain = (end_price - start_price) / start_price * 100\n",
381
+ " ytd_gains[ticker] = ytd_gain\n",
382
+ "\n",
383
+ "# Print the results\n",
384
+ "for ticker, ytd_gain in ytd_gains.items():\n",
385
+ " print(f\"{ticker}: {ytd_gain:.2f}%\")\n",
386
+ "```\n",
387
+ "\n",
388
+ "To run this script, save it to a file named `ytd_gain.py`, then run it with the command `python ytd_gain.py`.\n",
389
+ "\n",
390
+ "Please note that this script requires the yfinance library. If you don't have this library installed, you can install it with the command `pip install yfinance`.\n",
391
+ "\n",
392
+ "--------------------------------------------------------------------------------\n",
393
+ "\u001b[31m\n",
394
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
395
+ "\u001b[31m\n",
396
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
397
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
398
+ "\n",
399
+ "exitcode: 1 (execution failed)\n",
400
+ "Code output: \n",
401
+ "FB: No data found, symbol may be delisted\n",
402
+ "Traceback (most recent call last):\n",
403
+ " File \"ytd_gain.py\", line 13, in <module>\n",
404
+ " start_price = hist.iloc[0]['Close']\n",
405
+ " File \"/home/vscode/.local/lib/python3.9/site-packages/pandas/core/indexing.py\", line 931, in __getitem__\n",
406
+ " return self._getitem_axis(maybe_callable, axis=axis)\n",
407
+ " File \"/home/vscode/.local/lib/python3.9/site-packages/pandas/core/indexing.py\", line 1566, in _getitem_axis\n",
408
+ " self._validate_integer(key, axis)\n",
409
+ " File \"/home/vscode/.local/lib/python3.9/site-packages/pandas/core/indexing.py\", line 1500, in _validate_integer\n",
410
+ " raise IndexError(\"single positional indexer is out-of-bounds\")\n",
411
+ "IndexError: single positional indexer is out-of-bounds\n",
412
+ "\n",
413
+ "\n",
414
+ "--------------------------------------------------------------------------------\n",
415
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
416
+ "\n",
417
+ "I apologize for the error. It seems that the script failed to get the data for Facebook (FB). This could be due to a temporary issue with the Yahoo Finance API.\n",
418
+ "\n",
419
+ "Let's modify the script to handle this situation. We'll add a try-except block to skip any companies for which we can't get data.\n",
420
+ "\n",
421
+ "Here's the updated Python script:\n",
422
+ "\n",
423
+ "```python\n",
424
+ "# filename: ytd_gain.py\n",
425
+ "\n",
426
+ "import yfinance as yf\n",
427
+ "\n",
428
+ "# List of the 10 largest tech companies by market cap\n",
429
+ "tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'FB', 'TSLA', 'NVDA', 'PYPL', 'ADBE', 'NFLX']\n",
430
+ "\n",
431
+ "# Get the YTD gain for each company\n",
432
+ "ytd_gains = {}\n",
433
+ "for ticker in tickers:\n",
434
+ " try:\n",
435
+ " ticker_data = yf.Ticker(ticker)\n",
436
+ " hist = ticker_data.history(period=\"ytd\")\n",
437
+ " start_price = hist.iloc[0]['Close']\n",
438
+ " end_price = hist.iloc[-1]['Close']\n",
439
+ " ytd_gain = (end_price - start_price) / start_price * 100\n",
440
+ " ytd_gains[ticker] = ytd_gain\n",
441
+ " except IndexError:\n",
442
+ " print(f\"Could not get data for {ticker}\")\n",
443
+ "\n",
444
+ "# Print the results\n",
445
+ "for ticker, ytd_gain in ytd_gains.items():\n",
446
+ " print(f\"{ticker}: {ytd_gain:.2f}%\")\n",
447
+ "```\n",
448
+ "\n",
449
+ "To run this script, save it to a file named `ytd_gain.py`, then run it with the command `python ytd_gain.py`.\n",
450
+ "\n",
451
+ "Please note that this script requires the yfinance library. If you don't have this library installed, you can install it with the command `pip install yfinance`.\n",
452
+ "\n",
453
+ "--------------------------------------------------------------------------------\n",
454
+ "\u001b[31m\n",
455
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
456
+ "\u001b[31m\n",
457
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
458
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
459
+ "\n",
460
+ "exitcode: 0 (execution succeeded)\n",
461
+ "Code output: \n",
462
+ "Could not get data for FB\n",
463
+ "AAPL: 46.41%\n",
464
+ "MSFT: 37.63%\n",
465
+ "GOOGL: 51.13%\n",
466
+ "AMZN: 56.22%\n",
467
+ "TSLA: 131.46%\n",
468
+ "NVDA: 237.68%\n",
469
+ "PYPL: -16.49%\n",
470
+ "ADBE: 59.79%\n",
471
+ "NFLX: 43.41%\n",
472
+ "\n",
473
+ "\n",
474
+ "--------------------------------------------------------------------------------\n",
475
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
476
+ "\n",
477
+ "The script has successfully fetched the Year-to-Date (YTD) gain for the largest technology companies, except for Facebook (FB). The YTD gains are as follows:\n",
478
+ "\n",
479
+ "- Apple (AAPL): 46.41%\n",
480
+ "- Microsoft (MSFT): 37.63%\n",
481
+ "- Alphabet (GOOGL): 51.13%\n",
482
+ "- Amazon (AMZN): 56.22%\n",
483
+ "- Tesla (TSLA): 131.46%\n",
484
+ "- Nvidia (NVDA): 237.68%\n",
485
+ "- Paypal (PYPL): -16.49%\n",
486
+ "- Adobe (ADBE): 59.79%\n",
487
+ "- Netflix (NFLX): 43.41%\n",
488
+ "\n",
489
+ "Unfortunately, the script could not fetch the data for Facebook (FB). This could be due to a temporary issue with the Yahoo Finance API or the ticker symbol might have changed.\n",
490
+ "\n",
491
+ "Please note that these values are subject to change as the stock market fluctuates. \n",
492
+ "\n",
493
+ "TERMINATE\n",
494
+ "\n",
495
+ "--------------------------------------------------------------------------------\n",
496
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
497
+ "\n",
498
+ "FB -> META\n",
499
+ "\n",
500
+ "--------------------------------------------------------------------------------\n",
501
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
502
+ "\n",
503
+ "Thank you for the update. Facebook Inc. has indeed rebranded to Meta Platforms Inc. and its ticker symbol has changed from FB to META on the NASDAQ stock exchange.\n",
504
+ "\n",
505
+ "Let's update the script to reflect this change:\n",
506
+ "\n",
507
+ "```python\n",
508
+ "# filename: ytd_gain.py\n",
509
+ "\n",
510
+ "import yfinance as yf\n",
511
+ "\n",
512
+ "# List of the 10 largest tech companies by market cap\n",
513
+ "# Updated Facebook (FB) to Meta Platforms (META)\n",
514
+ "tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'META', 'TSLA', 'NVDA', 'PYPL', 'ADBE', 'NFLX']\n",
515
+ "\n",
516
+ "# Get the YTD gain for each company\n",
517
+ "ytd_gains = {}\n",
518
+ "for ticker in tickers:\n",
519
+ " try:\n",
520
+ " ticker_data = yf.Ticker(ticker)\n",
521
+ " hist = ticker_data.history(period=\"ytd\")\n",
522
+ " start_price = hist.iloc[0]['Close']\n",
523
+ " end_price = hist.iloc[-1]['Close']\n",
524
+ " ytd_gain = (end_price - start_price) / start_price * 100\n",
525
+ " ytd_gains[ticker] = ytd_gain\n",
526
+ " except IndexError:\n",
527
+ " print(f\"Could not get data for {ticker}\")\n",
528
+ "\n",
529
+ "# Print the results\n",
530
+ "for ticker, ytd_gain in ytd_gains.items():\n",
531
+ " print(f\"{ticker}: {ytd_gain:.2f}%\")\n",
532
+ "```\n",
533
+ "\n",
534
+ "To run this script, save it to a file named `ytd_gain.py`, then run it with the command `python ytd_gain.py`.\n",
535
+ "\n",
536
+ "Please note that this script requires the yfinance library. If you don't have this library installed, you can install it with the command `pip install yfinance`.\n",
537
+ "\n",
538
+ "--------------------------------------------------------------------------------\n",
539
+ "\u001b[31m\n",
540
+ ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
541
+ "\u001b[31m\n",
542
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
543
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
544
+ "\n",
545
+ "exitcode: 0 (execution succeeded)\n",
546
+ "Code output: \n",
547
+ "AAPL: 46.34%\n",
548
+ "MSFT: 37.55%\n",
549
+ "GOOGL: 51.08%\n",
550
+ "AMZN: 56.17%\n",
551
+ "META: 133.67%\n",
552
+ "TSLA: 131.02%\n",
553
+ "NVDA: 237.38%\n",
554
+ "PYPL: -16.55%\n",
555
+ "ADBE: 59.71%\n",
556
+ "NFLX: 43.44%\n",
557
+ "\n",
558
+ "\n",
559
+ "--------------------------------------------------------------------------------\n",
560
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
561
+ "\n",
562
+ "The script has successfully fetched the Year-to-Date (YTD) gain for the largest technology companies, including Meta Platforms (formerly Facebook). The YTD gains are as follows:\n",
563
+ "\n",
564
+ "- Apple (AAPL): 46.34%\n",
565
+ "- Microsoft (MSFT): 37.55%\n",
566
+ "- Alphabet (GOOGL): 51.08%\n",
567
+ "- Amazon (AMZN): 56.17%\n",
568
+ "- Meta Platforms (META): 133.67%\n",
569
+ "- Tesla (TSLA): 131.02%\n",
570
+ "- Nvidia (NVDA): 237.38%\n",
571
+ "- Paypal (PYPL): -16.55%\n",
572
+ "- Adobe (ADBE): 59.71%\n",
573
+ "- Netflix (NFLX): 43.44%\n",
574
+ "\n",
575
+ "Please note that these values are subject to change as the stock market fluctuates. \n",
576
+ "\n",
577
+ "TERMINATE\n",
578
+ "\n",
579
+ "--------------------------------------------------------------------------------\n",
580
+ "\u001b[31m\n",
581
+ ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n"
582
+ ]
583
+ }
584
+ ],
585
+ "source": [
586
+ "user_proxy.initiate_chat(\n",
587
+ " assistant,\n",
588
+ " message=\"\"\"Show me the YTD gain of 10 largest technology companies as of today.\"\"\",\n",
589
+ ")"
590
+ ]
591
+ }
592
+ ],
593
+ "metadata": {
594
+ "kernelspec": {
595
+ "display_name": "Python 3",
596
+ "language": "python",
597
+ "name": "python3"
598
+ },
599
+ "language_info": {
600
+ "codemirror_mode": {
601
+ "name": "ipython",
602
+ "version": 3
603
+ },
604
+ "file_extension": ".py",
605
+ "mimetype": "text/x-python",
606
+ "name": "python",
607
+ "nbconvert_exporter": "python",
608
+ "pygments_lexer": "ipython3",
609
+ "version": "3.9.17"
610
+ },
611
+ "vscode": {
612
+ "interpreter": {
613
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
614
+ }
615
+ },
616
+ "widgets": {
617
+ "application/vnd.jupyter.widget-state+json": {
618
+ "state": {
619
+ "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
620
+ "model_module": "@jupyter-widgets/base",
621
+ "model_module_version": "2.0.0",
622
+ "model_name": "LayoutModel",
623
+ "state": {
624
+ "_model_module": "@jupyter-widgets/base",
625
+ "_model_module_version": "2.0.0",
626
+ "_model_name": "LayoutModel",
627
+ "_view_count": null,
628
+ "_view_module": "@jupyter-widgets/base",
629
+ "_view_module_version": "2.0.0",
630
+ "_view_name": "LayoutView",
631
+ "align_content": null,
632
+ "align_items": null,
633
+ "align_self": null,
634
+ "border_bottom": null,
635
+ "border_left": null,
636
+ "border_right": null,
637
+ "border_top": null,
638
+ "bottom": null,
639
+ "display": null,
640
+ "flex": null,
641
+ "flex_flow": null,
642
+ "grid_area": null,
643
+ "grid_auto_columns": null,
644
+ "grid_auto_flow": null,
645
+ "grid_auto_rows": null,
646
+ "grid_column": null,
647
+ "grid_gap": null,
648
+ "grid_row": null,
649
+ "grid_template_areas": null,
650
+ "grid_template_columns": null,
651
+ "grid_template_rows": null,
652
+ "height": null,
653
+ "justify_content": null,
654
+ "justify_items": null,
655
+ "left": null,
656
+ "margin": null,
657
+ "max_height": null,
658
+ "max_width": null,
659
+ "min_height": null,
660
+ "min_width": null,
661
+ "object_fit": null,
662
+ "object_position": null,
663
+ "order": null,
664
+ "overflow": null,
665
+ "padding": null,
666
+ "right": null,
667
+ "top": null,
668
+ "visibility": null,
669
+ "width": null
670
+ }
671
+ },
672
+ "454146d0f7224f038689031002906e6f": {
673
+ "model_module": "@jupyter-widgets/controls",
674
+ "model_module_version": "2.0.0",
675
+ "model_name": "HBoxModel",
676
+ "state": {
677
+ "_dom_classes": [],
678
+ "_model_module": "@jupyter-widgets/controls",
679
+ "_model_module_version": "2.0.0",
680
+ "_model_name": "HBoxModel",
681
+ "_view_count": null,
682
+ "_view_module": "@jupyter-widgets/controls",
683
+ "_view_module_version": "2.0.0",
684
+ "_view_name": "HBoxView",
685
+ "box_style": "",
686
+ "children": [
687
+ "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
688
+ "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
689
+ "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
690
+ ],
691
+ "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
692
+ "tabbable": null,
693
+ "tooltip": null
694
+ }
695
+ },
696
+ "577e1e3cc4db4942b0883577b3b52755": {
697
+ "model_module": "@jupyter-widgets/controls",
698
+ "model_module_version": "2.0.0",
699
+ "model_name": "FloatProgressModel",
700
+ "state": {
701
+ "_dom_classes": [],
702
+ "_model_module": "@jupyter-widgets/controls",
703
+ "_model_module_version": "2.0.0",
704
+ "_model_name": "FloatProgressModel",
705
+ "_view_count": null,
706
+ "_view_module": "@jupyter-widgets/controls",
707
+ "_view_module_version": "2.0.0",
708
+ "_view_name": "ProgressView",
709
+ "bar_style": "success",
710
+ "description": "",
711
+ "description_allow_html": false,
712
+ "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
713
+ "max": 1,
714
+ "min": 0,
715
+ "orientation": "horizontal",
716
+ "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
717
+ "tabbable": null,
718
+ "tooltip": null,
719
+ "value": 1
720
+ }
721
+ },
722
+ "6086462a12d54bafa59d3c4566f06cb2": {
723
+ "model_module": "@jupyter-widgets/base",
724
+ "model_module_version": "2.0.0",
725
+ "model_name": "LayoutModel",
726
+ "state": {
727
+ "_model_module": "@jupyter-widgets/base",
728
+ "_model_module_version": "2.0.0",
729
+ "_model_name": "LayoutModel",
730
+ "_view_count": null,
731
+ "_view_module": "@jupyter-widgets/base",
732
+ "_view_module_version": "2.0.0",
733
+ "_view_name": "LayoutView",
734
+ "align_content": null,
735
+ "align_items": null,
736
+ "align_self": null,
737
+ "border_bottom": null,
738
+ "border_left": null,
739
+ "border_right": null,
740
+ "border_top": null,
741
+ "bottom": null,
742
+ "display": null,
743
+ "flex": null,
744
+ "flex_flow": null,
745
+ "grid_area": null,
746
+ "grid_auto_columns": null,
747
+ "grid_auto_flow": null,
748
+ "grid_auto_rows": null,
749
+ "grid_column": null,
750
+ "grid_gap": null,
751
+ "grid_row": null,
752
+ "grid_template_areas": null,
753
+ "grid_template_columns": null,
754
+ "grid_template_rows": null,
755
+ "height": null,
756
+ "justify_content": null,
757
+ "justify_items": null,
758
+ "left": null,
759
+ "margin": null,
760
+ "max_height": null,
761
+ "max_width": null,
762
+ "min_height": null,
763
+ "min_width": null,
764
+ "object_fit": null,
765
+ "object_position": null,
766
+ "order": null,
767
+ "overflow": null,
768
+ "padding": null,
769
+ "right": null,
770
+ "top": null,
771
+ "visibility": null,
772
+ "width": null
773
+ }
774
+ },
775
+ "74a6ba0c3cbc4051be0a83e152fe1e62": {
776
+ "model_module": "@jupyter-widgets/controls",
777
+ "model_module_version": "2.0.0",
778
+ "model_name": "ProgressStyleModel",
779
+ "state": {
780
+ "_model_module": "@jupyter-widgets/controls",
781
+ "_model_module_version": "2.0.0",
782
+ "_model_name": "ProgressStyleModel",
783
+ "_view_count": null,
784
+ "_view_module": "@jupyter-widgets/base",
785
+ "_view_module_version": "2.0.0",
786
+ "_view_name": "StyleView",
787
+ "bar_color": null,
788
+ "description_width": ""
789
+ }
790
+ },
791
+ "7d3f3d9e15894d05a4d188ff4f466554": {
792
+ "model_module": "@jupyter-widgets/controls",
793
+ "model_module_version": "2.0.0",
794
+ "model_name": "HTMLStyleModel",
795
+ "state": {
796
+ "_model_module": "@jupyter-widgets/controls",
797
+ "_model_module_version": "2.0.0",
798
+ "_model_name": "HTMLStyleModel",
799
+ "_view_count": null,
800
+ "_view_module": "@jupyter-widgets/base",
801
+ "_view_module_version": "2.0.0",
802
+ "_view_name": "StyleView",
803
+ "background": null,
804
+ "description_width": "",
805
+ "font_size": null,
806
+ "text_color": null
807
+ }
808
+ },
809
+ "b40bdfb1ac1d4cffb7cefcb870c64d45": {
810
+ "model_module": "@jupyter-widgets/controls",
811
+ "model_module_version": "2.0.0",
812
+ "model_name": "HTMLModel",
813
+ "state": {
814
+ "_dom_classes": [],
815
+ "_model_module": "@jupyter-widgets/controls",
816
+ "_model_module_version": "2.0.0",
817
+ "_model_name": "HTMLModel",
818
+ "_view_count": null,
819
+ "_view_module": "@jupyter-widgets/controls",
820
+ "_view_module_version": "2.0.0",
821
+ "_view_name": "HTMLView",
822
+ "description": "",
823
+ "description_allow_html": false,
824
+ "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
825
+ "placeholder": "​",
826
+ "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
827
+ "tabbable": null,
828
+ "tooltip": null,
829
+ "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
830
+ }
831
+ },
832
+ "ca245376fd9f4354af6b2befe4af4466": {
833
+ "model_module": "@jupyter-widgets/controls",
834
+ "model_module_version": "2.0.0",
835
+ "model_name": "HTMLStyleModel",
836
+ "state": {
837
+ "_model_module": "@jupyter-widgets/controls",
838
+ "_model_module_version": "2.0.0",
839
+ "_model_name": "HTMLStyleModel",
840
+ "_view_count": null,
841
+ "_view_module": "@jupyter-widgets/base",
842
+ "_view_module_version": "2.0.0",
843
+ "_view_name": "StyleView",
844
+ "background": null,
845
+ "description_width": "",
846
+ "font_size": null,
847
+ "text_color": null
848
+ }
849
+ },
850
+ "dc83c7bff2f241309537a8119dfc7555": {
851
+ "model_module": "@jupyter-widgets/base",
852
+ "model_module_version": "2.0.0",
853
+ "model_name": "LayoutModel",
854
+ "state": {
855
+ "_model_module": "@jupyter-widgets/base",
856
+ "_model_module_version": "2.0.0",
857
+ "_model_name": "LayoutModel",
858
+ "_view_count": null,
859
+ "_view_module": "@jupyter-widgets/base",
860
+ "_view_module_version": "2.0.0",
861
+ "_view_name": "LayoutView",
862
+ "align_content": null,
863
+ "align_items": null,
864
+ "align_self": null,
865
+ "border_bottom": null,
866
+ "border_left": null,
867
+ "border_right": null,
868
+ "border_top": null,
869
+ "bottom": null,
870
+ "display": null,
871
+ "flex": null,
872
+ "flex_flow": null,
873
+ "grid_area": null,
874
+ "grid_auto_columns": null,
875
+ "grid_auto_flow": null,
876
+ "grid_auto_rows": null,
877
+ "grid_column": null,
878
+ "grid_gap": null,
879
+ "grid_row": null,
880
+ "grid_template_areas": null,
881
+ "grid_template_columns": null,
882
+ "grid_template_rows": null,
883
+ "height": null,
884
+ "justify_content": null,
885
+ "justify_items": null,
886
+ "left": null,
887
+ "margin": null,
888
+ "max_height": null,
889
+ "max_width": null,
890
+ "min_height": null,
891
+ "min_width": null,
892
+ "object_fit": null,
893
+ "object_position": null,
894
+ "order": null,
895
+ "overflow": null,
896
+ "padding": null,
897
+ "right": null,
898
+ "top": null,
899
+ "visibility": null,
900
+ "width": null
901
+ }
902
+ },
903
+ "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
904
+ "model_module": "@jupyter-widgets/controls",
905
+ "model_module_version": "2.0.0",
906
+ "model_name": "HTMLModel",
907
+ "state": {
908
+ "_dom_classes": [],
909
+ "_model_module": "@jupyter-widgets/controls",
910
+ "_model_module_version": "2.0.0",
911
+ "_model_name": "HTMLModel",
912
+ "_view_count": null,
913
+ "_view_module": "@jupyter-widgets/controls",
914
+ "_view_module_version": "2.0.0",
915
+ "_view_name": "HTMLView",
916
+ "description": "",
917
+ "description_allow_html": false,
918
+ "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
919
+ "placeholder": "​",
920
+ "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
921
+ "tabbable": null,
922
+ "tooltip": null,
923
+ "value": "100%"
924
+ }
925
+ },
926
+ "f1355871cc6f4dd4b50d9df5af20e5c8": {
927
+ "model_module": "@jupyter-widgets/base",
928
+ "model_module_version": "2.0.0",
929
+ "model_name": "LayoutModel",
930
+ "state": {
931
+ "_model_module": "@jupyter-widgets/base",
932
+ "_model_module_version": "2.0.0",
933
+ "_model_name": "LayoutModel",
934
+ "_view_count": null,
935
+ "_view_module": "@jupyter-widgets/base",
936
+ "_view_module_version": "2.0.0",
937
+ "_view_name": "LayoutView",
938
+ "align_content": null,
939
+ "align_items": null,
940
+ "align_self": null,
941
+ "border_bottom": null,
942
+ "border_left": null,
943
+ "border_right": null,
944
+ "border_top": null,
945
+ "bottom": null,
946
+ "display": null,
947
+ "flex": null,
948
+ "flex_flow": null,
949
+ "grid_area": null,
950
+ "grid_auto_columns": null,
951
+ "grid_auto_flow": null,
952
+ "grid_auto_rows": null,
953
+ "grid_column": null,
954
+ "grid_gap": null,
955
+ "grid_row": null,
956
+ "grid_template_areas": null,
957
+ "grid_template_columns": null,
958
+ "grid_template_rows": null,
959
+ "height": null,
960
+ "justify_content": null,
961
+ "justify_items": null,
962
+ "left": null,
963
+ "margin": null,
964
+ "max_height": null,
965
+ "max_width": null,
966
+ "min_height": null,
967
+ "min_width": null,
968
+ "object_fit": null,
969
+ "object_position": null,
970
+ "order": null,
971
+ "overflow": null,
972
+ "padding": null,
973
+ "right": null,
974
+ "top": null,
975
+ "visibility": null,
976
+ "width": null
977
+ }
978
+ }
979
+ },
980
+ "version_major": 2,
981
+ "version_minor": 0
982
+ }
983
+ }
984
+ },
985
+ "nbformat": 4,
986
+ "nbformat_minor": 2
987
+ }
oai_chatgpt_gpt4.ipynb ADDED
@@ -0,0 +1,2199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/oai_chatgpt_gpt4.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "slideshow": {
16
+ "slide_type": "slide"
17
+ }
18
+ },
19
+ "source": [
20
+ "Copyright (c) Microsoft Corporation. All rights reserved. \n",
21
+ "\n",
22
+ "Licensed under the MIT License.\n",
23
+ "\n",
24
+ "# Use AutoGen to Tune ChatGPT\n",
25
+ "\n",
26
+ "AutoGen offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. The study finds that tuning hyperparameters can significantly improve the utility of LLMs.\n",
27
+ "Please find documentation about this feature [here](/docs/Use-Cases/AutoGen#enhanced-inference).\n",
28
+ "\n",
29
+ "In this notebook, we tune OpenAI ChatGPT (both GPT-3.5 and GPT-4) models for math problem solving. We use [the MATH benchmark](https://crfm.stanford.edu/helm/latest/?group=math_chain_of_thought) for measuring mathematical problem solving on competition math problems with chain-of-thoughts style reasoning.\n",
30
+ "\n",
31
+ "Related link: [Blogpost](https://microsoft.github.io/autogen/blog/2023/04/21/LLM-tuning-math) based on this experiment.\n",
32
+ "\n",
33
+ "## Requirements\n",
34
+ "\n",
35
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install with the [blendsearch] option:\n",
36
+ "```bash\n",
37
+ "pip install \"pyautogen[blendsearch]\"\n",
38
+ "```"
39
+ ]
40
+ },
41
+ {
42
+ "cell_type": "code",
43
+ "execution_count": 1,
44
+ "metadata": {
45
+ "execution": {
46
+ "iopub.execute_input": "2023-02-13T23:40:52.317406Z",
47
+ "iopub.status.busy": "2023-02-13T23:40:52.316561Z",
48
+ "iopub.status.idle": "2023-02-13T23:40:52.321193Z",
49
+ "shell.execute_reply": "2023-02-13T23:40:52.320628Z"
50
+ }
51
+ },
52
+ "outputs": [],
53
+ "source": [
54
+ "# %pip install \"pyautogen[blendsearch]\" datasets"
55
+ ]
56
+ },
57
+ {
58
+ "attachments": {},
59
+ "cell_type": "markdown",
60
+ "metadata": {},
61
+ "source": [
62
+ "AutoGen has provided an API for hyperparameter optimization of OpenAI ChatGPT models: `autogen.ChatCompletion.tune` and to make a request with the tuned config: `autogen.ChatCompletion.create`. First, we import autogen:"
63
+ ]
64
+ },
65
+ {
66
+ "cell_type": "code",
67
+ "execution_count": 2,
68
+ "metadata": {
69
+ "execution": {
70
+ "iopub.execute_input": "2023-02-13T23:40:54.634335Z",
71
+ "iopub.status.busy": "2023-02-13T23:40:54.633929Z",
72
+ "iopub.status.idle": "2023-02-13T23:40:56.105700Z",
73
+ "shell.execute_reply": "2023-02-13T23:40:56.105085Z"
74
+ },
75
+ "slideshow": {
76
+ "slide_type": "slide"
77
+ }
78
+ },
79
+ "outputs": [],
80
+ "source": [
81
+ "import autogen"
82
+ ]
83
+ },
84
+ {
85
+ "attachments": {},
86
+ "cell_type": "markdown",
87
+ "metadata": {},
88
+ "source": [
89
+ "### Set your API Endpoint\n",
90
+ "\n",
91
+ "The [`config_list_openai_aoai`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_openai_aoai) function tries to create a list of Azure OpenAI endpoints and OpenAI endpoints. It assumes the api keys and api bases are stored in the corresponding environment variables or local txt files:\n",
92
+ "\n",
93
+ "- OpenAI API key: os.environ[\"OPENAI_API_KEY\"] or `openai_api_key_file=\"key_openai.txt\"`.\n",
94
+ "- Azure OpenAI API key: os.environ[\"AZURE_OPENAI_API_KEY\"] or `aoai_api_key_file=\"key_aoai.txt\"`. Multiple keys can be stored, one per line.\n",
95
+ "- Azure OpenAI API base: os.environ[\"AZURE_OPENAI_API_BASE\"] or `aoai_api_base_file=\"base_aoai.txt\"`. Multiple bases can be stored, one per line.\n",
96
+ "\n",
97
+ "It's OK to have only the OpenAI API key, or only the Azure OpenAI API key + base.\n"
98
+ ]
99
+ },
100
+ {
101
+ "cell_type": "code",
102
+ "execution_count": 3,
103
+ "metadata": {
104
+ "execution": {
105
+ "iopub.execute_input": "2023-02-13T23:40:52.324240Z",
106
+ "iopub.status.busy": "2023-02-13T23:40:52.323783Z",
107
+ "iopub.status.idle": "2023-02-13T23:40:52.330570Z",
108
+ "shell.execute_reply": "2023-02-13T23:40:52.329750Z"
109
+ }
110
+ },
111
+ "outputs": [],
112
+ "source": [
113
+ "config_list = autogen.config_list_openai_aoai()"
114
+ ]
115
+ },
116
+ {
117
+ "attachments": {},
118
+ "cell_type": "markdown",
119
+ "metadata": {},
120
+ "source": [
121
+ "The config list looks like the following:\n",
122
+ "```python\n",
123
+ "config_list = [\n",
124
+ " {'api_key': '<your OpenAI API key here>'}, # only if OpenAI API key is found\n",
125
+ " {\n",
126
+ " 'api_key': '<your first Azure OpenAI API key here>',\n",
127
+ " 'api_base': '<your first Azure OpenAI API base here>',\n",
128
+ " 'api_type': 'azure',\n",
129
+ " 'api_version': '2023-06-01-preview',\n",
130
+ " }, # only if the at least one Azure OpenAI API key is found\n",
131
+ " {\n",
132
+ " 'api_key': '<your second Azure OpenAI API key here>',\n",
133
+ " 'api_base': '<your second Azure OpenAI API base here>',\n",
134
+ " 'api_type': 'azure',\n",
135
+ " 'api_version': '2023-06-01-preview',\n",
136
+ " }, # only if the second Azure OpenAI API key is found\n",
137
+ "]\n",
138
+ "```\n",
139
+ "\n",
140
+ "You can directly override it if the above function returns an empty list, i.e., it doesn't find the keys in the specified locations."
141
+ ]
142
+ },
143
+ {
144
+ "attachments": {},
145
+ "cell_type": "markdown",
146
+ "metadata": {},
147
+ "source": [
148
+ "## Load dataset\n",
149
+ "\n",
150
+ "We load the competition_math dataset. The dataset contains 201 \"Level 2\" Algebra examples. We use a random sample of 20 examples for tuning the generation hyperparameters and the remaining for evaluation."
151
+ ]
152
+ },
153
+ {
154
+ "cell_type": "code",
155
+ "execution_count": 4,
156
+ "metadata": {
157
+ "execution": {
158
+ "iopub.execute_input": "2023-02-13T23:40:52.339977Z",
159
+ "iopub.status.busy": "2023-02-13T23:40:52.339556Z",
160
+ "iopub.status.idle": "2023-02-13T23:40:54.603349Z",
161
+ "shell.execute_reply": "2023-02-13T23:40:54.602630Z"
162
+ }
163
+ },
164
+ "outputs": [
165
+ {
166
+ "name": "stdout",
167
+ "output_type": "stream",
168
+ "text": [
169
+ "20 201\n"
170
+ ]
171
+ }
172
+ ],
173
+ "source": [
174
+ "import datasets\n",
175
+ "\n",
176
+ "seed = 41\n",
177
+ "data = datasets.load_dataset(\"competition_math\")\n",
178
+ "train_data = data[\"train\"].shuffle(seed=seed)\n",
179
+ "test_data = data[\"test\"].shuffle(seed=seed)\n",
180
+ "n_tune_data = 20\n",
181
+ "tune_data = [\n",
182
+ " {\n",
183
+ " \"problem\": train_data[x][\"problem\"],\n",
184
+ " \"solution\": train_data[x][\"solution\"],\n",
185
+ " }\n",
186
+ " for x in range(len(train_data)) if train_data[x][\"level\"] == \"Level 2\" and train_data[x][\"type\"] == \"Algebra\"\n",
187
+ "][:n_tune_data]\n",
188
+ "test_data = [\n",
189
+ " {\n",
190
+ " \"problem\": test_data[x][\"problem\"],\n",
191
+ " \"solution\": test_data[x][\"solution\"],\n",
192
+ " }\n",
193
+ " for x in range(len(test_data)) if test_data[x][\"level\"] == \"Level 2\" and test_data[x][\"type\"] == \"Algebra\"\n",
194
+ "]\n",
195
+ "print(len(tune_data), len(test_data))\n"
196
+ ]
197
+ },
198
+ {
199
+ "attachments": {},
200
+ "cell_type": "markdown",
201
+ "metadata": {
202
+ "slideshow": {
203
+ "slide_type": "slide"
204
+ }
205
+ },
206
+ "source": [
207
+ "Check a tuning example:"
208
+ ]
209
+ },
210
+ {
211
+ "cell_type": "code",
212
+ "execution_count": 5,
213
+ "metadata": {
214
+ "execution": {
215
+ "iopub.execute_input": "2023-02-13T23:40:54.607152Z",
216
+ "iopub.status.busy": "2023-02-13T23:40:54.606441Z",
217
+ "iopub.status.idle": "2023-02-13T23:40:54.610504Z",
218
+ "shell.execute_reply": "2023-02-13T23:40:54.609759Z"
219
+ },
220
+ "slideshow": {
221
+ "slide_type": "subslide"
222
+ },
223
+ "tags": []
224
+ },
225
+ "outputs": [
226
+ {
227
+ "name": "stdout",
228
+ "output_type": "stream",
229
+ "text": [
230
+ "If $3+a=4-b$ and $4+b=7+a$, what is $3-a$?\n"
231
+ ]
232
+ }
233
+ ],
234
+ "source": [
235
+ "print(tune_data[1][\"problem\"])"
236
+ ]
237
+ },
238
+ {
239
+ "attachments": {},
240
+ "cell_type": "markdown",
241
+ "metadata": {},
242
+ "source": [
243
+ "Here is one example of the canonical solution:"
244
+ ]
245
+ },
246
+ {
247
+ "cell_type": "code",
248
+ "execution_count": 6,
249
+ "metadata": {
250
+ "execution": {
251
+ "iopub.execute_input": "2023-02-13T23:40:54.613590Z",
252
+ "iopub.status.busy": "2023-02-13T23:40:54.613168Z",
253
+ "iopub.status.idle": "2023-02-13T23:40:54.616873Z",
254
+ "shell.execute_reply": "2023-02-13T23:40:54.616193Z"
255
+ }
256
+ },
257
+ "outputs": [
258
+ {
259
+ "name": "stdout",
260
+ "output_type": "stream",
261
+ "text": [
262
+ "First we begin by solving the system of equations \\begin{align*}\n",
263
+ "3+a&=4-b, \\\\\n",
264
+ "4+b&=7+a.\n",
265
+ "\\end{align*}Adding the two equations, we get $3+a+4+b=4-b+7+a$, which simplifies to $7+a+b=11+a-b$. Cancelling $a$ from both sides, we get $7+b=11-b$. Solving for $b$, we find that $b=2$. Plugging this into the first equation above, we obtain $3+a=4-2$. Hence $a=-1$ and $3-a=\\boxed{4}$.\n"
266
+ ]
267
+ }
268
+ ],
269
+ "source": [
270
+ "print(tune_data[1][\"solution\"])"
271
+ ]
272
+ },
273
+ {
274
+ "attachments": {},
275
+ "cell_type": "markdown",
276
+ "metadata": {},
277
+ "source": [
278
+ "## Define Success Metric\n",
279
+ "\n",
280
+ "Before we start tuning, we need to define the success metric we want to optimize. For each math task, we use voting to select a response with the most common answers out of all the generated responses. If it has an equivalent answer to the canonical solution, we consider the task as successfully solved. Then we can optimize the mean success rate of a collection of tasks."
281
+ ]
282
+ },
283
+ {
284
+ "cell_type": "code",
285
+ "execution_count": 7,
286
+ "metadata": {
287
+ "execution": {
288
+ "iopub.execute_input": "2023-02-13T23:40:54.626998Z",
289
+ "iopub.status.busy": "2023-02-13T23:40:54.626593Z",
290
+ "iopub.status.idle": "2023-02-13T23:40:54.631383Z",
291
+ "shell.execute_reply": "2023-02-13T23:40:54.630770Z"
292
+ }
293
+ },
294
+ "outputs": [],
295
+ "source": [
296
+ "from autogen.math_utils import eval_math_responses"
297
+ ]
298
+ },
299
+ {
300
+ "attachments": {},
301
+ "cell_type": "markdown",
302
+ "metadata": {
303
+ "slideshow": {
304
+ "slide_type": "slide"
305
+ }
306
+ },
307
+ "source": [
308
+ "## Use the tuning data to find a good configuration\n"
309
+ ]
310
+ },
311
+ {
312
+ "attachments": {},
313
+ "cell_type": "markdown",
314
+ "metadata": {},
315
+ "source": [
316
+ "For (local) reproducibility and cost efficiency, we cache responses from OpenAI with a controllable seed."
317
+ ]
318
+ },
319
+ {
320
+ "cell_type": "code",
321
+ "execution_count": 8,
322
+ "metadata": {
323
+ "execution": {
324
+ "iopub.execute_input": "2023-02-13T23:40:56.109177Z",
325
+ "iopub.status.busy": "2023-02-13T23:40:56.108624Z",
326
+ "iopub.status.idle": "2023-02-13T23:40:56.112651Z",
327
+ "shell.execute_reply": "2023-02-13T23:40:56.112076Z"
328
+ },
329
+ "slideshow": {
330
+ "slide_type": "slide"
331
+ }
332
+ },
333
+ "outputs": [],
334
+ "source": [
335
+ "autogen.ChatCompletion.set_cache(seed)"
336
+ ]
337
+ },
338
+ {
339
+ "attachments": {},
340
+ "cell_type": "markdown",
341
+ "metadata": {},
342
+ "source": [
343
+ "This will create a disk cache in \".cache/{seed}\". You can change `cache_path_root` from \".cache\" to a different path in `set_cache()`. The cache for different seeds are stored separately.\n",
344
+ "\n",
345
+ "### Perform tuning\n",
346
+ "\n",
347
+ "The tuning will take a while to finish, depending on the optimization budget. The tuning will be performed under the specified optimization budgets.\n",
348
+ "\n",
349
+ "* `inference_budget` is the target average inference budget per instance in the benchmark. For example, 0.004 means the target inference budget is 0.004 dollars, which translates to 2000 tokens (input + output combined) if the gpt-3.5-turbo model is used.\n",
350
+ "* `optimization_budget` is the total budget allowed to perform the tuning. For example, 1 means 1 dollars are allowed in total, which translates to 500K tokens for the gpt-3.5-turbo model.\n",
351
+ "* `num_sumples` is the number of different hyperparameter configurations which is allowed to try. The tuning will stop after either num_samples trials or after optimization_budget dollars spent, whichever happens first. -1 means no hard restriction in the number of trials and the actual number is decided by `optimization_budget`.\n",
352
+ "\n",
353
+ "Users can specify tuning data, optimization metric, optimization mode, evaluation function, search spaces etc.. The default search space is:\n",
354
+ "\n",
355
+ "```python\n",
356
+ "default_search_space = {\n",
357
+ " \"model\": tune.choice([\n",
358
+ " \"gpt-3.5-turbo\",\n",
359
+ " \"gpt-4\",\n",
360
+ " ]),\n",
361
+ " \"temperature_or_top_p\": tune.choice(\n",
362
+ " [\n",
363
+ " {\"temperature\": tune.uniform(0, 2)},\n",
364
+ " {\"top_p\": tune.uniform(0, 1)},\n",
365
+ " ]\n",
366
+ " ),\n",
367
+ " \"max_tokens\": tune.lograndint(50, 1000),\n",
368
+ " \"n\": tune.randint(1, 100),\n",
369
+ " \"prompt\": \"{prompt}\",\n",
370
+ "}\n",
371
+ "```\n",
372
+ "\n",
373
+ "The default search space can be overridden by users' input.\n",
374
+ "For example, the following code specifies a fixed prompt template. For hyperparameters which don't appear in users' input, the default search space will be used."
375
+ ]
376
+ },
377
+ {
378
+ "cell_type": "code",
379
+ "execution_count": null,
380
+ "metadata": {
381
+ "execution": {
382
+ "iopub.execute_input": "2023-02-13T23:40:56.115383Z",
383
+ "iopub.status.busy": "2023-02-13T23:40:56.114975Z",
384
+ "iopub.status.idle": "2023-02-13T23:41:55.045654Z",
385
+ "shell.execute_reply": "2023-02-13T23:41:55.044973Z"
386
+ }
387
+ },
388
+ "outputs": [],
389
+ "source": [
390
+ "import logging\n",
391
+ "\n",
392
+ "prompts = [\"{problem} Solve the problem carefully. Simplify your answer as much as possible. Put the final answer in \\\\boxed{{}}.\"]\n",
393
+ "config, analysis = autogen.ChatCompletion.tune(\n",
394
+ " data=tune_data, # the data for tuning\n",
395
+ " metric=\"success_vote\", # the metric to optimize\n",
396
+ " mode=\"max\", # the optimization mode\n",
397
+ " eval_func=eval_math_responses, # the evaluation function to return the success metrics\n",
398
+ " # log_file_name=\"logs/math.log\", # the log file name\n",
399
+ " inference_budget=0.02, # the inference budget (dollar per instance)\n",
400
+ " optimization_budget=1, # the optimization budget (dollar in total)\n",
401
+ " # num_samples can further limit the number of trials for different hyperparameter configurations;\n",
402
+ " # -1 means decided by the optimization budget only\n",
403
+ " num_samples=20,\n",
404
+ " model=\"gpt-3.5-turbo\", # comment to tune both gpt-3.5-turbo and gpt-4\n",
405
+ " prompt=prompts, # the prompt templates to choose from\n",
406
+ " # stop=\"###\", # the stop sequence\n",
407
+ " config_list=config_list, # the endpoint list\n",
408
+ " allow_format_str_template=True, # whether to allow format string template\n",
409
+ " # logging_level=logging.INFO, # the logging level\n",
410
+ ")\n"
411
+ ]
412
+ },
413
+ {
414
+ "attachments": {},
415
+ "cell_type": "markdown",
416
+ "metadata": {},
417
+ "source": [
418
+ "### Output tuning results\n",
419
+ "\n",
420
+ "After the tuning, we can print out the config and the result found by AutoGen, which uses flaml for tuning."
421
+ ]
422
+ },
423
+ {
424
+ "cell_type": "code",
425
+ "execution_count": 10,
426
+ "metadata": {
427
+ "execution": {
428
+ "iopub.execute_input": "2023-02-13T23:41:55.049204Z",
429
+ "iopub.status.busy": "2023-02-13T23:41:55.048871Z",
430
+ "iopub.status.idle": "2023-02-13T23:41:55.053284Z",
431
+ "shell.execute_reply": "2023-02-13T23:41:55.052574Z"
432
+ }
433
+ },
434
+ "outputs": [
435
+ {
436
+ "name": "stdout",
437
+ "output_type": "stream",
438
+ "text": [
439
+ "optimized config {'max_tokens': 375, 'n': 44, 'prompt': '{problem} Solve the problem carefully. Simplify your answer as much as possible. Put the final answer in \\\\boxed{{}}.', 'model': 'gpt-3.5-turbo', 'allow_format_str_template': True, 'temperature': 0.7466815201029384}\n",
440
+ "best result on tuning data {'expected_success': 0.9818164607828072, 'success': 1.0, 'success_vote': 0.95, 'voted_answer': 'To find the number of integers in the sequence, we need to find when each term becomes less than 1. \\n\\nStarting with 6075, we divide by 3 to get $\\\\frac{6075}{3} = 2025$. Since 2025 is an integer, it is included in the sequence.\\n\\nDividing 2025 by 3, we get $\\\\frac{2025}{3} = 675$. Again, 675 is an integer, so it is included in the sequence.\\n\\nIf we divide 675 by 3, we get $\\\\frac{675}{3} = 225$. 225 is an integer, so it is included in the sequence.\\n\\nDividing 225 by 3, we get $\\\\frac{225}{3} = 75$. 75 is an integer, so it is included in the sequence.\\n\\nDividing 75 by 3, we get $\\\\frac{75}{3} = 25$. 25 is an integer, so it is included in the sequence.\\n\\nIf we divide 25 by 3, we get $\\\\frac{25}{3} \\\\approx 8.3333$, which is not an integer. Thus, 25 is the last integer in the sequence.\\n\\nThere are a total of $\\\\boxed{6}$ integers in the sequence.', 'votes': 34.85, 'total_cost': 0.463802, 'cost': 0.27552199999999993, 'inference_cost': 0.01310685, 'training_iteration': 0, 'config': {'temperature_or_top_p': {'temperature': 0.7466815201029384}, 'max_tokens': 375, 'n': 44, 'prompt': 0, 'model': 'gpt-3.5-turbo', 'allow_format_str_template': True}, 'config/temperature_or_top_p': {'temperature': 0.7466815201029384}, 'config/max_tokens': 375, 'config/n': 44, 'config/prompt': 0, 'config/model': 'gpt-3.5-turbo', 'config/allow_format_str_template': True, 'experiment_tag': 'exp', 'time_total_s': 201.2768588066101}\n"
441
+ ]
442
+ }
443
+ ],
444
+ "source": [
445
+ "print(\"optimized config\", config)\n",
446
+ "print(\"best result on tuning data\", analysis.best_result)"
447
+ ]
448
+ },
449
+ {
450
+ "attachments": {},
451
+ "cell_type": "markdown",
452
+ "metadata": {
453
+ "slideshow": {
454
+ "slide_type": "slide"
455
+ }
456
+ },
457
+ "source": [
458
+ "### Make a request with the tuned config\n",
459
+ "\n",
460
+ "We can apply the tuned config on the request for an example task:"
461
+ ]
462
+ },
463
+ {
464
+ "cell_type": "code",
465
+ "execution_count": 11,
466
+ "metadata": {
467
+ "execution": {
468
+ "iopub.execute_input": "2023-02-13T23:41:55.056205Z",
469
+ "iopub.status.busy": "2023-02-13T23:41:55.055631Z",
470
+ "iopub.status.idle": "2023-02-13T23:41:56.039259Z",
471
+ "shell.execute_reply": "2023-02-13T23:41:56.038427Z"
472
+ },
473
+ "slideshow": {
474
+ "slide_type": "subslide"
475
+ },
476
+ "tags": []
477
+ },
478
+ "outputs": [
479
+ {
480
+ "name": "stdout",
481
+ "output_type": "stream",
482
+ "text": [
483
+ "response on an example data instance: {\n",
484
+ " \"id\": \"chatcmpl-7isNR6uGRH8VfNvrTX9YHj7cKdp49\",\n",
485
+ " \"object\": \"chat.completion\",\n",
486
+ " \"created\": 1690929813,\n",
487
+ " \"model\": \"gpt-35-turbo\",\n",
488
+ " \"prompt_annotations\": [\n",
489
+ " {\n",
490
+ " \"prompt_index\": 0,\n",
491
+ " \"content_filter_results\": {\n",
492
+ " \"hate\": {\n",
493
+ " \"filtered\": false,\n",
494
+ " \"severity\": \"safe\"\n",
495
+ " },\n",
496
+ " \"self_harm\": {\n",
497
+ " \"filtered\": false,\n",
498
+ " \"severity\": \"safe\"\n",
499
+ " },\n",
500
+ " \"sexual\": {\n",
501
+ " \"filtered\": false,\n",
502
+ " \"severity\": \"safe\"\n",
503
+ " },\n",
504
+ " \"violence\": {\n",
505
+ " \"filtered\": false,\n",
506
+ " \"severity\": \"safe\"\n",
507
+ " }\n",
508
+ " }\n",
509
+ " }\n",
510
+ " ],\n",
511
+ " \"choices\": [\n",
512
+ " {\n",
513
+ " \"index\": 0,\n",
514
+ " \"finish_reason\": \"stop\",\n",
515
+ " \"message\": {\n",
516
+ " \"role\": \"assistant\",\n",
517
+ " \"content\": \"We start by solving the first equation for $a$: $$3+a=4-b.$$Adding $-3$ to both sides gives $a=1-b$. Substituting this expression for $a$ into the second equation gives $$4+b=7+(1-b).$$Simplifying this expression, we find that $b=2$. Substituting $b=2$ into the first equation to solve for $a$, we find that $a=1-2=-1$. Finally, we have $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
518
+ " },\n",
519
+ " \"content_filter_results\": {\n",
520
+ " \"hate\": {\n",
521
+ " \"filtered\": false,\n",
522
+ " \"severity\": \"safe\"\n",
523
+ " },\n",
524
+ " \"self_harm\": {\n",
525
+ " \"filtered\": false,\n",
526
+ " \"severity\": \"safe\"\n",
527
+ " },\n",
528
+ " \"sexual\": {\n",
529
+ " \"filtered\": false,\n",
530
+ " \"severity\": \"safe\"\n",
531
+ " },\n",
532
+ " \"violence\": {\n",
533
+ " \"filtered\": false,\n",
534
+ " \"severity\": \"safe\"\n",
535
+ " }\n",
536
+ " }\n",
537
+ " },\n",
538
+ " {\n",
539
+ " \"index\": 1,\n",
540
+ " \"finish_reason\": \"stop\",\n",
541
+ " \"message\": {\n",
542
+ " \"role\": \"assistant\",\n",
543
+ " \"content\": \"Adding $a$ to both sides of the first equation gives $3+a+a=4-b+a$, which simplifies to $3+2a=4+a-b$. Adding $b$ to both sides of the second equation gives $4+b+b=7+a+b$, which simplifies to $4+2b=7+a+b$. Rearranging the equations gives $2a-b=1$ and $2b-a=3$. Adding these two equations gives $3a=4$, so $a=\\\\frac{4}{3}$. Substituting this into the first equation gives $2\\\\left(\\\\frac{4}{3}\\\\right)-b=1$, so $b=\\\\frac{5}{3}$. Thus, $3-a=3-\\\\frac{4}{3}=\\\\frac{9}{3}-\\\\frac{4}{3}=\\\\boxed{\\\\frac{5}{3}}$.\"\n",
544
+ " },\n",
545
+ " \"content_filter_results\": {\n",
546
+ " \"hate\": {\n",
547
+ " \"filtered\": false,\n",
548
+ " \"severity\": \"safe\"\n",
549
+ " },\n",
550
+ " \"self_harm\": {\n",
551
+ " \"filtered\": false,\n",
552
+ " \"severity\": \"safe\"\n",
553
+ " },\n",
554
+ " \"sexual\": {\n",
555
+ " \"filtered\": false,\n",
556
+ " \"severity\": \"safe\"\n",
557
+ " },\n",
558
+ " \"violence\": {\n",
559
+ " \"filtered\": false,\n",
560
+ " \"severity\": \"safe\"\n",
561
+ " }\n",
562
+ " }\n",
563
+ " },\n",
564
+ " {\n",
565
+ " \"index\": 2,\n",
566
+ " \"finish_reason\": \"stop\",\n",
567
+ " \"message\": {\n",
568
+ " \"role\": \"assistant\",\n",
569
+ " \"content\": \"We can first rewrite the given equations as $a+b=1$ and $a-b=3$. Adding these equations, we get $(a+b)+(a-b)=1+3$, which simplifies to $2a=4$. Dividing both sides by 2, we find $a=2$. Substituting this value of $a$ into the first equation, we have $2+b=1$, so $b=-1$. Finally, we can compute $3-a=3-2=\\\\boxed{1}$.\"\n",
570
+ " },\n",
571
+ " \"content_filter_results\": {\n",
572
+ " \"hate\": {\n",
573
+ " \"filtered\": false,\n",
574
+ " \"severity\": \"safe\"\n",
575
+ " },\n",
576
+ " \"self_harm\": {\n",
577
+ " \"filtered\": false,\n",
578
+ " \"severity\": \"safe\"\n",
579
+ " },\n",
580
+ " \"sexual\": {\n",
581
+ " \"filtered\": false,\n",
582
+ " \"severity\": \"safe\"\n",
583
+ " },\n",
584
+ " \"violence\": {\n",
585
+ " \"filtered\": false,\n",
586
+ " \"severity\": \"safe\"\n",
587
+ " }\n",
588
+ " }\n",
589
+ " },\n",
590
+ " {\n",
591
+ " \"index\": 3,\n",
592
+ " \"finish_reason\": \"stop\",\n",
593
+ " \"message\": {\n",
594
+ " \"role\": \"assistant\",\n",
595
+ " \"content\": \"From the first equation, we have $a=4-b-3$. Substituting this into the second equation, we get $4+b=7+4-b-3$. Simplifying the right side gives us $4+b=8-b$. Adding $b$ to both sides gives $4+2b=8$. Subtracting 4 from both sides gives $2b=4$ and dividing both sides by 2 gives $b=2$. Substituting this into the first equation gives $3+a=4-2$, so $a=-1$. Finally, $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
596
+ " },\n",
597
+ " \"content_filter_results\": {\n",
598
+ " \"hate\": {\n",
599
+ " \"filtered\": false,\n",
600
+ " \"severity\": \"safe\"\n",
601
+ " },\n",
602
+ " \"self_harm\": {\n",
603
+ " \"filtered\": false,\n",
604
+ " \"severity\": \"safe\"\n",
605
+ " },\n",
606
+ " \"sexual\": {\n",
607
+ " \"filtered\": false,\n",
608
+ " \"severity\": \"safe\"\n",
609
+ " },\n",
610
+ " \"violence\": {\n",
611
+ " \"filtered\": false,\n",
612
+ " \"severity\": \"safe\"\n",
613
+ " }\n",
614
+ " }\n",
615
+ " },\n",
616
+ " {\n",
617
+ " \"index\": 4,\n",
618
+ " \"finish_reason\": \"stop\",\n",
619
+ " \"message\": {\n",
620
+ " \"role\": \"assistant\",\n",
621
+ " \"content\": \"Adding the two equations, we have $(3+a)+(4+b)=(4-b)+(7+a)$. Expanding both sides gives $7+a+b=11+a-b$. Subtracting $a$ and $b$ from both sides gives $7=11$, which is impossible. Thus, there is no solution to the given equations.\"\n",
622
+ " },\n",
623
+ " \"content_filter_results\": {\n",
624
+ " \"hate\": {\n",
625
+ " \"filtered\": false,\n",
626
+ " \"severity\": \"safe\"\n",
627
+ " },\n",
628
+ " \"self_harm\": {\n",
629
+ " \"filtered\": false,\n",
630
+ " \"severity\": \"safe\"\n",
631
+ " },\n",
632
+ " \"sexual\": {\n",
633
+ " \"filtered\": false,\n",
634
+ " \"severity\": \"safe\"\n",
635
+ " },\n",
636
+ " \"violence\": {\n",
637
+ " \"filtered\": false,\n",
638
+ " \"severity\": \"safe\"\n",
639
+ " }\n",
640
+ " }\n",
641
+ " },\n",
642
+ " {\n",
643
+ " \"index\": 5,\n",
644
+ " \"finish_reason\": \"stop\",\n",
645
+ " \"message\": {\n",
646
+ " \"role\": \"assistant\",\n",
647
+ " \"content\": \"We are given the equations $3+a=4-b$ and $4+b=7+a$. We can solve for $a$ and $b$ by using elimination.\\n\\nWe can eliminate $a$ by adding the two equations together. This gives us $(3+a)+(4+b)=(4-b)+(7+a)$. Simplifying both sides gives us $7+a+b=11+a-b$. We can then eliminate $a$ by subtracting $a$ from both sides, which gives us $7+b=11-b$. Adding $b$ to both sides gives us $7+2b=11$. Subtracting $7$ from both sides gives us $2b=4$. Dividing both sides by $2$, we get $b=2$.\\n\\nNow that we know the value of $b$, we can substitute it back into one of the original equations to solve for $a$. Let's use the equation $4+b=7+a$. Since we found that $b=2$, we have $4+2=7+a$. Simplifying gives us $6=7+a$. Subtracting $7$ from both sides gives us $-1=a$.\\n\\nFinally, we can find $3-a$ by substituting the value of $a$ that we found back into the expression. We have $3-a=3-(-1)$. Simplifying gives us $3-(-1)=3+1=\\\\boxed{4}$.\"\n",
648
+ " },\n",
649
+ " \"content_filter_results\": {\n",
650
+ " \"hate\": {\n",
651
+ " \"filtered\": false,\n",
652
+ " \"severity\": \"safe\"\n",
653
+ " },\n",
654
+ " \"self_harm\": {\n",
655
+ " \"filtered\": false,\n",
656
+ " \"severity\": \"safe\"\n",
657
+ " },\n",
658
+ " \"sexual\": {\n",
659
+ " \"filtered\": false,\n",
660
+ " \"severity\": \"safe\"\n",
661
+ " },\n",
662
+ " \"violence\": {\n",
663
+ " \"filtered\": false,\n",
664
+ " \"severity\": \"safe\"\n",
665
+ " }\n",
666
+ " }\n",
667
+ " },\n",
668
+ " {\n",
669
+ " \"index\": 6,\n",
670
+ " \"finish_reason\": \"stop\",\n",
671
+ " \"message\": {\n",
672
+ " \"role\": \"assistant\",\n",
673
+ " \"content\": \"We can start by solving the given equations separately. From $3+a=4-b$, we can add $b$ to both sides to find $3+a+b=4$. Similarly, from $4+b=7+a$, we can subtract $a$ from both sides to find $4+b-a=7$. Adding these two equations together, we have $3+a+b+4+b-a=4+7$, which simplifies to $7+b=11$. Subtracting $7$ from both sides gives us $b=4$. Substituting this value of $b$ into the equation $4+b-a=7$, we find $4+4-a=7$. Simplifying this equation further, we have $8-a=7$. Subtracting $8$ from both sides gives us $-a=-1$, or $a=1$. Finally, we can substitute the values of $a$ and $b$ into the expression $3-a$, which gives us $3-1=\\\\boxed{2}$.\"\n",
674
+ " },\n",
675
+ " \"content_filter_results\": {\n",
676
+ " \"hate\": {\n",
677
+ " \"filtered\": false,\n",
678
+ " \"severity\": \"safe\"\n",
679
+ " },\n",
680
+ " \"self_harm\": {\n",
681
+ " \"filtered\": false,\n",
682
+ " \"severity\": \"safe\"\n",
683
+ " },\n",
684
+ " \"sexual\": {\n",
685
+ " \"filtered\": false,\n",
686
+ " \"severity\": \"safe\"\n",
687
+ " },\n",
688
+ " \"violence\": {\n",
689
+ " \"filtered\": false,\n",
690
+ " \"severity\": \"safe\"\n",
691
+ " }\n",
692
+ " }\n",
693
+ " },\n",
694
+ " {\n",
695
+ " \"index\": 7,\n",
696
+ " \"finish_reason\": \"stop\",\n",
697
+ " \"message\": {\n",
698
+ " \"role\": \"assistant\",\n",
699
+ " \"content\": \"Starting with the first equation, we have $3+a=4-b$. Rearranging, we get $a=-1-b$. Substituting this into the second equation, we have $4+b=7+(-1-b)$. Simplifying, we get $b=-2$. Substituting this value of $b$ back into the first equation, we have $3+a=4-(-2)$. Simplifying further, we have $3+a=6$. Subtracting $3$ from both sides, we get $a=3$. Finally, we can find $3-a=3-3=\\\\boxed{0}$.\"\n",
700
+ " },\n",
701
+ " \"content_filter_results\": {\n",
702
+ " \"hate\": {\n",
703
+ " \"filtered\": false,\n",
704
+ " \"severity\": \"safe\"\n",
705
+ " },\n",
706
+ " \"self_harm\": {\n",
707
+ " \"filtered\": false,\n",
708
+ " \"severity\": \"safe\"\n",
709
+ " },\n",
710
+ " \"sexual\": {\n",
711
+ " \"filtered\": false,\n",
712
+ " \"severity\": \"safe\"\n",
713
+ " },\n",
714
+ " \"violence\": {\n",
715
+ " \"filtered\": false,\n",
716
+ " \"severity\": \"safe\"\n",
717
+ " }\n",
718
+ " }\n",
719
+ " },\n",
720
+ " {\n",
721
+ " \"index\": 8,\n",
722
+ " \"finish_reason\": \"stop\",\n",
723
+ " \"message\": {\n",
724
+ " \"role\": \"assistant\",\n",
725
+ " \"content\": \"Adding the two equations gives $(3+a)+(4+b)=(4-b)+(7+a)$. Simplifying both sides gives $7+a+b=11+a-b$. Subtracting $a$ from both sides gives $7+b=11-b$. Adding $b$ to both sides gives $b+b=11-7$, so $2b=4$ and $b=2$. Substituting this value back into either equation gives $4+2=7+a$, so $a=4$. Therefore, $3-a=3-4=\\\\boxed{-1}$.\"\n",
726
+ " },\n",
727
+ " \"content_filter_results\": {\n",
728
+ " \"hate\": {\n",
729
+ " \"filtered\": false,\n",
730
+ " \"severity\": \"safe\"\n",
731
+ " },\n",
732
+ " \"self_harm\": {\n",
733
+ " \"filtered\": false,\n",
734
+ " \"severity\": \"safe\"\n",
735
+ " },\n",
736
+ " \"sexual\": {\n",
737
+ " \"filtered\": false,\n",
738
+ " \"severity\": \"safe\"\n",
739
+ " },\n",
740
+ " \"violence\": {\n",
741
+ " \"filtered\": false,\n",
742
+ " \"severity\": \"safe\"\n",
743
+ " }\n",
744
+ " }\n",
745
+ " },\n",
746
+ " {\n",
747
+ " \"index\": 9,\n",
748
+ " \"finish_reason\": \"stop\",\n",
749
+ " \"message\": {\n",
750
+ " \"role\": \"assistant\",\n",
751
+ " \"content\": \"Rearranging the first equation, we have $a=4-b-3$ and rearranging the second equation, we have $b=7+a-4$. Substituting these expressions for $a$ and $b$ into the equation $3-a$, we have $3-(4-b-3)$. Simplifying the expression inside the parentheses, we have $3-(4-b-3)=3-4+b+3=9+b$. Substituting the expression for $b$ into $9+b$, we have $9+(7+a-4)=9+7+a-4=12+a$. Finally, substituting the expression for $a$ into $12+a$, we have $12+(4-b-3)=12+4-b-3=\\\\boxed{10-b}$.\"\n",
752
+ " },\n",
753
+ " \"content_filter_results\": {\n",
754
+ " \"hate\": {\n",
755
+ " \"filtered\": false,\n",
756
+ " \"severity\": \"safe\"\n",
757
+ " },\n",
758
+ " \"self_harm\": {\n",
759
+ " \"filtered\": false,\n",
760
+ " \"severity\": \"safe\"\n",
761
+ " },\n",
762
+ " \"sexual\": {\n",
763
+ " \"filtered\": false,\n",
764
+ " \"severity\": \"safe\"\n",
765
+ " },\n",
766
+ " \"violence\": {\n",
767
+ " \"filtered\": false,\n",
768
+ " \"severity\": \"safe\"\n",
769
+ " }\n",
770
+ " }\n",
771
+ " },\n",
772
+ " {\n",
773
+ " \"index\": 10,\n",
774
+ " \"finish_reason\": \"stop\",\n",
775
+ " \"message\": {\n",
776
+ " \"role\": \"assistant\",\n",
777
+ " \"content\": \"We have the system of equations \\\\begin{align*}\\n3+a&=4-b\\\\\\\\\\n4+b&=7+a\\n\\\\end{align*} Rearranging the first equation, we have $a+b=1$. Substituting this into the second equation, we get $4+1=7+a$, so $a=-4$. Thus, $3-a=\\\\boxed{7}$.\"\n",
778
+ " },\n",
779
+ " \"content_filter_results\": {\n",
780
+ " \"hate\": {\n",
781
+ " \"filtered\": false,\n",
782
+ " \"severity\": \"safe\"\n",
783
+ " },\n",
784
+ " \"self_harm\": {\n",
785
+ " \"filtered\": false,\n",
786
+ " \"severity\": \"safe\"\n",
787
+ " },\n",
788
+ " \"sexual\": {\n",
789
+ " \"filtered\": false,\n",
790
+ " \"severity\": \"safe\"\n",
791
+ " },\n",
792
+ " \"violence\": {\n",
793
+ " \"filtered\": false,\n",
794
+ " \"severity\": \"safe\"\n",
795
+ " }\n",
796
+ " }\n",
797
+ " },\n",
798
+ " {\n",
799
+ " \"index\": 11,\n",
800
+ " \"finish_reason\": \"stop\",\n",
801
+ " \"message\": {\n",
802
+ " \"role\": \"assistant\",\n",
803
+ " \"content\": \"Simplifying the first equation, we have $a=1-b$. Substituting this into the second equation, we have $4+b=7+(1-b)$. Expanding the right side gives $4+b=7+1-b$. Combining like terms gives $2b=4$, so $b=2$. Substituting this back into $a=1-b$, we find that $a=-1$. Thus, $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
804
+ " },\n",
805
+ " \"content_filter_results\": {\n",
806
+ " \"hate\": {\n",
807
+ " \"filtered\": false,\n",
808
+ " \"severity\": \"safe\"\n",
809
+ " },\n",
810
+ " \"self_harm\": {\n",
811
+ " \"filtered\": false,\n",
812
+ " \"severity\": \"safe\"\n",
813
+ " },\n",
814
+ " \"sexual\": {\n",
815
+ " \"filtered\": false,\n",
816
+ " \"severity\": \"safe\"\n",
817
+ " },\n",
818
+ " \"violence\": {\n",
819
+ " \"filtered\": false,\n",
820
+ " \"severity\": \"safe\"\n",
821
+ " }\n",
822
+ " }\n",
823
+ " },\n",
824
+ " {\n",
825
+ " \"index\": 12,\n",
826
+ " \"finish_reason\": \"stop\",\n",
827
+ " \"message\": {\n",
828
+ " \"role\": \"assistant\",\n",
829
+ " \"content\": \"From the first equation, we have $a=4-b-3$. Substituting this into the second equation, we have $4+b=7+(4-b-3)$. Simplifying the right side of the equation gives $4+b=8-b$. Adding $b$ to both sides gives $4+2b=8$. Subtracting 4 from both sides gives $2b=4$. Dividing both sides by 2 gives $b=2$. Substituting this value back into the first equation gives $3+a=4-2$. Simplifying the right side gives $3+a=2$. Subtracting 3 from both sides gives $a=-1$. Finally, we have $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
830
+ " },\n",
831
+ " \"content_filter_results\": {\n",
832
+ " \"hate\": {\n",
833
+ " \"filtered\": false,\n",
834
+ " \"severity\": \"safe\"\n",
835
+ " },\n",
836
+ " \"self_harm\": {\n",
837
+ " \"filtered\": false,\n",
838
+ " \"severity\": \"safe\"\n",
839
+ " },\n",
840
+ " \"sexual\": {\n",
841
+ " \"filtered\": false,\n",
842
+ " \"severity\": \"safe\"\n",
843
+ " },\n",
844
+ " \"violence\": {\n",
845
+ " \"filtered\": false,\n",
846
+ " \"severity\": \"safe\"\n",
847
+ " }\n",
848
+ " }\n",
849
+ " },\n",
850
+ " {\n",
851
+ " \"index\": 13,\n",
852
+ " \"finish_reason\": \"stop\",\n",
853
+ " \"message\": {\n",
854
+ " \"role\": \"assistant\",\n",
855
+ " \"content\": \"From the first equation, subtracting $a$ and adding $4$ to both sides gives $7=b-a$. Substituting this into the second equation gives $4+(b-a)=7+a$, so $4+7=b+a$. Combining these equations gives $3+b+a=11+a$, so $b=\\\\boxed{8}$. Substituting into the first equation gives $3+a=4-8$ which gives $a=\\\\boxed{-9}$. Finally, $3-a=3-(-9)=3+9=\\\\boxed{12}$.\"\n",
856
+ " },\n",
857
+ " \"content_filter_results\": {\n",
858
+ " \"hate\": {\n",
859
+ " \"filtered\": false,\n",
860
+ " \"severity\": \"safe\"\n",
861
+ " },\n",
862
+ " \"self_harm\": {\n",
863
+ " \"filtered\": false,\n",
864
+ " \"severity\": \"safe\"\n",
865
+ " },\n",
866
+ " \"sexual\": {\n",
867
+ " \"filtered\": false,\n",
868
+ " \"severity\": \"safe\"\n",
869
+ " },\n",
870
+ " \"violence\": {\n",
871
+ " \"filtered\": false,\n",
872
+ " \"severity\": \"safe\"\n",
873
+ " }\n",
874
+ " }\n",
875
+ " },\n",
876
+ " {\n",
877
+ " \"index\": 14,\n",
878
+ " \"finish_reason\": \"stop\",\n",
879
+ " \"message\": {\n",
880
+ " \"role\": \"assistant\",\n",
881
+ " \"content\": \"We can start by solving the first equation for $a$ in terms of $b$. Subtracting $3$ from both sides of the equation $3+a=4-b$ gives $a=1-b$. We can substitute this expression for $a$ in the second equation to solve for $b$: \\\\begin{align*}\\n4+b&=7+a\\\\\\\\\\n4+b&=7+(1-b)\\\\\\\\\\n4+b&=8-b\\\\\\\\\\n2b&=4\\\\\\\\\\nb&=2.\\n\\\\end{align*}Substituting this value of $b$ back into the first equation to solve for $a$, we have $3+a=4-2$, so $a=-1$. Finally, we can find $3-a=3-(-1)=\\\\boxed{4}$.\"\n",
882
+ " },\n",
883
+ " \"content_filter_results\": {\n",
884
+ " \"hate\": {\n",
885
+ " \"filtered\": false,\n",
886
+ " \"severity\": \"safe\"\n",
887
+ " },\n",
888
+ " \"self_harm\": {\n",
889
+ " \"filtered\": false,\n",
890
+ " \"severity\": \"safe\"\n",
891
+ " },\n",
892
+ " \"sexual\": {\n",
893
+ " \"filtered\": false,\n",
894
+ " \"severity\": \"safe\"\n",
895
+ " },\n",
896
+ " \"violence\": {\n",
897
+ " \"filtered\": false,\n",
898
+ " \"severity\": \"safe\"\n",
899
+ " }\n",
900
+ " }\n",
901
+ " },\n",
902
+ " {\n",
903
+ " \"index\": 15,\n",
904
+ " \"finish_reason\": \"stop\",\n",
905
+ " \"message\": {\n",
906
+ " \"role\": \"assistant\",\n",
907
+ " \"content\": \"From the first equation, we have $a=1-b$. Substituting this into the second equation gives $4+b=7+1-b$. Simplifying gives $2b=4$, so $b=2$. Substituting this back into the first equation gives $3+a=4-2$, so $a=-1$. Therefore, $3-a=3-(-1)=4$. Simplifying gives $\\\\boxed{4}$.\"\n",
908
+ " },\n",
909
+ " \"content_filter_results\": {\n",
910
+ " \"hate\": {\n",
911
+ " \"filtered\": false,\n",
912
+ " \"severity\": \"safe\"\n",
913
+ " },\n",
914
+ " \"self_harm\": {\n",
915
+ " \"filtered\": false,\n",
916
+ " \"severity\": \"safe\"\n",
917
+ " },\n",
918
+ " \"sexual\": {\n",
919
+ " \"filtered\": false,\n",
920
+ " \"severity\": \"safe\"\n",
921
+ " },\n",
922
+ " \"violence\": {\n",
923
+ " \"filtered\": false,\n",
924
+ " \"severity\": \"safe\"\n",
925
+ " }\n",
926
+ " }\n",
927
+ " },\n",
928
+ " {\n",
929
+ " \"index\": 16,\n",
930
+ " \"finish_reason\": \"stop\",\n",
931
+ " \"message\": {\n",
932
+ " \"role\": \"assistant\",\n",
933
+ " \"content\": \"Adding the two given equations, we have $(3+a)+(4+b)=(4-b)+(7+a)$. Simplifying both sides gives $7+a+b=11+a-b$. We can subtract $a$ from both sides to get $7+b=11-b$. Adding $b$ to both sides gives $7+2b=11$. Subtracting 7 from both sides gives $2b=4$. Dividing by 2 gives $b=2$.\\n\\nWe can substitute $b=2$ into the first equation $3+a=4-b$ to solve for $a$. We have $3+a=4-2$ which simplifies to $3+a=2$. Subtracting 3 from both sides gives $a=-1$.\\n\\nFinally, we can substitute $a=-1$ into $3-a$ to find $3-a=3-(-1)$. Simplifying gives $3-a=3+1=\\\\boxed{4}$.\"\n",
934
+ " },\n",
935
+ " \"content_filter_results\": {\n",
936
+ " \"hate\": {\n",
937
+ " \"filtered\": false,\n",
938
+ " \"severity\": \"safe\"\n",
939
+ " },\n",
940
+ " \"self_harm\": {\n",
941
+ " \"filtered\": false,\n",
942
+ " \"severity\": \"safe\"\n",
943
+ " },\n",
944
+ " \"sexual\": {\n",
945
+ " \"filtered\": false,\n",
946
+ " \"severity\": \"safe\"\n",
947
+ " },\n",
948
+ " \"violence\": {\n",
949
+ " \"filtered\": false,\n",
950
+ " \"severity\": \"safe\"\n",
951
+ " }\n",
952
+ " }\n",
953
+ " },\n",
954
+ " {\n",
955
+ " \"index\": 17,\n",
956
+ " \"finish_reason\": \"stop\",\n",
957
+ " \"message\": {\n",
958
+ " \"role\": \"assistant\",\n",
959
+ " \"content\": \"Adding the two given equations, we have $(3+a)+(4+b)=(4-b)+(7+a)$. Simplifying both sides gives $7+a+b=11+a-b$. Subtracting $a$ from both sides gives $7+b=11-b$. Adding $b$ to both sides gives $7+2b=11$. Subtracting $7$ from both sides gives $2b=4$. Finally, dividing both sides by $2$ gives $b=2$. Substituting this value for $b$ into the second given equation, we have $4+2=7+a$. Simplifying gives $a=-1$. Therefore, $3-a=3-(-1)=4$. Thus, the final answer is $\\\\boxed{4}$.\"\n",
960
+ " },\n",
961
+ " \"content_filter_results\": {\n",
962
+ " \"hate\": {\n",
963
+ " \"filtered\": false,\n",
964
+ " \"severity\": \"safe\"\n",
965
+ " },\n",
966
+ " \"self_harm\": {\n",
967
+ " \"filtered\": false,\n",
968
+ " \"severity\": \"safe\"\n",
969
+ " },\n",
970
+ " \"sexual\": {\n",
971
+ " \"filtered\": false,\n",
972
+ " \"severity\": \"safe\"\n",
973
+ " },\n",
974
+ " \"violence\": {\n",
975
+ " \"filtered\": false,\n",
976
+ " \"severity\": \"safe\"\n",
977
+ " }\n",
978
+ " }\n",
979
+ " },\n",
980
+ " {\n",
981
+ " \"index\": 18,\n",
982
+ " \"finish_reason\": \"stop\",\n",
983
+ " \"message\": {\n",
984
+ " \"role\": \"assistant\",\n",
985
+ " \"content\": \"Let's start by simplifying the given equations. We have $3+a=4-b$, which we can rearrange to get $a=-b+1$. Similarly, we have $4+b=7+a$, which rearranges to $b=a+3$. \\n\\nWe can substitute the value of $b$ from the second equation into the first equation to get $a=(-a-3)+1$. Simplifying this equation gives $2a=-2$, so $a=-1$. \\n\\nSubstituting this value of $a$ into the second equation gives $b=(-1)+3$, so $b=2$. \\n\\nFinally, we can find $3-a$ by substituting $a=-1$ into $3-a$. This gives $3-(-1)=3+1=\\\\boxed{4}$.\"\n",
986
+ " },\n",
987
+ " \"content_filter_results\": {\n",
988
+ " \"hate\": {\n",
989
+ " \"filtered\": false,\n",
990
+ " \"severity\": \"safe\"\n",
991
+ " },\n",
992
+ " \"self_harm\": {\n",
993
+ " \"filtered\": false,\n",
994
+ " \"severity\": \"safe\"\n",
995
+ " },\n",
996
+ " \"sexual\": {\n",
997
+ " \"filtered\": false,\n",
998
+ " \"severity\": \"safe\"\n",
999
+ " },\n",
1000
+ " \"violence\": {\n",
1001
+ " \"filtered\": false,\n",
1002
+ " \"severity\": \"safe\"\n",
1003
+ " }\n",
1004
+ " }\n",
1005
+ " },\n",
1006
+ " {\n",
1007
+ " \"index\": 19,\n",
1008
+ " \"finish_reason\": \"stop\",\n",
1009
+ " \"message\": {\n",
1010
+ " \"role\": \"assistant\",\n",
1011
+ " \"content\": \"We have the equations $3+a=4-b$ and $4+b=7+a$. We can solve these equations using substitution or elimination. Let's solve it using elimination.\\n\\nTo eliminate $a$, we can add the first equation to the second equation. This gives us $(3+a)+(4+b)=(4-b)+(7+a)$. Simplifying both sides, we have $7 + a + b = 11 + a - b$.\\n\\nNow, let's isolate $b$ by subtracting $a$ from both sides: $7 + b = 11 - b$.\\n\\nTo isolate $b$ on one side, we can add $b$ to both sides: $7 + 2b = 11$.\\n\\nSubtracting $7$ from both sides gives $2b= 4$. Dividing both sides by $2$, we find $b=2$.\\n\\nNow, we can substitute $b=2$ into the second equation $4+b=7+a$. This gives $4+2=7+a$, or $6=7+a$. Subtracting $7$ from both sides gives $-1=a$.\\n\\nFinally, we can find $3-a$ by substituting $a=-1$ into $3-a$. This gives $3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1012
+ " },\n",
1013
+ " \"content_filter_results\": {\n",
1014
+ " \"hate\": {\n",
1015
+ " \"filtered\": false,\n",
1016
+ " \"severity\": \"safe\"\n",
1017
+ " },\n",
1018
+ " \"self_harm\": {\n",
1019
+ " \"filtered\": false,\n",
1020
+ " \"severity\": \"safe\"\n",
1021
+ " },\n",
1022
+ " \"sexual\": {\n",
1023
+ " \"filtered\": false,\n",
1024
+ " \"severity\": \"safe\"\n",
1025
+ " },\n",
1026
+ " \"violence\": {\n",
1027
+ " \"filtered\": false,\n",
1028
+ " \"severity\": \"safe\"\n",
1029
+ " }\n",
1030
+ " }\n",
1031
+ " },\n",
1032
+ " {\n",
1033
+ " \"index\": 20,\n",
1034
+ " \"finish_reason\": \"stop\",\n",
1035
+ " \"message\": {\n",
1036
+ " \"role\": \"assistant\",\n",
1037
+ " \"content\": \"We can start by combining like terms in both equations. From the first equation, we have $a+3=b-4$, and from the second equation, we have $b+4=a-7$. \\n\\nNow, we can substitute $b-4$ for $a+3$ in the second equation, since they are equal. This gives us $b+4=(b-4)-7$. \\n\\nSimplifying, we have $b+4=b-11$. \\n\\nSubtracting $b$ from both sides, we get $4=-11$. \\n\\nThis is a contradiction, since $4$ does not equal $-11$. \\n\\nTherefore, the solution to this system of equations does not exist, and we cannot find the value of $3-a$. Thus, the answer is $\\\\boxed{\\\\text{DNE}}$.\"\n",
1038
+ " },\n",
1039
+ " \"content_filter_results\": {\n",
1040
+ " \"hate\": {\n",
1041
+ " \"filtered\": false,\n",
1042
+ " \"severity\": \"safe\"\n",
1043
+ " },\n",
1044
+ " \"self_harm\": {\n",
1045
+ " \"filtered\": false,\n",
1046
+ " \"severity\": \"safe\"\n",
1047
+ " },\n",
1048
+ " \"sexual\": {\n",
1049
+ " \"filtered\": false,\n",
1050
+ " \"severity\": \"safe\"\n",
1051
+ " },\n",
1052
+ " \"violence\": {\n",
1053
+ " \"filtered\": false,\n",
1054
+ " \"severity\": \"safe\"\n",
1055
+ " }\n",
1056
+ " }\n",
1057
+ " },\n",
1058
+ " {\n",
1059
+ " \"index\": 21,\n",
1060
+ " \"finish_reason\": \"stop\",\n",
1061
+ " \"message\": {\n",
1062
+ " \"role\": \"assistant\",\n",
1063
+ " \"content\": \"We can start by solving the first equation, $3+a=4-b$, for $a$ in terms of $b$ by subtracting $3$ from both sides and then adding $b$ to both sides. This gives us $a = 1-b$.\\n\\nWe can substitute this expression for $a$ into the second equation, $4+b=7+a$, to solve for $b$ in terms of $a$. After simplifying, we have $b=4-a$.\\n\\nTo find $3-a$, we substitute $b=4-a$ into the first equation $3+a=4-b$. This gives us $3+a=4-(4-a)$.\\n\\nSimplifying this equation gives $3+a=4-4+a$, so $3+a=a$.\\n\\nTherefore, $3-a = \\\\boxed{3}$.\"\n",
1064
+ " },\n",
1065
+ " \"content_filter_results\": {\n",
1066
+ " \"hate\": {\n",
1067
+ " \"filtered\": false,\n",
1068
+ " \"severity\": \"safe\"\n",
1069
+ " },\n",
1070
+ " \"self_harm\": {\n",
1071
+ " \"filtered\": false,\n",
1072
+ " \"severity\": \"safe\"\n",
1073
+ " },\n",
1074
+ " \"sexual\": {\n",
1075
+ " \"filtered\": false,\n",
1076
+ " \"severity\": \"safe\"\n",
1077
+ " },\n",
1078
+ " \"violence\": {\n",
1079
+ " \"filtered\": false,\n",
1080
+ " \"severity\": \"safe\"\n",
1081
+ " }\n",
1082
+ " }\n",
1083
+ " },\n",
1084
+ " {\n",
1085
+ " \"index\": 22,\n",
1086
+ " \"finish_reason\": \"stop\",\n",
1087
+ " \"message\": {\n",
1088
+ " \"role\": \"assistant\",\n",
1089
+ " \"content\": \"To solve this problem, we can start by solving the first equation $3+a=4-b$ for $b$. Subtracting $3$ from both sides gives $a=1-b$. We can substitute this into the second equation $4+b=7+a$ to get $4+b=7+(1-b)$. Expanding the right side gives $4+b=7+1-b$. Combining like terms gives $b+b=7+1-4$. Simplifying the right side gives $2b=4$. Dividing both sides by $2$ gives $b=2$. Now we can substitute this back into the first equation to solve for $a$. We have $3+a=4-2$, so $3+a=2$. Subtracting $3$ from both sides gives $a=-1$. Finally, we can find $3-a$ by subtracting $a$ from $3$. We have $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1090
+ " },\n",
1091
+ " \"content_filter_results\": {\n",
1092
+ " \"hate\": {\n",
1093
+ " \"filtered\": false,\n",
1094
+ " \"severity\": \"safe\"\n",
1095
+ " },\n",
1096
+ " \"self_harm\": {\n",
1097
+ " \"filtered\": false,\n",
1098
+ " \"severity\": \"safe\"\n",
1099
+ " },\n",
1100
+ " \"sexual\": {\n",
1101
+ " \"filtered\": false,\n",
1102
+ " \"severity\": \"safe\"\n",
1103
+ " },\n",
1104
+ " \"violence\": {\n",
1105
+ " \"filtered\": false,\n",
1106
+ " \"severity\": \"safe\"\n",
1107
+ " }\n",
1108
+ " }\n",
1109
+ " },\n",
1110
+ " {\n",
1111
+ " \"index\": 23,\n",
1112
+ " \"finish_reason\": \"stop\",\n",
1113
+ " \"message\": {\n",
1114
+ " \"role\": \"assistant\",\n",
1115
+ " \"content\": \"Starting with the first equation, we have $3+a=4-b$. Rearranging this equation, we get $a=-b+1$. \\n\\nSubstituting this expression for $a$ into the second equation, we have $4+b=7+(-b+1)$. Simplifying this equation gives $b=-2$. \\n\\nSubstituting this value of $b$ into the equation $a=-b+1$, we find $a=3$. \\n\\nFinally, we can evaluate $3-a$ using the value we found for $a$. We have $3-a=3-3=\\\\boxed{0}$.\"\n",
1116
+ " },\n",
1117
+ " \"content_filter_results\": {\n",
1118
+ " \"hate\": {\n",
1119
+ " \"filtered\": false,\n",
1120
+ " \"severity\": \"safe\"\n",
1121
+ " },\n",
1122
+ " \"self_harm\": {\n",
1123
+ " \"filtered\": false,\n",
1124
+ " \"severity\": \"safe\"\n",
1125
+ " },\n",
1126
+ " \"sexual\": {\n",
1127
+ " \"filtered\": false,\n",
1128
+ " \"severity\": \"safe\"\n",
1129
+ " },\n",
1130
+ " \"violence\": {\n",
1131
+ " \"filtered\": false,\n",
1132
+ " \"severity\": \"safe\"\n",
1133
+ " }\n",
1134
+ " }\n",
1135
+ " },\n",
1136
+ " {\n",
1137
+ " \"index\": 24,\n",
1138
+ " \"finish_reason\": \"stop\",\n",
1139
+ " \"message\": {\n",
1140
+ " \"role\": \"assistant\",\n",
1141
+ " \"content\": \"Adding the two given equations, we have $(3+a)+(4+b)=(4-b)+(7+a)$. Simplifying both sides gives $7+a+b=11+a-b$. Subtracting $a$ from both sides gives $7+b=11-b$. Adding $b$ to both sides gives $7+2b=11$. Subtracting $7$ from both sides gives $2b=4$. Dividing both sides by $2$ gives $b=2$. Substituting this value of $b$ into the first given equation, we have $3+a=4-2$, or $a=-1$. Finally, substituting these values into $3-a$, we have $3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1142
+ " },\n",
1143
+ " \"content_filter_results\": {\n",
1144
+ " \"hate\": {\n",
1145
+ " \"filtered\": false,\n",
1146
+ " \"severity\": \"safe\"\n",
1147
+ " },\n",
1148
+ " \"self_harm\": {\n",
1149
+ " \"filtered\": false,\n",
1150
+ " \"severity\": \"safe\"\n",
1151
+ " },\n",
1152
+ " \"sexual\": {\n",
1153
+ " \"filtered\": false,\n",
1154
+ " \"severity\": \"safe\"\n",
1155
+ " },\n",
1156
+ " \"violence\": {\n",
1157
+ " \"filtered\": false,\n",
1158
+ " \"severity\": \"safe\"\n",
1159
+ " }\n",
1160
+ " }\n",
1161
+ " },\n",
1162
+ " {\n",
1163
+ " \"index\": 25,\n",
1164
+ " \"finish_reason\": \"stop\",\n",
1165
+ " \"message\": {\n",
1166
+ " \"role\": \"assistant\",\n",
1167
+ " \"content\": \"From the first equation, $3+a=4-b$, we can subtract $3$ from both sides to find that $a=1-b$. Substituting this into the second equation, we have $4+b=7+(1-b)$. Expanding the right side gives $4+b=7+1-b$. We can then simplify this to $b+4=8-b$ by combining like terms. Adding $b$ to both sides gives $2b+4=8$. Subtracting $4$ from both sides gives $2b=4$. Finally, dividing both sides by $2$ gives $b=2$. \\n\\nNow that we have found the value of $b$, we can substitute it back into the first equation to find $3+a=4-2$. Simplifying gives $3+a=2$. Subtracting $3$ from both sides gives $a=-1$. \\n\\nFinally, we can find $3-a$ by subtracting $-1$ from $3$. This gives $3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1168
+ " },\n",
1169
+ " \"content_filter_results\": {\n",
1170
+ " \"hate\": {\n",
1171
+ " \"filtered\": false,\n",
1172
+ " \"severity\": \"safe\"\n",
1173
+ " },\n",
1174
+ " \"self_harm\": {\n",
1175
+ " \"filtered\": false,\n",
1176
+ " \"severity\": \"safe\"\n",
1177
+ " },\n",
1178
+ " \"sexual\": {\n",
1179
+ " \"filtered\": false,\n",
1180
+ " \"severity\": \"safe\"\n",
1181
+ " },\n",
1182
+ " \"violence\": {\n",
1183
+ " \"filtered\": false,\n",
1184
+ " \"severity\": \"safe\"\n",
1185
+ " }\n",
1186
+ " }\n",
1187
+ " },\n",
1188
+ " {\n",
1189
+ " \"index\": 26,\n",
1190
+ " \"finish_reason\": \"stop\",\n",
1191
+ " \"message\": {\n",
1192
+ " \"role\": \"assistant\",\n",
1193
+ " \"content\": \"We can start by solving the first equation, $3+a=4-b$, for $a$. Subtracting 3 from both sides gives $a=1-b$. \\n\\nWe can substitute this value for $a$ into the second equation $4+b=7+a$. Substituting gives $4+b=7+(1-b)$. Expanding the right side gives $4+b=7+1-b$. Combining like terms gives $4+b=8-b$. \\n\\nAdding $b$ to both sides gives $4=8-2b$. Subtracting 8 from both sides gives $-4=-2b$. Dividing both sides by $-2$ gives $2=b$.\\n\\nNow that we know $b=2$, we can substitute this value back into the first equation to find $a$. Substituting gives $3+a=4-2$. Simplifying gives $3+a=2$. Subtracting 3 from both sides gives $a=-1$.\\n\\nFinally, we can find $3-a$ by substituting the value of $a$ we found into $3-a$. Substituting gives $3-(-1)$. Simplifying gives $3+1=4$.\\n\\nThus, $3-a=\\\\boxed{4}$.\"\n",
1194
+ " },\n",
1195
+ " \"content_filter_results\": {\n",
1196
+ " \"hate\": {\n",
1197
+ " \"filtered\": false,\n",
1198
+ " \"severity\": \"safe\"\n",
1199
+ " },\n",
1200
+ " \"self_harm\": {\n",
1201
+ " \"filtered\": false,\n",
1202
+ " \"severity\": \"safe\"\n",
1203
+ " },\n",
1204
+ " \"sexual\": {\n",
1205
+ " \"filtered\": false,\n",
1206
+ " \"severity\": \"safe\"\n",
1207
+ " },\n",
1208
+ " \"violence\": {\n",
1209
+ " \"filtered\": false,\n",
1210
+ " \"severity\": \"safe\"\n",
1211
+ " }\n",
1212
+ " }\n",
1213
+ " },\n",
1214
+ " {\n",
1215
+ " \"index\": 27,\n",
1216
+ " \"finish_reason\": \"stop\",\n",
1217
+ " \"message\": {\n",
1218
+ " \"role\": \"assistant\",\n",
1219
+ " \"content\": \"Starting with the first equation, we can subtract $a$ from both sides to isolate $3$:\\n\\n\\\\[3+a-a=4-b-a\\\\qquad\\\\Rightarrow\\\\qquad 3=4-b-a.\\\\]\\n\\nRearranging the terms, we have $b+a=4-3=1.$ Similarly, starting with the second equation, we can subtract $b$ from both sides to obtain $a+b=7-4=3.$ Adding these two equations, we have $2a+2b=4,$ so $a+b=2.$ Subtracting this equation from $b+a=1,$ we get $b-a=\\\\boxed{-1}.$\"\n",
1220
+ " },\n",
1221
+ " \"content_filter_results\": {\n",
1222
+ " \"hate\": {\n",
1223
+ " \"filtered\": false,\n",
1224
+ " \"severity\": \"safe\"\n",
1225
+ " },\n",
1226
+ " \"self_harm\": {\n",
1227
+ " \"filtered\": false,\n",
1228
+ " \"severity\": \"safe\"\n",
1229
+ " },\n",
1230
+ " \"sexual\": {\n",
1231
+ " \"filtered\": false,\n",
1232
+ " \"severity\": \"safe\"\n",
1233
+ " },\n",
1234
+ " \"violence\": {\n",
1235
+ " \"filtered\": false,\n",
1236
+ " \"severity\": \"safe\"\n",
1237
+ " }\n",
1238
+ " }\n",
1239
+ " },\n",
1240
+ " {\n",
1241
+ " \"index\": 28,\n",
1242
+ " \"finish_reason\": \"stop\",\n",
1243
+ " \"message\": {\n",
1244
+ " \"role\": \"assistant\",\n",
1245
+ " \"content\": \"From the first equation, we have $a=4-b-3=-b+1$. Substituting this into the second equation, we get $4+b=7+(-b+1)$, which simplifies to $4+b=8-b$. Solving for $b$, we find $2b=4$, so $b=2$. Substituting this back into $a=-b+1$, we find $a=-2+1=-1$. Finally, $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1246
+ " },\n",
1247
+ " \"content_filter_results\": {\n",
1248
+ " \"hate\": {\n",
1249
+ " \"filtered\": false,\n",
1250
+ " \"severity\": \"safe\"\n",
1251
+ " },\n",
1252
+ " \"self_harm\": {\n",
1253
+ " \"filtered\": false,\n",
1254
+ " \"severity\": \"safe\"\n",
1255
+ " },\n",
1256
+ " \"sexual\": {\n",
1257
+ " \"filtered\": false,\n",
1258
+ " \"severity\": \"safe\"\n",
1259
+ " },\n",
1260
+ " \"violence\": {\n",
1261
+ " \"filtered\": false,\n",
1262
+ " \"severity\": \"safe\"\n",
1263
+ " }\n",
1264
+ " }\n",
1265
+ " },\n",
1266
+ " {\n",
1267
+ " \"index\": 29,\n",
1268
+ " \"finish_reason\": \"stop\",\n",
1269
+ " \"message\": {\n",
1270
+ " \"role\": \"assistant\",\n",
1271
+ " \"content\": \"We start by solving the first equation for $a$ by subtracting 3 from both sides to find $a=1-b$. Substituting this into the second equation gives $4+b=7+(1-b)$. Expanding the brackets gives $4+b=7+1-b$. Simplifying gives $2b=4$, so $b=2$. Substituting this into $a=1-b$ gives $a=1-2=-1$. Finally, substituting this into $3-a$ gives $3-(-1)=4$, so our final answer is $\\\\boxed{4}$.\"\n",
1272
+ " },\n",
1273
+ " \"content_filter_results\": {\n",
1274
+ " \"hate\": {\n",
1275
+ " \"filtered\": false,\n",
1276
+ " \"severity\": \"safe\"\n",
1277
+ " },\n",
1278
+ " \"self_harm\": {\n",
1279
+ " \"filtered\": false,\n",
1280
+ " \"severity\": \"safe\"\n",
1281
+ " },\n",
1282
+ " \"sexual\": {\n",
1283
+ " \"filtered\": false,\n",
1284
+ " \"severity\": \"safe\"\n",
1285
+ " },\n",
1286
+ " \"violence\": {\n",
1287
+ " \"filtered\": false,\n",
1288
+ " \"severity\": \"safe\"\n",
1289
+ " }\n",
1290
+ " }\n",
1291
+ " },\n",
1292
+ " {\n",
1293
+ " \"index\": 30,\n",
1294
+ " \"finish_reason\": \"stop\",\n",
1295
+ " \"message\": {\n",
1296
+ " \"role\": \"assistant\",\n",
1297
+ " \"content\": \"Adding the two given equations, we have $(3+a)+(4+b)=(4-b)+(7+a)$. Applying the commutative property of addition, we can rearrange the terms to get $(3+4)+(a+b)=(4+7)+(-b+a)$. Simplifying both sides gives $7+(a+b)=11+(a-b)$. We can rewrite this equation as $a+b+7=a-b+11$. Subtracting $a+b+4$ from both sides yields $7-4=a-b+11-(a+b+4)$. Simplifying gives $3=-4-b$. Adding $b$ to both sides gives $3+b=-4$. Subtracting 4 from both sides yields $b-1=-4$. Then, adding 1 to both sides gives $b=-3$. Substituting this into the first equation $3+a=4-b$, we can substitute $-3$ for $b$ to get $3+a=4-(-3)$. Simplifying gives $3+a=4+3$. Subtracting 3 from both sides yields $a=4$. Finally, substituting this into $3-a$, we can substitute $4$ for $a$ to get $3-4=\\\\boxed{-1}$.\"\n",
1298
+ " },\n",
1299
+ " \"content_filter_results\": {\n",
1300
+ " \"hate\": {\n",
1301
+ " \"filtered\": false,\n",
1302
+ " \"severity\": \"safe\"\n",
1303
+ " },\n",
1304
+ " \"self_harm\": {\n",
1305
+ " \"filtered\": false,\n",
1306
+ " \"severity\": \"safe\"\n",
1307
+ " },\n",
1308
+ " \"sexual\": {\n",
1309
+ " \"filtered\": false,\n",
1310
+ " \"severity\": \"safe\"\n",
1311
+ " },\n",
1312
+ " \"violence\": {\n",
1313
+ " \"filtered\": false,\n",
1314
+ " \"severity\": \"safe\"\n",
1315
+ " }\n",
1316
+ " }\n",
1317
+ " },\n",
1318
+ " {\n",
1319
+ " \"index\": 31,\n",
1320
+ " \"finish_reason\": \"stop\",\n",
1321
+ " \"message\": {\n",
1322
+ " \"role\": \"assistant\",\n",
1323
+ " \"content\": \"From the first equation, we have $a=4-b-3=1-b$. Substituting this into the second equation gives $4+b=7+(1-b)$. Expanding the right side gives $4+b=7+1-b$. Simplifying the right side gives $4+b=8-b$. Adding $b$ to both sides gives $4+2b=8$. Subtracting 4 from both sides gives $2b=4$. Dividing both sides by 2 gives $b=2$. Substituting this into the first equation gives $3+a=4-2$. Simplifying the right side gives $3+a=2$. Subtracting 3 from both sides gives $a=-1$. Finally, we have $3-a=3-(-1)=\\\\boxed{4}$.\"\n",
1324
+ " },\n",
1325
+ " \"content_filter_results\": {\n",
1326
+ " \"hate\": {\n",
1327
+ " \"filtered\": false,\n",
1328
+ " \"severity\": \"safe\"\n",
1329
+ " },\n",
1330
+ " \"self_harm\": {\n",
1331
+ " \"filtered\": false,\n",
1332
+ " \"severity\": \"safe\"\n",
1333
+ " },\n",
1334
+ " \"sexual\": {\n",
1335
+ " \"filtered\": false,\n",
1336
+ " \"severity\": \"safe\"\n",
1337
+ " },\n",
1338
+ " \"violence\": {\n",
1339
+ " \"filtered\": false,\n",
1340
+ " \"severity\": \"safe\"\n",
1341
+ " }\n",
1342
+ " }\n",
1343
+ " },\n",
1344
+ " {\n",
1345
+ " \"index\": 32,\n",
1346
+ " \"finish_reason\": \"stop\",\n",
1347
+ " \"message\": {\n",
1348
+ " \"role\": \"assistant\",\n",
1349
+ " \"content\": \"We start by solving the first equation for $a$: \\\\begin{align*}\\n3+a&=4-b \\\\\\\\\\na&=1-b.\\n\\\\end{align*}We substitute this expression for $a$ into the second equation: \\\\begin{align*}\\n4+b&=7+a \\\\\\\\\\n4+b&=7+(1-b) \\\\\\\\\\n4+b&=8-b.\\n\\\\end{align*}Adding $b$ to both sides gives $4+2b=8$, so $2b=4$ and $b=2$. Substituting this value into $a=1-b$ gives $a=1-2=-1$. Therefore, $3-a=3-(-1)=3+1=4$. Thus, $3-a=\\\\boxed{4}$.\"\n",
1350
+ " },\n",
1351
+ " \"content_filter_results\": {\n",
1352
+ " \"hate\": {\n",
1353
+ " \"filtered\": false,\n",
1354
+ " \"severity\": \"safe\"\n",
1355
+ " },\n",
1356
+ " \"self_harm\": {\n",
1357
+ " \"filtered\": false,\n",
1358
+ " \"severity\": \"safe\"\n",
1359
+ " },\n",
1360
+ " \"sexual\": {\n",
1361
+ " \"filtered\": false,\n",
1362
+ " \"severity\": \"safe\"\n",
1363
+ " },\n",
1364
+ " \"violence\": {\n",
1365
+ " \"filtered\": false,\n",
1366
+ " \"severity\": \"safe\"\n",
1367
+ " }\n",
1368
+ " }\n",
1369
+ " },\n",
1370
+ " {\n",
1371
+ " \"index\": 33,\n",
1372
+ " \"finish_reason\": \"stop\",\n",
1373
+ " \"message\": {\n",
1374
+ " \"role\": \"assistant\",\n",
1375
+ " \"content\": \"We start by solving the first equation, $3+a=4-b$, for $a$. Subtracting $3$ from both sides gives $a=1-b$.\\n\\nWe can substitute this expression for $a$ into the second equation, $4+b=7+a$. Substituting $1-b$ for $a$ gives $4+b=7+(1-b)$.\\n\\nExpanding the parentheses gives $4+b=7+1-b$. Simplifying the right side gives $4+b=8-b$.\\n\\nAdding $b$ to both sides gives $4+2b=8$. Subtracting $4$ from both sides gives $2b=4$.\\n\\nFinally, dividing both sides by $2$ gives $b=2$. We can substitute this value back into the equation $a=1-b$ to find $a=1-2=-1$.\\n\\nNow, we can find $3-a$ by subtracting $(-1)$ from $3$. This gives $3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1376
+ " },\n",
1377
+ " \"content_filter_results\": {\n",
1378
+ " \"hate\": {\n",
1379
+ " \"filtered\": false,\n",
1380
+ " \"severity\": \"safe\"\n",
1381
+ " },\n",
1382
+ " \"self_harm\": {\n",
1383
+ " \"filtered\": false,\n",
1384
+ " \"severity\": \"safe\"\n",
1385
+ " },\n",
1386
+ " \"sexual\": {\n",
1387
+ " \"filtered\": false,\n",
1388
+ " \"severity\": \"safe\"\n",
1389
+ " },\n",
1390
+ " \"violence\": {\n",
1391
+ " \"filtered\": false,\n",
1392
+ " \"severity\": \"safe\"\n",
1393
+ " }\n",
1394
+ " }\n",
1395
+ " },\n",
1396
+ " {\n",
1397
+ " \"index\": 34,\n",
1398
+ " \"finish_reason\": \"stop\",\n",
1399
+ " \"message\": {\n",
1400
+ " \"role\": \"assistant\",\n",
1401
+ " \"content\": \"Combining the two equations, we have $3+a+4+b=4-b+7+a$. Simplifying both sides, we have $7+a+b=11+a-b$. Subtracting $a$ from both sides, we have $7+b=11-b$. Adding $b$ to both sides, we have $7+2b=11$. Subtracting 7 from both sides, we have $2b=4$. Dividing both sides by 2, we have $b=2$. Substituting this value back into the first equation, we have $3+a=4-2$, so $3+a=2$. Subtracting 3 from both sides, we have $a=-1$. Therefore, $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1402
+ " },\n",
1403
+ " \"content_filter_results\": {\n",
1404
+ " \"hate\": {\n",
1405
+ " \"filtered\": false,\n",
1406
+ " \"severity\": \"safe\"\n",
1407
+ " },\n",
1408
+ " \"self_harm\": {\n",
1409
+ " \"filtered\": false,\n",
1410
+ " \"severity\": \"safe\"\n",
1411
+ " },\n",
1412
+ " \"sexual\": {\n",
1413
+ " \"filtered\": false,\n",
1414
+ " \"severity\": \"safe\"\n",
1415
+ " },\n",
1416
+ " \"violence\": {\n",
1417
+ " \"filtered\": false,\n",
1418
+ " \"severity\": \"safe\"\n",
1419
+ " }\n",
1420
+ " }\n",
1421
+ " },\n",
1422
+ " {\n",
1423
+ " \"index\": 35,\n",
1424
+ " \"finish_reason\": \"stop\",\n",
1425
+ " \"message\": {\n",
1426
+ " \"role\": \"assistant\",\n",
1427
+ " \"content\": \"We can start by subtracting $a$ from both sides of the first equation to find that $3 = 4 - b - a$. Similarly, we can subtract $a$ from both sides of the second equation to find that $4 = 7 + a - b$. Adding these two equations gives $3 + 4 = 4 - b - a + 7 + a - b$, which simplifies to $7 = 11 - 2b$. Solving for $b$ gives $b = 2$. Substituting this value into the first equation gives $3 + a = 4 - 2$, so $a = 3$. Then $3 - a = 3 - 3 = \\\\boxed{0}$.\"\n",
1428
+ " },\n",
1429
+ " \"content_filter_results\": {\n",
1430
+ " \"hate\": {\n",
1431
+ " \"filtered\": false,\n",
1432
+ " \"severity\": \"safe\"\n",
1433
+ " },\n",
1434
+ " \"self_harm\": {\n",
1435
+ " \"filtered\": false,\n",
1436
+ " \"severity\": \"safe\"\n",
1437
+ " },\n",
1438
+ " \"sexual\": {\n",
1439
+ " \"filtered\": false,\n",
1440
+ " \"severity\": \"safe\"\n",
1441
+ " },\n",
1442
+ " \"violence\": {\n",
1443
+ " \"filtered\": false,\n",
1444
+ " \"severity\": \"safe\"\n",
1445
+ " }\n",
1446
+ " }\n",
1447
+ " },\n",
1448
+ " {\n",
1449
+ " \"index\": 36,\n",
1450
+ " \"finish_reason\": \"stop\",\n",
1451
+ " \"message\": {\n",
1452
+ " \"role\": \"assistant\",\n",
1453
+ " \"content\": \"Adding the two given equations, we have $$(3+a)+(4+b)=(4-b)+(7+a).$$Expanding both sides gives $$7+a+b=11+a-b.$$We can then cancel out the $a$ term on both sides to get $$7+b=11-b.$$Adding $b$ to both sides gives $$7+2b=11.$$Subtracting $7$ from both sides gives $$2b=4.$$Dividing both sides by $2$ gives $$b=2.$$Plugging this value of $b$ into either of the original equations, we can solve for $a$. Using the first equation, we have $$3+a=4-2 \\\\Rightarrow a=-1.$$Finally, we can find $3-a$ as $$3-a=3-(-1)=3+1=\\\\boxed{4}.$$\"\n",
1454
+ " },\n",
1455
+ " \"content_filter_results\": {\n",
1456
+ " \"hate\": {\n",
1457
+ " \"filtered\": false,\n",
1458
+ " \"severity\": \"safe\"\n",
1459
+ " },\n",
1460
+ " \"self_harm\": {\n",
1461
+ " \"filtered\": false,\n",
1462
+ " \"severity\": \"safe\"\n",
1463
+ " },\n",
1464
+ " \"sexual\": {\n",
1465
+ " \"filtered\": false,\n",
1466
+ " \"severity\": \"safe\"\n",
1467
+ " },\n",
1468
+ " \"violence\": {\n",
1469
+ " \"filtered\": false,\n",
1470
+ " \"severity\": \"safe\"\n",
1471
+ " }\n",
1472
+ " }\n",
1473
+ " },\n",
1474
+ " {\n",
1475
+ " \"index\": 37,\n",
1476
+ " \"finish_reason\": \"stop\",\n",
1477
+ " \"message\": {\n",
1478
+ " \"role\": \"assistant\",\n",
1479
+ " \"content\": \"We can start by adding $a$ to both sides of the first equation and subtracting $b$ from both sides of the second equation to obtain \\\\begin{align*}\\na+b&=1, \\\\\\\\\\na-b&=-3.\\n\\\\end{align*} We can then add these equations to eliminate $b$: $$2a=1+(-3)=-2.$$Dividing both sides by $2$ gives $a=-1$. Substituting into the second equation gives $-1-b=-3$, so $b=2$. Finally, we find that $3-a=3-(-1)=\\\\boxed{4}$.\"\n",
1480
+ " },\n",
1481
+ " \"content_filter_results\": {\n",
1482
+ " \"hate\": {\n",
1483
+ " \"filtered\": false,\n",
1484
+ " \"severity\": \"safe\"\n",
1485
+ " },\n",
1486
+ " \"self_harm\": {\n",
1487
+ " \"filtered\": false,\n",
1488
+ " \"severity\": \"safe\"\n",
1489
+ " },\n",
1490
+ " \"sexual\": {\n",
1491
+ " \"filtered\": false,\n",
1492
+ " \"severity\": \"safe\"\n",
1493
+ " },\n",
1494
+ " \"violence\": {\n",
1495
+ " \"filtered\": false,\n",
1496
+ " \"severity\": \"safe\"\n",
1497
+ " }\n",
1498
+ " }\n",
1499
+ " },\n",
1500
+ " {\n",
1501
+ " \"index\": 38,\n",
1502
+ " \"finish_reason\": \"stop\",\n",
1503
+ " \"message\": {\n",
1504
+ " \"role\": \"assistant\",\n",
1505
+ " \"content\": \"We can start by subtracting $a$ from both sides of the first equation and subtracting $b$ from both sides of the second equation to obtain \\\\begin{align*}\\n3&=4-b-a,\\\\\\\\\\n4&=7+a-b.\\n\\\\end{align*}We can rearrange the first equation to get $b+a=4-3=1$. Similarly, we can rearrange the second equation to get $a-b=4-7=-3$. Adding these equations, we find that $(b+a)+(a-b)=1+(-3)$, which implies $2a= -2$. Hence, $a=-1$. We can substitute this value of $a$ into $a-b=-3$ to find that $-1-b=-3$, so $b=-1-(-3)=2$. Finally, we have \\\\begin{align*}\\n3-a&=3-(-1)=3+1=\\\\boxed{4}.\\n\\\\end{align*}\"\n",
1506
+ " },\n",
1507
+ " \"content_filter_results\": {\n",
1508
+ " \"hate\": {\n",
1509
+ " \"filtered\": false,\n",
1510
+ " \"severity\": \"safe\"\n",
1511
+ " },\n",
1512
+ " \"self_harm\": {\n",
1513
+ " \"filtered\": false,\n",
1514
+ " \"severity\": \"safe\"\n",
1515
+ " },\n",
1516
+ " \"sexual\": {\n",
1517
+ " \"filtered\": false,\n",
1518
+ " \"severity\": \"safe\"\n",
1519
+ " },\n",
1520
+ " \"violence\": {\n",
1521
+ " \"filtered\": false,\n",
1522
+ " \"severity\": \"safe\"\n",
1523
+ " }\n",
1524
+ " }\n",
1525
+ " },\n",
1526
+ " {\n",
1527
+ " \"index\": 39,\n",
1528
+ " \"finish_reason\": \"stop\",\n",
1529
+ " \"message\": {\n",
1530
+ " \"role\": \"assistant\",\n",
1531
+ " \"content\": \"Adding the two given equations, we have $$(3+a)+(4+b)=(4-b)+(7+a).$$Simplifying both sides gives $7+a+b=11+a-b$. Subtracting $a$ and $b$ from both sides gives $7=11$, which is a contradiction. Therefore, there are no solutions to the given equations, and the value of $3-a$ is undefined. So we have $3-a=\\\\boxed{ \\\\text{undefined}}$.\"\n",
1532
+ " },\n",
1533
+ " \"content_filter_results\": {\n",
1534
+ " \"hate\": {\n",
1535
+ " \"filtered\": false,\n",
1536
+ " \"severity\": \"safe\"\n",
1537
+ " },\n",
1538
+ " \"self_harm\": {\n",
1539
+ " \"filtered\": false,\n",
1540
+ " \"severity\": \"safe\"\n",
1541
+ " },\n",
1542
+ " \"sexual\": {\n",
1543
+ " \"filtered\": false,\n",
1544
+ " \"severity\": \"safe\"\n",
1545
+ " },\n",
1546
+ " \"violence\": {\n",
1547
+ " \"filtered\": false,\n",
1548
+ " \"severity\": \"safe\"\n",
1549
+ " }\n",
1550
+ " }\n",
1551
+ " },\n",
1552
+ " {\n",
1553
+ " \"index\": 40,\n",
1554
+ " \"finish_reason\": \"stop\",\n",
1555
+ " \"message\": {\n",
1556
+ " \"role\": \"assistant\",\n",
1557
+ " \"content\": \"To solve this problem, we can start by isolating $a$ in both equations. \\n\\nFrom the first equation, $3+a=4-b$, we can subtract 3 from both sides to get $a=1-b$. \\n\\nFrom the second equation, $4+b=7+a$, we can subtract 4 from both sides to get $b=3+a$. \\n\\nNow, we can substitute $1-b$ for $a$ in the second equation to get $b=3+1-b$. Simplifying this equation gives $2b=4$, so $b=2$. \\n\\nSubstituting $b=2$ into the equation $a=1-b$, we find that $a=1-2=-1$. \\n\\nFinally, we can find $3-a$ by subtracting $(-1)$ from 3, which gives us $3-(-1)=\\\\boxed{4}$.\"\n",
1558
+ " },\n",
1559
+ " \"content_filter_results\": {\n",
1560
+ " \"hate\": {\n",
1561
+ " \"filtered\": false,\n",
1562
+ " \"severity\": \"safe\"\n",
1563
+ " },\n",
1564
+ " \"self_harm\": {\n",
1565
+ " \"filtered\": false,\n",
1566
+ " \"severity\": \"safe\"\n",
1567
+ " },\n",
1568
+ " \"sexual\": {\n",
1569
+ " \"filtered\": false,\n",
1570
+ " \"severity\": \"safe\"\n",
1571
+ " },\n",
1572
+ " \"violence\": {\n",
1573
+ " \"filtered\": false,\n",
1574
+ " \"severity\": \"safe\"\n",
1575
+ " }\n",
1576
+ " }\n",
1577
+ " },\n",
1578
+ " {\n",
1579
+ " \"index\": 41,\n",
1580
+ " \"finish_reason\": \"stop\",\n",
1581
+ " \"message\": {\n",
1582
+ " \"role\": \"assistant\",\n",
1583
+ " \"content\": \"We start by solving the first equation for $a$ in terms of $b$. Subtracting $3$ from both sides, we have $a=1-b$. Substituting this into the second equation, we get $4+b=7+(1-b)$. Simplifying, we have $4+b=8-b$. Adding $b$ to both sides, we have $4+2b=8$. Subtracting $4$ from both sides, we have $2b=4$. Dividing both sides by $2$, we have $b=2$. Substituting this into the equation $a=1-b$, we have $a=1-2=-1$. Thus, $3-a=3-(-1)=3+1=\\\\boxed{4}$.\"\n",
1584
+ " },\n",
1585
+ " \"content_filter_results\": {\n",
1586
+ " \"hate\": {\n",
1587
+ " \"filtered\": false,\n",
1588
+ " \"severity\": \"safe\"\n",
1589
+ " },\n",
1590
+ " \"self_harm\": {\n",
1591
+ " \"filtered\": false,\n",
1592
+ " \"severity\": \"safe\"\n",
1593
+ " },\n",
1594
+ " \"sexual\": {\n",
1595
+ " \"filtered\": false,\n",
1596
+ " \"severity\": \"safe\"\n",
1597
+ " },\n",
1598
+ " \"violence\": {\n",
1599
+ " \"filtered\": false,\n",
1600
+ " \"severity\": \"safe\"\n",
1601
+ " }\n",
1602
+ " }\n",
1603
+ " },\n",
1604
+ " {\n",
1605
+ " \"index\": 42,\n",
1606
+ " \"finish_reason\": \"stop\",\n",
1607
+ " \"message\": {\n",
1608
+ " \"role\": \"assistant\",\n",
1609
+ " \"content\": \"From the first equation, we have $3+a=4-b$. Rearranging this equation, we get $a=-1-b$. Substituting this value of $a$ into the second equation, we have $4+b=7+(-1-b)$. Simplifying this equation, we get $b+1=-b+6$. Adding $b$ to both sides and subtracting $1$ from both sides, we have $2b=5$. Therefore, $b=\\\\frac{5}{2}$. Substituting this value of $b$ into the first equation, we have $3+a=4-\\\\frac{5}{2}$. Simplifying this equation, we get $a=\\\\frac{3}{2}$. Finally, we have $3-a=3-\\\\frac{3}{2}=\\\\boxed{\\\\frac{3}{2}}$.\"\n",
1610
+ " },\n",
1611
+ " \"content_filter_results\": {\n",
1612
+ " \"hate\": {\n",
1613
+ " \"filtered\": false,\n",
1614
+ " \"severity\": \"safe\"\n",
1615
+ " },\n",
1616
+ " \"self_harm\": {\n",
1617
+ " \"filtered\": false,\n",
1618
+ " \"severity\": \"safe\"\n",
1619
+ " },\n",
1620
+ " \"sexual\": {\n",
1621
+ " \"filtered\": false,\n",
1622
+ " \"severity\": \"safe\"\n",
1623
+ " },\n",
1624
+ " \"violence\": {\n",
1625
+ " \"filtered\": false,\n",
1626
+ " \"severity\": \"safe\"\n",
1627
+ " }\n",
1628
+ " }\n",
1629
+ " },\n",
1630
+ " {\n",
1631
+ " \"index\": 43,\n",
1632
+ " \"finish_reason\": \"stop\",\n",
1633
+ " \"message\": {\n",
1634
+ " \"role\": \"assistant\",\n",
1635
+ " \"content\": \"Manipulating the given equations, we have $a=1-b$ and $b=3+a$. Substituting $a=1-b$ into the second equation, we get $b=3+1-b$, which implies $2b=4$, or $b=2$. Substituting $b=2$ into the first equation, we get $a=1-2=-1$. Finally, evaluating $3-a$, we find that $3-a=3-(-1)=\\\\boxed{4}$.\"\n",
1636
+ " },\n",
1637
+ " \"content_filter_results\": {\n",
1638
+ " \"hate\": {\n",
1639
+ " \"filtered\": false,\n",
1640
+ " \"severity\": \"safe\"\n",
1641
+ " },\n",
1642
+ " \"self_harm\": {\n",
1643
+ " \"filtered\": false,\n",
1644
+ " \"severity\": \"safe\"\n",
1645
+ " },\n",
1646
+ " \"sexual\": {\n",
1647
+ " \"filtered\": false,\n",
1648
+ " \"severity\": \"safe\"\n",
1649
+ " },\n",
1650
+ " \"violence\": {\n",
1651
+ " \"filtered\": false,\n",
1652
+ " \"severity\": \"safe\"\n",
1653
+ " }\n",
1654
+ " }\n",
1655
+ " }\n",
1656
+ " ],\n",
1657
+ " \"usage\": {\n",
1658
+ " \"completion_tokens\": 7178,\n",
1659
+ " \"prompt_tokens\": 52,\n",
1660
+ " \"total_tokens\": 7230\n",
1661
+ " },\n",
1662
+ " \"cost\": 0.01446,\n",
1663
+ " \"config_id\": 0,\n",
1664
+ " \"pass_filter\": true\n",
1665
+ "}\n",
1666
+ "metric_results on the example data instance: {'expected_success': 1.0, 'success': True, 'success_vote': 1.0, 'voted_answer': 'We start by solving the first equation for $a$: $$3+a=4-b.$$Adding $-3$ to both sides gives $a=1-b$. Substituting this expression for $a$ into the second equation gives $$4+b=7+(1-b).$$Simplifying this expression, we find that $b=2$. Substituting $b=2$ into the first equation to solve for $a$, we find that $a=1-2=-1$. Finally, we have $3-a=3-(-1)=3+1=\\\\boxed{4}$.', 'votes': 27}\n"
1667
+ ]
1668
+ }
1669
+ ],
1670
+ "source": [
1671
+ "response = autogen.ChatCompletion.create(context=tune_data[1], config_list=config_list, **config)\n",
1672
+ "metric_results = eval_math_responses(autogen.ChatCompletion.extract_text(response), **tune_data[1])\n",
1673
+ "print(\"response on an example data instance:\", response)\n",
1674
+ "print(\"metric_results on the example data instance:\", metric_results)\n"
1675
+ ]
1676
+ },
1677
+ {
1678
+ "attachments": {},
1679
+ "cell_type": "markdown",
1680
+ "metadata": {},
1681
+ "source": [
1682
+ "### Evaluate the success rate on the test data\n",
1683
+ "\n",
1684
+ "You can use `autogen.ChatCompletion.test` to evaluate the performance of an entire dataset with the tuned config. The following code will take a while (30 mins to 1 hour) to evaluate all the test data instances if uncommented and run. It will cost roughly $3. "
1685
+ ]
1686
+ },
1687
+ {
1688
+ "cell_type": "code",
1689
+ "execution_count": null,
1690
+ "metadata": {
1691
+ "execution": {
1692
+ "iopub.execute_input": "2023-02-13T23:41:56.042764Z",
1693
+ "iopub.status.busy": "2023-02-13T23:41:56.042086Z",
1694
+ "iopub.status.idle": "2023-02-13T23:53:05.597643Z",
1695
+ "shell.execute_reply": "2023-02-13T23:53:05.596603Z"
1696
+ }
1697
+ },
1698
+ "outputs": [],
1699
+ "source": [
1700
+ "# result = autogen.ChatCompletion.test(test_data, logging_level=logging.INFO, config_list=config_list, **config)\n",
1701
+ "# print(\"performance on test data with the tuned config:\", result)"
1702
+ ]
1703
+ },
1704
+ {
1705
+ "attachments": {},
1706
+ "cell_type": "markdown",
1707
+ "metadata": {},
1708
+ "source": [
1709
+ "What about the default, untuned gpt-4 config (with the same prompt as the tuned config)? We can evaluate it and compare:"
1710
+ ]
1711
+ },
1712
+ {
1713
+ "cell_type": "code",
1714
+ "execution_count": 16,
1715
+ "metadata": {},
1716
+ "outputs": [
1717
+ {
1718
+ "name": "stdout",
1719
+ "output_type": "stream",
1720
+ "text": [
1721
+ "performance on test data from gpt-4 with a default config: {'expected_success': 0.6965174129353234, 'success': 0.6965174129353234, 'success_vote': 0.6965174129353234, 'votes': 1.0, 'cost': 1.9264799999999993, 'inference_cost': 0.009584477611940295}\n"
1722
+ ]
1723
+ }
1724
+ ],
1725
+ "source": [
1726
+ "# the following code will cost roughly $2 if uncommented and run.\n",
1727
+ "\n",
1728
+ "# default_config = {\"model\": 'gpt-4', \"prompt\": prompts[0], \"allow_format_str_template\": True}\n",
1729
+ "# default_result = autogen.ChatCompletion.test(test_data, config_list=config_list, **default_config)\n",
1730
+ "# print(\"performance on test data from gpt-4 with a default config:\", default_result)"
1731
+ ]
1732
+ },
1733
+ {
1734
+ "cell_type": "code",
1735
+ "execution_count": 17,
1736
+ "metadata": {},
1737
+ "outputs": [
1738
+ {
1739
+ "name": "stdout",
1740
+ "output_type": "stream",
1741
+ "text": [
1742
+ "tuned config succeeds in 90.5% test cases\n",
1743
+ "untuned config succeeds in 69.7% test cases\n"
1744
+ ]
1745
+ }
1746
+ ],
1747
+ "source": [
1748
+ "# print(\"tuned config succeeds in {:.1f}% test cases\".format(result[\"success_vote\"] * 100))\n",
1749
+ "# print(\"untuned config succeeds in {:.1f}% test cases\".format(default_result[\"success_vote\"] * 100))"
1750
+ ]
1751
+ },
1752
+ {
1753
+ "attachments": {},
1754
+ "cell_type": "markdown",
1755
+ "metadata": {},
1756
+ "source": [
1757
+ "The default use of GPT-4 has a much lower accuracy. Note that the default config has a lower inference cost. What if we heuristically increase the number of responses n?"
1758
+ ]
1759
+ },
1760
+ {
1761
+ "cell_type": "code",
1762
+ "execution_count": 18,
1763
+ "metadata": {},
1764
+ "outputs": [],
1765
+ "source": [
1766
+ "# The following evaluation costs $3 and longer than one hour if you uncomment it and run it.\n",
1767
+ "\n",
1768
+ "# config_n2 = {\"model\": 'gpt-4', \"prompt\": prompts[0], \"n\": 2, \"allow_format_str_template\": True}\n",
1769
+ "# result_n2 = autogen.ChatCompletion.test(test_data, config_list=config_list, **config_n2)\n",
1770
+ "# print(\"performance on test data from gpt-4 with a default config and n=2:\", result_n2)\n"
1771
+ ]
1772
+ },
1773
+ {
1774
+ "attachments": {},
1775
+ "cell_type": "markdown",
1776
+ "metadata": {},
1777
+ "source": [
1778
+ "The inference cost is doubled and matches the tuned config. But the success rate doesn't improve much. What if we further increase the number of responses n to 5?"
1779
+ ]
1780
+ },
1781
+ {
1782
+ "cell_type": "code",
1783
+ "execution_count": 19,
1784
+ "metadata": {},
1785
+ "outputs": [],
1786
+ "source": [
1787
+ "# The following evaluation costs $8 and longer than one hour if you uncomment it and run it.\n",
1788
+ "\n",
1789
+ "# config_n5 = {\"model\": 'gpt-4', \"prompt\": prompts[0], \"n\": 5, \"allow_format_str_template\": True}\n",
1790
+ "# result_n5 = autogen.ChatCompletion.test(test_data, config_list=config_list, **config_n5)\n",
1791
+ "# print(\"performance on test data from gpt-4 with a default config and n=5:\", result_n5)"
1792
+ ]
1793
+ },
1794
+ {
1795
+ "attachments": {},
1796
+ "cell_type": "markdown",
1797
+ "metadata": {},
1798
+ "source": [
1799
+ "We find that the 'success_vote' metric is increased at the cost of exceeding the inference budget. But the tuned configuration has both higher 'success_vote' (91% vs. 87%) and lower average inference cost ($0.015 vs. $0.037 per instance).\n",
1800
+ "\n",
1801
+ "A developer could use AutoGen to tune the configuration to satisfy the target inference budget while maximizing the value out of it."
1802
+ ]
1803
+ }
1804
+ ],
1805
+ "metadata": {
1806
+ "kernelspec": {
1807
+ "display_name": "Python 3",
1808
+ "language": "python",
1809
+ "name": "python3"
1810
+ },
1811
+ "language_info": {
1812
+ "codemirror_mode": {
1813
+ "name": "ipython",
1814
+ "version": 3
1815
+ },
1816
+ "file_extension": ".py",
1817
+ "mimetype": "text/x-python",
1818
+ "name": "python",
1819
+ "nbconvert_exporter": "python",
1820
+ "pygments_lexer": "ipython3",
1821
+ "version": "3.9.17"
1822
+ },
1823
+ "vscode": {
1824
+ "interpreter": {
1825
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
1826
+ }
1827
+ },
1828
+ "widgets": {
1829
+ "application/vnd.jupyter.widget-state+json": {
1830
+ "state": {
1831
+ "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
1832
+ "model_module": "@jupyter-widgets/base",
1833
+ "model_module_version": "2.0.0",
1834
+ "model_name": "LayoutModel",
1835
+ "state": {
1836
+ "_model_module": "@jupyter-widgets/base",
1837
+ "_model_module_version": "2.0.0",
1838
+ "_model_name": "LayoutModel",
1839
+ "_view_count": null,
1840
+ "_view_module": "@jupyter-widgets/base",
1841
+ "_view_module_version": "2.0.0",
1842
+ "_view_name": "LayoutView",
1843
+ "align_content": null,
1844
+ "align_items": null,
1845
+ "align_self": null,
1846
+ "border_bottom": null,
1847
+ "border_left": null,
1848
+ "border_right": null,
1849
+ "border_top": null,
1850
+ "bottom": null,
1851
+ "display": null,
1852
+ "flex": null,
1853
+ "flex_flow": null,
1854
+ "grid_area": null,
1855
+ "grid_auto_columns": null,
1856
+ "grid_auto_flow": null,
1857
+ "grid_auto_rows": null,
1858
+ "grid_column": null,
1859
+ "grid_gap": null,
1860
+ "grid_row": null,
1861
+ "grid_template_areas": null,
1862
+ "grid_template_columns": null,
1863
+ "grid_template_rows": null,
1864
+ "height": null,
1865
+ "justify_content": null,
1866
+ "justify_items": null,
1867
+ "left": null,
1868
+ "margin": null,
1869
+ "max_height": null,
1870
+ "max_width": null,
1871
+ "min_height": null,
1872
+ "min_width": null,
1873
+ "object_fit": null,
1874
+ "object_position": null,
1875
+ "order": null,
1876
+ "overflow": null,
1877
+ "padding": null,
1878
+ "right": null,
1879
+ "top": null,
1880
+ "visibility": null,
1881
+ "width": null
1882
+ }
1883
+ },
1884
+ "454146d0f7224f038689031002906e6f": {
1885
+ "model_module": "@jupyter-widgets/controls",
1886
+ "model_module_version": "2.0.0",
1887
+ "model_name": "HBoxModel",
1888
+ "state": {
1889
+ "_dom_classes": [],
1890
+ "_model_module": "@jupyter-widgets/controls",
1891
+ "_model_module_version": "2.0.0",
1892
+ "_model_name": "HBoxModel",
1893
+ "_view_count": null,
1894
+ "_view_module": "@jupyter-widgets/controls",
1895
+ "_view_module_version": "2.0.0",
1896
+ "_view_name": "HBoxView",
1897
+ "box_style": "",
1898
+ "children": [
1899
+ "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
1900
+ "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
1901
+ "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
1902
+ ],
1903
+ "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
1904
+ "tabbable": null,
1905
+ "tooltip": null
1906
+ }
1907
+ },
1908
+ "577e1e3cc4db4942b0883577b3b52755": {
1909
+ "model_module": "@jupyter-widgets/controls",
1910
+ "model_module_version": "2.0.0",
1911
+ "model_name": "FloatProgressModel",
1912
+ "state": {
1913
+ "_dom_classes": [],
1914
+ "_model_module": "@jupyter-widgets/controls",
1915
+ "_model_module_version": "2.0.0",
1916
+ "_model_name": "FloatProgressModel",
1917
+ "_view_count": null,
1918
+ "_view_module": "@jupyter-widgets/controls",
1919
+ "_view_module_version": "2.0.0",
1920
+ "_view_name": "ProgressView",
1921
+ "bar_style": "success",
1922
+ "description": "",
1923
+ "description_allow_html": false,
1924
+ "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
1925
+ "max": 1,
1926
+ "min": 0,
1927
+ "orientation": "horizontal",
1928
+ "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
1929
+ "tabbable": null,
1930
+ "tooltip": null,
1931
+ "value": 1
1932
+ }
1933
+ },
1934
+ "6086462a12d54bafa59d3c4566f06cb2": {
1935
+ "model_module": "@jupyter-widgets/base",
1936
+ "model_module_version": "2.0.0",
1937
+ "model_name": "LayoutModel",
1938
+ "state": {
1939
+ "_model_module": "@jupyter-widgets/base",
1940
+ "_model_module_version": "2.0.0",
1941
+ "_model_name": "LayoutModel",
1942
+ "_view_count": null,
1943
+ "_view_module": "@jupyter-widgets/base",
1944
+ "_view_module_version": "2.0.0",
1945
+ "_view_name": "LayoutView",
1946
+ "align_content": null,
1947
+ "align_items": null,
1948
+ "align_self": null,
1949
+ "border_bottom": null,
1950
+ "border_left": null,
1951
+ "border_right": null,
1952
+ "border_top": null,
1953
+ "bottom": null,
1954
+ "display": null,
1955
+ "flex": null,
1956
+ "flex_flow": null,
1957
+ "grid_area": null,
1958
+ "grid_auto_columns": null,
1959
+ "grid_auto_flow": null,
1960
+ "grid_auto_rows": null,
1961
+ "grid_column": null,
1962
+ "grid_gap": null,
1963
+ "grid_row": null,
1964
+ "grid_template_areas": null,
1965
+ "grid_template_columns": null,
1966
+ "grid_template_rows": null,
1967
+ "height": null,
1968
+ "justify_content": null,
1969
+ "justify_items": null,
1970
+ "left": null,
1971
+ "margin": null,
1972
+ "max_height": null,
1973
+ "max_width": null,
1974
+ "min_height": null,
1975
+ "min_width": null,
1976
+ "object_fit": null,
1977
+ "object_position": null,
1978
+ "order": null,
1979
+ "overflow": null,
1980
+ "padding": null,
1981
+ "right": null,
1982
+ "top": null,
1983
+ "visibility": null,
1984
+ "width": null
1985
+ }
1986
+ },
1987
+ "74a6ba0c3cbc4051be0a83e152fe1e62": {
1988
+ "model_module": "@jupyter-widgets/controls",
1989
+ "model_module_version": "2.0.0",
1990
+ "model_name": "ProgressStyleModel",
1991
+ "state": {
1992
+ "_model_module": "@jupyter-widgets/controls",
1993
+ "_model_module_version": "2.0.0",
1994
+ "_model_name": "ProgressStyleModel",
1995
+ "_view_count": null,
1996
+ "_view_module": "@jupyter-widgets/base",
1997
+ "_view_module_version": "2.0.0",
1998
+ "_view_name": "StyleView",
1999
+ "bar_color": null,
2000
+ "description_width": ""
2001
+ }
2002
+ },
2003
+ "7d3f3d9e15894d05a4d188ff4f466554": {
2004
+ "model_module": "@jupyter-widgets/controls",
2005
+ "model_module_version": "2.0.0",
2006
+ "model_name": "HTMLStyleModel",
2007
+ "state": {
2008
+ "_model_module": "@jupyter-widgets/controls",
2009
+ "_model_module_version": "2.0.0",
2010
+ "_model_name": "HTMLStyleModel",
2011
+ "_view_count": null,
2012
+ "_view_module": "@jupyter-widgets/base",
2013
+ "_view_module_version": "2.0.0",
2014
+ "_view_name": "StyleView",
2015
+ "background": null,
2016
+ "description_width": "",
2017
+ "font_size": null,
2018
+ "text_color": null
2019
+ }
2020
+ },
2021
+ "b40bdfb1ac1d4cffb7cefcb870c64d45": {
2022
+ "model_module": "@jupyter-widgets/controls",
2023
+ "model_module_version": "2.0.0",
2024
+ "model_name": "HTMLModel",
2025
+ "state": {
2026
+ "_dom_classes": [],
2027
+ "_model_module": "@jupyter-widgets/controls",
2028
+ "_model_module_version": "2.0.0",
2029
+ "_model_name": "HTMLModel",
2030
+ "_view_count": null,
2031
+ "_view_module": "@jupyter-widgets/controls",
2032
+ "_view_module_version": "2.0.0",
2033
+ "_view_name": "HTMLView",
2034
+ "description": "",
2035
+ "description_allow_html": false,
2036
+ "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
2037
+ "placeholder": "​",
2038
+ "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
2039
+ "tabbable": null,
2040
+ "tooltip": null,
2041
+ "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
2042
+ }
2043
+ },
2044
+ "ca245376fd9f4354af6b2befe4af4466": {
2045
+ "model_module": "@jupyter-widgets/controls",
2046
+ "model_module_version": "2.0.0",
2047
+ "model_name": "HTMLStyleModel",
2048
+ "state": {
2049
+ "_model_module": "@jupyter-widgets/controls",
2050
+ "_model_module_version": "2.0.0",
2051
+ "_model_name": "HTMLStyleModel",
2052
+ "_view_count": null,
2053
+ "_view_module": "@jupyter-widgets/base",
2054
+ "_view_module_version": "2.0.0",
2055
+ "_view_name": "StyleView",
2056
+ "background": null,
2057
+ "description_width": "",
2058
+ "font_size": null,
2059
+ "text_color": null
2060
+ }
2061
+ },
2062
+ "dc83c7bff2f241309537a8119dfc7555": {
2063
+ "model_module": "@jupyter-widgets/base",
2064
+ "model_module_version": "2.0.0",
2065
+ "model_name": "LayoutModel",
2066
+ "state": {
2067
+ "_model_module": "@jupyter-widgets/base",
2068
+ "_model_module_version": "2.0.0",
2069
+ "_model_name": "LayoutModel",
2070
+ "_view_count": null,
2071
+ "_view_module": "@jupyter-widgets/base",
2072
+ "_view_module_version": "2.0.0",
2073
+ "_view_name": "LayoutView",
2074
+ "align_content": null,
2075
+ "align_items": null,
2076
+ "align_self": null,
2077
+ "border_bottom": null,
2078
+ "border_left": null,
2079
+ "border_right": null,
2080
+ "border_top": null,
2081
+ "bottom": null,
2082
+ "display": null,
2083
+ "flex": null,
2084
+ "flex_flow": null,
2085
+ "grid_area": null,
2086
+ "grid_auto_columns": null,
2087
+ "grid_auto_flow": null,
2088
+ "grid_auto_rows": null,
2089
+ "grid_column": null,
2090
+ "grid_gap": null,
2091
+ "grid_row": null,
2092
+ "grid_template_areas": null,
2093
+ "grid_template_columns": null,
2094
+ "grid_template_rows": null,
2095
+ "height": null,
2096
+ "justify_content": null,
2097
+ "justify_items": null,
2098
+ "left": null,
2099
+ "margin": null,
2100
+ "max_height": null,
2101
+ "max_width": null,
2102
+ "min_height": null,
2103
+ "min_width": null,
2104
+ "object_fit": null,
2105
+ "object_position": null,
2106
+ "order": null,
2107
+ "overflow": null,
2108
+ "padding": null,
2109
+ "right": null,
2110
+ "top": null,
2111
+ "visibility": null,
2112
+ "width": null
2113
+ }
2114
+ },
2115
+ "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
2116
+ "model_module": "@jupyter-widgets/controls",
2117
+ "model_module_version": "2.0.0",
2118
+ "model_name": "HTMLModel",
2119
+ "state": {
2120
+ "_dom_classes": [],
2121
+ "_model_module": "@jupyter-widgets/controls",
2122
+ "_model_module_version": "2.0.0",
2123
+ "_model_name": "HTMLModel",
2124
+ "_view_count": null,
2125
+ "_view_module": "@jupyter-widgets/controls",
2126
+ "_view_module_version": "2.0.0",
2127
+ "_view_name": "HTMLView",
2128
+ "description": "",
2129
+ "description_allow_html": false,
2130
+ "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
2131
+ "placeholder": "​",
2132
+ "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
2133
+ "tabbable": null,
2134
+ "tooltip": null,
2135
+ "value": "100%"
2136
+ }
2137
+ },
2138
+ "f1355871cc6f4dd4b50d9df5af20e5c8": {
2139
+ "model_module": "@jupyter-widgets/base",
2140
+ "model_module_version": "2.0.0",
2141
+ "model_name": "LayoutModel",
2142
+ "state": {
2143
+ "_model_module": "@jupyter-widgets/base",
2144
+ "_model_module_version": "2.0.0",
2145
+ "_model_name": "LayoutModel",
2146
+ "_view_count": null,
2147
+ "_view_module": "@jupyter-widgets/base",
2148
+ "_view_module_version": "2.0.0",
2149
+ "_view_name": "LayoutView",
2150
+ "align_content": null,
2151
+ "align_items": null,
2152
+ "align_self": null,
2153
+ "border_bottom": null,
2154
+ "border_left": null,
2155
+ "border_right": null,
2156
+ "border_top": null,
2157
+ "bottom": null,
2158
+ "display": null,
2159
+ "flex": null,
2160
+ "flex_flow": null,
2161
+ "grid_area": null,
2162
+ "grid_auto_columns": null,
2163
+ "grid_auto_flow": null,
2164
+ "grid_auto_rows": null,
2165
+ "grid_column": null,
2166
+ "grid_gap": null,
2167
+ "grid_row": null,
2168
+ "grid_template_areas": null,
2169
+ "grid_template_columns": null,
2170
+ "grid_template_rows": null,
2171
+ "height": null,
2172
+ "justify_content": null,
2173
+ "justify_items": null,
2174
+ "left": null,
2175
+ "margin": null,
2176
+ "max_height": null,
2177
+ "max_width": null,
2178
+ "min_height": null,
2179
+ "min_width": null,
2180
+ "object_fit": null,
2181
+ "object_position": null,
2182
+ "order": null,
2183
+ "overflow": null,
2184
+ "padding": null,
2185
+ "right": null,
2186
+ "top": null,
2187
+ "visibility": null,
2188
+ "width": null
2189
+ }
2190
+ }
2191
+ },
2192
+ "version_major": 2,
2193
+ "version_minor": 0
2194
+ }
2195
+ }
2196
+ },
2197
+ "nbformat": 4,
2198
+ "nbformat_minor": 2
2199
+ }
oai_completion.ipynb ADDED
@@ -0,0 +1,1189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "attachments": {},
5
+ "cell_type": "markdown",
6
+ "metadata": {},
7
+ "source": [
8
+ "<a href=\"https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/oai_completion.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
9
+ ]
10
+ },
11
+ {
12
+ "attachments": {},
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "slideshow": {
16
+ "slide_type": "slide"
17
+ }
18
+ },
19
+ "source": [
20
+ "Copyright (c) Microsoft Corporation. All rights reserved. \n",
21
+ "\n",
22
+ "Licensed under the MIT License.\n",
23
+ "\n",
24
+ "# Use AutoGen to Tune OpenAI Models\n",
25
+ "\n",
26
+ "AutoGen offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. The research study finds that tuning hyperparameters can significantly improve the utility of LLMs.\n",
27
+ "Please find documentation about this feature [here](/docs/Use-Cases/AutoGen#enhanced-inference).\n",
28
+ "\n",
29
+ "In this notebook, we tune OpenAI models for code generation. We use [the HumanEval benchmark](https://huggingface.co/datasets/openai_humaneval) released by OpenAI for synthesizing programs from docstrings.\n",
30
+ "\n",
31
+ "## Requirements\n",
32
+ "\n",
33
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install with the [blendsearch] option:\n",
34
+ "```bash\n",
35
+ "pip install pyautogen[blendsearch]\n",
36
+ "```"
37
+ ]
38
+ },
39
+ {
40
+ "cell_type": "code",
41
+ "execution_count": 1,
42
+ "metadata": {
43
+ "execution": {
44
+ "iopub.execute_input": "2023-02-24T23:25:36.910966Z",
45
+ "iopub.status.busy": "2023-02-24T23:25:36.910473Z",
46
+ "iopub.status.idle": "2023-02-24T23:25:36.914554Z",
47
+ "shell.execute_reply": "2023-02-24T23:25:36.914030Z"
48
+ }
49
+ },
50
+ "outputs": [],
51
+ "source": [
52
+ "# %pip install \"pyautogen[blendsearch]~=0.1.0\" datasets"
53
+ ]
54
+ },
55
+ {
56
+ "attachments": {},
57
+ "cell_type": "markdown",
58
+ "metadata": {},
59
+ "source": [
60
+ "## Set your API Endpoint\n",
61
+ "\n",
62
+ "* The [`config_list_openai_aoai`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_openai_aoai) function tries to create a list of configurations using Azure OpenAI endpoints and OpenAI endpoints. It assumes the api keys and api bases are stored in the corresponding environment variables or local txt files:\n",
63
+ " - OpenAI API key: os.environ[\"OPENAI_API_KEY\"] or `openai_api_key_file=\"key_openai.txt\"`.\n",
64
+ " - Azure OpenAI API key: os.environ[\"AZURE_OPENAI_API_KEY\"] or `aoai_api_key_file=\"key_aoai.txt\"`. Multiple keys can be stored, one per line.\n",
65
+ " - Azure OpenAI API base: os.environ[\"AZURE_OPENAI_API_BASE\"] or `aoai_api_base_file=\"base_aoai.txt\"`. Multiple bases can be stored, one per line.\n",
66
+ "* The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file. It first looks for environment variable `env_or_file` which needs to be a valid json string. If that variable is not found, it then looks for a json file with the same name. It filters the configs by filter_dict.\n",
67
+ "\n",
68
+ "It's OK to have only the OpenAI API key, or only the Azure OpenAI API key + base. If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n"
69
+ ]
70
+ },
71
+ {
72
+ "cell_type": "code",
73
+ "execution_count": 2,
74
+ "metadata": {
75
+ "execution": {
76
+ "iopub.execute_input": "2023-02-24T23:25:36.917301Z",
77
+ "iopub.status.busy": "2023-02-24T23:25:36.917011Z",
78
+ "iopub.status.idle": "2023-02-24T23:25:36.923156Z",
79
+ "shell.execute_reply": "2023-02-24T23:25:36.922619Z"
80
+ }
81
+ },
82
+ "outputs": [],
83
+ "source": [
84
+ "import autogen\n",
85
+ "\n",
86
+ "endpoint_list = autogen.config_list_openai_aoai()\n",
87
+ "# the endpoint_list looks like this:\n",
88
+ "# endpoint_list = [\n",
89
+ "# {\n",
90
+ "# 'api_key': '<your OpenAI API key here>',\n",
91
+ "# }, # OpenAI API endpoint for gpt-4\n",
92
+ "# {\n",
93
+ "# 'api_key': '<your first Azure OpenAI API key here>',\n",
94
+ "# 'api_base': '<your first Azure OpenAI API base here>',\n",
95
+ "# 'api_type': 'azure',\n",
96
+ "# 'api_version': '2023-03-15-preview',\n",
97
+ "# }, # Azure OpenAI API endpoint for gpt-4\n",
98
+ "# {\n",
99
+ "# 'api_key': '<your second Azure OpenAI API key here>',\n",
100
+ "# 'api_base': '<your second Azure OpenAI API base here>',\n",
101
+ "# 'api_type': 'azure',\n",
102
+ "# 'api_version': '2023-03-15-preview',\n",
103
+ "# }, # another Azure OpenAI API endpoint for gpt-4\n",
104
+ "# ]\n",
105
+ "\n",
106
+ "config_list = autogen.config_list_from_json(\n",
107
+ " env_or_file=\"OAI_CONFIG_LIST\",\n",
108
+ " filter_dict={\n",
109
+ " \"model\": {\n",
110
+ " \"gpt-3.5-turbo\",\n",
111
+ " \"gpt-3.5-turbo-16k\",\n",
112
+ " \"gpt-3.5-turbo-0301\",\n",
113
+ " \"chatgpt-35-turbo-0301\",\n",
114
+ " \"gpt-35-turbo-v0301\",\n",
115
+ " \"gpt\",\n",
116
+ " },\n",
117
+ " },\n",
118
+ ")\n",
119
+ "# the config_list looks like this:\n",
120
+ "# config_list = [\n",
121
+ "# {\n",
122
+ "# 'model': 'gpt-3.5-turbo',\n",
123
+ "# 'api_key': '<your OpenAI API key here>',\n",
124
+ "# }, # OpenAI API endpoint for gpt-3.5-turbo\n",
125
+ "# {\n",
126
+ "# 'model': 'gpt-3.5-turbo',\n",
127
+ "# 'api_key': '<your first Azure OpenAI API key here>',\n",
128
+ "# 'api_base': '<your first Azure OpenAI API base here>',\n",
129
+ "# 'api_type': 'azure',\n",
130
+ "# 'api_version': '2023-06-01-preview',\n",
131
+ "# }, # Azure OpenAI API endpoint for gpt-3.5-turbo\n",
132
+ "# {\n",
133
+ "# 'model': 'gpt-35-turbo-v0301',\n",
134
+ "# 'api_key': '<your second Azure OpenAI API key here>',\n",
135
+ "# 'api_base': '<your second Azure OpenAI API base here>',\n",
136
+ "# 'api_type': 'azure',\n",
137
+ "# 'api_version': '2023-06-01-preview',\n",
138
+ "# }, # another Azure OpenAI API endpoint for gpt-3.5-turbo with deployment name gpt-35-turbo-v0301\n",
139
+ "# ]\n"
140
+ ]
141
+ },
142
+ {
143
+ "attachments": {},
144
+ "cell_type": "markdown",
145
+ "metadata": {},
146
+ "source": [
147
+ "If you don't use the two provided utility functions above, you can define the lists in other ways you prefer.\n",
148
+ "\n",
149
+ "## Load dataset\n",
150
+ "\n",
151
+ "First, we load the humaneval dataset. The dataset contains 164 examples. We use the first 20 for tuning the generation hyperparameters and the remaining for evaluation. In each example, the \"prompt\" is the prompt string for eliciting the code generation (renamed into \"definition\"), \"test\" is the Python code for unit test for the example, and \"entry_point\" is the function name to be tested."
152
+ ]
153
+ },
154
+ {
155
+ "cell_type": "code",
156
+ "execution_count": 3,
157
+ "metadata": {
158
+ "execution": {
159
+ "iopub.execute_input": "2023-02-24T23:25:36.931255Z",
160
+ "iopub.status.busy": "2023-02-24T23:25:36.930838Z",
161
+ "iopub.status.idle": "2023-02-24T23:25:39.148799Z",
162
+ "shell.execute_reply": "2023-02-24T23:25:39.148113Z"
163
+ }
164
+ },
165
+ "outputs": [
166
+ {
167
+ "name": "stderr",
168
+ "output_type": "stream",
169
+ "text": [
170
+ "Found cached dataset openai_humaneval (/home/vscode/.cache/huggingface/datasets/openai_humaneval/openai_humaneval/1.0.0/2955cebd73602e828fa8c0a424c594e5fab4ec863b316ca98f3d8fdb6a626e75)\n"
171
+ ]
172
+ },
173
+ {
174
+ "data": {
175
+ "application/vnd.jupyter.widget-view+json": {
176
+ "model_id": "8e08cc907707418a86a3da668e45326b",
177
+ "version_major": 2,
178
+ "version_minor": 0
179
+ },
180
+ "text/plain": [
181
+ " 0%| | 0/1 [00:00<?, ?it/s]"
182
+ ]
183
+ },
184
+ "metadata": {},
185
+ "output_type": "display_data"
186
+ },
187
+ {
188
+ "name": "stderr",
189
+ "output_type": "stream",
190
+ "text": [
191
+ "Loading cached shuffled indices for dataset at /home/vscode/.cache/huggingface/datasets/openai_humaneval/openai_humaneval/1.0.0/2955cebd73602e828fa8c0a424c594e5fab4ec863b316ca98f3d8fdb6a626e75/cache-1e8448101c1b32e8.arrow\n"
192
+ ]
193
+ }
194
+ ],
195
+ "source": [
196
+ "import datasets\n",
197
+ "\n",
198
+ "seed = 41\n",
199
+ "data = datasets.load_dataset(\"openai_humaneval\")[\"test\"].shuffle(seed=seed)\n",
200
+ "n_tune_data = 20\n",
201
+ "tune_data = [\n",
202
+ " {\n",
203
+ " \"definition\": data[x][\"prompt\"],\n",
204
+ " \"test\": data[x][\"test\"],\n",
205
+ " \"entry_point\": data[x][\"entry_point\"],\n",
206
+ " }\n",
207
+ " for x in range(n_tune_data)\n",
208
+ "]\n",
209
+ "test_data = [\n",
210
+ " {\n",
211
+ " \"definition\": data[x][\"prompt\"],\n",
212
+ " \"test\": data[x][\"test\"],\n",
213
+ " \"entry_point\": data[x][\"entry_point\"],\n",
214
+ " }\n",
215
+ " for x in range(n_tune_data, len(data))\n",
216
+ "]\n"
217
+ ]
218
+ },
219
+ {
220
+ "attachments": {},
221
+ "cell_type": "markdown",
222
+ "metadata": {
223
+ "slideshow": {
224
+ "slide_type": "slide"
225
+ }
226
+ },
227
+ "source": [
228
+ "Check a tuning example:"
229
+ ]
230
+ },
231
+ {
232
+ "cell_type": "code",
233
+ "execution_count": 4,
234
+ "metadata": {
235
+ "execution": {
236
+ "iopub.execute_input": "2023-02-24T23:25:39.152156Z",
237
+ "iopub.status.busy": "2023-02-24T23:25:39.151531Z",
238
+ "iopub.status.idle": "2023-02-24T23:25:39.155313Z",
239
+ "shell.execute_reply": "2023-02-24T23:25:39.154731Z"
240
+ },
241
+ "slideshow": {
242
+ "slide_type": "subslide"
243
+ },
244
+ "tags": []
245
+ },
246
+ "outputs": [
247
+ {
248
+ "name": "stdout",
249
+ "output_type": "stream",
250
+ "text": [
251
+ "\n",
252
+ "def compare(game,guess):\n",
253
+ " \"\"\"I think we all remember that feeling when the result of some long-awaited\n",
254
+ " event is finally known. The feelings and thoughts you have at that moment are\n",
255
+ " definitely worth noting down and comparing.\n",
256
+ " Your task is to determine if a person correctly guessed the results of a number of matches.\n",
257
+ " You are given two arrays of scores and guesses of equal length, where each index shows a match. \n",
258
+ " Return an array of the same length denoting how far off each guess was. If they have guessed correctly,\n",
259
+ " the value is 0, and if not, the value is the absolute difference between the guess and the score.\n",
260
+ " \n",
261
+ " \n",
262
+ " example:\n",
263
+ "\n",
264
+ " compare([1,2,3,4,5,1],[1,2,3,4,2,-2]) -> [0,0,0,0,3,3]\n",
265
+ " compare([0,5,0,0,0,4],[4,1,1,0,0,-2]) -> [4,4,1,0,0,6]\n",
266
+ " \"\"\"\n",
267
+ "\n"
268
+ ]
269
+ }
270
+ ],
271
+ "source": [
272
+ "print(tune_data[1][\"definition\"])"
273
+ ]
274
+ },
275
+ {
276
+ "attachments": {},
277
+ "cell_type": "markdown",
278
+ "metadata": {},
279
+ "source": [
280
+ "Here is one example of the unit test code for verifying the correctness of the generated code:"
281
+ ]
282
+ },
283
+ {
284
+ "cell_type": "code",
285
+ "execution_count": 5,
286
+ "metadata": {
287
+ "execution": {
288
+ "iopub.execute_input": "2023-02-24T23:25:39.158398Z",
289
+ "iopub.status.busy": "2023-02-24T23:25:39.157766Z",
290
+ "iopub.status.idle": "2023-02-24T23:25:39.161396Z",
291
+ "shell.execute_reply": "2023-02-24T23:25:39.160797Z"
292
+ }
293
+ },
294
+ "outputs": [
295
+ {
296
+ "name": "stdout",
297
+ "output_type": "stream",
298
+ "text": [
299
+ "def check(candidate):\n",
300
+ "\n",
301
+ " # Check some simple cases\n",
302
+ " assert candidate([1,2,3,4,5,1],[1,2,3,4,2,-2])==[0,0,0,0,3,3], \"This prints if this assert fails 1 (good for debugging!)\"\n",
303
+ " assert candidate([0,0,0,0,0,0],[0,0,0,0,0,0])==[0,0,0,0,0,0], \"This prints if this assert fails 1 (good for debugging!)\"\n",
304
+ " assert candidate([1,2,3],[-1,-2,-3])==[2,4,6], \"This prints if this assert fails 1 (good for debugging!)\"\n",
305
+ " assert candidate([1,2,3,5],[-1,2,3,4])==[2,0,0,1], \"This prints if this assert fails 1 (good for debugging!)\"\n",
306
+ "\n",
307
+ " # Check some edge cases that are easy to work out by hand.\n",
308
+ " assert True, \"This prints if this assert fails 2 (also good for debugging!)\"\n",
309
+ "\n",
310
+ "\n"
311
+ ]
312
+ }
313
+ ],
314
+ "source": [
315
+ "print(tune_data[1][\"test\"])"
316
+ ]
317
+ },
318
+ {
319
+ "attachments": {},
320
+ "cell_type": "markdown",
321
+ "metadata": {},
322
+ "source": [
323
+ "## Define Success Metric\n",
324
+ "\n",
325
+ "Before we start tuning, we need to define the success metric we want to optimize. For each code generation task, we can use the model to generate multiple candidates, and then select one from them. If the final selected response can pass a unit test, we consider the task as successfully solved. Then we can define the mean success rate of a collection of tasks."
326
+ ]
327
+ },
328
+ {
329
+ "cell_type": "code",
330
+ "execution_count": 6,
331
+ "metadata": {
332
+ "execution": {
333
+ "iopub.execute_input": "2023-02-24T23:25:39.164187Z",
334
+ "iopub.status.busy": "2023-02-24T23:25:39.163867Z",
335
+ "iopub.status.idle": "2023-02-24T23:25:39.169009Z",
336
+ "shell.execute_reply": "2023-02-24T23:25:39.168427Z"
337
+ }
338
+ },
339
+ "outputs": [],
340
+ "source": [
341
+ "from functools import partial\n",
342
+ "\n",
343
+ "eval_with_generated_assertions = partial(\n",
344
+ " autogen.code_utils.eval_function_completions,\n",
345
+ " assertions=partial(autogen.code_utils.generate_assertions, config_list=config_list),\n",
346
+ " use_docker=False,\n",
347
+ " # Please set use_docker=True if you have docker available to run the generated code.\n",
348
+ " # Using docker is safer than running the generated code directly.\n",
349
+ ")\n"
350
+ ]
351
+ },
352
+ {
353
+ "attachments": {},
354
+ "cell_type": "markdown",
355
+ "metadata": {
356
+ "slideshow": {
357
+ "slide_type": "slide"
358
+ }
359
+ },
360
+ "source": [
361
+ "This function will first generate assertion statements for each problem. Then, it uses the assertions to select the generated responses.\n",
362
+ "\n",
363
+ "## Use the tuning data to find a good configuration\n",
364
+ "\n",
365
+ "AutoGen has provided an API for hyperparameter optimization of OpenAI models: `autogen.Completion.tune` and to make a request with the tuned config: `autogen.Completion.create`.\n",
366
+ "\n",
367
+ "For (local) reproducibility and cost efficiency, we cache responses from OpenAI with a controllable seed."
368
+ ]
369
+ },
370
+ {
371
+ "cell_type": "code",
372
+ "execution_count": 7,
373
+ "metadata": {
374
+ "execution": {
375
+ "iopub.execute_input": "2023-02-24T23:25:40.587815Z",
376
+ "iopub.status.busy": "2023-02-24T23:25:40.587283Z",
377
+ "iopub.status.idle": "2023-02-24T23:25:40.590826Z",
378
+ "shell.execute_reply": "2023-02-24T23:25:40.590158Z"
379
+ },
380
+ "slideshow": {
381
+ "slide_type": "slide"
382
+ }
383
+ },
384
+ "outputs": [],
385
+ "source": [
386
+ "autogen.Completion.set_cache(seed)"
387
+ ]
388
+ },
389
+ {
390
+ "attachments": {},
391
+ "cell_type": "markdown",
392
+ "metadata": {},
393
+ "source": [
394
+ "This will create a disk cache in \".cache/{seed}\". You can change `cache_path_root` from \".cache\" to a different path in `set_cache()`. The cache for different seeds are stored separately.\n",
395
+ "\n",
396
+ "### Perform tuning\n",
397
+ "\n",
398
+ "The tuning will take a while to finish, depending on the optimization budget. The tuning will be performed under the specified optimization budgets.\n",
399
+ "\n",
400
+ "* `inference_budget` is the target average inference budget per instance in the benchmark. For example, 0.02 means the target inference budget is 0.02 dollars, which translates to 1000 tokens (input + output combined) if the text Davinci model is used.\n",
401
+ "* `optimization_budget` is the total budget allowed to perform the tuning. For example, 5 means 5 dollars are allowed in total, which translates to 250K tokens for the text Davinci model.\n",
402
+ "* `num_sumples` is the number of different hyperparameter configurations which is allowed to try. The tuning will stop after either num_samples trials or after optimization_budget dollars spent, whichever happens first. -1 means no hard restriction in the number of trials and the actual number is decided by `optimization_budget`.\n",
403
+ "\n",
404
+ "Users can specify tuning data, optimization metric, optimization mode, evaluation function, search spaces etc.. The default search space is:\n",
405
+ "\n",
406
+ "```python\n",
407
+ "default_search_space = {\n",
408
+ " \"model\": tune.choice([\n",
409
+ " \"text-ada-001\",\n",
410
+ " \"text-babbage-001\",\n",
411
+ " \"text-davinci-003\",\n",
412
+ " \"gpt-3.5-turbo\",\n",
413
+ " \"gpt-4\",\n",
414
+ " ]),\n",
415
+ " \"temperature_or_top_p\": tune.choice(\n",
416
+ " [\n",
417
+ " {\"temperature\": tune.uniform(0, 1)},\n",
418
+ " {\"top_p\": tune.uniform(0, 1)},\n",
419
+ " ]\n",
420
+ " ),\n",
421
+ " \"max_tokens\": tune.lograndint(50, 1000),\n",
422
+ " \"n\": tune.randint(1, 100),\n",
423
+ " \"prompt\": \"{prompt}\",\n",
424
+ "}\n",
425
+ "```\n",
426
+ "\n",
427
+ "The default search space can be overridden by users' input.\n",
428
+ "For example, the following code specifies three choices for the prompt and two choices of stop sequences. For hyperparameters which don't appear in users' input, the default search space will be used. If you don't have access to gpt-4 or would like to modify the choice of models, you can provide a different search space for model."
429
+ ]
430
+ },
431
+ {
432
+ "cell_type": "code",
433
+ "execution_count": 8,
434
+ "metadata": {
435
+ "execution": {
436
+ "iopub.execute_input": "2023-02-24T23:25:40.593603Z",
437
+ "iopub.status.busy": "2023-02-24T23:25:40.593269Z",
438
+ "iopub.status.idle": "2023-02-24T23:26:38.349191Z",
439
+ "shell.execute_reply": "2023-02-24T23:26:38.348392Z"
440
+ }
441
+ },
442
+ "outputs": [
443
+ {
444
+ "name": "stderr",
445
+ "output_type": "stream",
446
+ "text": [
447
+ "\u001b[32m[I 2023-07-30 04:19:08,150]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n",
448
+ "\u001b[32m[I 2023-07-30 04:19:08,153]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
449
+ ]
450
+ },
451
+ {
452
+ "name": "stdout",
453
+ "output_type": "stream",
454
+ "text": [
455
+ "[flaml.tune.tune: 07-30 04:19:08] {805} INFO - trial 1 config: {'prompt': 1, 'stop': 0, 'subspace': {'model': 'text-ada-001', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}\n",
456
+ "[flaml.tune.tune: 07-30 04:22:35] {197} INFO - result: {'index_selected': 26.0, 'succeed_assertions': 0.0, 'success': 0.0, 'gen_cost': 0.000460625, 'assertions': 'assert vowels_count(\"abcde\") == 2\\nassert vowels_count(\"ACEDY\") == 3', 'total_cost': 0.010514800000000003, 'cost': 0.010514800000000003, 'inference_cost': 0.00023534000000000003, 'training_iteration': 0, 'config': {'prompt': 1, 'stop': 0, 'subspace': {'model': 'text-ada-001', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}, 'config/prompt': 1, 'config/stop': 0, 'config/subspace': {'model': 'text-ada-001', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}, 'experiment_tag': 'exp', 'time_total_s': 207.29033374786377}\n",
457
+ "[flaml.tune.tune: 07-30 04:22:35] {805} INFO - trial 2 config: {'prompt': 1, 'stop': 0, 'subspace': {'model': 'text-babbage-001', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}\n",
458
+ "[flaml.tune.tune: 07-30 04:23:18] {197} INFO - result: {'index_selected': 26.0, 'succeed_assertions': 0.0, 'success': 0.0, 'gen_cost': 0.000460625, 'assertions': 'assert vowels_count(\"abcde\") == 2\\nassert vowels_count(\"ACEDY\") == 3', 'total_cost': 0.0300243, 'cost': 0.019509500000000003, 'inference_cost': 0.0009754750000000001, 'training_iteration': 0, 'config': {'prompt': 1, 'stop': 0, 'subspace': {'model': 'text-babbage-001', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}, 'config/prompt': 1, 'config/stop': 0, 'config/subspace': {'model': 'text-babbage-001', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}, 'experiment_tag': 'exp', 'time_total_s': 42.417603969573975}\n",
459
+ "[flaml.tune.tune: 07-30 04:23:18] {805} INFO - trial 3 config: {'prompt': 1, 'stop': 0, 'subspace': {'model': 'text-davinci-003', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}\n",
460
+ "[flaml.tune.tune: 07-30 04:24:20] {197} INFO - result: {'index_selected': 2.35, 'succeed_assertions': 0.95, 'success': 0.65, 'gen_cost': 0.000460625, 'assertions': 'assert vowels_count(\"abcde\") == 2\\nassert vowels_count(\"ACEDY\") == 3', 'total_cost': 0.8658043000000002, 'cost': 0.8357800000000002, 'inference_cost': 0.04093000000000001, 'training_iteration': 0, 'config': {'prompt': 1, 'stop': 0, 'subspace': {'model': 'text-davinci-003', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}, 'config/prompt': 1, 'config/stop': 0, 'config/subspace': {'model': 'text-davinci-003', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}, 'experiment_tag': 'exp', 'time_total_s': 62.81497287750244}\n",
461
+ "[flaml.tune.tune: 07-30 04:24:20] {805} INFO - trial 4 config: {'prompt': 1, 'stop': 0, 'subspace': {'model': 'gpt-3.5-turbo', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}\n",
462
+ "[flaml.tune.tune: 07-30 04:25:39] {197} INFO - result: {'index_selected': 13.95, 'succeed_assertions': 0.55, 'success': 0.5, 'gen_cost': 0.000460625, 'assertions': 'assert vowels_count(\"abcde\") == 2\\nassert vowels_count(\"ACEDY\") == 3', 'total_cost': 0.9462703000000001, 'cost': 0.08046600000000001, 'inference_cost': 0.00399515, 'training_iteration': 0, 'config': {'prompt': 1, 'stop': 0, 'subspace': {'model': 'gpt-3.5-turbo', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}, 'config/prompt': 1, 'config/stop': 0, 'config/subspace': {'model': 'gpt-3.5-turbo', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}, 'experiment_tag': 'exp', 'time_total_s': 79.03474521636963}\n",
463
+ "[flaml.tune.tune: 07-30 04:25:39] {805} INFO - trial 5 config: {'prompt': 1, 'stop': 0, 'subspace': {'model': 'gpt-4', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}\n",
464
+ "[flaml.tune.tune: 07-30 04:25:50] {197} INFO - result: {'success': 0, 'total_cost': 1.0053703, 'cost': 0.0591, 'training_iteration': 0, 'config': {'prompt': 1, 'stop': 0, 'subspace': {'model': 'gpt-4', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}, 'config/prompt': 1, 'config/stop': 0, 'config/subspace': {'model': 'gpt-4', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}, 'experiment_tag': 'exp', 'time_total_s': 10.245523691177368}\n",
465
+ "[flaml.tune.tune: 07-30 04:25:50] {828} WARNING - fail to sample a trial for 100 times in a row, stopping.\n"
466
+ ]
467
+ }
468
+ ],
469
+ "source": [
470
+ "config, analysis = autogen.Completion.tune(\n",
471
+ " data=tune_data, # the data for tuning\n",
472
+ " metric=\"success\", # the metric to optimize\n",
473
+ " mode=\"max\", # the optimization mode\n",
474
+ " eval_func=eval_with_generated_assertions, # the evaluation function to return the success metrics\n",
475
+ " # log_file_name=\"logs/humaneval.log\", # the log file name\n",
476
+ " inference_budget=0.05, # the inference budget (dollar per instance)\n",
477
+ " optimization_budget=1, # the optimization budget (dollar in total)\n",
478
+ " # num_samples can further limit the number of trials for different hyperparameter configurations;\n",
479
+ " # -1 means decided by the optimization budget only\n",
480
+ " num_samples=-1,\n",
481
+ " prompt=[\n",
482
+ " \"{definition}\",\n",
483
+ " \"# Python 3{definition}\",\n",
484
+ " \"Complete the following Python function:{definition}\",\n",
485
+ " ], # the prompt templates to choose from\n",
486
+ " stop=[[\"\\nclass\", \"\\ndef\", \"\\nif\", \"\\nprint\"], None], # the stop sequences\n",
487
+ " config_list=endpoint_list, # optional: a list of endpoints to use\n",
488
+ " allow_format_str_template=True, # whether to allow format string template\n",
489
+ ")\n"
490
+ ]
491
+ },
492
+ {
493
+ "attachments": {},
494
+ "cell_type": "markdown",
495
+ "metadata": {},
496
+ "source": [
497
+ "### Output tuning results\n",
498
+ "\n",
499
+ "After the tuning, we can print out the config and the result found by autogen:"
500
+ ]
501
+ },
502
+ {
503
+ "cell_type": "code",
504
+ "execution_count": 9,
505
+ "metadata": {
506
+ "execution": {
507
+ "iopub.execute_input": "2023-02-24T23:26:38.352710Z",
508
+ "iopub.status.busy": "2023-02-24T23:26:38.352378Z",
509
+ "iopub.status.idle": "2023-02-24T23:26:38.356939Z",
510
+ "shell.execute_reply": "2023-02-24T23:26:38.356217Z"
511
+ }
512
+ },
513
+ "outputs": [
514
+ {
515
+ "name": "stdout",
516
+ "output_type": "stream",
517
+ "text": [
518
+ "optimized config {'prompt': '# Python 3{definition}', 'stop': ['\\nclass', '\\ndef', '\\nif', '\\nprint'], 'model': 'text-davinci-003', 'max_tokens': 148, 'n': 27, 'top_p': 0.755486898036596}\n",
519
+ "best result on tuning data {'index_selected': 2.35, 'succeed_assertions': 0.95, 'success': 0.65, 'gen_cost': 0.000460625, 'assertions': 'assert vowels_count(\"abcde\") == 2\\nassert vowels_count(\"ACEDY\") == 3', 'total_cost': 0.8658043000000002, 'cost': 0.8357800000000002, 'inference_cost': 0.04093000000000001, 'training_iteration': 0, 'config': {'prompt': 1, 'stop': 0, 'subspace': {'model': 'text-davinci-003', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}}, 'config/prompt': 1, 'config/stop': 0, 'config/subspace': {'model': 'text-davinci-003', 'max_tokens': 148, 'temperature_or_top_p': {'top_p': 0.755486898036596}, 'n': 27}, 'experiment_tag': 'exp', 'time_total_s': 62.81497287750244}\n"
520
+ ]
521
+ }
522
+ ],
523
+ "source": [
524
+ "print(\"optimized config\", config)\n",
525
+ "print(\"best result on tuning data\", analysis.best_result)"
526
+ ]
527
+ },
528
+ {
529
+ "attachments": {},
530
+ "cell_type": "markdown",
531
+ "metadata": {
532
+ "slideshow": {
533
+ "slide_type": "slide"
534
+ }
535
+ },
536
+ "source": [
537
+ "### Make a request with the tuned config\n",
538
+ "\n",
539
+ "We can apply the tuned config on the request for an example task:"
540
+ ]
541
+ },
542
+ {
543
+ "cell_type": "code",
544
+ "execution_count": 10,
545
+ "metadata": {
546
+ "execution": {
547
+ "iopub.execute_input": "2023-02-24T23:26:38.359902Z",
548
+ "iopub.status.busy": "2023-02-24T23:26:38.359506Z",
549
+ "iopub.status.idle": "2023-02-24T23:26:39.343921Z",
550
+ "shell.execute_reply": "2023-02-24T23:26:39.343051Z"
551
+ },
552
+ "slideshow": {
553
+ "slide_type": "subslide"
554
+ },
555
+ "tags": []
556
+ },
557
+ "outputs": [
558
+ {
559
+ "name": "stdout",
560
+ "output_type": "stream",
561
+ "text": [
562
+ "{\n",
563
+ " \"id\": \"cmpl-7hsFhPX6faeWYaT4y3C7IkQAgNbZR\",\n",
564
+ " \"warning\": \"This model version is deprecated. Migrate before January 4, 2024 to avoid disruption of service. Learn more https://platform.openai.com/docs/deprecations\",\n",
565
+ " \"object\": \"text_completion\",\n",
566
+ " \"created\": 1690691005,\n",
567
+ " \"model\": \"text-davinci-003\",\n",
568
+ " \"choices\": [\n",
569
+ " {\n",
570
+ " \"text\": \" results = []\\n for i in range(len(game)):\\n if game[i] == guess[i]:\\n results.append(0)\\n else:\\n results.append(abs(game[i]-guess[i]))\\n return results\",\n",
571
+ " \"index\": 0,\n",
572
+ " \"logprobs\": null,\n",
573
+ " \"finish_reason\": \"stop\"\n",
574
+ " },\n",
575
+ " {\n",
576
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
577
+ " \"index\": 1,\n",
578
+ " \"logprobs\": null,\n",
579
+ " \"finish_reason\": \"stop\"\n",
580
+ " },\n",
581
+ " {\n",
582
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i]-guess[i]))\\n return result\",\n",
583
+ " \"index\": 2,\n",
584
+ " \"logprobs\": null,\n",
585
+ " \"finish_reason\": \"stop\"\n",
586
+ " },\n",
587
+ " {\n",
588
+ " \"text\": \" res = []\\n for i in range(len(game)):\\n res.append(abs(game[i]-guess[i]))\\n return res\",\n",
589
+ " \"index\": 3,\n",
590
+ " \"logprobs\": null,\n",
591
+ " \"finish_reason\": \"stop\"\n",
592
+ " },\n",
593
+ " {\n",
594
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
595
+ " \"index\": 4,\n",
596
+ " \"logprobs\": null,\n",
597
+ " \"finish_reason\": \"stop\"\n",
598
+ " },\n",
599
+ " {\n",
600
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
601
+ " \"index\": 5,\n",
602
+ " \"logprobs\": null,\n",
603
+ " \"finish_reason\": \"stop\"\n",
604
+ " },\n",
605
+ " {\n",
606
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
607
+ " \"index\": 6,\n",
608
+ " \"logprobs\": null,\n",
609
+ " \"finish_reason\": \"stop\"\n",
610
+ " },\n",
611
+ " {\n",
612
+ " \"text\": \" results = []\\n for i in range(len(game)):\\n results.append(abs(game[i] - guess[i]))\\n return results\",\n",
613
+ " \"index\": 7,\n",
614
+ " \"logprobs\": null,\n",
615
+ " \"finish_reason\": \"stop\"\n",
616
+ " },\n",
617
+ " {\n",
618
+ " \"text\": \" res = []\\n for i in range(len(game)):\\n res.append(abs(game[i]-guess[i]))\\n return res\",\n",
619
+ " \"index\": 8,\n",
620
+ " \"logprobs\": null,\n",
621
+ " \"finish_reason\": \"stop\"\n",
622
+ " },\n",
623
+ " {\n",
624
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i]-guess[i]))\\n return result\",\n",
625
+ " \"index\": 9,\n",
626
+ " \"logprobs\": null,\n",
627
+ " \"finish_reason\": \"stop\"\n",
628
+ " },\n",
629
+ " {\n",
630
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
631
+ " \"index\": 10,\n",
632
+ " \"logprobs\": null,\n",
633
+ " \"finish_reason\": \"stop\"\n",
634
+ " },\n",
635
+ " {\n",
636
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
637
+ " \"index\": 11,\n",
638
+ " \"logprobs\": null,\n",
639
+ " \"finish_reason\": \"stop\"\n",
640
+ " },\n",
641
+ " {\n",
642
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n if game[i] == guess[i]:\\n result.append(0)\\n else:\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
643
+ " \"index\": 12,\n",
644
+ " \"logprobs\": null,\n",
645
+ " \"finish_reason\": \"stop\"\n",
646
+ " },\n",
647
+ " {\n",
648
+ " \"text\": \" # set up empty list to store differences\\n diff = []\\n # iterate through the game list and guess list\\n for i in range(len(game)):\\n # check if the guess is equal to the game\\n if game[i] == guess[i]:\\n # if so, append 0 to the diff list\\n diff.append(0)\\n # otherwise, calculate the difference between the guess and the game\\n else:\\n diff.append(abs(game[i]-guess[i]))\\n # return the diff list\\n return diff\",\n",
649
+ " \"index\": 13,\n",
650
+ " \"logprobs\": null,\n",
651
+ " \"finish_reason\": \"stop\"\n",
652
+ " },\n",
653
+ " {\n",
654
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i]-guess[i]))\\n return result\",\n",
655
+ " \"index\": 14,\n",
656
+ " \"logprobs\": null,\n",
657
+ " \"finish_reason\": \"stop\"\n",
658
+ " },\n",
659
+ " {\n",
660
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n if game[i] == guess[i]:\\n result.append(0)\\n else:\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
661
+ " \"index\": 15,\n",
662
+ " \"logprobs\": null,\n",
663
+ " \"finish_reason\": \"stop\"\n",
664
+ " },\n",
665
+ " {\n",
666
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
667
+ " \"index\": 16,\n",
668
+ " \"logprobs\": null,\n",
669
+ " \"finish_reason\": \"stop\"\n",
670
+ " },\n",
671
+ " {\n",
672
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
673
+ " \"index\": 17,\n",
674
+ " \"logprobs\": null,\n",
675
+ " \"finish_reason\": \"stop\"\n",
676
+ " },\n",
677
+ " {\n",
678
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
679
+ " \"index\": 18,\n",
680
+ " \"logprobs\": null,\n",
681
+ " \"finish_reason\": \"stop\"\n",
682
+ " },\n",
683
+ " {\n",
684
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
685
+ " \"index\": 19,\n",
686
+ " \"logprobs\": null,\n",
687
+ " \"finish_reason\": \"stop\"\n",
688
+ " },\n",
689
+ " {\n",
690
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
691
+ " \"index\": 20,\n",
692
+ " \"logprobs\": null,\n",
693
+ " \"finish_reason\": \"stop\"\n",
694
+ " },\n",
695
+ " {\n",
696
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
697
+ " \"index\": 21,\n",
698
+ " \"logprobs\": null,\n",
699
+ " \"finish_reason\": \"stop\"\n",
700
+ " },\n",
701
+ " {\n",
702
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
703
+ " \"index\": 22,\n",
704
+ " \"logprobs\": null,\n",
705
+ " \"finish_reason\": \"stop\"\n",
706
+ " },\n",
707
+ " {\n",
708
+ " \"text\": \" # your code here\\n result = []\\n for i in range(len(game)):\\n if game[i] == guess[i]:\\n result.append(0)\\n else:\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
709
+ " \"index\": 23,\n",
710
+ " \"logprobs\": null,\n",
711
+ " \"finish_reason\": \"stop\"\n",
712
+ " },\n",
713
+ " {\n",
714
+ " \"text\": \" results = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n results.append(diff)\\n return results\",\n",
715
+ " \"index\": 24,\n",
716
+ " \"logprobs\": null,\n",
717
+ " \"finish_reason\": \"stop\"\n",
718
+ " },\n",
719
+ " {\n",
720
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n diff = abs(game[i] - guess[i])\\n result.append(diff)\\n return result\",\n",
721
+ " \"index\": 25,\n",
722
+ " \"logprobs\": null,\n",
723
+ " \"finish_reason\": \"stop\"\n",
724
+ " },\n",
725
+ " {\n",
726
+ " \"text\": \" result = []\\n for i in range(len(game)):\\n result.append(abs(game[i] - guess[i]))\\n return result\",\n",
727
+ " \"index\": 26,\n",
728
+ " \"logprobs\": null,\n",
729
+ " \"finish_reason\": \"stop\"\n",
730
+ " }\n",
731
+ " ],\n",
732
+ " \"usage\": {\n",
733
+ " \"prompt_tokens\": 243,\n",
734
+ " \"completion_tokens\": 1264,\n",
735
+ " \"total_tokens\": 1507\n",
736
+ " },\n",
737
+ " \"cost\": 0.03014,\n",
738
+ " \"config_id\": 0,\n",
739
+ " \"pass_filter\": true\n",
740
+ "}\n",
741
+ "{'index_selected': 0, 'succeed_assertions': True, 'success': True, 'gen_cost': 0.000702, 'assertions': 'assert compare([1,2,3,4,5,1],[1,2,3,4,2,-2]) == [0,0,0,0,3,3]\\nassert compare([0,5,0,0,0,4],[4,1,1,0,0,-2]) == [4,4,1,0,0,6]'}\n"
742
+ ]
743
+ }
744
+ ],
745
+ "source": [
746
+ "response = autogen.Completion.create(context=tune_data[1], config_list=endpoint_list, **config)\n",
747
+ "print(response)\n",
748
+ "print(eval_with_generated_assertions(autogen.Completion.extract_text(response), **tune_data[1]))\n"
749
+ ]
750
+ },
751
+ {
752
+ "attachments": {},
753
+ "cell_type": "markdown",
754
+ "metadata": {},
755
+ "source": [
756
+ "### Evaluate the success rate on the test data\n",
757
+ "\n",
758
+ "You can use `autogen.Completion.test` to evaluate the performance of an entire dataset with the tuned config. The following code will take a while to evaluate all the 144 test data instances. The cost is about $6 if you uncomment it and run it."
759
+ ]
760
+ },
761
+ {
762
+ "cell_type": "code",
763
+ "execution_count": 12,
764
+ "metadata": {
765
+ "execution": {
766
+ "iopub.execute_input": "2023-02-24T23:26:39.347295Z",
767
+ "iopub.status.busy": "2023-02-24T23:26:39.346994Z",
768
+ "iopub.status.idle": "2023-02-24T23:29:27.160335Z",
769
+ "shell.execute_reply": "2023-02-24T23:29:27.159519Z"
770
+ }
771
+ },
772
+ "outputs": [
773
+ {
774
+ "name": "stdout",
775
+ "output_type": "stream",
776
+ "text": [
777
+ "performance on test data with the tuned config: {'index_selected': 5.222222222222222, 'succeed_assertions': 0.8402777777777778, 'success': 0.7569444444444444, 'gen_cost': 0.00044632638888888885, 'cost': 5.704979999999999, 'inference_cost': 0.03961791666666666}\n"
778
+ ]
779
+ }
780
+ ],
781
+ "source": [
782
+ "# result = autogen.Completion.test(test_data, config_list=endpoint_list, **config)\n",
783
+ "# print(\"performance on test data with the tuned config:\", result)"
784
+ ]
785
+ },
786
+ {
787
+ "attachments": {},
788
+ "cell_type": "markdown",
789
+ "metadata": {},
790
+ "source": [
791
+ "The result will vary with the inference budget and optimization budget.\n"
792
+ ]
793
+ }
794
+ ],
795
+ "metadata": {
796
+ "kernelspec": {
797
+ "display_name": "Python 3",
798
+ "language": "python",
799
+ "name": "python3"
800
+ },
801
+ "language_info": {
802
+ "codemirror_mode": {
803
+ "name": "ipython",
804
+ "version": 3
805
+ },
806
+ "file_extension": ".py",
807
+ "mimetype": "text/x-python",
808
+ "name": "python",
809
+ "nbconvert_exporter": "python",
810
+ "pygments_lexer": "ipython3",
811
+ "version": "3.9.16"
812
+ },
813
+ "vscode": {
814
+ "interpreter": {
815
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
816
+ }
817
+ },
818
+ "widgets": {
819
+ "application/vnd.jupyter.widget-state+json": {
820
+ "state": {
821
+ "24dd93300e0442788ee6cc1310e5bf14": {
822
+ "model_module": "@jupyter-widgets/controls",
823
+ "model_module_version": "2.0.0",
824
+ "model_name": "HTMLStyleModel",
825
+ "state": {
826
+ "_model_module": "@jupyter-widgets/controls",
827
+ "_model_module_version": "2.0.0",
828
+ "_model_name": "HTMLStyleModel",
829
+ "_view_count": null,
830
+ "_view_module": "@jupyter-widgets/base",
831
+ "_view_module_version": "2.0.0",
832
+ "_view_name": "StyleView",
833
+ "background": null,
834
+ "description_width": "",
835
+ "font_size": null,
836
+ "text_color": null
837
+ }
838
+ },
839
+ "35cd066a31b242bb87b2c106ee72e5f2": {
840
+ "model_module": "@jupyter-widgets/controls",
841
+ "model_module_version": "2.0.0",
842
+ "model_name": "HBoxModel",
843
+ "state": {
844
+ "_dom_classes": [],
845
+ "_model_module": "@jupyter-widgets/controls",
846
+ "_model_module_version": "2.0.0",
847
+ "_model_name": "HBoxModel",
848
+ "_view_count": null,
849
+ "_view_module": "@jupyter-widgets/controls",
850
+ "_view_module_version": "2.0.0",
851
+ "_view_name": "HBoxView",
852
+ "box_style": "",
853
+ "children": [
854
+ "IPY_MODEL_8e7ee7687a99410d88a98a74ecfcea99",
855
+ "IPY_MODEL_421e02a11a974b40b3ddb75382b3b640",
856
+ "IPY_MODEL_77db9797e78b49438d21c5c8da34b4cb"
857
+ ],
858
+ "layout": "IPY_MODEL_47d3046236a54b0e8f9ae455a82c7e0b",
859
+ "tabbable": null,
860
+ "tooltip": null
861
+ }
862
+ },
863
+ "3d5d106a38954af2bb3bde5777702f4e": {
864
+ "model_module": "@jupyter-widgets/controls",
865
+ "model_module_version": "2.0.0",
866
+ "model_name": "HTMLStyleModel",
867
+ "state": {
868
+ "_model_module": "@jupyter-widgets/controls",
869
+ "_model_module_version": "2.0.0",
870
+ "_model_name": "HTMLStyleModel",
871
+ "_view_count": null,
872
+ "_view_module": "@jupyter-widgets/base",
873
+ "_view_module_version": "2.0.0",
874
+ "_view_name": "StyleView",
875
+ "background": null,
876
+ "description_width": "",
877
+ "font_size": null,
878
+ "text_color": null
879
+ }
880
+ },
881
+ "3e1ebb31412443b0bca86a301cbdac11": {
882
+ "model_module": "@jupyter-widgets/controls",
883
+ "model_module_version": "2.0.0",
884
+ "model_name": "ProgressStyleModel",
885
+ "state": {
886
+ "_model_module": "@jupyter-widgets/controls",
887
+ "_model_module_version": "2.0.0",
888
+ "_model_name": "ProgressStyleModel",
889
+ "_view_count": null,
890
+ "_view_module": "@jupyter-widgets/base",
891
+ "_view_module_version": "2.0.0",
892
+ "_view_name": "StyleView",
893
+ "bar_color": null,
894
+ "description_width": ""
895
+ }
896
+ },
897
+ "421e02a11a974b40b3ddb75382b3b640": {
898
+ "model_module": "@jupyter-widgets/controls",
899
+ "model_module_version": "2.0.0",
900
+ "model_name": "FloatProgressModel",
901
+ "state": {
902
+ "_dom_classes": [],
903
+ "_model_module": "@jupyter-widgets/controls",
904
+ "_model_module_version": "2.0.0",
905
+ "_model_name": "FloatProgressModel",
906
+ "_view_count": null,
907
+ "_view_module": "@jupyter-widgets/controls",
908
+ "_view_module_version": "2.0.0",
909
+ "_view_name": "ProgressView",
910
+ "bar_style": "success",
911
+ "description": "",
912
+ "description_allow_html": false,
913
+ "layout": "IPY_MODEL_e6398d4027c9459a97965b9d91ae484f",
914
+ "max": 1,
915
+ "min": 0,
916
+ "orientation": "horizontal",
917
+ "style": "IPY_MODEL_3e1ebb31412443b0bca86a301cbdac11",
918
+ "tabbable": null,
919
+ "tooltip": null,
920
+ "value": 1
921
+ }
922
+ },
923
+ "47d3046236a54b0e8f9ae455a82c7e0b": {
924
+ "model_module": "@jupyter-widgets/base",
925
+ "model_module_version": "2.0.0",
926
+ "model_name": "LayoutModel",
927
+ "state": {
928
+ "_model_module": "@jupyter-widgets/base",
929
+ "_model_module_version": "2.0.0",
930
+ "_model_name": "LayoutModel",
931
+ "_view_count": null,
932
+ "_view_module": "@jupyter-widgets/base",
933
+ "_view_module_version": "2.0.0",
934
+ "_view_name": "LayoutView",
935
+ "align_content": null,
936
+ "align_items": null,
937
+ "align_self": null,
938
+ "border_bottom": null,
939
+ "border_left": null,
940
+ "border_right": null,
941
+ "border_top": null,
942
+ "bottom": null,
943
+ "display": null,
944
+ "flex": null,
945
+ "flex_flow": null,
946
+ "grid_area": null,
947
+ "grid_auto_columns": null,
948
+ "grid_auto_flow": null,
949
+ "grid_auto_rows": null,
950
+ "grid_column": null,
951
+ "grid_gap": null,
952
+ "grid_row": null,
953
+ "grid_template_areas": null,
954
+ "grid_template_columns": null,
955
+ "grid_template_rows": null,
956
+ "height": null,
957
+ "justify_content": null,
958
+ "justify_items": null,
959
+ "left": null,
960
+ "margin": null,
961
+ "max_height": null,
962
+ "max_width": null,
963
+ "min_height": null,
964
+ "min_width": null,
965
+ "object_fit": null,
966
+ "object_position": null,
967
+ "order": null,
968
+ "overflow": null,
969
+ "padding": null,
970
+ "right": null,
971
+ "top": null,
972
+ "visibility": null,
973
+ "width": null
974
+ }
975
+ },
976
+ "754800f7feb04acea977696e4787d1ff": {
977
+ "model_module": "@jupyter-widgets/base",
978
+ "model_module_version": "2.0.0",
979
+ "model_name": "LayoutModel",
980
+ "state": {
981
+ "_model_module": "@jupyter-widgets/base",
982
+ "_model_module_version": "2.0.0",
983
+ "_model_name": "LayoutModel",
984
+ "_view_count": null,
985
+ "_view_module": "@jupyter-widgets/base",
986
+ "_view_module_version": "2.0.0",
987
+ "_view_name": "LayoutView",
988
+ "align_content": null,
989
+ "align_items": null,
990
+ "align_self": null,
991
+ "border_bottom": null,
992
+ "border_left": null,
993
+ "border_right": null,
994
+ "border_top": null,
995
+ "bottom": null,
996
+ "display": null,
997
+ "flex": null,
998
+ "flex_flow": null,
999
+ "grid_area": null,
1000
+ "grid_auto_columns": null,
1001
+ "grid_auto_flow": null,
1002
+ "grid_auto_rows": null,
1003
+ "grid_column": null,
1004
+ "grid_gap": null,
1005
+ "grid_row": null,
1006
+ "grid_template_areas": null,
1007
+ "grid_template_columns": null,
1008
+ "grid_template_rows": null,
1009
+ "height": null,
1010
+ "justify_content": null,
1011
+ "justify_items": null,
1012
+ "left": null,
1013
+ "margin": null,
1014
+ "max_height": null,
1015
+ "max_width": null,
1016
+ "min_height": null,
1017
+ "min_width": null,
1018
+ "object_fit": null,
1019
+ "object_position": null,
1020
+ "order": null,
1021
+ "overflow": null,
1022
+ "padding": null,
1023
+ "right": null,
1024
+ "top": null,
1025
+ "visibility": null,
1026
+ "width": null
1027
+ }
1028
+ },
1029
+ "77db9797e78b49438d21c5c8da34b4cb": {
1030
+ "model_module": "@jupyter-widgets/controls",
1031
+ "model_module_version": "2.0.0",
1032
+ "model_name": "HTMLModel",
1033
+ "state": {
1034
+ "_dom_classes": [],
1035
+ "_model_module": "@jupyter-widgets/controls",
1036
+ "_model_module_version": "2.0.0",
1037
+ "_model_name": "HTMLModel",
1038
+ "_view_count": null,
1039
+ "_view_module": "@jupyter-widgets/controls",
1040
+ "_view_module_version": "2.0.0",
1041
+ "_view_name": "HTMLView",
1042
+ "description": "",
1043
+ "description_allow_html": false,
1044
+ "layout": "IPY_MODEL_7b6c4e1c11e249409a1edcd63be450d8",
1045
+ "placeholder": "​",
1046
+ "style": "IPY_MODEL_3d5d106a38954af2bb3bde5777702f4e",
1047
+ "tabbable": null,
1048
+ "tooltip": null,
1049
+ "value": " 1/1 [00:00&lt;00:00, 44.40it/s]"
1050
+ }
1051
+ },
1052
+ "7b6c4e1c11e249409a1edcd63be450d8": {
1053
+ "model_module": "@jupyter-widgets/base",
1054
+ "model_module_version": "2.0.0",
1055
+ "model_name": "LayoutModel",
1056
+ "state": {
1057
+ "_model_module": "@jupyter-widgets/base",
1058
+ "_model_module_version": "2.0.0",
1059
+ "_model_name": "LayoutModel",
1060
+ "_view_count": null,
1061
+ "_view_module": "@jupyter-widgets/base",
1062
+ "_view_module_version": "2.0.0",
1063
+ "_view_name": "LayoutView",
1064
+ "align_content": null,
1065
+ "align_items": null,
1066
+ "align_self": null,
1067
+ "border_bottom": null,
1068
+ "border_left": null,
1069
+ "border_right": null,
1070
+ "border_top": null,
1071
+ "bottom": null,
1072
+ "display": null,
1073
+ "flex": null,
1074
+ "flex_flow": null,
1075
+ "grid_area": null,
1076
+ "grid_auto_columns": null,
1077
+ "grid_auto_flow": null,
1078
+ "grid_auto_rows": null,
1079
+ "grid_column": null,
1080
+ "grid_gap": null,
1081
+ "grid_row": null,
1082
+ "grid_template_areas": null,
1083
+ "grid_template_columns": null,
1084
+ "grid_template_rows": null,
1085
+ "height": null,
1086
+ "justify_content": null,
1087
+ "justify_items": null,
1088
+ "left": null,
1089
+ "margin": null,
1090
+ "max_height": null,
1091
+ "max_width": null,
1092
+ "min_height": null,
1093
+ "min_width": null,
1094
+ "object_fit": null,
1095
+ "object_position": null,
1096
+ "order": null,
1097
+ "overflow": null,
1098
+ "padding": null,
1099
+ "right": null,
1100
+ "top": null,
1101
+ "visibility": null,
1102
+ "width": null
1103
+ }
1104
+ },
1105
+ "8e7ee7687a99410d88a98a74ecfcea99": {
1106
+ "model_module": "@jupyter-widgets/controls",
1107
+ "model_module_version": "2.0.0",
1108
+ "model_name": "HTMLModel",
1109
+ "state": {
1110
+ "_dom_classes": [],
1111
+ "_model_module": "@jupyter-widgets/controls",
1112
+ "_model_module_version": "2.0.0",
1113
+ "_model_name": "HTMLModel",
1114
+ "_view_count": null,
1115
+ "_view_module": "@jupyter-widgets/controls",
1116
+ "_view_module_version": "2.0.0",
1117
+ "_view_name": "HTMLView",
1118
+ "description": "",
1119
+ "description_allow_html": false,
1120
+ "layout": "IPY_MODEL_754800f7feb04acea977696e4787d1ff",
1121
+ "placeholder": "​",
1122
+ "style": "IPY_MODEL_24dd93300e0442788ee6cc1310e5bf14",
1123
+ "tabbable": null,
1124
+ "tooltip": null,
1125
+ "value": "100%"
1126
+ }
1127
+ },
1128
+ "e6398d4027c9459a97965b9d91ae484f": {
1129
+ "model_module": "@jupyter-widgets/base",
1130
+ "model_module_version": "2.0.0",
1131
+ "model_name": "LayoutModel",
1132
+ "state": {
1133
+ "_model_module": "@jupyter-widgets/base",
1134
+ "_model_module_version": "2.0.0",
1135
+ "_model_name": "LayoutModel",
1136
+ "_view_count": null,
1137
+ "_view_module": "@jupyter-widgets/base",
1138
+ "_view_module_version": "2.0.0",
1139
+ "_view_name": "LayoutView",
1140
+ "align_content": null,
1141
+ "align_items": null,
1142
+ "align_self": null,
1143
+ "border_bottom": null,
1144
+ "border_left": null,
1145
+ "border_right": null,
1146
+ "border_top": null,
1147
+ "bottom": null,
1148
+ "display": null,
1149
+ "flex": null,
1150
+ "flex_flow": null,
1151
+ "grid_area": null,
1152
+ "grid_auto_columns": null,
1153
+ "grid_auto_flow": null,
1154
+ "grid_auto_rows": null,
1155
+ "grid_column": null,
1156
+ "grid_gap": null,
1157
+ "grid_row": null,
1158
+ "grid_template_areas": null,
1159
+ "grid_template_columns": null,
1160
+ "grid_template_rows": null,
1161
+ "height": null,
1162
+ "justify_content": null,
1163
+ "justify_items": null,
1164
+ "left": null,
1165
+ "margin": null,
1166
+ "max_height": null,
1167
+ "max_width": null,
1168
+ "min_height": null,
1169
+ "min_width": null,
1170
+ "object_fit": null,
1171
+ "object_position": null,
1172
+ "order": null,
1173
+ "overflow": null,
1174
+ "padding": null,
1175
+ "right": null,
1176
+ "top": null,
1177
+ "visibility": null,
1178
+ "width": null
1179
+ }
1180
+ }
1181
+ },
1182
+ "version_major": 2,
1183
+ "version_minor": 0
1184
+ }
1185
+ }
1186
+ },
1187
+ "nbformat": 4,
1188
+ "nbformat_minor": 2
1189
+ }