QscQ commited on
Commit
8d1494b
·
verified ·
1 Parent(s): 034cfed

Update docs/function_call_guide.md

Browse files
Files changed (1) hide show
  1. docs/function_call_guide.md +218 -64
docs/function_call_guide.md CHANGED
@@ -8,9 +8,122 @@ The MiniMax-M1 model supports function calling capabilities, enabling the model
8
 
9
  ## 🚀 Quick Start
10
 
11
- ### Using Chat Template
12
 
13
- MiniMax-M1 uses a specific chat template format to handle function calls. The chat template is defined in `tokenizer_config.json`, and you can use it in your code through the template.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ```python
16
  from transformers import AutoTokenizer
@@ -18,21 +131,19 @@ from transformers import AutoTokenizer
18
  def get_default_tools():
19
  return [
20
  {
21
- {
22
- "name": "get_current_weather",
23
- "description": "Get the latest weather for a location",
24
- "parameters": {
25
- "type": "object",
26
- "properties": {
27
- "location": {
28
- "type": "string",
29
- "description": "A certain city, such as Beijing, Shanghai"
30
- }
31
- },
32
- }
33
- "required": ["location"],
34
- "type": "object"
35
  }
 
 
36
  }
37
  ]
38
 
@@ -54,6 +165,27 @@ text = tokenizer.apply_chat_template(
54
  add_generation_prompt=True,
55
  tools=tools
56
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ```
58
 
59
  ## 🛠️ Function Call Definition
@@ -102,22 +234,21 @@ Function calls need to be defined in the `tools` field of the request body. Each
102
  When processed internally by the model, function definitions are converted to a special format and concatenated to the input text:
103
 
104
  ```
105
- ]~!b[]~b]system ai_setting=MiniMax AI
106
- MiniMax AI is an AI assistant independently developed by MiniMax. [e~[
107
- ]~b]system tool_setting=tools
108
  You are provided with these tools:
109
  <tools>
110
- {"name": "search_web", "description": "Search function.", "parameters": {"properties": {"query_list": {"description": "Keywords for search, with list element count of 1.", "items": {"type": "string"}, "type": "array"}, "query_tag": {"description": "Classification of the query", "items": {"type": "string"}, "type": "array"}}, "required": ["query_list", "query_tag"], "type": "object"}}
111
  </tools>
112
-
113
  If you need to call tools, please respond with <tool_calls></tool_calls> XML tags, and provide tool-name and json-object of arguments, following the format below:
114
  <tool_calls>
115
  {"name": <tool-name>, "arguments": <args-json-object>}
116
  ...
117
- </tool_calls>[e~[
118
- ]~b]user name=User
119
- When were the most recent launch events for OpenAI and Gemini?[e~[
120
- ]~b]ai name=MiniMax AI
121
  ```
122
 
123
  ### Model Output Format
@@ -134,16 +265,15 @@ Okay, I will search for the OpenAI and Gemini latest release.
134
  </tool_calls>
135
  ```
136
 
137
- ## 📥 Function Call Result Processing
138
 
139
  ### Parsing Function Calls
140
 
141
- You can use the following code to parse function calls from the model output:
142
 
143
  ```python
144
  import re
145
  import json
146
-
147
  def parse_function_calls(content: str):
148
  """
149
  Parse function calls from model output
@@ -193,23 +323,33 @@ def execute_function_call(function_name: str, arguments: dict):
193
  # Build function execution result
194
  return {
195
  "role": "tool",
196
- "name": function_name,
197
- "content": json.dumps({
198
- "location": location,
199
- "temperature": "25",
200
- "unit": "celsius",
201
- "weather": "Sunny"
202
- }, ensure_ascii=False)
203
- }
 
 
 
 
 
204
  elif function_name == "search_web":
205
  query_list = arguments.get("query_list", [])
206
  query_tag = arguments.get("query_tag", [])
207
  # Simulate search results
208
  return {
209
  "role": "tool",
210
- "name": function_name,
211
- "content": f"Search keywords: {query_list}, Categories: {query_tag}\nSearch results: Relevant information found"
212
- }
 
 
 
 
 
213
 
214
  return None
215
  ```
@@ -220,51 +360,65 @@ After successfully parsing function calls, you should add the function execution
220
 
221
  #### Single Result
222
 
223
- If the model decides to call `search_web`, we suggest you to return the function result in the following format, with the `name` field set to the specific tool name.
224
 
225
  ```json
226
  {
227
- "data": [
228
- {
229
- "role": "tool",
230
- "name": "search_web",
231
- "content": "search_result"
232
- }
 
233
  ]
234
  }
235
  ```
236
 
237
  Corresponding model input format:
238
  ```
239
- ]~b]tool name=search_web
240
- search_result[e~[
 
 
241
  ```
242
 
 
243
 
244
- #### Multiple Result
245
- If the model decides to call `search_web` and `get_current_weather` at the same time, we suggest you to return the multiple function results in the following format, with the `name` field set to "tools", and use the `content` field to contain multiple results.
246
-
247
 
248
  ```json
249
  {
250
- "data": [
251
- {
252
- "role": "tool",
253
- "name": "tools",
254
- "content": "Tool name: search_web\nTool result: test_result1\n\nTool name: get_current_weather\nTool result: test_result2"
255
- }
 
 
 
 
 
 
256
  ]
257
  }
258
  ```
259
 
260
  Corresponding model input format:
261
  ```
262
- ]~b]tool name=tools
263
- Tool name: search_web
264
- Tool result: test_result1
265
-
266
- Tool name: get_current_weather
267
- Tool result: test_result2[e~[
268
  ```
269
 
270
- While we suggest following the above formats, as long as the model input is easy to understand, the specific values of `name` and `content` is entirely up to the caller.
 
 
 
 
 
 
 
 
8
 
9
  ## 🚀 Quick Start
10
 
11
+ ### Using vLLM for Function Calls (Recommended)
12
 
13
+ In actual deployment, to support native Function Calling (tool calling) capabilities similar to OpenAI API, the MiniMax-M1 model integrates a dedicated `tool_call_parser=minimax` parser, avoiding additional regex parsing of model output.
14
+
15
+ #### Environment Setup and vLLM Recompilation
16
+
17
+ Since this feature has not been officially released in the PyPI version, compilation from source code is required. The following is an example process based on the official vLLM Docker image `vllm/vllm-openai:v0.8.3`:
18
+
19
+ ```bash
20
+ IMAGE=vllm/vllm-openai:v0.8.3
21
+ DOCKER_RUN_CMD="--network=host --privileged --ipc=host --ulimit memlock=-1 --shm-size=32gb --rm --gpus all --ulimit stack=67108864"
22
+
23
+ # Run docker
24
+ sudo docker run -it -v $MODEL_DIR:$MODEL_DIR \
25
+ -v $CODE_DIR:$CODE_DIR \
26
+ --name vllm_function_call \
27
+ $DOCKER_RUN_CMD \
28
+ --entrypoint /bin/bash \
29
+ $IMAGE
30
+ ```
31
+
32
+ #### Compiling vLLM Source Code
33
+
34
+ After entering the container, execute the following commands to get the source code and reinstall:
35
+
36
+ ```bash
37
+ cd $CODE_DIR
38
+ git clone https://github.com/vllm-project/vllm.git
39
+ cd vllm
40
+ pip install -e .
41
+ ```
42
+
43
+ #### Starting vLLM API Service
44
+
45
+ ```bash
46
+ export SAFETENSORS_FAST_GPU=1
47
+ export VLLM_USE_V1=0
48
+
49
+ python3 -m vllm.entrypoints.openai.api_server \
50
+ --model MiniMax-M1-80k \
51
+ --tensor-parallel-size 8 \
52
+ --trust-remote-code \
53
+ --quantization experts_int8 \
54
+ --enable-auto-tool-choice \
55
+ --tool-call-parser minimax \
56
+ --chat-template vllm/examples/tool_chat_template_minimax_m1.jinja \
57
+ --max_model_len 4096 \
58
+ --dtype bfloat16 \
59
+ --gpu-memory-utilization 0.85
60
+ ```
61
+
62
+ **⚠️ Note:**
63
+ - `--tool-call-parser minimax` is a key parameter for enabling the MiniMax-M1 custom parser
64
+ - `--enable-auto-tool-choice` enables automatic tool selection
65
+ - `--chat-template` template file needs to be adapted for tool calling format
66
+
67
+ #### Function Call Test Script Example
68
+
69
+ The following Python script implements a weather query function call example based on OpenAI SDK:
70
+
71
+ ```python
72
+ from openai import OpenAI
73
+ import json
74
+
75
+ client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
76
+
77
+ def get_weather(location: str, unit: str):
78
+ return f"Getting the weather for {location} in {unit}..."
79
+
80
+ tool_functions = {"get_weather": get_weather}
81
+
82
+ tools = [{
83
+ "type": "function",
84
+ "function": {
85
+ "name": "get_weather",
86
+ "description": "Get the current weather in a given location",
87
+ "parameters": {
88
+ "type": "object",
89
+ "properties": {
90
+ "location": {"type": "string", "description": "City and state, e.g., 'San Francisco, CA'"},
91
+ "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
92
+ },
93
+ "required": ["location", "unit"]
94
+ }
95
+ }
96
+ }]
97
+
98
+ response = client.chat.completions.create(
99
+ model=client.models.list().data[0].id,
100
+ messages=[{"role": "user", "content": "What's the weather like in San Francisco? use celsius."}],
101
+ tools=tools,
102
+ tool_choice="auto"
103
+ )
104
+
105
+ print(response)
106
+
107
+ tool_call = response.choices[0].message.tool_calls[0].function
108
+ print(f"Function called: {tool_call.name}")
109
+ print(f"Arguments: {tool_call.arguments}")
110
+ print(f"Result: {get_weather(**json.loads(tool_call.arguments))}")
111
+ ```
112
+
113
+ **Output Example:**
114
+ ```
115
+ Function called: get_weather
116
+ Arguments: {"location": "San Francisco, CA", "unit": "celsius"}
117
+ Result: Getting the weather for San Francisco, CA in celsius...
118
+ ```
119
+
120
+ ### Manual Parsing of Model Output
121
+
122
+ If you cannot use vLLM's built-in parser, or need to use other inference frameworks (such as transformers, TGI, etc.), you can use the following method to manually parse the model's raw output. This method requires you to parse the XML tag format of the model output yourself.
123
+
124
+ #### Using Transformers Example
125
+
126
+ The following is a complete example using the transformers library:
127
 
128
  ```python
129
  from transformers import AutoTokenizer
 
131
  def get_default_tools():
132
  return [
133
  {
134
+ "name": "get_current_weather",
135
+ "description": "Get the latest weather for a location",
136
+ "parameters": {
137
+ "type": "object",
138
+ "properties": {
139
+ "location": {
140
+ "type": "string",
141
+ "description": "A certain city, such as Beijing, Shanghai"
142
+ }
143
+ },
 
 
 
 
144
  }
145
+ "required": ["location"],
146
+ "type": "object"
147
  }
148
  ]
149
 
 
165
  add_generation_prompt=True,
166
  tools=tools
167
  )
168
+
169
+ # Send request (using any inference service here)
170
+ import requests
171
+ payload = {
172
+ "model": "MiniMaxAI/MiniMax-M1-40k",
173
+ "prompt": text,
174
+ "max_tokens": 4000
175
+ }
176
+ response = requests.post(
177
+ "http://localhost:8000/v1/completions",
178
+ headers={"Content-Type": "application/json"},
179
+ json=payload,
180
+ stream=False,
181
+ )
182
+
183
+ # Model output needs manual parsing
184
+ raw_output = response.json()["choices"][0]["text"]
185
+ print("Raw output:", raw_output)
186
+
187
+ # Use the parsing function below to process the output
188
+ function_calls = parse_function_calls(raw_output)
189
  ```
190
 
191
  ## 🛠️ Function Call Definition
 
234
  When processed internally by the model, function definitions are converted to a special format and concatenated to the input text:
235
 
236
  ```
237
+ <begin_of_document><beginning_of_sentence>system ai_setting=MiniMax AI
238
+ MiniMax AI是由上海稀宇科技有限公司(MiniMax)自主研发的AI助理。<end_of_sentence>
239
+ <beginning_of_sentence>system tool_setting=tools
240
  You are provided with these tools:
241
  <tools>
242
+ {"name": "search_web", "description": "搜索函数。", "parameters": {"properties": {"query_list": {"description": "进行搜索的关键词,列表元素个数为1", "items": {"type": "string"}, "type": "array"}, "query_tag": {"description": "query的分类", "items": {"type": "string"}, "type": "array"}}, "required": ["query_list", "query_tag"], "type": "object"}}
243
  </tools>
 
244
  If you need to call tools, please respond with <tool_calls></tool_calls> XML tags, and provide tool-name and json-object of arguments, following the format below:
245
  <tool_calls>
246
  {"name": <tool-name>, "arguments": <args-json-object>}
247
  ...
248
+ </tool_calls><end_of_sentence>
249
+ <beginning_of_sentence>user name=用户
250
+ OpenAI Gemini 的最近一次发布会都是什么时候?<end_of_sentence>
251
+ <beginning_of_sentence>ai name=MiniMax AI
252
  ```
253
 
254
  ### Model Output Format
 
265
  </tool_calls>
266
  ```
267
 
268
+ ## 📥 Manual Parsing of Function Call Results
269
 
270
  ### Parsing Function Calls
271
 
272
+ When manual parsing is required, you need to parse the XML tag format of the model output:
273
 
274
  ```python
275
  import re
276
  import json
 
277
  def parse_function_calls(content: str):
278
  """
279
  Parse function calls from model output
 
323
  # Build function execution result
324
  return {
325
  "role": "tool",
326
+ "content": [
327
+ {
328
+ "name": function_name,
329
+ "type": "text",
330
+ "text": json.dumps({
331
+ "location": location,
332
+ "temperature": "25",
333
+ "unit": "celsius",
334
+ "weather": "Sunny"
335
+ }, ensure_ascii=False)
336
+ }
337
+ ]
338
+ }
339
  elif function_name == "search_web":
340
  query_list = arguments.get("query_list", [])
341
  query_tag = arguments.get("query_tag", [])
342
  # Simulate search results
343
  return {
344
  "role": "tool",
345
+ "content": [
346
+ {
347
+ "name": function_name,
348
+ "type": "text",
349
+ "text": f"Search keywords: {query_list}, Categories: {query_tag}\nSearch results: Relevant information found"
350
+ }
351
+ ]
352
+ }
353
 
354
  return None
355
  ```
 
360
 
361
  #### Single Result
362
 
363
+ If the model calls the `search_web` function, you can refer to the following format to add execution results, with the `name` field being the specific function name.
364
 
365
  ```json
366
  {
367
+ "role": "tool",
368
+ "content": [
369
+ {
370
+ "name": "search_web",
371
+ "type": "text",
372
+ "text": "test_result"
373
+ }
374
  ]
375
  }
376
  ```
377
 
378
  Corresponding model input format:
379
  ```
380
+ <beginning_of_sentence>tool name=tools
381
+ tool name: search_web
382
+ tool result: test_result
383
+ <end_of_sentence>
384
  ```
385
 
386
+ #### Multiple Results
387
 
388
+ If the model calls both `search_web` and `get_current_weather` functions simultaneously, you can refer to the following format to add execution results, with `content` containing multiple results.
 
 
389
 
390
  ```json
391
  {
392
+ "role": "tool",
393
+ "content": [
394
+ {
395
+ "name": "search_web",
396
+ "type": "text",
397
+ "text": "test_result1"
398
+ },
399
+ {
400
+ "name": "get_current_weather",
401
+ "type": "text",
402
+ "text": "test_result2"
403
+ }
404
  ]
405
  }
406
  ```
407
 
408
  Corresponding model input format:
409
  ```
410
+ <beginning_of_sentence>tool name=tools
411
+ tool name: search_web
412
+ tool result: test_result1
413
+ tool name: get_current_weather
414
+ tool result: test_result2<end_of_sentence>
 
415
  ```
416
 
417
+ While we recommend following the above formats, as long as the input returned to the model is easy to understand, the specific content of `name` and `text` is entirely up to you.
418
+
419
+ ## 📚 References
420
+
421
+ - [MiniMax-M1 Model Repository](https://github.com/MiniMaxAI/MiniMax-M1)
422
+ - [vLLM Project Homepage](https://github.com/vllm-project/vllm)
423
+ - [vLLM Function Calling PR](https://github.com/vllm-project/vllm/pull/20297)
424
+ - [OpenAI Python SDK](https://github.com/openai/openai-python)