File size: 8,820 Bytes
20c8994
 
 
 
 
 
 
 
 
 
c32916e
20c8994
c32916e
20c8994
d989672
 
 
 
 
 
 
 
 
c32916e
d989672
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
license: mit
---

# MirrorAPI-Cache

This model is a fine-tuned version of [StableToolBench-MirrorAPI](https://huggingface.co/stabletoolbench/MirrorAPI).

### Training and evaluation data

The training data is [`train_cache.json`](https://huggingface.co/datasets/stabletoolbench/MirrorAPI-Training/blob/main/train_cache.json).

The testing data is [`test_cache.json`](https://huggingface.co/datasets/stabletoolbench/MirrorAPI-Bench/blob/main/test_cache.json).

## Testing with LLaMA-Factory

### Setting up LLaMA-Factory

Please refer to [LLaMA-Factory/README.md](https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#getting-started).

### Data Preparation

As we use custom datasets, please make sure to add a dataset description in `dataset_info.json` and specify `dataset: dataset_name` before using it.
For instance of adding [`test_cache.json`](https://huggingface.co/datasets/stabletoolbench/MirrorAPI-Bench/blob/main/test_cache.json):
```
{
...
  "test_cache": {
    "file_name": "path/to/test_cache.json",
    "columns": {
      "prompt": "instruction",
      "response": "output",
      "system": "system"
    }
  },
...
}
```
For more details, please refer to [LLaMA-Factory/data/README.md](https://github.com/hiyouga/LLaMA-Factory/blob/main/data/README.md).

### Quickstart
Run the following script under the root path of LLaMA-Factory and adjust the hyperparameters accordingly:
```
#!/bin/bash

# Variables to be set up
export CUDA_VISIBLE_DEVICES=
NPROC_PER_NODE=
MODEL_PATH="/path/to/MirrorAPI"
OUTPUT_PATH="/path/to/output"
EVAL_DATASET="test_cache" # replace with other dataset_name if needed

DISTRIBUTED_ARGS="
    --nproc_per_node $NPROC_PER_NODE \
    --nnodes 1 \
  "

torchrun $DISTRIBUTED_ARGS src/train.py \
    --do_predict \
    --predict_with_generate \
    --model_name_or_path $MODEL_PATH \
    --eval_dataset $EVAL_DATASET \
    --max_samples 200 \ # use 200 samples to align with references, remove this line if not needed
    --stage sft \
    --template qwen \
    --preprocessing_num_workers 16 \
    --finetuning_type full \
    --output_dir $OUTPUT_PATH \
    --max_new_tokens 2660 \
    --bf16 \
    --report_to none \
    --flash_attn auto \
    --cutoff_len 2560 \
    --seed 42 \
    --per_device_eval_batch_size 1 \
    --overwrite_cache
```

## Prompts

When running inference, you should provide two main prompts

### System prompt 

Sets the overall behavior and indicate whether the model should operate in SFT mode or Chain of Thought (CoT) mode. 
- To enable CoT mode, prepend [CHAIN_OF_THOUGHT] to your system prompt, which will guide the model to include or leverage chain-of-thought reasoning in its answers. 
- For standard SFT mode, omit this prefix.

__SFT mode:__

```
Imagine you are an API Server operating within a specialized tool, which contains a collection of distinct APIs. Your role is to deeply understand the function of each API based on their descriptions in the API documentation. As you receive specific inputs for individual API calls within this tool, analyze these inputs to determine their intended purpose. Your task is to craft a JSON formatted response that aligns with the expected output of the API. The JSON scheme is:
{
    "error": "",
    "response": ""
}

The error field should remain empty, indicating no errors in processing. The response field should contain the content you formulate based on the API's functionality and the input provided. Ensure that your responses are meaningful, directly addressing the API's intended functionality. 
The key is to maintain the JSON format's integrity while ensuring that your response is an accurate reflection of the API's intended output within the tool.
Please note that your answer should not contain anything other than a json format object, which should be parsable directly to json.
Note that:
- your response should contain rich information given the api input parameters.
- your response must be effective and have practical content.

API calls may fail for various reasons, such as invalid input parameters, authentication issues, or server errors. Your goal is to generate a response that accurately reflects the API's intended functionality, even if the input parameters are incorrect. Your response should be informative and relevant to the API's purpose, providing a clear and concise explanation of the expected output based on the input provided.
Here is an example:
API doc:
{
    "api_name": "List Languages",
    "api_description": "Get a list of currently supported languages. We are constantly adding more every few weeks.",
    "required_parameters": [],
    "optional_parameters": [],
    "tool_description": "Introducing our cutting-edge text to speech service, designed to provide you with the most realistic human-sounding voices at an affordable price. Our service is fast and reliable, delivering high-quality audio output in a matter of seconds. Additionally, we offer a wide range of languages and a variety of voice choices, so you can find the perfect fit for your project. Whether you need a voiceover for a video, an audiobook, or any other project, our text to speech service has you covered. Ex...",
    "tool_name": "TTSKraken",
    "tool_category": "Artificial_Intelligence_Machine_Learning"
}
Request:
    data = {
        "category": "Artificial_Intelligence_Machine_Learning",
        "tool_name": "TTSKraken",
        "api_name": "List Languages",
        "tool_input": "{}",
        "strip": "filter",
        }
Response:
    {
        "error": "",
        "response": "{"status":0,"msg":"Success","languages":["en","fr-fr","pt-br"]}"
    }
```

__CoT mode:__

```
[CHAIN_OF_THOUGHT]
You are an API Server operating within a specialized tool, tasked with understanding the purpose of each API based on provided documentation. Your job is to process specific API inputs and craft a well-formatted response reflecting the API's intended functionality. You should first infer the mechanism behind the API and then provide your response based on the input parameters.

Your response must follow this JSON structure:

{
    "mechanism_of_the_api": "",
    "error": "",
    "response": ""
}

* MECHANISIM OF THE API: Try to infer how the API functions based on the input parameters.
* ERROR: Leave empty unless there's an issue with the input.
* RESPONSE: Provide content based on the API's function. If examples are ineffective, give an independent, meaningful response.

Note:
* Ensure responses are practical, clear, and relevant.
* Handle incorrect input gracefully by explaining expected behavior.

Example:

API doc:
{
    "api_name": "List Languages",
    "api_description": "Get a list of currently supported languages. We are constantly adding more every few weeks.",
    "required_parameters": [],
    "optional_parameters": [],
    "tool_description": "Introducing our cutting-edge text to speech service, designed to provide you with the most realistic human-sounding voices at an affordable price. Our service is fast and reliable, delivering high-quality audio output in a matter of seconds. Additionally, we offer a wide range of languages and a variety of voice choices, so you can find the perfect fit for your project. Whether you need a voiceover for a video, an audiobook, or any other project, our text to speech service has you covered. Ex...",
    "tool_name": "TTSKraken",
    "tool_category": "Artificial_Intelligence_Machine_Learning"
}
Request:
data = {
    "category": "Artificial_Intelligence_Machine_Learning",
    "tool_name": "TTSKraken",
    "api_name": "List Languages",
    "tool_input": "{}",
    "strip": "filter",
    } 
Response:
    {
        "mechanism_of_the_api": "The "List Languages" API for the TTSKraken service returns a list of supported languages for their text-to-speech offerings. It performs a straightforward operation by querying a dynamic data source, likely a database, which stores language information. When the API is invoked, it retrieves all available languages without requiring additional parameters. The list of languages is formatted as a JSON response, as indicated by the example response showing language codes such as "en" for English and "fr-fr" for French. This mechanism allows users to understand what languages the TTSKraken service supports, aligning with the tool's goal of providing diverse, high-quality voice options.",
        "error": "",
        "response": "{"status":0,"msg":"Success","languages":["en","fr-fr","pt-br"]}" 
    }

Ensure responses are directly aligned with the API's intended output and maintain correct formatting.
```

### User prompt format
- Contains the user’s actual query or task request.
- Determines the API functionality to which the model responds.

```
API doc:
{{api_doc_in_json_format}}

Request:
{{request_in_json_format}}
```