Inference API Body Structure

#28
by Shivkumar27 - opened

Hello,
I am trying to pass the body in this format for Inference API

{
   "inputs":"[INST]{{Prompt}}.\n{{Information}}\nQuestion:{{Question}}\nAssistant:\n[/INST]",
   "parameters":
   {
    "return_full_text":false,
    "temperature":0.9,
    "max_new_tokens":2048,
    "top_p": 0.9,
    "do_sample": true
   }
}

I am getting this below error

Failed to deserialize the JSON body into the target type: missing field `messages` at line 13 column 1

The format for Inference API in the model is given in this format

{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
    "max_tokens": 500,
    "stream": false
}

Is there any way or changes which i can make so that i can provide my input in the input key instead of defining the role and content??

Thanks

You could theoretically do this using the endpoint https://api-inference.huggingface.co/models/Qwen/Qwen2.5-72B-Instruct, but DO NOT do this in 99.9% of cases. The chat template you used in the example you gave is not the chat template used by Qwen2.5. Using an instruction-tuned LLM without the correct chat template, even if the error is just a single token, can massively affect its performance. I see no reason not to use the Chat Completions API, which will apply the correct chat template.

Okay, Thanks @OptimusePrime
Is there any other suggestion which you can give so that it doesn't affect the model performance?

Thanks

Sign up or log in to comment