How to Use Mistral Agents API (Quick Guide)

Community Article Published May 28, 2025

Introduction: The Dawn of Action-Oriented AI

Core Concepts: Understanding the Building Blocks
What are AI Agents?

Built-in Connectors: Equipping Agents with Essential Tools

MCP (Model Context Protocol) Tools: Bridging Agents and External Systems

Memory and Context: Enabling Stateful Conversations

Agent Orchestration: Collaborative Problem Solving
Dynamic Orchestration

Creating an Agentic Workflow with Handoffs

How Handoffs Work

Advanced Features and Usage
Function Calling

Supported Models

Inherited Features: Structured Outputs, Document Understanding, Citations

Deployment and Integration Strategies
Self-Deployment with vLLM

Deploy with Cloudflare Workers AI

Getting Started and Resources
Frequently Asked Questions (FAQ)

Conclusion: The Future is Agentic

Tired of Postman? Want a decent postman alternative that doesn't suck?

Apidog is a powerful all-in-one API development platform that's revolutionizing how developers design, test, and document their APIs.

Unlike traditional tools like Postman, Apidog seamlessly integrates API design, automated testing, mock servers, and documentation into a single cohesive workflow. With its intuitive interface, collaborative features, and comprehensive toolset, Apidog eliminates the need to juggle multiple applications during your API development process.

Whether you're a solo developer or part of a large team, Apidog streamlines your workflow, increases productivity, and ensures consistent API quality across your projects.

Published: May 27, 2025 (Note: This date is taken from the source material and used for consistency within the article's narrative)

Introduction: The Dawn of Action-Oriented AI

Mistral AI has recently unveiled its new Agents API, a significant advancement designed to elevate artificial intelligence from a passive text generator to an active, problem-solving partner. While traditional large language models (LLMs) have demonstrated remarkable proficiency in generating human-like text, their inherent limitations lie in their inability to perform actions, interact with external systems directly, or maintain persistent context over extended interactions. The Mistral Agents API directly addresses these shortcomings by synergizing Mistral's cutting-edge language models with a robust framework for agentic capabilities.

This new API is engineered with three core pillars:

Built-in Connectors and MCP Tool Integration: Providing agents with out-of-the-box abilities to execute code, search the web, generate images, access document libraries, and leverage a wide array of external tools through the Model Context Protocol (MCP).
Persistent Memory Across Conversations: Enabling agents to maintain context and history, leading to more coherent and meaningful long-term interactions.
Agentic Orchestration Capabilities: Allowing for the coordination of multiple specialized agents to tackle complex, multi-step tasks collaboratively.

The Agents API is not merely an extension but a powerful complement to the existing Chat Completion API. It offers a dedicated, streamlined framework specifically for implementing sophisticated agentic use cases, positioning itself as the foundational technology for enterprise-grade agentic platforms. By empowering AI agents to reliably handle intricate tasks, maintain crucial context, and orchestrate multiple actions, the Agents API unlocks new frontiers for enterprises to deploy AI in more practical, impactful, and transformative ways.

This article will provide a comprehensive technical exploration of the Mistral Agent API, delving into its core concepts, functionalities, advanced features, and deployment strategies. We aim to equip developers and architects with the knowledge required to build powerful AI agents capable of tackling real-world challenges.

Core Concepts: Understanding the Building Blocks

To fully leverage the Mistral Agent API, it's crucial to understand its fundamental components and how they interact.

What are AI Agents?

In the context of the Mistral API, AI agents are autonomous systems powered by LLMs. Given high-level instructions, these agents can:

Plan: Decompose complex goals into manageable steps.
Use Tools: Interact with various built-in connectors or external tools (via MCP or function calling) to gather information or perform actions.
Carry out Processing Steps: Analyze information, make decisions, and adapt their strategy based on new inputs.
Take Actions: Execute tasks to achieve their specified goals.

These agents utilize advanced natural language processing to understand and execute intricate tasks efficiently. Furthermore, they possess the capability to collaborate, handing off tasks to other agents with specialized skills, thereby achieving more sophisticated outcomes than a single agent could alone.

The Agents API provides developers with the infrastructure to build such agents, supported by features like:

Access to multiple multimodal models (text and vision).
Persistent state across conversations.
The ability to converse with base models, a single agent, or orchestrate multiple agents.
A suite of built-in connector tools.
Handoff capabilities for complex workflow creation.
Support for features from the chat completions endpoint, including Structured Outputs, Document Understanding, Tool Usage, and Citations.

Built-in Connectors: Equipping Agents with Essential Tools

Connectors are pre-built tools, deployed and managed by Mistral, that agents can call upon demand to perform specific tasks. These significantly expand an agent's capabilities beyond text generation.

1. Code Execution (Code Interpreter)

The Code Interpreter connector empowers agents to execute Python code within a secure, sandboxed environment. This is invaluable for tasks requiring computation, data manipulation, or visualization.

Capabilities:
- Mathematical calculations and analysis.
- Data visualization and plotting (e.g., generating graphs from data).
- Scientific computing and simulations.
- Code validation and execution of user-provided snippets.

Creating a Code Interpreter Agent: You can instantiate an agent with code execution capabilities by including "code_interpreter" in its toolset.

Python Example:

from mistralai import Mistral

client = Mistral(api_key="YOUR_API_KEY")

code_agent = client.beta.agents.create(
    model="mistral-medium-latest", # Or mistral-large-latest
    name="Coding Agent",
    description="Agent used to execute code using the interpreter tool.",
    instructions="Use the code interpreter tool when you have to run code.",
    tools=[{"type": "code_interpreter"}],
    completion_args={
        "temperature": 0.3,
        "top_p": 0.95,
    }
)
# agent_id will be in code_agent.id

cURL Example:

curl --location "https://api.mistral.ai/v1/agents" \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $MISTRAL_API_KEY" \
     --data '{
     "model": "mistral-medium-latest",
     "name": "Coding Agent",
     "description": "Agent used to execute code using the interpreter tool.",
     "instructions": "Use the code interpreter tool when you have to run code.",
     "tools": [{"type": "code_interpreter"}],
     "completion_args": {
         "temperature": 0.3,
         "top_p": 0.95
     }
 }'

Conversation and Output: When an agent uses the Code Interpreter, the API response will detail the execution.

Example Request (Python):

response = client.beta.conversations.start(
    agent_id=code_agent.id,
    inputs="Run a fibonacci function for the first 20 values."
)
# Process response.outputs

Simplified JSON Output Structure:

{
  "conversation_id": "conv_...",
  "outputs": [
    {
      "type": "message.output", // Initial agent response
      "content": "Sure, I can help with that...",
      // ... other metadata
    },
    {
      "type": "tool.execution",
      "name": "code_interpreter",
      "id": "tool_exec_...",
      "info": {
        "code": "def fibonacci(n):\n    # ... (fibonacci code)\nfibonacci_20 = fibonacci(20)\nfibonacci_20",
        "code_output": "[0, 1, 1, ... , 4181]\n"
      }
      // ... other metadata
    },
    {
      "type": "message.output", // Final agent response with results
      "content": "The first 20 values of the Fibonacci sequence are:\n\n[0, 1, ... , 4181]",
      // ... other metadata
    }
  ],
  "usage": { /* ... token usage details ... */ }
}

The tool.execution entry shows the name of the tool, and its info block contains the code that was executed and the corresponding code_output.

2. Image Generation

Powered by Black Forest Lab FLUX1.1 [pro] Ultra, the image generation connector allows agents to create diverse images based on textual prompts.

Use Cases:
- Generating visual aids for educational content.
- Creating custom graphics for marketing materials.
- Producing artistic images or illustrations.

Creating an Image Generation Agent: Include "image_generation" in the agent's tool configuration.

Python Example:

image_agent = client.beta.agents.create(
    model="mistral-medium-latest",
    name="Image Generation Agent",
    description="Agent used to generate images.",
    instructions="Use the image generation tool when you have to create images.",
    tools=[{"type": "image_generation"}],
    completion_args={ /* ... */ }
)

Conversation and Output: When an image is generated, the response includes a reference to the image file.

Example Request (Python):

response = client.beta.conversations.start(
    agent_id=image_agent.id,
    inputs="Generate an orange cat in an office."
)

Simplified JSON Output Structure:

{
  "conversation_id": "conv_...",
  "outputs": [
    {
      "type": "tool.execution",
      "name": "image_generation",
      // ... metadata
    },
    {
      "type": "message.output",
      "content": [
        {
          "type": "text",
          "text": "Here is your image: an orange cat in an office.\n\n"
        },
        {
          "type": "tool_file",
          "tool": "image_generation",
          "file_id": "933c5b5a-1c47-4cdd-84f6-f32526bd161b", // Crucial ID
          "file_name": "image_generated_0",
          "file_type": "png"
        }
      ],
      // ... other metadata
    }
  ],
  "usage": { /* ... */ }
}

The message.output contains a content array. A chunk of type: "tool_file" provides the file_id, file_name, and file_type for the generated image.

Downloading Images: The file_id obtained from the tool_file chunk is used to download the image via the files endpoint.

Python Example:

from mistralai.models import ToolFileChunk # Assuming this or similar class exists

# Assuming 'response' is the API response object
# and the relevant output is the last one
image_chunk = None
for chunk in response.outputs[-1].content:
    if hasattr(chunk, 'type') and chunk.type == 'tool_file': # Check if it's a ToolFileChunk or similar
         image_chunk = chunk
         break

if image_chunk:
    file_bytes = client.files.download(file_id=image_chunk.file_id).read()
    with open(f"{image_chunk.file_name}.{image_chunk.file_type}", "wb") as file:
        file.write(file_bytes)

3. Document Library (Beta)

This connector provides built-in Retrieval-Augmented Generation (RAG) capabilities, enabling agents to access and leverage information from documents uploaded to Mistral Cloud.

Key Features:
- Enhances agent knowledge with custom, user-provided data.
- Agents can query and retrieve relevant information from specified libraries.
Setup and Usage:
1. Create Libraries: Currently, document libraries must be created via Le Chat (Mistral's chat interface).
2. Permissions: To enable an Agent to access a library, an Organization admin must share it with the Organization.
3. Library ID: The library_id is found in the URL of the library on Le Chat (e.g., https://chat.mistral.ai/libraries/<library_id>).

Creating a Document Library Agent: Specify the document_library tool type and provide the accessible library_ids.

Python Example:

library_agent = client.beta.agents.create(
    model="mistral-medium-latest",
    name="Document Library Agent",
    description="Agent used to access documents from the document library.",
    instructions="Use the library tool to access external documents.",
    tools=[{
        "type": "document_library",
        "library_ids": ["<your_library_id_1>", "<your_library_id_2>"]
    }],
    completion_args={ /* ... */ }
)

Conversation and Output: When the agent queries the document library, the output structure is similar to other tool executions, followed by the agent's synthesized answer based on the retrieved documents.

Example Request (Python):

response = client.beta.conversations.start(
    agent_id=library_agent.id, # Corrected from image_agent.id in source example
    inputs="How does the vision encoder for Pixtral 12B work?"
)

Simplified JSON Output Structure:

{
  "conversation_id": "conv_...",
  "outputs": [
    {
      "type": "tool.execution",
      "name": "document_library",
      // ... metadata
    },
    {
      "type": "message.output",
      "content": [
        {
          "type": "text",
          "text": "The vision encoder for Pixtral 12B, known as PixtralViT, is designed to process images..." // Detailed answer
        }
        // Potentially tool_reference chunks if citations are enabled and relevant
      ],
      // ... other metadata
    }
  ],
  "usage": { /* ... */ }
}

The agent's response in message.output will contain the information retrieved and processed from the document library. This tool, like web search, can also use tool_reference chunks for citations if applicable.

4. Web Search

This connector enables agents to access up-to-date information from the internet, overcoming the knowledge cut-off limitations of LLMs.

Benefits:
- Provides current information on diverse topics.
- Allows agents to access specific websites or news sources.
- Significantly improves performance on knowledge-intensive tasks (e.g., SimpleQA benchmark scores for Mistral Large and Mistral Medium improved dramatically with web search).
Versions:
- web_search: A standard web search tool accessing a general search engine.
- web_search_premium: A more advanced version providing access to a search engine plus news agencies like AFP (Agence France-Presse) and AP (Associated Press).

Creating a Web Search Agent: Include "web_search" or "web_search_premium" in the agent's tools.

Python Example:

websearch_agent = client.beta.agents.create(
    model="mistral-medium-latest",
    description="Agent able to search information over the web...",
    name="Websearch Agent",
    instructions="You have the ability to perform web searches with `web_search` to find up-to-date information.",
    tools=[{"type": "web_search"}], // or web_search_premium
    completion_args={ /* ... */ }
)

Conversation and Output: The API response includes details of the search execution and the agent's answer, often supplemented with source citations.

Example Request (Python):

response = client.beta.conversations.start(
    agent_id=websearch_agent.id,
    inputs="Who won the last European Football cup?"
)

Simplified JSON Output Structure:

{
  "conversation_id": "conv_...",
  "outputs": [
    {
      "type": "tool.execution",
      "name": "web_search",
      // ... metadata
    },
    {
      "type": "message.output",
      "content": [
        {
          "type": "text",
          "text": "The last winner of the European Football Cup was Spain..."
        },
        {
          "type": "tool_reference", // Citation
          "tool": "web_search",
          "title": "UEFA Euro Winners List...",
          "url": "https://www.marca.com/en/football/uefa-euro/winners.html",
          "source": "brave" // Example search provider
        },
        // ... more tool_reference chunks and text chunks
      ],
      // ... other metadata
    }
  ],
  "usage": { /* ... token usage, including connector_tokens ... */ }
}

The message.output content array interleaves text chunks (the agent's answer) with tool_reference chunks. Each tool_reference provides metadata about the source (title, URL, source provider) used for that part of the answer, enhancing transparency and enabling fact-verification.

MCP (Model Context Protocol) Tools: Bridging Agents and External Systems

Beyond built-in connectors, the Agents API SDK supports tools built on the Model Context Protocol (MCP). MCP is an open, standardized protocol designed to facilitate seamless and secure integration between AI agents and a vast array of external systems.

Purpose of MCP:
- Enables agents to access real-world context, including APIs, databases, user-specific data, documents, and other dynamic resources.
- Replaces fragmented, custom integrations with a unified protocol.
- Helps AI models produce more relevant and accurate responses by connecting them to live data.
Using MCP with Mistral Agents: The Mistral Python SDK provides utilities for integrating agents with MCP Clients, allowing agents to call functions or services exposed by MCP servers.

Example: Local MCP Server for Weather Information This involves setting up a local Python script that acts as an MCP server and an agent that queries it.

Initialize Mistral Client and MCP Components:

# stdio_mcp_weather_example.py
import asyncio
import os
from mistralai import Mistral
from mistralai.extra.run.context import RunContext
from mcp import StdioServerParameters # From the 'mcp' package
from mistralai.extra.mcp.stdio import MCPClientSTDIO
from pathlib import Path
from mistralai.types import BaseModel # For structured output

# Set the current working directory and model to use
cwd = Path(__file__).parent
MODEL = "mistral-medium-latest" # or mistral-large-latest

async def main() -> None:
    api_key = os.environ.get("MISTRAL_API_KEY")
    if not api_key:
        raise ValueError("MISTRAL_API_KEY environment variable not set.")
    client = Mistral(api_key=api_key)

    # Define parameters for the local MCP server
    # Assumes 'mcp_servers/stdio_server.py' exists relative to this script
    # and implements the MCP logic (e.g., a weather service).
    server_params = StdioServerParameters(
        command="python", # Or your server executable
        args=[str((cwd / "mcp_servers/stdio_server.py").resolve())],
        env=None, # Or specific environment variables for the server
    )

    # Create an agent
    weather_agent = client.beta.agents.create(
        model=MODEL,
        name="Weather Teller Agent (MCP)",
        instructions="You can tell the weather using available tools.",
        description="An agent that uses an MCP server to fetch weather.",
    )

    # Define the expected structured output format
    class WeatherResult(BaseModel):
        user: str
        location: str
        temperature: float

    # Create a run context for the agent
    async with RunContext(
        agent_id=weather_agent.id,
        output_format=WeatherResult, # For structured output
        continue_on_fn_error=True,
    ) as run_ctx:
        # Create and register an MCP client with the run context
        # This client will communicate with the stdio_server.py
        mcp_client = MCPClientSTDIO(stdio_params=server_params)
        await run_ctx.register_mcp_client(mcp_client=mcp_client)

        # (Optional) Register local Python functions as tools too
        import random
        @run_ctx.register_func
        def get_location(name: str) -> str:
            """Function to get a mock location of a user."""
            return random.choice(["New York", "London", "Paris"])

        # Run the agent with a query
        # The agent will decide whether to use get_location or the MCP tool
        run_result = await client.beta.conversations.run_async(
            run_ctx=run_ctx,
            inputs="Tell me the weather in John's location currently.",
        )

        print("All run entries:")
        for entry in run_result.output_entries:
            print(f"{entry}")
            print()
        
        if run_result.output_as_model:
            print(f"Final model output (structured): {run_result.output_as_model}")
        else:
            print(f"Final model output (raw): {run_result.output_entries[-1] if run_result.output_entries else 'N/A'}")


if __name__ == "__main__":
    # Ensure mcp_servers/stdio_server.py exists and is executable
    # Example stub for mcp_servers/stdio_server.py:
    # #!/usr/bin/env python
    # import sys, json
    # def get_weather(location: str): return {"temperature": 25.5, "condition": "Sunny"}
    # if __name__ == "__main__":
    #     for line in sys.stdin:
    #         req = json.loads(line)
    #         # Simplified MCP interaction: assumes a 'tool_call' with 'get_weather_for_location'
    #         if req.get("tool_calls") and req["tool_calls"][0].get("function", {}).get("name") == "get_weather_for_location":
    #             loc = json.loads(req["tool_calls"][0]["function"]["arguments"]).get("location")
    #             res = {"tool_results": [{"id": req["tool_calls"][0]["id"], "result": get_weather(loc)}]}
    #             sys.stdout.write(json.dumps(res) + "\n")
    #             sys.stdout.flush()
    #         else: # Echo back if not a tool call, for basic MCP handshake
    #             sys.stdout.write(json.dumps({"message": "MCP Server Ready"}) + "\n")
    #             sys.stdout.flush()
    asyncio.run(main())

Note: The stdio_server.py in this example is highly simplified. A real MCP server would adhere to the full MCP specification for tool registration, invocation, and response.

Streaming Conversations with MCP: Similar to non-streaming, but results are processed asynchronously as they arrive.

# Inside the RunContext block
events = await client.beta.conversations.run_stream_async(
    run_ctx=run_ctx,
    inputs="Tell me the weather in John's location currently.",
)

run_result_stream = None
async for event in events:
    if isinstance(event, client.beta.conversations.RunResult): # Adjust class name as per SDK
        run_result_stream = event
    else:
        print(f"Stream event: {event}") # Print intermediate events/chunks

if not run_result_stream:
    raise RuntimeError("No run result found from stream")
# Process run_result_stream similar to run_result

Memory and Context: Enabling Stateful Conversations

A key differentiator of the Agents API is its robust conversation management system, which is inherently stateful.

Persistent Context: Each conversation retains its history, allowing for coherent and contextually aware interactions over time. Developers are relieved from manually tracking conversation history.
Starting Conversations:
- With an Agent: Initiate a conversation using a specific agent_id to leverage its pre-defined tools, instructions, and model.
```
response = client.beta.conversations.start(
    agent_id="ag_xxxxxxxx",
    inputs="Initial user query."
)
conversation_id = response.conversation_id
```
- Direct Access: Start a conversation by directly specifying the model and completion_args, providing quick access to built-in connectors without pre-defining an agent.
Conversation History: Interactions are stored as structured conversation entries, ensuring context is preserved.
Stateful Interactions and Branching:
- Developers can view past conversation entries.
- Any conversation can be continued from its last point.
- New conversation paths (branches) can be initiated from any previous point in an existing conversation.
Streaming Output: The API supports streaming for both starting new conversations and continuing existing ones, enabling real-time updates and interactive experiences.

Agent Orchestration: Collaborative Problem Solving

The true power of the Agents API emerges in its ability to orchestrate multiple agents to solve complex, multi-faceted problems.

Dynamic Orchestration

Agents can be dynamically added to or removed from a conversation as needed. Each agent contributes its unique capabilities (specialized tools, different models, or specific instructions) to tackle different parts of a larger problem.

Creating an Agentic Workflow with Handoffs

Create Necessary Agents: Begin by creating all the individual agents that will participate in the workflow. Each agent can be tailored with specific tools, models, and instructions. For example, one might create:
- A finance-agent (e.g., using mistral-large-latest for complex reasoning).
- A web-search-agent (equipped with the web_search tool).
- A calculator-agent (potentially with code_interpreter or a custom function calling tool).
- An ecb-interest-rate-agent (using function calling for a specific API).
- A graph-plotting-agent (using code_interpreter for visualizations).
cURL Example (Creating a Finance Agent):
```
curl --location "https://api.mistral.ai/v1/agents" \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $MISTRAL_API_KEY" \
     --data '{
     "model": "mistral-large-latest",
     "name": "finance-agent",
     "description": "Agent used to answer financial related requests"
  }'
# Save the returned agent_id as <finance_agent_id>
```
Repeat for other agents like web_search_agent (tool: web_search), graph_agent_id (tool: code_interpreter), calculator_agent_id (tool: code_interpreter).
Define Handoff Responsibilities: Once agents are created, update them to specify which other agents they can hand off tasks to. This is done via the handoffs parameter, which takes a list of agent_ids.

cURL Example (Updating Web Search Agent with Handoffs):
```
curl --location "https://api.mistral.ai/v1/agents/<web_search_agent_id>" \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $MISTRAL_API_KEY" \
     --data '{
     "handoffs": ["<graph_agent_id>", "<calculator_agent_id>"]
  }'
```
This configuration allows the web-search-agent to delegate tasks to the graph-plotting-agent or the calculator-agent if the conversation requires their specific capabilities.

How Handoffs Work

A single user request to an initial agent (e.g., finance-agent) can trigger a chain of actions across multiple agents. Each agent handles the part of the request relevant to its specialization.

handoff_execution Parameter: This parameter in the conversation API call controls how handoffs are managed:
- server (default): Handoffs are executed internally on Mistral's cloud servers. The process is seamless to the end-user until a final response or further input is required.
- client: When a handoff is triggered, the API returns a response to the client, indicating the pending handoff. This allows the client application to manage or even override the handoff logic, providing more control.
Example Workflow A: US Interest Rate and Compounding User Query: "Fetch the current US bank interest rate and calculate the compounded effect if investing for the next 10 years." Expected Flow (Server-side Handoff):
1. Request goes to finance-agent.
2. finance-agent determines it needs current interest rates and calculation.
3. It might hand off to web-search-agent to find the "current US bank interest rate."
4. web-search-agent performs the search and returns the rate.
5. The context (including the rate) might then be handed off (or back to finance-agent which then hands off) to calculator-agent.
6. calculator-agent uses its code_interpreter or a function to calculate the compounded interest over 10 years.
7. The final result is synthesized and returned to the user.
Example Workflow B: ECB Interest Rate and Graph Plotting User Query: "Given the interest rate of the European Central Bank as of Jan 2025, plot a graph of the compounded interest rate over the next 10 years." Expected Flow (Potentially involving client-side handoff for function call fulfillment if local):
1. Request to finance-agent.
2. finance-agent identifies need for ECB rate and graph.
3. Handoff to ecb-interest-rate-agent (which uses function calling for a specific, possibly client-managed, API to get the rate for "Jan 2025").
  - If handoff_execution: client is used, and the function is client-side, the API would signal the client to execute the function and provide the result back.
4. Once the rate is obtained, context is handed to graph-plotting-agent.
5. graph-plotting-agent uses code_interpreter to generate the plot data and potentially an image of the graph.
6. The graph (or its file_id) and analysis are returned. The image can be downloaded using its file_id (e.g., curl ... /files/<file_id>/content).

Advanced Features and Usage

Function Calling

Complementing built-in connectors and MCP, agents can use standard function calling by defining a JSON schema for custom functions. This allows agents to interact with any external API or proprietary system.

Defining Functions: When creating or updating an agent, provide a tool of type: "function" with a function object detailing its name, description, and parameters (as a JSON schema).

cURL Example (Creating an Agent with a Function):

curl --location "https://api.mistral.ai/v1/agents" \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $MISTRAL_API_KEY" \
     --data '{
     "model": "mistral-medium-latest",
     "name": "ecb-interest-rate-agent",
     "description": "Can find the current interest rate of the European central bank",
     "instructions": "You can provide interest rate and information regarding the European central bank.",
     "tools": [
         {
             "type": "function",
             "function": {
                 "name": "get_european_central_bank_interest_rate",
                 "description": "Retrieve the real interest rate of European central bank.",
                 "parameters": {
                     "type": "object",
                     "properties": {
                         "date": {
                             "type": "string",
                             "description": "The date for which to fetch the rate, e.g., YYYY-MM-DD"
                         }
                     },
                     "required": ["date"]
                 }
             }
         }
     ]
 }'
# Save the agent_id

Using an Agent with Function Calling:

Start Conversation: Send the user query. The agent might respond with a tool_calls entry if it decides to use the function. cURL (Start Conversation):

curl --location "https://api.mistral.ai/v1/conversations" \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $MISTRAL_API_KEY" \
     --data '{
     "inputs": [
         {
             "role": "user",
             "content": "Whats the current 2025 real interest rate for ECB?",
             "object": "entry",
             "type": "message.input"
         }
     ],
     "stream": false,
     "agent_id": "<agent_id_with_function>"
 }'

Expected Partial Response (if function is called): The response's outputs would contain an entry where type is tool.calls (or similar, based on exact API spec for agent function calls), including a tool_call_id and the function name and arguments.

Provide Function Result: After your client executes the function, send the result back to continue the conversation, referencing the tool_call_id. cURL (Continue Conversation with Function Result):

CONV_ID="<conversation_id_from_previous_step>"
TOOL_CALL_ID="<tool_call_id_from_previous_step>" # e.g., "6TI17yZkV"

curl --location "https://api.mistral.ai/v1/conversations/$CONV_ID" \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $MISTRAL_API_KEY" \
     --data-raw '{
     "inputs": [
         {
             "tool_call_id": "'"$TOOL_CALL_ID"'",
             "result": "{\"date\": \"2025-01-15\", \"interest_rate\": \"2.75%\"}",
             "object": "entry",
             "type": "function.result"
         }
     ],
     "stream": false,
     "store": true, 
     "handoff_execution": "server"
 }'

The agent will then use this result to formulate its final answer.

MCP Orchestration can also be used for local function calling, as shown in the MCP example with run_ctx.register_func.

Supported Models

Initially, the Agents API supports mistral-medium-latest and mistral-large-latest. Mistral AI plans to enable more models in the future.

Inherited Features: Structured Outputs, Document Understanding, Citations

The Agents API also benefits from features available in the Chat Completions API:

Structured Outputs: Agents can be guided to produce outputs in a specific JSON schema, useful for programmatic consumption (demonstrated with output_format=WeatherResult in the MCP example).
Document Understanding: Models can process and understand content from provided documents (related to the Document Library connector).
Citations: For RAG-related tool usages (like Web Search and Document Library), agents can provide citations (tool_reference chunks) indicating the source of the information used in their responses. This is crucial for transparency and fact-checking.

Deployment and Integration Strategies

Mistral models, including those powering agents, can be deployed in various environments.

Self-Deployment with vLLM

vLLM is an open-source LLM inference and serving engine, well-suited for on-premise deployment of Mistral models.

Prerequisites:
- Hardware meeting vLLM requirements.
- Hugging Face authentication (HF_TOKEN with READ permission) if sourcing weights from Hugging Face. Ensure terms are accepted on model cards.
- Alternatively, use local model artifacts by pointing vLLM to their path.

Installation & Setup:

pip install vllm # Ensure version >=0.6.1.post1 for compatibility
huggingface-cli login --token $HF_TOKEN

Offline Mode Inference (Batch Processing): Example (Mistral NeMo / Mistral Small - Text Input):

from vllm import LLM, SamplingParams

# For Mistral NeMo / Small:
model_name = "mistralai/Mistral-Nemo-Instruct-2407" # Or mistralai/Mistral-Small-latest
# For Pixtral-12B (multimodal):
# model_name = "mistralai/Pixtral-12B-Instruct-v0.1"

sampling_params = SamplingParams(max_tokens=1024, temperature=0.7) # Adjust as needed

llm = LLM(
    model=model_name,
    tokenizer_mode="mistral", # Important for Mistral models
    # For newer models, these might be auto-detected or use 'auto'
    # load_format="mistral",
    # config_format="mistral",
    # For Pixtral, you might need additional multimodal configurations
    # enable_lora=True, # if using LoRA adapters
)

messages = [{"role": "user", "content": "Who is the best French painter? Explain briefly."}]

# For Pixtral, messages would be a list of dictionaries with 'role' and 'content',
# where content can be a list of text and image items:
# messages_pixtral = [{
#     "role": "user",
#     "content": [
#         {"type": "text", "text": "Describe this image."},
#         {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} # Base64 encoded image
#     ]
# }]

# For text models:
outputs = llm.generate(prompt_token_ids=[llm.get_tokenizer().encode(messages[0]["content"])], sampling_params=sampling_params) # Or use llm.chat() for chat formatted prompts
# For chat models (newer vLLM versions support llm.chat directly)
# outputs = llm.chat(messages=messages, sampling_params=sampling_params)

# print(outputs[0].outputs[0].text) # Accessing generated text

# For Pixtral (example using generate for multimodal, check vLLM docs for latest API)
# from vllm.multimodal.utils import process_image_pil
# from PIL import Image
# image = Image.open("path_to_your_image.jpg")
# processed_image_input = {"image": image} # This API might change
# outputs = llm.generate(
#     prompts=["Describe this image <image>"], # Placeholder for image
#     multi_modal_data=processed_image_input,
#     sampling_params=sampling_params
# )
# print(outputs[0].outputs[0].text)

Note: Pixtral usage with vLLM requires specific multimodal setup; consult vLLM documentation for the most current methods.

Server Mode Inference (OpenAI Compatible API): Start Server (e.g., Mistral NeMo):

vllm serve mistralai/Mistral-Nemo-Instruct-2407 \
  --tokenizer_mode mistral \
  --load_format mistral \
  --config_format mistral \
  # --port 8000 (default)
  # --host 0.0.0.0
  # --tensor-parallel-size N (for multi-GPU)

Query Server (cURL):

curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer sk-xxxxxxxx" \ # Token can often be arbitrary for local vLLM
    -d '{
        "model": "mistralai/Mistral-Nemo-Instruct-2407",
        "messages": [
          {"role": "user", "content": "Who is the best French painter? One short sentence."}
        ],
        "max_tokens": 50
      }'

Deploying with Docker: Use vLLM's official Docker image.

export HF_TOKEN=your-hugging-face-access-token
docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HUGGING_FACE_HUB_TOKEN=${HF_TOKEN}" \
    -p 8000:8000 \
    --ipc=host \
    vllm/vllm-openai:latest \
    --model mistralai/Mistral-Nemo-Instruct-2407 \
    --tokenizer_mode mistral \
    --load_format mistral \
    --config_format mistral
    # Add other vLLM CLI args as needed

Deploy with Cloudflare Workers AI

Cloudflare Workers AI allows running LLMs on serverless GPUs across Cloudflare’s global network.

Setup:
1. Create a Cloudflare account.
2. Obtain your Account ID.
3. Generate an API Token with Workers AI permissions.

Send Completion Request (cURL Example for Mistral-7B):

ACCOUNT_ID="your_account_id"
API_TOKEN="your_api_token"

curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/ai/run/@cf/mistral/mistral-7b-instruct-v0.1" \
  -X POST \
  -H "Authorization: Bearer $API_TOKEN" \
  -d '{ "messages": [{ "role": "user", "content": "[INST] What is 2 + 2 ? [/INST]" }]}'

Expected Output:

{"result":{"response":" 2 + 2 = 4."},"success":true,"errors":[],"messages":[]}

Python and TypeScript SDKs/examples are also typically available from Cloudflare.

Getting Started and Resources

Embarking on your journey with the Mistral Agent API is straightforward:

Consult the Official Documentation: The primary source for detailed API references, guides, and updates.
Create Your First Agent: Experiment by defining a simple agent with a built-in connector like Web Search or Code Interpreter.
Explore the Cookbooks: Mistral provides practical examples and tutorials (cookbooks) showcasing agentic workflows for various applications:
- GitHub Agent: Automating software development tasks.
- Linear Tickets Assistant: Managing project deliverables.
- Financial Analyst: Sourcing financial metrics and compiling insights.
- Travel Assistant: Planning trips and booking accommodations.
- Food Diet Companion: Tracking nutrition and meal planning.

Frequently Asked Questions (FAQ)

Which models are supported? Currently, mistral-medium-latest and mistral-large-latest are supported, with plans to enable more models soon.

Conclusion: The Future is Agentic

The Mistral Agent API represents a pivotal step towards more capable and autonomous AI systems. By providing a robust framework for built-in tools, external system integration via MCP and function calling, persistent memory, and sophisticated agent orchestration, Mistral AI empowers developers to build AI agents that can actively engage with complex problems and digital environments.

The ability to chain specialized agents, each contributing unique skills, unlocks possibilities for automating intricate workflows, performing deep research, and creating highly interactive and intelligent applications. From coding assistants that manage repositories to financial analysts that synthesize market data, the potential applications are vast and transformative.

As enterprises increasingly seek to integrate AI into core operations, the Mistral Agent API offers a powerful, flexible, and scalable solution. The journey into agentic AI is just beginning, and with tools like the Mistral Agent API, developers are well-equipped to architect the next generation of intelligent systems. The combination of advanced language models with actionable capabilities heralds a new era where AI transitions from a passive information provider to an active collaborator and problem-solver.

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote