How to Use Mistral Agents API (Quick Guide)
Tired of Postman? Want a decent postman alternative that doesn't suck?
Apidog is a powerful all-in-one API development platform that's revolutionizing how developers design, test, and document their APIs.
Unlike traditional tools like Postman, Apidog seamlessly integrates API design, automated testing, mock servers, and documentation into a single cohesive workflow. With its intuitive interface, collaborative features, and comprehensive toolset, Apidog eliminates the need to juggle multiple applications during your API development process.
Whether you're a solo developer or part of a large team, Apidog streamlines your workflow, increases productivity, and ensures consistent API quality across your projects.
Published: May 27, 2025 (Note: This date is taken from the source material and used for consistency within the article's narrative)
Introduction: The Dawn of Action-Oriented AI
Mistral AI has recently unveiled its new Agents API, a significant advancement designed to elevate artificial intelligence from a passive text generator to an active, problem-solving partner. While traditional large language models (LLMs) have demonstrated remarkable proficiency in generating human-like text, their inherent limitations lie in their inability to perform actions, interact with external systems directly, or maintain persistent context over extended interactions. The Mistral Agents API directly addresses these shortcomings by synergizing Mistral's cutting-edge language models with a robust framework for agentic capabilities.
This new API is engineered with three core pillars:
- Built-in Connectors and MCP Tool Integration: Providing agents with out-of-the-box abilities to execute code, search the web, generate images, access document libraries, and leverage a wide array of external tools through the Model Context Protocol (MCP).
- Persistent Memory Across Conversations: Enabling agents to maintain context and history, leading to more coherent and meaningful long-term interactions.
- Agentic Orchestration Capabilities: Allowing for the coordination of multiple specialized agents to tackle complex, multi-step tasks collaboratively.
The Agents API is not merely an extension but a powerful complement to the existing Chat Completion API. It offers a dedicated, streamlined framework specifically for implementing sophisticated agentic use cases, positioning itself as the foundational technology for enterprise-grade agentic platforms. By empowering AI agents to reliably handle intricate tasks, maintain crucial context, and orchestrate multiple actions, the Agents API unlocks new frontiers for enterprises to deploy AI in more practical, impactful, and transformative ways.
This article will provide a comprehensive technical exploration of the Mistral Agent API, delving into its core concepts, functionalities, advanced features, and deployment strategies. We aim to equip developers and architects with the knowledge required to build powerful AI agents capable of tackling real-world challenges.
Core Concepts: Understanding the Building Blocks
To fully leverage the Mistral Agent API, it's crucial to understand its fundamental components and how they interact.
What are AI Agents?
In the context of the Mistral API, AI agents are autonomous systems powered by LLMs. Given high-level instructions, these agents can:
- Plan: Decompose complex goals into manageable steps.
- Use Tools: Interact with various built-in connectors or external tools (via MCP or function calling) to gather information or perform actions.
- Carry out Processing Steps: Analyze information, make decisions, and adapt their strategy based on new inputs.
- Take Actions: Execute tasks to achieve their specified goals.
These agents utilize advanced natural language processing to understand and execute intricate tasks efficiently. Furthermore, they possess the capability to collaborate, handing off tasks to other agents with specialized skills, thereby achieving more sophisticated outcomes than a single agent could alone.
The Agents API provides developers with the infrastructure to build such agents, supported by features like:
- Access to multiple multimodal models (text and vision).
- Persistent state across conversations.
- The ability to converse with base models, a single agent, or orchestrate multiple agents.
- A suite of built-in connector tools.
- Handoff capabilities for complex workflow creation.
- Support for features from the chat completions endpoint, including Structured Outputs, Document Understanding, Tool Usage, and Citations.
Built-in Connectors: Equipping Agents with Essential Tools
Connectors are pre-built tools, deployed and managed by Mistral, that agents can call upon demand to perform specific tasks. These significantly expand an agent's capabilities beyond text generation.
1. Code Execution (Code Interpreter)
The Code Interpreter connector empowers agents to execute Python code within a secure, sandboxed environment. This is invaluable for tasks requiring computation, data manipulation, or visualization.
Capabilities:
- Mathematical calculations and analysis.
- Data visualization and plotting (e.g., generating graphs from data).
- Scientific computing and simulations.
- Code validation and execution of user-provided snippets.
Creating a Code Interpreter Agent: You can instantiate an agent with code execution capabilities by including
"code_interpreter"
in its toolset.Python Example:
from mistralai import Mistral client = Mistral(api_key="YOUR_API_KEY") code_agent = client.beta.agents.create( model="mistral-medium-latest", # Or mistral-large-latest name="Coding Agent", description="Agent used to execute code using the interpreter tool.", instructions="Use the code interpreter tool when you have to run code.", tools=[{"type": "code_interpreter"}], completion_args={ "temperature": 0.3, "top_p": 0.95, } ) # agent_id will be in code_agent.id
cURL Example:
curl --location "https://api.mistral.ai/v1/agents" \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --header "Authorization: Bearer $MISTRAL_API_KEY" \ --data '{ "model": "mistral-medium-latest", "name": "Coding Agent", "description": "Agent used to execute code using the interpreter tool.", "instructions": "Use the code interpreter tool when you have to run code.", "tools": [{"type": "code_interpreter"}], "completion_args": { "temperature": 0.3, "top_p": 0.95 } }'
Conversation and Output: When an agent uses the Code Interpreter, the API response will detail the execution.
Example Request (Python):
response = client.beta.conversations.start( agent_id=code_agent.id, inputs="Run a fibonacci function for the first 20 values." ) # Process response.outputs
Simplified JSON Output Structure:
{ "conversation_id": "conv_...", "outputs": [ { "type": "message.output", // Initial agent response "content": "Sure, I can help with that...", // ... other metadata }, { "type": "tool.execution", "name": "code_interpreter", "id": "tool_exec_...", "info": { "code": "def fibonacci(n):\n # ... (fibonacci code)\nfibonacci_20 = fibonacci(20)\nfibonacci_20", "code_output": "[0, 1, 1, ... , 4181]\n" } // ... other metadata }, { "type": "message.output", // Final agent response with results "content": "The first 20 values of the Fibonacci sequence are:\n\n[0, 1, ... , 4181]", // ... other metadata } ], "usage": { /* ... token usage details ... */ } }
The
tool.execution
entry shows thename
of the tool, and itsinfo
block contains thecode
that was executed and the correspondingcode_output
.
2. Image Generation
Powered by Black Forest Lab FLUX1.1 [pro] Ultra, the image generation connector allows agents to create diverse images based on textual prompts.
Use Cases:
- Generating visual aids for educational content.
- Creating custom graphics for marketing materials.
- Producing artistic images or illustrations.
Creating an Image Generation Agent: Include
"image_generation"
in the agent's tool configuration.Python Example:
image_agent = client.beta.agents.create( model="mistral-medium-latest", name="Image Generation Agent", description="Agent used to generate images.", instructions="Use the image generation tool when you have to create images.", tools=[{"type": "image_generation"}], completion_args={ /* ... */ } )
Conversation and Output: When an image is generated, the response includes a reference to the image file.
Example Request (Python):
response = client.beta.conversations.start( agent_id=image_agent.id, inputs="Generate an orange cat in an office." )
Simplified JSON Output Structure:
{ "conversation_id": "conv_...", "outputs": [ { "type": "tool.execution", "name": "image_generation", // ... metadata }, { "type": "message.output", "content": [ { "type": "text", "text": "Here is your image: an orange cat in an office.\n\n" }, { "type": "tool_file", "tool": "image_generation", "file_id": "933c5b5a-1c47-4cdd-84f6-f32526bd161b", // Crucial ID "file_name": "image_generated_0", "file_type": "png" } ], // ... other metadata } ], "usage": { /* ... */ } }
The
message.output
contains acontent
array. A chunk oftype: "tool_file"
provides thefile_id
,file_name
, andfile_type
for the generated image.Downloading Images: The
file_id
obtained from thetool_file
chunk is used to download the image via the files endpoint.Python Example:
from mistralai.models import ToolFileChunk # Assuming this or similar class exists # Assuming 'response' is the API response object # and the relevant output is the last one image_chunk = None for chunk in response.outputs[-1].content: if hasattr(chunk, 'type') and chunk.type == 'tool_file': # Check if it's a ToolFileChunk or similar image_chunk = chunk break if image_chunk: file_bytes = client.files.download(file_id=image_chunk.file_id).read() with open(f"{image_chunk.file_name}.{image_chunk.file_type}", "wb") as file: file.write(file_bytes)
3. Document Library (Beta)
This connector provides built-in Retrieval-Augmented Generation (RAG) capabilities, enabling agents to access and leverage information from documents uploaded to Mistral Cloud.
Key Features:
- Enhances agent knowledge with custom, user-provided data.
- Agents can query and retrieve relevant information from specified libraries.
Setup and Usage:
- Create Libraries: Currently, document libraries must be created via Le Chat (Mistral's chat interface).
- Permissions: To enable an Agent to access a library, an Organization admin must share it with the Organization.
- Library ID: The
library_id
is found in the URL of the library on Le Chat (e.g.,https://chat.mistral.ai/libraries/<library_id>
).
Creating a Document Library Agent: Specify the
document_library
tool type and provide the accessiblelibrary_ids
.Python Example:
library_agent = client.beta.agents.create( model="mistral-medium-latest", name="Document Library Agent", description="Agent used to access documents from the document library.", instructions="Use the library tool to access external documents.", tools=[{ "type": "document_library", "library_ids": ["<your_library_id_1>", "<your_library_id_2>"] }], completion_args={ /* ... */ } )
Conversation and Output: When the agent queries the document library, the output structure is similar to other tool executions, followed by the agent's synthesized answer based on the retrieved documents.
Example Request (Python):
response = client.beta.conversations.start( agent_id=library_agent.id, # Corrected from image_agent.id in source example inputs="How does the vision encoder for Pixtral 12B work?" )
Simplified JSON Output Structure:
{ "conversation_id": "conv_...", "outputs": [ { "type": "tool.execution", "name": "document_library", // ... metadata }, { "type": "message.output", "content": [ { "type": "text", "text": "The vision encoder for Pixtral 12B, known as PixtralViT, is designed to process images..." // Detailed answer } // Potentially tool_reference chunks if citations are enabled and relevant ], // ... other metadata } ], "usage": { /* ... */ } }
The agent's response in
message.output
will contain the information retrieved and processed from the document library. This tool, like web search, can also usetool_reference
chunks for citations if applicable.
4. Web Search
This connector enables agents to access up-to-date information from the internet, overcoming the knowledge cut-off limitations of LLMs.
Benefits:
- Provides current information on diverse topics.
- Allows agents to access specific websites or news sources.
- Significantly improves performance on knowledge-intensive tasks (e.g., SimpleQA benchmark scores for Mistral Large and Mistral Medium improved dramatically with web search).
Versions:
web_search
: A standard web search tool accessing a general search engine.web_search_premium
: A more advanced version providing access to a search engine plus news agencies like AFP (Agence France-Presse) and AP (Associated Press).
Creating a Web Search Agent: Include
"web_search"
or"web_search_premium"
in the agent's tools.Python Example:
websearch_agent = client.beta.agents.create( model="mistral-medium-latest", description="Agent able to search information over the web...", name="Websearch Agent", instructions="You have the ability to perform web searches with `web_search` to find up-to-date information.", tools=[{"type": "web_search"}], // or web_search_premium completion_args={ /* ... */ } )
Conversation and Output: The API response includes details of the search execution and the agent's answer, often supplemented with source citations.
Example Request (Python):
response = client.beta.conversations.start( agent_id=websearch_agent.id, inputs="Who won the last European Football cup?" )
Simplified JSON Output Structure:
{ "conversation_id": "conv_...", "outputs": [ { "type": "tool.execution", "name": "web_search", // ... metadata }, { "type": "message.output", "content": [ { "type": "text", "text": "The last winner of the European Football Cup was Spain..." }, { "type": "tool_reference", // Citation "tool": "web_search", "title": "UEFA Euro Winners List...", "url": "https://www.marca.com/en/football/uefa-euro/winners.html", "source": "brave" // Example search provider }, // ... more tool_reference chunks and text chunks ], // ... other metadata } ], "usage": { /* ... token usage, including connector_tokens ... */ } }
The
message.output
content
array interleavestext
chunks (the agent's answer) withtool_reference
chunks. Eachtool_reference
provides metadata about the source (title, URL, source provider) used for that part of the answer, enhancing transparency and enabling fact-verification.
MCP (Model Context Protocol) Tools: Bridging Agents and External Systems
Beyond built-in connectors, the Agents API SDK supports tools built on the Model Context Protocol (MCP). MCP is an open, standardized protocol designed to facilitate seamless and secure integration between AI agents and a vast array of external systems.
Purpose of MCP:
- Enables agents to access real-world context, including APIs, databases, user-specific data, documents, and other dynamic resources.
- Replaces fragmented, custom integrations with a unified protocol.
- Helps AI models produce more relevant and accurate responses by connecting them to live data.
Using MCP with Mistral Agents: The Mistral Python SDK provides utilities for integrating agents with MCP Clients, allowing agents to call functions or services exposed by MCP servers.
Example: Local MCP Server for Weather Information This involves setting up a local Python script that acts as an MCP server and an agent that queries it.
Initialize Mistral Client and MCP Components:
# stdio_mcp_weather_example.py import asyncio import os from mistralai import Mistral from mistralai.extra.run.context import RunContext from mcp import StdioServerParameters # From the 'mcp' package from mistralai.extra.mcp.stdio import MCPClientSTDIO from pathlib import Path from mistralai.types import BaseModel # For structured output # Set the current working directory and model to use cwd = Path(__file__).parent MODEL = "mistral-medium-latest" # or mistral-large-latest async def main() -> None: api_key = os.environ.get("MISTRAL_API_KEY") if not api_key: raise ValueError("MISTRAL_API_KEY environment variable not set.") client = Mistral(api_key=api_key) # Define parameters for the local MCP server # Assumes 'mcp_servers/stdio_server.py' exists relative to this script # and implements the MCP logic (e.g., a weather service). server_params = StdioServerParameters( command="python", # Or your server executable args=[str((cwd / "mcp_servers/stdio_server.py").resolve())], env=None, # Or specific environment variables for the server ) # Create an agent weather_agent = client.beta.agents.create( model=MODEL, name="Weather Teller Agent (MCP)", instructions="You can tell the weather using available tools.", description="An agent that uses an MCP server to fetch weather.", ) # Define the expected structured output format class WeatherResult(BaseModel): user: str location: str temperature: float # Create a run context for the agent async with RunContext( agent_id=weather_agent.id, output_format=WeatherResult, # For structured output continue_on_fn_error=True, ) as run_ctx: # Create and register an MCP client with the run context # This client will communicate with the stdio_server.py mcp_client = MCPClientSTDIO(stdio_params=server_params) await run_ctx.register_mcp_client(mcp_client=mcp_client) # (Optional) Register local Python functions as tools too import random @run_ctx.register_func def get_location(name: str) -> str: """Function to get a mock location of a user.""" return random.choice(["New York", "London", "Paris"]) # Run the agent with a query # The agent will decide whether to use get_location or the MCP tool run_result = await client.beta.conversations.run_async( run_ctx=run_ctx, inputs="Tell me the weather in John's location currently.", ) print("All run entries:") for entry in run_result.output_entries: print(f"{entry}") print() if run_result.output_as_model: print(f"Final model output (structured): {run_result.output_as_model}") else: print(f"Final model output (raw): {run_result.output_entries[-1] if run_result.output_entries else 'N/A'}") if __name__ == "__main__": # Ensure mcp_servers/stdio_server.py exists and is executable # Example stub for mcp_servers/stdio_server.py: # #!/usr/bin/env python # import sys, json # def get_weather(location: str): return {"temperature": 25.5, "condition": "Sunny"} # if __name__ == "__main__": # for line in sys.stdin: # req = json.loads(line) # # Simplified MCP interaction: assumes a 'tool_call' with 'get_weather_for_location' # if req.get("tool_calls") and req["tool_calls"][0].get("function", {}).get("name") == "get_weather_for_location": # loc = json.loads(req["tool_calls"][0]["function"]["arguments"]).get("location") # res = {"tool_results": [{"id": req["tool_calls"][0]["id"], "result": get_weather(loc)}]} # sys.stdout.write(json.dumps(res) + "\n") # sys.stdout.flush() # else: # Echo back if not a tool call, for basic MCP handshake # sys.stdout.write(json.dumps({"message": "MCP Server Ready"}) + "\n") # sys.stdout.flush() asyncio.run(main())
Note: The
stdio_server.py
in this example is highly simplified. A real MCP server would adhere to the full MCP specification for tool registration, invocation, and response.Streaming Conversations with MCP: Similar to non-streaming, but results are processed asynchronously as they arrive.
# Inside the RunContext block events = await client.beta.conversations.run_stream_async( run_ctx=run_ctx, inputs="Tell me the weather in John's location currently.", ) run_result_stream = None async for event in events: if isinstance(event, client.beta.conversations.RunResult): # Adjust class name as per SDK run_result_stream = event else: print(f"Stream event: {event}") # Print intermediate events/chunks if not run_result_stream: raise RuntimeError("No run result found from stream") # Process run_result_stream similar to run_result
Memory and Context: Enabling Stateful Conversations
A key differentiator of the Agents API is its robust conversation management system, which is inherently stateful.
- Persistent Context: Each conversation retains its history, allowing for coherent and contextually aware interactions over time. Developers are relieved from manually tracking conversation history.
- Starting Conversations:
- With an Agent: Initiate a conversation using a specific
agent_id
to leverage its pre-defined tools, instructions, and model.response = client.beta.conversations.start( agent_id="ag_xxxxxxxx", inputs="Initial user query." ) conversation_id = response.conversation_id
- Direct Access: Start a conversation by directly specifying the
model
andcompletion_args
, providing quick access to built-in connectors without pre-defining an agent.
- With an Agent: Initiate a conversation using a specific
- Conversation History: Interactions are stored as structured
conversation entries
, ensuring context is preserved. - Stateful Interactions and Branching:
- Developers can view past conversation entries.
- Any conversation can be continued from its last point.
- New conversation paths (branches) can be initiated from any previous point in an existing conversation.
- Streaming Output: The API supports streaming for both starting new conversations and continuing existing ones, enabling real-time updates and interactive experiences.
Agent Orchestration: Collaborative Problem Solving
The true power of the Agents API emerges in its ability to orchestrate multiple agents to solve complex, multi-faceted problems.
Dynamic Orchestration
Agents can be dynamically added to or removed from a conversation as needed. Each agent contributes its unique capabilities (specialized tools, different models, or specific instructions) to tackle different parts of a larger problem.
Creating an Agentic Workflow with Handoffs
Create Necessary Agents: Begin by creating all the individual agents that will participate in the workflow. Each agent can be tailored with specific tools, models, and instructions. For example, one might create:
- A
finance-agent
(e.g., usingmistral-large-latest
for complex reasoning). - A
web-search-agent
(equipped with theweb_search
tool). - A
calculator-agent
(potentially withcode_interpreter
or a custom function calling tool). - An
ecb-interest-rate-agent
(using function calling for a specific API). - A
graph-plotting-agent
(usingcode_interpreter
for visualizations).
cURL Example (Creating a Finance Agent):
curl --location "https://api.mistral.ai/v1/agents" \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --header "Authorization: Bearer $MISTRAL_API_KEY" \ --data '{ "model": "mistral-large-latest", "name": "finance-agent", "description": "Agent used to answer financial related requests" }' # Save the returned agent_id as <finance_agent_id>
Repeat for other agents like
web_search_agent
(tool:web_search
),graph_agent_id
(tool:code_interpreter
),calculator_agent_id
(tool:code_interpreter
).- A
Define Handoff Responsibilities: Once agents are created, update them to specify which other agents they can hand off tasks to. This is done via the
handoffs
parameter, which takes a list ofagent_id
s.cURL Example (Updating Web Search Agent with Handoffs):
curl --location "https://api.mistral.ai/v1/agents/<web_search_agent_id>" \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --header "Authorization: Bearer $MISTRAL_API_KEY" \ --data '{ "handoffs": ["<graph_agent_id>", "<calculator_agent_id>"] }'
This configuration allows the
web-search-agent
to delegate tasks to thegraph-plotting-agent
or thecalculator-agent
if the conversation requires their specific capabilities.
How Handoffs Work
A single user request to an initial agent (e.g., finance-agent
) can trigger a chain of actions across multiple agents. Each agent handles the part of the request relevant to its specialization.
handoff_execution
Parameter: This parameter in the conversation API call controls how handoffs are managed:server
(default): Handoffs are executed internally on Mistral's cloud servers. The process is seamless to the end-user until a final response or further input is required.client
: When a handoff is triggered, the API returns a response to the client, indicating the pending handoff. This allows the client application to manage or even override the handoff logic, providing more control.
Example Workflow A: US Interest Rate and Compounding User Query: "Fetch the current US bank interest rate and calculate the compounded effect if investing for the next 10 years." Expected Flow (Server-side Handoff):
- Request goes to
finance-agent
. finance-agent
determines it needs current interest rates and calculation.- It might hand off to
web-search-agent
to find the "current US bank interest rate." web-search-agent
performs the search and returns the rate.- The context (including the rate) might then be handed off (or back to
finance-agent
which then hands off) tocalculator-agent
. calculator-agent
uses itscode_interpreter
or a function to calculate the compounded interest over 10 years.- The final result is synthesized and returned to the user.
- Request goes to
Example Workflow B: ECB Interest Rate and Graph Plotting User Query: "Given the interest rate of the European Central Bank as of Jan 2025, plot a graph of the compounded interest rate over the next 10 years." Expected Flow (Potentially involving client-side handoff for function call fulfillment if local):
- Request to
finance-agent
. finance-agent
identifies need for ECB rate and graph.- Handoff to
ecb-interest-rate-agent
(which uses function calling for a specific, possibly client-managed, API to get the rate for "Jan 2025").- If
handoff_execution: client
is used, and the function is client-side, the API would signal the client to execute the function and provide the result back.
- If
- Once the rate is obtained, context is handed to
graph-plotting-agent
. graph-plotting-agent
usescode_interpreter
to generate the plot data and potentially an image of the graph.- The graph (or its
file_id
) and analysis are returned. The image can be downloaded using itsfile_id
(e.g.,curl ... /files/<file_id>/content
).
- Request to
Advanced Features and Usage
Function Calling
Complementing built-in connectors and MCP, agents can use standard function calling by defining a JSON schema for custom functions. This allows agents to interact with any external API or proprietary system.
Defining Functions: When creating or updating an agent, provide a tool of
type: "function"
with afunction
object detailing itsname
,description
, andparameters
(as a JSON schema).cURL Example (Creating an Agent with a Function):
curl --location "https://api.mistral.ai/v1/agents" \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --header "Authorization: Bearer $MISTRAL_API_KEY" \ --data '{ "model": "mistral-medium-latest", "name": "ecb-interest-rate-agent", "description": "Can find the current interest rate of the European central bank", "instructions": "You can provide interest rate and information regarding the European central bank.", "tools": [ { "type": "function", "function": { "name": "get_european_central_bank_interest_rate", "description": "Retrieve the real interest rate of European central bank.", "parameters": { "type": "object", "properties": { "date": { "type": "string", "description": "The date for which to fetch the rate, e.g., YYYY-MM-DD" } }, "required": ["date"] } } } ] }' # Save the agent_id
Using an Agent with Function Calling:
Start Conversation: Send the user query. The agent might respond with a
tool_calls
entry if it decides to use the function. cURL (Start Conversation):curl --location "https://api.mistral.ai/v1/conversations" \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --header "Authorization: Bearer $MISTRAL_API_KEY" \ --data '{ "inputs": [ { "role": "user", "content": "Whats the current 2025 real interest rate for ECB?", "object": "entry", "type": "message.input" } ], "stream": false, "agent_id": "<agent_id_with_function>" }'
Expected Partial Response (if function is called): The response's
outputs
would contain an entry wheretype
istool.calls
(or similar, based on exact API spec for agent function calls), including atool_call_id
and the function name and arguments.Provide Function Result: After your client executes the function, send the result back to continue the conversation, referencing the
tool_call_id
. cURL (Continue Conversation with Function Result):CONV_ID="<conversation_id_from_previous_step>" TOOL_CALL_ID="<tool_call_id_from_previous_step>" # e.g., "6TI17yZkV" curl --location "https://api.mistral.ai/v1/conversations/$CONV_ID" \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --header "Authorization: Bearer $MISTRAL_API_KEY" \ --data-raw '{ "inputs": [ { "tool_call_id": "'"$TOOL_CALL_ID"'", "result": "{\"date\": \"2025-01-15\", \"interest_rate\": \"2.75%\"}", "object": "entry", "type": "function.result" } ], "stream": false, "store": true, "handoff_execution": "server" }'
The agent will then use this result to formulate its final answer.
- MCP Orchestration can also be used for local function calling, as shown in the MCP example with
run_ctx.register_func
.
Supported Models
Initially, the Agents API supports mistral-medium-latest
and mistral-large-latest
. Mistral AI plans to enable more models in the future.
Inherited Features: Structured Outputs, Document Understanding, Citations
The Agents API also benefits from features available in the Chat Completions API:
- Structured Outputs: Agents can be guided to produce outputs in a specific JSON schema, useful for programmatic consumption (demonstrated with
output_format=WeatherResult
in the MCP example). - Document Understanding: Models can process and understand content from provided documents (related to the Document Library connector).
- Citations: For RAG-related tool usages (like Web Search and Document Library), agents can provide citations (
tool_reference
chunks) indicating the source of the information used in their responses. This is crucial for transparency and fact-checking.
Deployment and Integration Strategies
Mistral models, including those powering agents, can be deployed in various environments.
Self-Deployment with vLLM
vLLM is an open-source LLM inference and serving engine, well-suited for on-premise deployment of Mistral models.
Prerequisites:
- Hardware meeting vLLM requirements.
- Hugging Face authentication (HF_TOKEN with READ permission) if sourcing weights from Hugging Face. Ensure terms are accepted on model cards.
- Alternatively, use local model artifacts by pointing vLLM to their path.
Installation & Setup:
pip install vllm # Ensure version >=0.6.1.post1 for compatibility huggingface-cli login --token $HF_TOKEN
Offline Mode Inference (Batch Processing): Example (Mistral NeMo / Mistral Small - Text Input):
from vllm import LLM, SamplingParams # For Mistral NeMo / Small: model_name = "mistralai/Mistral-Nemo-Instruct-2407" # Or mistralai/Mistral-Small-latest # For Pixtral-12B (multimodal): # model_name = "mistralai/Pixtral-12B-Instruct-v0.1" sampling_params = SamplingParams(max_tokens=1024, temperature=0.7) # Adjust as needed llm = LLM( model=model_name, tokenizer_mode="mistral", # Important for Mistral models # For newer models, these might be auto-detected or use 'auto' # load_format="mistral", # config_format="mistral", # For Pixtral, you might need additional multimodal configurations # enable_lora=True, # if using LoRA adapters ) messages = [{"role": "user", "content": "Who is the best French painter? Explain briefly."}] # For Pixtral, messages would be a list of dictionaries with 'role' and 'content', # where content can be a list of text and image items: # messages_pixtral = [{ # "role": "user", # "content": [ # {"type": "text", "text": "Describe this image."}, # {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} # Base64 encoded image # ] # }] # For text models: outputs = llm.generate(prompt_token_ids=[llm.get_tokenizer().encode(messages[0]["content"])], sampling_params=sampling_params) # Or use llm.chat() for chat formatted prompts # For chat models (newer vLLM versions support llm.chat directly) # outputs = llm.chat(messages=messages, sampling_params=sampling_params) # print(outputs[0].outputs[0].text) # Accessing generated text # For Pixtral (example using generate for multimodal, check vLLM docs for latest API) # from vllm.multimodal.utils import process_image_pil # from PIL import Image # image = Image.open("path_to_your_image.jpg") # processed_image_input = {"image": image} # This API might change # outputs = llm.generate( # prompts=["Describe this image <image>"], # Placeholder for image # multi_modal_data=processed_image_input, # sampling_params=sampling_params # ) # print(outputs[0].outputs[0].text)
Note: Pixtral usage with vLLM requires specific multimodal setup; consult vLLM documentation for the most current methods.
Server Mode Inference (OpenAI Compatible API): Start Server (e.g., Mistral NeMo):
vllm serve mistralai/Mistral-Nemo-Instruct-2407 \ --tokenizer_mode mistral \ --load_format mistral \ --config_format mistral \ # --port 8000 (default) # --host 0.0.0.0 # --tensor-parallel-size N (for multi-GPU)
Query Server (cURL):
curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-xxxxxxxx" \ # Token can often be arbitrary for local vLLM -d '{ "model": "mistralai/Mistral-Nemo-Instruct-2407", "messages": [ {"role": "user", "content": "Who is the best French painter? One short sentence."} ], "max_tokens": 50 }'
Deploying with Docker: Use vLLM's official Docker image.
export HF_TOKEN=your-hugging-face-access-token docker run --runtime nvidia --gpus all \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HUGGING_FACE_HUB_TOKEN=${HF_TOKEN}" \ -p 8000:8000 \ --ipc=host \ vllm/vllm-openai:latest \ --model mistralai/Mistral-Nemo-Instruct-2407 \ --tokenizer_mode mistral \ --load_format mistral \ --config_format mistral # Add other vLLM CLI args as needed
Deploy with Cloudflare Workers AI
Cloudflare Workers AI allows running LLMs on serverless GPUs across Cloudflare’s global network.
Setup:
- Create a Cloudflare account.
- Obtain your Account ID.
- Generate an API Token with Workers AI permissions.
Send Completion Request (cURL Example for Mistral-7B):
ACCOUNT_ID="your_account_id" API_TOKEN="your_api_token" curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/ai/run/@cf/mistral/mistral-7b-instruct-v0.1" \ -X POST \ -H "Authorization: Bearer $API_TOKEN" \ -d '{ "messages": [{ "role": "user", "content": "[INST] What is 2 + 2 ? [/INST]" }]}'
Expected Output:
{"result":{"response":" 2 + 2 = 4."},"success":true,"errors":[],"messages":[]}
Python and TypeScript SDKs/examples are also typically available from Cloudflare.
Getting Started and Resources
Embarking on your journey with the Mistral Agent API is straightforward:
- Consult the Official Documentation: The primary source for detailed API references, guides, and updates.
- Create Your First Agent: Experiment by defining a simple agent with a built-in connector like Web Search or Code Interpreter.
- Explore the Cookbooks: Mistral provides practical examples and tutorials (cookbooks) showcasing agentic workflows for various applications:
- GitHub Agent: Automating software development tasks.
- Linear Tickets Assistant: Managing project deliverables.
- Financial Analyst: Sourcing financial metrics and compiling insights.
- Travel Assistant: Planning trips and booking accommodations.
- Food Diet Companion: Tracking nutrition and meal planning.
Frequently Asked Questions (FAQ)
- Which models are supported?
Currently,
mistral-medium-latest
andmistral-large-latest
are supported, with plans to enable more models soon.
Conclusion: The Future is Agentic
The Mistral Agent API represents a pivotal step towards more capable and autonomous AI systems. By providing a robust framework for built-in tools, external system integration via MCP and function calling, persistent memory, and sophisticated agent orchestration, Mistral AI empowers developers to build AI agents that can actively engage with complex problems and digital environments.
The ability to chain specialized agents, each contributing unique skills, unlocks possibilities for automating intricate workflows, performing deep research, and creating highly interactive and intelligent applications. From coding assistants that manage repositories to financial analysts that synthesize market data, the potential applications are vast and transformative.
As enterprises increasingly seek to integrate AI into core operations, the Mistral Agent API offers a powerful, flexible, and scalable solution. The journey into agentic AI is just beginning, and with tools like the Mistral Agent API, developers are well-equipped to architect the next generation of intelligent systems. The combination of advanced language models with actionable capabilities heralds a new era where AI transitions from a passive information provider to an active collaborator and problem-solver.