Building a Modular Image Processing Server with Gradio and MCP
Author: Wen Liang
Modern AI workflows often require a modular, composable way to expose tools (such as image or text processing) that can be invoked from anywhere - whether it’s an API, a user interface, or another AI agent. The Model Context Protocol (MCP) provides a unified framework for exposing functions (tools) via an event-driven protocol. Combined with Gradio, you can rapidly stand up web UIs or HTTP endpoints for both interactive and programmatic use.
In this post, I’ll walk through building a simple image editing server using Gradio’s MCP server mode and a Python client that programmatically calls each function, showing how this architecture enables easy tool integration and remote automation.
Architecture Overview
- Server: Exposes image processing tools (grayscale, rotate, contrast) via Gradio Blocks, each function available as an MCP tool.
- Client: Connects to the server’s MCP endpoint, uploads an image, and invokes each tool remotely, saving the results.
Diagram: End-to-End MCP Workflow
Implementing the Image Editing Server
The following section demonstrates a minimal server implementation using Gradio Blocks and MCP. This server exposes several image processing tools as callable MCP functions (server.py
):
import gradio as gr
from PIL import Image, ImageEnhance
def to_grayscale(img: Image.Image) -> Image.Image:
return img.convert("L").convert("RGB")
def rotate_image(img: Image.Image, angle: float) -> Image.Image:
return img.rotate(angle, expand=True)
def adjust_contrast(img: Image.Image, factor: float) -> Image.Image:
enhancer = ImageEnhance.Contrast(img)
return enhancer.enhance(factor)
with gr.Blocks() as demo:
gr.Markdown("# Image Editing MCP Server")
with gr.Tab("Grayscale"):
inp_g = gr.Image(type="pil", label="Input Image")
out_g = gr.Image(type="pil", label="Grayscale Output")
gr.Button("Convert").click(to_grayscale, inp_g, out_g)
with gr.Tab("Rotate"):
inp_r = gr.Image(type="pil", label="Input Image")
angle = gr.Slider(0, 360, value=90, label="Angle")
out_r = gr.Image(type="pil", label="Rotated Output")
gr.Button("Rotate").click(rotate_image, [inp_r, angle], out_r)
with gr.Tab("Contrast"):
inp_c = gr.Image(type="pil", label="Input Image")
factor = gr.Slider(0.1, 3.0, value=1.5, label="Contrast Factor")
out_c = gr.Image(type="pil", label="Adjusted Output")
gr.Button("Adjust").click(adjust_contrast, [inp_c, factor], out_c)
if __name__ == "__main__":
demo.launch(
server_name="0.0.0.0",
server_port=7860,
mcp_server=True, # <-- Enables MCP endpoint!
debug=True
)
Key point:
Setting mcp_server=True
tells Gradio to expose each tool as an MCP endpoint. Now, anything that speaks the MCP protocol can list, describe, and invoke these functions!
Programmatic Image Processing with the MCP Client
The client demonstrates how to connect to the server, encode an image as a data URI, and invoke each tool via MCP (client.py
):
import asyncio, base64
from mcp import ClientSession
from mcp.client.sse import sse_client
async def main():
server_url = "http://localhost:7860/gradio_api/mcp/sse"
# Encode image as data URI
with open("input.png", "rb") as f:
raw = f.read()
b64 = base64.b64encode(raw).decode("ascii")
data_uri = f"data:image/png;base64,{b64}"
async with sse_client(server_url) as (read_stream, write_stream):
async with ClientSession(read_stream, write_stream) as session:
await session.initialize()
tools = await session.list_tools()
# ... print available tools ...
async def call_and_save(tool_name, args, out_path):
res = await session.call_tool(tool_name, args)
# ... decode and write result ...
await call_and_save("to_grayscale", {"img": data_uri}, "output_grayscale.png")
await call_and_save("rotate_image", {"img": data_uri, "angle": 90}, "output_rotated.png")
await call_and_save("adjust_contrast", {"img": data_uri, "factor": 1.5}, "output_contrast.png")
if __name__ == "__main__":
asyncio.run(main())
Highlights:
- Tool listing: The client can discover available tools at runtime, adapting to whatever the server exposes.
- Streaming output: MCP enables chunked, event-based communication, which is ideal for AI tools that produce data asynchronously.
Why Use MCP with Gradio?
- Composable: Expose any Python function as a remote tool, so it can be used from scripts, web UIs, or even by other AI agents. This makes it easy to share and reuse logic across projects.
- Flexible: Easily add or remove tools just by editing your Gradio Blocks definition. You don’t have to deal with REST API boilerplate or custom routing, MCP handles the plumbing for you.
- Interactive or automated: The same server supports both live web demos and batch automation, letting you prototype quickly and then use those tools in production pipelines or automated scripts without changes.
Next Steps
- Add more image tools, or extend to audio/text processing!
- Integrate with LLMs to compose tool use from natural language instructions.
- Deploy the server for collaborative AI workflows.
MCP + Gradio is a lightweight, Pythonic way to serve modular AI tools for both human and machine users. Try it for your next project!