Building Your Own AI Document Dream Team: A Generic Multi-Agent System
The power of Large Language Models (LLMs) is undeniable, but tackling truly complex document generation tasks often requires a more nuanced approach than relying on a single AI. Enter multi-agent systems, a paradigm that allows us to orchestrate a team of specialized AI agents, each contributing their unique skills to achieve a common goal.
Imagine a scenario where one AI agent acts as a meticulous researcher, another as a creative writer, and yet another as a detail-oriented editor – all working together to produce a high-quality document. This is the promise of multi-agent systems, and in this blog post, we'll explore how to build a generic multi-agent document generation platform leveraging the CrewAI multi-agent framework, Streamlit supported UI, HuggingFace and Ollama models (all open-source) to generate sophisticated text documents.
To fully utilize the power of multi-agent, each agent can select a completely different models according to the role and task which is assigned to that agent.
We will be running this experiment on 4th Gen Intel® Xeon® CPU Processor running OS CentOS Stream release 9.
Why a Generic Multi-Agent Approach?
Unlike building a fixed pipeline, a generic system offers unparalleled flexibility. You can define the roles, goals, and tasks of your AI agents on the fly, tailoring the crew
to the specific document generation challenge at hand. Need a legal brief drafted? You can create a Legal Researcher
and a Legal Writer.
Aiming for a creative marketing copy? Assemble a Brainstormer
and a Copywriter.
Searching for some document on topics such as USA trade deficit and Tariffs, create your own team of the reserachers. The possibilities are limited only by your imagination and the capabilities of the underlying LLMs.
Setting the Stage: Essential Libraries
Our journey begins with the creation of a virtual environment and installation of the necessary Python libraries:
#Create a virtual environment
conda create -n multiagent_env python==3.11
conda activate multiagent_env
# Install the required libraries
pip install streamlit crewai crewai-tools langchain_community python-dotenv
# Install Ollama and pull/download some of the models of your interest
curl -fsSL https://ollama.com/install.sh | sh
ollama serve &
ollama pull deepseek-r1
ollama pull qwen2.5:7b
ollama pull llama3.2
That's all the setup required to run the following code. This Python code creates a user-friendly Streamlit web application that allows for the dynamic definition and execution of a CrewAI multi-agent system:
# multi-agent-app.py
# Create a '.env' file in the same folder as this python file and store your API tokens and keys in this file
import streamlit as st
import os
from crewai import Agent, Task, Crew, LLM
from dotenv import load_dotenv
from langchain_community.llms import HuggingFaceHub
load_dotenv(override=True)
HUGGINGFACEHUB_API_TOKEN = os.getenv('HUGGINGFACEHUB_API_TOKEN')
# Initialize session state variables
if 'agents' not in st.session_state:
st.session_state.agents = []
if 'topic' not in st.session_state:
st.session_state.topic = ""
if 'tasks' not in st.session_state:
st.session_state.tasks = []
if 'process_flow' not in st.session_state:
st.session_state.process_flow = []
if 'content_generated' not in st.session_state:
st.session_state.content_generated = False
if 'generated_content' not in st.session_state:
st.session_state.generated_content = ""
st.title("Generic Multi-Agent Document Generation ")
# Topic Input
st.session_state.topic = st.text_input("Enter the 'topic':")
# Add Agent Button
if st.button("Add Agent"):
st.session_state.agents.append({})
# Agent Input Forms
for i, agent in enumerate(st.session_state.agents):
st.subheader(f"Agent {i + 1}")
agent['role'] = st.text_input(f"Role for Agent {i + 1}", key=f"role_{i}")
agent['goal'] = st.text_area(f"Goal for Agent {i + 1}", key=f"goal_{i}")
agent['backstory'] = st.text_area(f"Backstory for Agent {i + 1}", key=f"backstory_{i}")
agent['task_description'] = st.text_area(f"Task Description for Agent {i + 1}", key=f"task_desc_{i}")
agent['task_expected_output'] = st.text_input(f"Task Expected Output for Agent {i + 1}", key=f"task_exp_{i}")
agent['llm_model'] = st.selectbox(
f"LLM Model for Agent {i + 1}",
["ollama/llama3.2",
"ollama/llama3.2:1b",
"ollama/gemma:2b",
"ollama/mistral",
"ollama/deepseek-r1:1.5b",
"ollama/deepseek-r1:latest",
"ollama/qwen:0.5b",
"ollama/qwen:4b",
"huggingface/Mistral-7B-Instruct-v0.3"
], # Add more models here
key=f"llm_{i}"
)
if st.button(f"Remove Agent {i + 1}", key=f"remove_agent_{i}"):
del st.session_state.agents[i]
# Process Flow Definition
if st.session_state.agents:
st.subheader("Define Process Flow")
st.session_state.process_flow = st.multiselect(
"Select Agent Order",
[agent['role'] for agent in st.session_state.agents],
default=[agent['role'] for agent in st.session_state.agents]
)
# Crew Kickoff Button
if st.button("Crew Kickoff") and st.session_state.agents and st.session_state.process_flow:
# Create Agents and Tasks
crew_agents = []
crew_tasks = []
for agent_data in st.session_state.agents:
llm_model = None
if agent_data['llm_model'].startswith("ollama/"):
llm_model = LLM(model=agent_data['llm_model'], base_url="http://localhost:11434")
elif agent_data['llm_model'].startswith("huggingface/"):
llm_model = LLM(
model="huggingface/Mistral-7B-Instruct-v0.3",
api_base="https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3",
temperatur=0.7,
api_key = HUGGINGFACEHUB_API_TOKEN
)
else:
st.error("Please check the model and environment variables.")
st.stop()
if llm_model:
agent = Agent(
role=agent_data['role'],
goal=agent_data['goal'].format(topic=st.session_state.topic),
backstory=agent_data['backstory'].format(topic=st.session_state.topic),
llm=llm_model,
allow_delegation=False,
verbose=True,
)
crew_agents.append(agent)
task = Task(
description=agent_data['task_description'].format(topic=st.session_state.topic),
expected_output=agent_data['task_expected_output'],
agent=agent,
)
crew_tasks.append(task)
# Organize tasks based on process flow
ordered_tasks = []
for role in st.session_state.process_flow:
for task in crew_tasks:
if task.agent.role == role:
ordered_tasks.append(task)
break
# Create and Kickoff Crew
crew = Crew(
agents=crew_agents,
tasks=ordered_tasks,
verbose=True,
)
result = crew.kickoff()
st.session_state.generated_content = result
st.session_state.content_generated = True
# Display Generated Content
if st.session_state.content_generated:
st.subheader("Generated Content:")
st.write(st.session_state.generated_content)
Code Explanation: A Deep Dive
Import necessary libraries:
streamlit
for creating the web UI.crewai
for the core multi-agent framework (Agent, Task, Crew, LLM).dotenv
for loading environment variables from a.env
file.langchain_community.llms.HuggingFaceHub
for interacting with models on the Hugging Face Hub. Theload_dotenv(override=True)
loads environment variables from a.env
file, which can be useful for storing API keys or other sensitive information.
Initialize Session State:
Streamlit's st.session_state is used to store variables that persist across user interactions, such as the list of agents, the main topic, defined tasks, the process flow, and the generated content. This allows the application to remember the user's configurations.
Set up Streamlit UI:
st.title() sets the title of the web application. st.markdown() adds a brief description. st.text_input() creates a text input field for the user to enter the central topic.
Dynamic Agent Configuration:
The "Add Agent" button appends an empty dictionary representing a new agent to the st.session_state.agents list. A loop iterates through the st.session_state.agents list, creating an expandable section (st.sidebar.expander) for each agent. Within each agent's expander, the user can define:
- Role: The agent's function (e.g., "Researcher," "Writer").
- Goal: The objective the agent aims to achieve. The {topic} placeholder allows for dynamic context.
- Backstory: A narrative to give the agent a persona.
- Task Description: The specific action the agent will perform, often referencing the {topic}.
- Expected Output: A description of the desired result of the agent's task.
- LLM Model: A dropdown (
st.selectbox
) allows the user to choose the LLM to power this agent. It includes options for locally running Ollama models and a commented-out option for a Hugging Face model.
A Remove Agent
button removes the corresponding agent from the st.session_state.agents
list and triggers a re-run of the Streamlit app to update the UI.
Workflow Orchestration:
This section define the order in which the agents should execute their tasks using a st.multiselect
widget. This example use a sequential flow however, in CrewAI this flow can be changed to hierarchical as well. The options are populated based on the roles of the defined agents.
Crew Execution:
The "Execute Crew" button triggers the creation and execution of the CrewAI multi-agent system.
It initializes empty lists crew_agents and crew_tasks.
It iterates through the st.session_state.agents
:
For each agent, it determines the appropriate LLM object based on the selected llm_model:
If the model starts with "ollama/", it creates a crewai.LLM instance pointing to the local Ollama server.
If the model starts with "huggingface/", it creates a langchain_community.llms.HuggingFaceHub
instance, using the HUGGINGFACEHUB_API_TOKEN environment variable for authentication.
If an unsupported model is selected, it displays an error and stops the application. If an llm_model is successfully created, it instantiates a crewai.Agent with the defined role, goal (formatted with the topic), backstory (formatted with the topic), and the chosen llm_model. It then creates a crewai.Task with the defined description (formatted with the topic), expected output, and assigns it to the created Agent.
The created Agent and Task are appended to their respective lists.
The ordered_tasks list is created based on the user-defined process_flow, ensuring the tasks are executed in the specified order. A crewai.Crew object is instantiated with the list of agents and ordered tasks.
crew.kickoff()
initiates the execution of the multi-agent workflow, and a st.spinner provides visual feedback to the user. The generated result is stored in the session state.
Output Display:
If st.session_state.content_generated
is True (meaning the Crew has finished execution), a "Generated Content:" subheader is displayed, and the output from the crew.kickoff()
method is written to the Streamlit app using st.write().
Conclusion: Unleash Your AI Dream Team for Document Generation
This generic multi-agent system provides a powerful and flexible foundation for tackling a wide range of document generation tasks. By defining the right agents with the right skills (powered by your choice of LLMs) and orchestrating their workflow effectively, you can achieve sophisticated and high-quality results.
Start experimenting with different agent configurations and workflows, and discover the incredible potential of collaborative AI for document generation!