AI Agent Architecture Plan

1. Frameworks and Tools

Python: The primary programming language.
Manim: For creating mathematical animations.
OpenAI API: For generating Manim code from text prompts.
pydantic-ai: For structured AI agent creation, function calling, and workflow management.
Gradio: For creating a user interface to input prompts and display generated videos.
dotenv: For managing environment variables.
Logging: For logging information and errors.

Input Handling: Use Gradio to create a user interface where users can input text prompts.
Agent Structure: Leverage pydantic-ai to define the agent's schema, capabilities, and functions.
System Prompts: Use both static and dynamic system prompts to guide the agent's behavior.
- Static prompts: Define the agent's role and general capabilities
- Dynamic prompts: Adjust behavior based on complexity settings and current context
Keyword Identification: Use pydantic-ai with OpenAI API to identify keywords and generate Manim code.
Scenario Creation: Define structured schemas in pydantic-ai to guide the generation of scenarios.
Function Search: Use pydantic-ai's function calling capabilities to organize and call Manim functions.
Code Generation and Testing: The generated code will be tested by rendering the video using Manim.

User Input: The user inputs a text prompt describing a mathematical or physics concept.
Agent Processing: The pydantic-ai agent processes the input through defined schemas and tools.
- System prompts dynamically adjust based on user requirements
- Tools are applied in sequence using the agent's capabilities
Keyword Identification and Scenario Creation: The agent uses OpenAI API to analyze the prompt and generate a structured scenario.
Code Generation: The agent transforms the structured scenario into Manim code using defined tools.
Video Rendering: The code is executed using Manim to render the video.
Output: The generated video is displayed to the user.

Setup Environment:
- Ensure all required packages are installed (gradio, openai, pydantic-ai, dotenv, manim, etc.).
- Set up environment variables in .env file (e.g., TOGETHER_API_KEY).
- Configure pydantic-ai with appropriate model settings.
Create Agent Structure:
- Define pydantic models for input prompts, scenario descriptions, and animation parameters.
- Create static and dynamic system prompts to guide agent behavior:
  - Static: Define the agent's role and general capabilities
  - Dynamic: Adjust behavior based on request complexity and context
- Create tool functions for scenario extraction, code generation, and rendering.
- Configure the agent with appropriate tools and models.
Create User Interface:
- Use Gradio to create a web interface for inputting prompts and displaying results.
- Add complexity selection controls to customize animation generation.
- Connect the UI to the pydantic-ai agent.
Generate Manim Code:
- Implement functions using pydantic-ai tools to transform user prompts into structured scenarios.
- Convert structured scenarios into Manim code templates.
- Fill templates with specifics from the scenario.
Render Video:
- Implement a function to render the generated Manim code into a video.
- Add error handling and validation using pydantic models.
Display Results:
- Display the generated video and code in the Gradio interface.
- Provide feedback and explanations based on the agent's processing steps.