Spaces:

JulsdL
/

AI-Notebook-Tutor

Sleeping

JulsdL commited on May 16, 2024

Commit

c21a510

1 Parent(s): 85ee7cf

Code documentation and dockerizing AI Notebook Tutor

- Expand the welcome guide in chainlit.md to introduce new features such as uploading notebooks, asking questions, generating quizzes, and creating flashcards.
- Enhance code with detailed docstrings for agent functions, improving code documentation and readability.
- Update notebook_tutor/chainlit_frontend.py to log successful chat initiation and provide users with guidance on interacting with the system.
- Add a Dockerfile for containerization, enabling easy deployment and scaling of the AI Notebook Tutor application.
- Update README.md with project metadata and acknowledgements, offering users and contributors a comprehensive overview of the project and its dependencies.
- Modify requirements.txt, deleting crewai, adding langchain-openai to align with the project's dependencies on LangChain and OpenAI technologies.

Files changed (11) hide show

Dockerfile +29 -0
README.md +13 -0
chainlit.md +30 -8
flashcards_cca7854c-91c2-47d5-872f-46132739ace0.csv +0 -11
notebook_tutor/agents.py +39 -2
notebook_tutor/chainlit_frontend.py +19 -2
notebook_tutor/graph.py +11 -0
notebook_tutor/prompt_templates.py +5 -1
notebook_tutor/states.py +11 -1
notebook_tutor/tools.py +29 -0
requirements.txt +1 -1

Dockerfile ADDED Viewed

	@@ -0,0 +1,29 @@

+# Use the official Python image from the Docker Hub
+FROM python:3.11
+# Create a new user with a specific UID and set it as the default user
+RUN useradd -m -u 1000 user
+# Switch to the new user
+USER user
+# Set environment variables
+ENV HOME=/home/user \
+  PATH=/home/user/.local/bin:$PATH
+# Set the working directory
+WORKDIR $HOME/app
+# Copy the requirements file and install the dependencies
+COPY --chown=user ./requirements.txt $HOME/app/requirements.txt
+RUN pip install --upgrade pip && \
+  pip install -r requirements.txt
+# Copy project files to the working directory
+COPY --chown=user . $HOME/app
+# Set PYTHONPATH to include the project directory
+ENV PYTHONPATH=$HOME/app
+# Set the command to run the application
+CMD ["chainlit", "run", "notebook_tutor/app.py", "--port", "7860"]

README.md CHANGED Viewed

@@ -1,3 +1,12 @@
 # AI-Notebook-Tutor
 # RAG Application for QA in Jupyter Notebook
@@ -34,3 +43,7 @@ chainlit run notebook_tutor/app.py
 ## Usage
 Start a chat session and upload a Jupyter notebook file. The application will process the document and you can then ask questions related to the content of the notebook. It might take some time to answer some question (should be less than 1 min), so please be patient.

+---
+title: DeepPDF AI
+emoji: 📖
+colorFrom: pink
+colorTo: yellow
+sdk: docker
+pinned: false
+---
 # AI-Notebook-Tutor
 # RAG Application for QA in Jupyter Notebook
 ## Usage
 Start a chat session and upload a Jupyter notebook file. The application will process the document and you can then ask questions related to the content of the notebook. It might take some time to answer some question (should be less than 1 min), so please be patient.
+## Acknowledgements
+This project uses technologies including LangChain, OpenAI's GPT models, Qdrant for vector storage and ChainLit. Thanks to all open-source contributors and organizations that make these tools available.

chainlit.md CHANGED Viewed

@@ -1,14 +1,36 @@
-# Welcome to Chainlit! 🚀🤖
-Hi there, Developer! 👋 We're excited to have you on board. Chainlit is a powerful tool designed to help you prototype, debug and share applications built on top of LLMs.
-## Useful Links 🔗
-- **Documentation:** Get started with our comprehensive [Chainlit Documentation](https://docs.chainlit.io) 📚
-- **Discord Community:** Join our friendly [Chainlit Discord](https://discord.gg/k73SQ3FyUh) to ask questions, share your projects, and connect with other developers! 💬
-We can't wait to see what you create with Chainlit! Happy coding! 💻😊
-## Welcome screen
-To modify the welcome screen, edit the `chainlit.md` file at the root of your project. If you do not want a welcome screen, just leave this file empty.

+# Welcome to AI Notebook Tutor! 🎓📘
+Hello and welcome to the AI Notebook Tutor, your interactive learning assistant designed to enhance your understanding of Jupyter notebooks. Whether you're a student, educator, or professional, our tool will help you navigate and master complex technical concepts more efficiently.
+## Getting Started
+To begin using AI Notebook Tutor:
+- **Upload your Jupyter notebook** (.ipynb, max. 5mb) to start your interactive learning journey.
+- **Ask specific questions** about the notebook content to get detailed explanations.
+- **Generate custom quizzes** to test your understanding of the material.
+- **Create flashcards** for quick revisions and efficient memorization of key concepts.
+### Sample Questions to Try:
+- "Explain me the "generate_query" function in detail"
+- "Generate a quiz about creating a vectore store retriever."
+- "Create 10 flashcards for the key concepts in the notebook."
+## How It Works 🧠
+1. **Upload and Process**: Start by uploading your Jupyter notebook. Our system will process the document and prepare it for interactive learning.
+2. **Interactive Q&A**: Ask specific questions about your notebook content. The AI will analyze the content and provide detailed, easy-to-understand explanations.
+3. **Quiz Generation**: Automatically generate quizzes based on the key concepts in your notebook. This helps reinforce your understanding and retention of the material.
+4. **Flashcard Creation**: Generate flashcards from key segments of your notebook for quick revision and repetitive learning, helping boost long-term retention.
+## The Tech Behind It 💡🤖
+- **Document Management**: Our system uses advanced document processing to load and split your notebook content into manageable chunks.
+- **Language Model**: Powered by GPT-4, the AI provides accurate and relevant explanations, quizzes, and flashcards based on your notebook content.
+- **Retrieval System**: Utilizing state-of-the-art retrieval techniques, our system ensures that the information provided is both precise and relevant.
+- **Interactive Agents**: Different specialized agents work together to answer your questions, generate quizzes, and create flashcards, making your learning experience comprehensive and engaging.
+## Ready to Learn?
+With AI Notebook Tutor, transform your Jupyter notebooks into an interactive, engaging learning experience. Dive deep into your study materials, test your knowledge, and retain information more effectively than ever before. Happy learning!

flashcards_cca7854c-91c2-47d5-872f-46132739ace0.csv DELETED Viewed

@@ -1,11 +0,0 @@
-Front,Back
-What command is used to clone a GitHub repository in a notebook?,!git clone https://github.com/arcee-ai/DALM
-How do you install or upgrade a Python package in a notebook?,!pip install --upgrade -q -e .
-Which command installs the 'langchain' and 'langchain-community' libraries?,!pip install -qU langchain langchain-core langchain-community sentence_transformers
-What is the command to install 'pymupdf' and 'faiss-cpu'?,!pip install -qU pymupdf faiss-cpu
-How do you import the Pandas library in Python?,import pandas as pd
-Which library provides the 'HuggingFaceEmbeddings' class?,from langchain_community.embeddings import HuggingFaceEmbeddings
-How do you import the 'FAISS' vector store from the 'langchain_community' library?,from langchain_community.vectorstores import FAISS
-What is the import statement for reading directories using the 'Llama Index' library?,from llama_index.core import SimpleDirectoryReader
-Which import statement is used for parsing nodes in the 'Llama Index' library?,from llama_index.core.node_parser import SimpleNodeParser
-How do you import the 'MetadataMode' schema from the 'Llama Index' library?,from llama_index.core.schema import MetadataMode

notebook_tutor/agents.py CHANGED Viewed

@@ -25,7 +25,18 @@ def create_agent(
     tools: list,
     system_prompt: str,
 ) -> AgentExecutor:
-    """Create a function-calling agent and add it to the graph."""
     system_prompt += "\nWork autonomously according to your specialty, using the tools available to you."
     " Do not ask for clarification."
     " Your other team members (and other teams) will collaborate with you with their own specialties."
@@ -46,6 +57,21 @@ def create_agent(
 # Function to create agent nodes
 def agent_node(state, agent, name):
     result = agent.invoke(state)
     if 'messages' not in result:
         raise ValueError(f"No messages found in agent state: {result}")
@@ -65,7 +91,18 @@ def agent_node(state, agent, name):
 # Function to create the supervisor
 def create_team_supervisor(llm: ChatOpenAI, system_prompt, members) -> AgentExecutor:
-    """An LLM-based router."""
     options = ["WAIT", "FINISH"] + members
     function_def = {
         "name": "route",

     tools: list,
     system_prompt: str,
 ) -> AgentExecutor:
+    """
+    Create a function-calling agent and add it to the graph.
+    Parameters:
+        llm (ChatOpenAI): The ChatOpenAI instance used for the agent.
+        tools (list): A list of tools available to the agent.
+        system_prompt (str): The system prompt for the agent.
+    Returns:
+        AgentExecutor: The AgentExecutor instance containing the agent.
+    """
     system_prompt += "\nWork autonomously according to your specialty, using the tools available to you."
     " Do not ask for clarification."
     " Your other team members (and other teams) will collaborate with you with their own specialties."
 # Function to create agent nodes
 def agent_node(state, agent, name):
+    """
+    Invoke an agent and update the state based on the agent's output.
+    Parameters:
+        state (dict): The current state of the conversation.
+        agent (AgentExecutor): The agent to be invoked.
+        name (str): The name of the agent.
+    Returns:
+        dict: The updated state after invoking the agent.
+    Raises:
+        ValueError: If no messages are found in the agent state.
+    """
     result = agent.invoke(state)
     if 'messages' not in result:
         raise ValueError(f"No messages found in agent state: {result}")
 # Function to create the supervisor
 def create_team_supervisor(llm: ChatOpenAI, system_prompt, members) -> AgentExecutor:
+    """
+    An LLM-based router.
+    Parameters:
+        llm (ChatOpenAI): The ChatOpenAI instance used for the supervisor.
+        system_prompt (str): The system prompt for the supervisor.
+        members (list): A list of team members.
+    Returns:
+        AgentExecutor: The AgentExecutor instance containing the supervisor.
+    """
     options = ["WAIT", "FINISH"] + members
     function_def = {
         "name": "route",

notebook_tutor/chainlit_frontend.py CHANGED Viewed

@@ -53,13 +53,24 @@ async def start_chat():
         tutor_chain = create_tutor_chain(retrieval_chain)
         cl.user_session.set("tutor_chain", tutor_chain)
-        ready_to_chat_message = "Notebook uploaded and processed successfully. You are now ready to chat!"
         await cl.Message(content=ready_to_chat_message).send()
-        logger.info("Chat started and notebook uploaded successfully.")
 @cl.on_message
 async def main(message: cl.Message):
     # Retrieve the LangGraph chain from the session
     tutor_chain = cl.user_session.get("tutor_chain")
@@ -134,6 +145,12 @@ async def main(message: cl.Message):
 @cl.on_chat_end
 async def end_chat():
     # Clean up the flashcards directory
     flashcard_directory = 'flashcards'
     if os.path.exists(flashcard_directory):

         tutor_chain = create_tutor_chain(retrieval_chain)
         cl.user_session.set("tutor_chain", tutor_chain)
+        logger.info("Chat started and notebook uploaded successfully.")
+        ready_to_chat_message = "Notebook uploaded and processed successfully!"
         await cl.Message(content=ready_to_chat_message).send()
+        invite_message = "You can now ask questions or request quizzes and flashcards based on the notebook content."
+        await cl.Message(content=invite_message).send()
 @cl.on_message
 async def main(message: cl.Message):
+    """
+    This is the main function that processes a message through the LangGraph chain.
+    Parameters:
+    - message (cl.Message): The message to be processed.
+    """
     # Retrieve the LangGraph chain from the session
     tutor_chain = cl.user_session.get("tutor_chain")
 @cl.on_chat_end
 async def end_chat():
+    """
+    Clean up the flashcards directory after the chat ends.
+    This function is executed when the chat session ends.
+    It removes the 'flashcards' directory and all its contents, if it exists.
+    If the directory does not exist, it creates a new empty directory with the same name.
+    """
     # Clean up the flashcards directory
     flashcard_directory = 'flashcards'
     if os.path.exists(flashcard_directory):

notebook_tutor/graph.py CHANGED Viewed

@@ -10,6 +10,17 @@ load_dotenv()
 # Create the LangGraph chain
 def create_tutor_chain(retrieval_chain):
     retrieve_information_tool = get_retrieve_information_tool(retrieval_chain)
     # Create QA Agent

 # Create the LangGraph chain
 def create_tutor_chain(retrieval_chain):
+    """
+    Create a tutor chain for the notebook tutor system.
+    This function creates a tutor chain for the notebook tutor system. The tutor chain consists of multiple agents, including a QA Agent, Quiz Agent, Flashcards Agent, and Supervisor Agent. Each agent is created with specific tools and prompts.
+    Parameters:
+        retrieval_chain (object): The retrieval chain used for information retrieval.
+    Returns:
+        StateGraph: The compiled tutor graph representing the tutor chain.
+    """
     retrieve_information_tool = get_retrieve_information_tool(retrieval_chain)
     # Create QA Agent

notebook_tutor/prompt_templates.py CHANGED Viewed

@@ -10,6 +10,10 @@ class PromptTemplates:
     Methods:
         __init__(): Initializes all prompt templates as instance variables.
         get_rag_qa_prompt(): Returns the RAG QA prompt.
     Example usage:
         prompt_templates = PromptTemplates()
@@ -41,7 +45,7 @@ class PromptTemplates:
         2. Search Notebook Content: Use the notebook content to gather relevant information and generate accurate and informative flashcards.
         3. Generate Flashcards: Create a series of flashcards content with clear questions on the front and detailed answers on the back. Ensure that the flashcards cover the essential points and concepts requested by the user.
         4. Export Flashcards: YOU MUST USE the flashcard_tool to create and export the flashcards in a format that can be easily imported into a flashcard management system, such as Anki.
-        5. Provide the list of flashcards in a clear and organized manner.
         Remember, your goal is to help the user learn efficiently and effectively by breaking down the notebook content into manageable, repeatable flashcards."""
         self.SupervisorAgent_prompt = "You are a supervisor tasked with managing a conversation between the following agents: QAAgent, QuizAgent, FlashcardsAgent. Given the user request, decide which agent should act next."

     Methods:
         __init__(): Initializes all prompt templates as instance variables.
         get_rag_qa_prompt(): Returns the RAG QA prompt.
+        get_qa_agent_prompt(): Returns the QA Agent prompt.
+        get_quiz_agent_prompt(): Returns the Quiz Agent prompt.
+        get_flashcards_agent_prompt(): Returns the Flashcards Agent prompt.
+        get_supervisor_agent_prompt(): Returns the Supervisor Agent prompt.
     Example usage:
         prompt_templates = PromptTemplates()
         2. Search Notebook Content: Use the notebook content to gather relevant information and generate accurate and informative flashcards.
         3. Generate Flashcards: Create a series of flashcards content with clear questions on the front and detailed answers on the back. Ensure that the flashcards cover the essential points and concepts requested by the user.
         4. Export Flashcards: YOU MUST USE the flashcard_tool to create and export the flashcards in a format that can be easily imported into a flashcard management system, such as Anki.
+        5. Provide the list of flashcards in a clear and organized manner. DO NOT SHARE THE LINK TO THE FLASHCARD FILE.
         Remember, your goal is to help the user learn efficiently and effectively by breaking down the notebook content into manageable, repeatable flashcards."""
         self.SupervisorAgent_prompt = "You are a supervisor tasked with managing a conversation between the following agents: QAAgent, QuizAgent, FlashcardsAgent. Given the user request, decide which agent should act next."

notebook_tutor/states.py CHANGED Viewed

@@ -3,10 +3,20 @@ from langchain_core.messages import BaseMessage
 # Define the state for the system
 class TutorState(TypedDict):
     messages: List[BaseMessage]
     next: str
     quiz: List[dict]
     quiz_created: bool
     question_answered: bool
     flashcards_created: bool
-    # flashcard_path: str

 # Define the state for the system
 class TutorState(TypedDict):
+    """
+    A class representing the state of the tutor system.
+    Attributes:
+        messages (List[BaseMessage]): A list of messages in the system.
+        next (str): The next step in the tutor system.
+        quiz (List[dict]): A list of quiz questions and answers.
+        quiz_created (bool): Indicates if a quiz has been created.
+        question_answered (bool): Indicates if a question has been answered.
+        flashcards_created (bool): Indicates if flashcards have been created.
+    """
     messages: List[BaseMessage]
     next: str
     quiz: List[dict]
     quiz_created: bool
     question_answered: bool
     flashcards_created: bool

notebook_tutor/tools.py CHANGED Viewed

@@ -13,6 +13,23 @@ class FlashcardInput(BaseModel):
     flashcards: list = Field(description="A list of flashcards. Each flashcard should be a dictionary with 'question' and 'answer' keys.")
 class FlashcardTool(BaseTool):
     name = "create_flashcards"
     description = "Create flashcards in a .csv format suitable for import into Anki"
     args_schema: Type[BaseModel] = FlashcardInput
@@ -49,6 +66,18 @@ class FlashcardTool(BaseTool):
 create_flashcards_tool = FlashcardTool()
 class RetrievalChainWrapper:
     def __init__(self, retrieval_chain):
         self.retrieval_chain = retrieval_chain

     flashcards: list = Field(description="A list of flashcards. Each flashcard should be a dictionary with 'question' and 'answer' keys.")
 class FlashcardTool(BaseTool):
+    """
+    FlashcardTool class.
+    This class represents a tool for creating flashcards in a .csv format suitable for import into Anki.
+    Attributes:
+        name (str): The name of the tool.
+        description (str): The description of the tool.
+        args_schema (Type[BaseModel]): The schema for the input arguments of the tool.
+    Methods:
+        _run(flashcards: list, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
+            Use the tool to create flashcards.
+        _arun(flashcards: list, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
+            Use the tool asynchronously.
+    """
     name = "create_flashcards"
     description = "Create flashcards in a .csv format suitable for import into Anki"
     args_schema: Type[BaseModel] = FlashcardInput
 create_flashcards_tool = FlashcardTool()
 class RetrievalChainWrapper:
+    """
+    RetrievalChainWrapper class.
+    This class wraps a retrieval chain and provides a method to retrieve information using the wrapped chain.
+    Attributes:
+        retrieval_chain: The retrieval chain to be wrapped.
+    Methods:
+        retrieve_information(query: str) -> str:
+            Use this tool to retrieve information about the provided notebook.
+    """
     def __init__(self, retrieval_chain):
         self.retrieval_chain = retrieval_chain

requirements.txt CHANGED Viewed

@@ -1,6 +1,6 @@
 langchain==0.1.20
 langgraph==0.0.48
-crewai==0.30.0
 qdrant-client==1.9.1
 python-dotenv==1.0.1
 chainlit==1.0.506

 langchain==0.1.20
 langgraph==0.0.48
+langchain-openai==0.0.5
 qdrant-client==1.9.1
 python-dotenv==1.0.1
 chainlit==1.0.506