JulsdL commited on
Commit
c21a510
·
1 Parent(s): 85ee7cf

Code documentation and dockerizing AI Notebook Tutor

Browse files

- Expand the welcome guide in chainlit.md to introduce new features such as uploading notebooks, asking questions, generating quizzes, and creating flashcards.
- Enhance code with detailed docstrings for agent functions, improving code documentation and readability.
- Update notebook_tutor/chainlit_frontend.py to log successful chat initiation and provide users with guidance on interacting with the system.
- Add a Dockerfile for containerization, enabling easy deployment and scaling of the AI Notebook Tutor application.
- Update README.md with project metadata and acknowledgements, offering users and contributors a comprehensive overview of the project and its dependencies.
- Modify requirements.txt, deleting crewai, adding langchain-openai to align with the project's dependencies on LangChain and OpenAI technologies.

Dockerfile ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use the official Python image from the Docker Hub
2
+ FROM python:3.11
3
+
4
+ # Create a new user with a specific UID and set it as the default user
5
+ RUN useradd -m -u 1000 user
6
+
7
+ # Switch to the new user
8
+ USER user
9
+
10
+ # Set environment variables
11
+ ENV HOME=/home/user \
12
+ PATH=/home/user/.local/bin:$PATH
13
+
14
+ # Set the working directory
15
+ WORKDIR $HOME/app
16
+
17
+ # Copy the requirements file and install the dependencies
18
+ COPY --chown=user ./requirements.txt $HOME/app/requirements.txt
19
+ RUN pip install --upgrade pip && \
20
+ pip install -r requirements.txt
21
+
22
+ # Copy project files to the working directory
23
+ COPY --chown=user . $HOME/app
24
+
25
+ # Set PYTHONPATH to include the project directory
26
+ ENV PYTHONPATH=$HOME/app
27
+
28
+ # Set the command to run the application
29
+ CMD ["chainlit", "run", "notebook_tutor/app.py", "--port", "7860"]
README.md CHANGED
@@ -1,3 +1,12 @@
 
 
 
 
 
 
 
 
 
1
  # AI-Notebook-Tutor
2
 
3
  # RAG Application for QA in Jupyter Notebook
@@ -34,3 +43,7 @@ chainlit run notebook_tutor/app.py
34
  ## Usage
35
 
36
  Start a chat session and upload a Jupyter notebook file. The application will process the document and you can then ask questions related to the content of the notebook. It might take some time to answer some question (should be less than 1 min), so please be patient.
 
 
 
 
 
1
+ ---
2
+ title: DeepPDF AI
3
+ emoji: 📖
4
+ colorFrom: pink
5
+ colorTo: yellow
6
+ sdk: docker
7
+ pinned: false
8
+ ---
9
+
10
  # AI-Notebook-Tutor
11
 
12
  # RAG Application for QA in Jupyter Notebook
 
43
  ## Usage
44
 
45
  Start a chat session and upload a Jupyter notebook file. The application will process the document and you can then ask questions related to the content of the notebook. It might take some time to answer some question (should be less than 1 min), so please be patient.
46
+
47
+ ## Acknowledgements
48
+
49
+ This project uses technologies including LangChain, OpenAI's GPT models, Qdrant for vector storage and ChainLit. Thanks to all open-source contributors and organizations that make these tools available.
chainlit.md CHANGED
@@ -1,14 +1,36 @@
1
- # Welcome to Chainlit! 🚀🤖
2
 
3
- Hi there, Developer! 👋 We're excited to have you on board. Chainlit is a powerful tool designed to help you prototype, debug and share applications built on top of LLMs.
4
 
5
- ## Useful Links 🔗
6
 
7
- - **Documentation:** Get started with our comprehensive [Chainlit Documentation](https://docs.chainlit.io) 📚
8
- - **Discord Community:** Join our friendly [Chainlit Discord](https://discord.gg/k73SQ3FyUh) to ask questions, share your projects, and connect with other developers! 💬
9
 
10
- We can't wait to see what you create with Chainlit! Happy coding! 💻😊
 
 
 
11
 
12
- ## Welcome screen
13
 
14
- To modify the welcome screen, edit the `chainlit.md` file at the root of your project. If you do not want a welcome screen, just leave this file empty.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Welcome to AI Notebook Tutor! 🎓📘
2
 
3
+ Hello and welcome to the AI Notebook Tutor, your interactive learning assistant designed to enhance your understanding of Jupyter notebooks. Whether you're a student, educator, or professional, our tool will help you navigate and master complex technical concepts more efficiently.
4
 
5
+ ## Getting Started
6
 
7
+ To begin using AI Notebook Tutor:
 
8
 
9
+ - **Upload your Jupyter notebook** (.ipynb, max. 5mb) to start your interactive learning journey.
10
+ - **Ask specific questions** about the notebook content to get detailed explanations.
11
+ - **Generate custom quizzes** to test your understanding of the material.
12
+ - **Create flashcards** for quick revisions and efficient memorization of key concepts.
13
 
14
+ ### Sample Questions to Try:
15
 
16
+ - "Explain me the "generate_query" function in detail"
17
+ - "Generate a quiz about creating a vectore store retriever."
18
+ - "Create 10 flashcards for the key concepts in the notebook."
19
+
20
+ ## How It Works 🧠
21
+
22
+ 1. **Upload and Process**: Start by uploading your Jupyter notebook. Our system will process the document and prepare it for interactive learning.
23
+ 2. **Interactive Q&A**: Ask specific questions about your notebook content. The AI will analyze the content and provide detailed, easy-to-understand explanations.
24
+ 3. **Quiz Generation**: Automatically generate quizzes based on the key concepts in your notebook. This helps reinforce your understanding and retention of the material.
25
+ 4. **Flashcard Creation**: Generate flashcards from key segments of your notebook for quick revision and repetitive learning, helping boost long-term retention.
26
+
27
+ ## The Tech Behind It 💡🤖
28
+
29
+ - **Document Management**: Our system uses advanced document processing to load and split your notebook content into manageable chunks.
30
+ - **Language Model**: Powered by GPT-4, the AI provides accurate and relevant explanations, quizzes, and flashcards based on your notebook content.
31
+ - **Retrieval System**: Utilizing state-of-the-art retrieval techniques, our system ensures that the information provided is both precise and relevant.
32
+ - **Interactive Agents**: Different specialized agents work together to answer your questions, generate quizzes, and create flashcards, making your learning experience comprehensive and engaging.
33
+
34
+ ## Ready to Learn?
35
+
36
+ With AI Notebook Tutor, transform your Jupyter notebooks into an interactive, engaging learning experience. Dive deep into your study materials, test your knowledge, and retain information more effectively than ever before. Happy learning!
flashcards_cca7854c-91c2-47d5-872f-46132739ace0.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What command is used to clone a GitHub repository in a notebook?,!git clone https://github.com/arcee-ai/DALM
3
- How do you install or upgrade a Python package in a notebook?,!pip install --upgrade -q -e .
4
- Which command installs the 'langchain' and 'langchain-community' libraries?,!pip install -qU langchain langchain-core langchain-community sentence_transformers
5
- What is the command to install 'pymupdf' and 'faiss-cpu'?,!pip install -qU pymupdf faiss-cpu
6
- How do you import the Pandas library in Python?,import pandas as pd
7
- Which library provides the 'HuggingFaceEmbeddings' class?,from langchain_community.embeddings import HuggingFaceEmbeddings
8
- How do you import the 'FAISS' vector store from the 'langchain_community' library?,from langchain_community.vectorstores import FAISS
9
- What is the import statement for reading directories using the 'Llama Index' library?,from llama_index.core import SimpleDirectoryReader
10
- Which import statement is used for parsing nodes in the 'Llama Index' library?,from llama_index.core.node_parser import SimpleNodeParser
11
- How do you import the 'MetadataMode' schema from the 'Llama Index' library?,from llama_index.core.schema import MetadataMode
 
 
 
 
 
 
 
 
 
 
 
 
notebook_tutor/agents.py CHANGED
@@ -25,7 +25,18 @@ def create_agent(
25
  tools: list,
26
  system_prompt: str,
27
  ) -> AgentExecutor:
28
- """Create a function-calling agent and add it to the graph."""
 
 
 
 
 
 
 
 
 
 
 
29
  system_prompt += "\nWork autonomously according to your specialty, using the tools available to you."
30
  " Do not ask for clarification."
31
  " Your other team members (and other teams) will collaborate with you with their own specialties."
@@ -46,6 +57,21 @@ def create_agent(
46
 
47
  # Function to create agent nodes
48
  def agent_node(state, agent, name):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  result = agent.invoke(state)
50
  if 'messages' not in result:
51
  raise ValueError(f"No messages found in agent state: {result}")
@@ -65,7 +91,18 @@ def agent_node(state, agent, name):
65
 
66
  # Function to create the supervisor
67
  def create_team_supervisor(llm: ChatOpenAI, system_prompt, members) -> AgentExecutor:
68
- """An LLM-based router."""
 
 
 
 
 
 
 
 
 
 
 
69
  options = ["WAIT", "FINISH"] + members
70
  function_def = {
71
  "name": "route",
 
25
  tools: list,
26
  system_prompt: str,
27
  ) -> AgentExecutor:
28
+ """
29
+ Create a function-calling agent and add it to the graph.
30
+
31
+ Parameters:
32
+ llm (ChatOpenAI): The ChatOpenAI instance used for the agent.
33
+ tools (list): A list of tools available to the agent.
34
+ system_prompt (str): The system prompt for the agent.
35
+
36
+ Returns:
37
+ AgentExecutor: The AgentExecutor instance containing the agent.
38
+
39
+ """
40
  system_prompt += "\nWork autonomously according to your specialty, using the tools available to you."
41
  " Do not ask for clarification."
42
  " Your other team members (and other teams) will collaborate with you with their own specialties."
 
57
 
58
  # Function to create agent nodes
59
  def agent_node(state, agent, name):
60
+ """
61
+ Invoke an agent and update the state based on the agent's output.
62
+
63
+ Parameters:
64
+ state (dict): The current state of the conversation.
65
+ agent (AgentExecutor): The agent to be invoked.
66
+ name (str): The name of the agent.
67
+
68
+ Returns:
69
+ dict: The updated state after invoking the agent.
70
+
71
+ Raises:
72
+ ValueError: If no messages are found in the agent state.
73
+
74
+ """
75
  result = agent.invoke(state)
76
  if 'messages' not in result:
77
  raise ValueError(f"No messages found in agent state: {result}")
 
91
 
92
  # Function to create the supervisor
93
  def create_team_supervisor(llm: ChatOpenAI, system_prompt, members) -> AgentExecutor:
94
+ """
95
+ An LLM-based router.
96
+
97
+ Parameters:
98
+ llm (ChatOpenAI): The ChatOpenAI instance used for the supervisor.
99
+ system_prompt (str): The system prompt for the supervisor.
100
+ members (list): A list of team members.
101
+
102
+ Returns:
103
+ AgentExecutor: The AgentExecutor instance containing the supervisor.
104
+
105
+ """
106
  options = ["WAIT", "FINISH"] + members
107
  function_def = {
108
  "name": "route",
notebook_tutor/chainlit_frontend.py CHANGED
@@ -53,13 +53,24 @@ async def start_chat():
53
  tutor_chain = create_tutor_chain(retrieval_chain)
54
  cl.user_session.set("tutor_chain", tutor_chain)
55
 
56
- ready_to_chat_message = "Notebook uploaded and processed successfully. You are now ready to chat!"
 
 
57
  await cl.Message(content=ready_to_chat_message).send()
58
 
59
- logger.info("Chat started and notebook uploaded successfully.")
 
 
 
60
 
61
  @cl.on_message
62
  async def main(message: cl.Message):
 
 
 
 
 
 
63
 
64
  # Retrieve the LangGraph chain from the session
65
  tutor_chain = cl.user_session.get("tutor_chain")
@@ -134,6 +145,12 @@ async def main(message: cl.Message):
134
 
135
  @cl.on_chat_end
136
  async def end_chat():
 
 
 
 
 
 
137
  # Clean up the flashcards directory
138
  flashcard_directory = 'flashcards'
139
  if os.path.exists(flashcard_directory):
 
53
  tutor_chain = create_tutor_chain(retrieval_chain)
54
  cl.user_session.set("tutor_chain", tutor_chain)
55
 
56
+ logger.info("Chat started and notebook uploaded successfully.")
57
+
58
+ ready_to_chat_message = "Notebook uploaded and processed successfully!"
59
  await cl.Message(content=ready_to_chat_message).send()
60
 
61
+ invite_message = "You can now ask questions or request quizzes and flashcards based on the notebook content."
62
+ await cl.Message(content=invite_message).send()
63
+
64
+
65
 
66
  @cl.on_message
67
  async def main(message: cl.Message):
68
+ """
69
+ This is the main function that processes a message through the LangGraph chain.
70
+
71
+ Parameters:
72
+ - message (cl.Message): The message to be processed.
73
+ """
74
 
75
  # Retrieve the LangGraph chain from the session
76
  tutor_chain = cl.user_session.get("tutor_chain")
 
145
 
146
  @cl.on_chat_end
147
  async def end_chat():
148
+ """
149
+ Clean up the flashcards directory after the chat ends.
150
+ This function is executed when the chat session ends.
151
+ It removes the 'flashcards' directory and all its contents, if it exists.
152
+ If the directory does not exist, it creates a new empty directory with the same name.
153
+ """
154
  # Clean up the flashcards directory
155
  flashcard_directory = 'flashcards'
156
  if os.path.exists(flashcard_directory):
notebook_tutor/graph.py CHANGED
@@ -10,6 +10,17 @@ load_dotenv()
10
 
11
  # Create the LangGraph chain
12
  def create_tutor_chain(retrieval_chain):
 
 
 
 
 
 
 
 
 
 
 
13
  retrieve_information_tool = get_retrieve_information_tool(retrieval_chain)
14
 
15
  # Create QA Agent
 
10
 
11
  # Create the LangGraph chain
12
  def create_tutor_chain(retrieval_chain):
13
+ """
14
+ Create a tutor chain for the notebook tutor system.
15
+
16
+ This function creates a tutor chain for the notebook tutor system. The tutor chain consists of multiple agents, including a QA Agent, Quiz Agent, Flashcards Agent, and Supervisor Agent. Each agent is created with specific tools and prompts.
17
+
18
+ Parameters:
19
+ retrieval_chain (object): The retrieval chain used for information retrieval.
20
+
21
+ Returns:
22
+ StateGraph: The compiled tutor graph representing the tutor chain.
23
+ """
24
  retrieve_information_tool = get_retrieve_information_tool(retrieval_chain)
25
 
26
  # Create QA Agent
notebook_tutor/prompt_templates.py CHANGED
@@ -10,6 +10,10 @@ class PromptTemplates:
10
  Methods:
11
  __init__(): Initializes all prompt templates as instance variables.
12
  get_rag_qa_prompt(): Returns the RAG QA prompt.
 
 
 
 
13
 
14
  Example usage:
15
  prompt_templates = PromptTemplates()
@@ -41,7 +45,7 @@ class PromptTemplates:
41
  2. Search Notebook Content: Use the notebook content to gather relevant information and generate accurate and informative flashcards.
42
  3. Generate Flashcards: Create a series of flashcards content with clear questions on the front and detailed answers on the back. Ensure that the flashcards cover the essential points and concepts requested by the user.
43
  4. Export Flashcards: YOU MUST USE the flashcard_tool to create and export the flashcards in a format that can be easily imported into a flashcard management system, such as Anki.
44
- 5. Provide the list of flashcards in a clear and organized manner.
45
  Remember, your goal is to help the user learn efficiently and effectively by breaking down the notebook content into manageable, repeatable flashcards."""
46
 
47
  self.SupervisorAgent_prompt = "You are a supervisor tasked with managing a conversation between the following agents: QAAgent, QuizAgent, FlashcardsAgent. Given the user request, decide which agent should act next."
 
10
  Methods:
11
  __init__(): Initializes all prompt templates as instance variables.
12
  get_rag_qa_prompt(): Returns the RAG QA prompt.
13
+ get_qa_agent_prompt(): Returns the QA Agent prompt.
14
+ get_quiz_agent_prompt(): Returns the Quiz Agent prompt.
15
+ get_flashcards_agent_prompt(): Returns the Flashcards Agent prompt.
16
+ get_supervisor_agent_prompt(): Returns the Supervisor Agent prompt.
17
 
18
  Example usage:
19
  prompt_templates = PromptTemplates()
 
45
  2. Search Notebook Content: Use the notebook content to gather relevant information and generate accurate and informative flashcards.
46
  3. Generate Flashcards: Create a series of flashcards content with clear questions on the front and detailed answers on the back. Ensure that the flashcards cover the essential points and concepts requested by the user.
47
  4. Export Flashcards: YOU MUST USE the flashcard_tool to create and export the flashcards in a format that can be easily imported into a flashcard management system, such as Anki.
48
+ 5. Provide the list of flashcards in a clear and organized manner. DO NOT SHARE THE LINK TO THE FLASHCARD FILE.
49
  Remember, your goal is to help the user learn efficiently and effectively by breaking down the notebook content into manageable, repeatable flashcards."""
50
 
51
  self.SupervisorAgent_prompt = "You are a supervisor tasked with managing a conversation between the following agents: QAAgent, QuizAgent, FlashcardsAgent. Given the user request, decide which agent should act next."
notebook_tutor/states.py CHANGED
@@ -3,10 +3,20 @@ from langchain_core.messages import BaseMessage
3
 
4
  # Define the state for the system
5
  class TutorState(TypedDict):
 
 
 
 
 
 
 
 
 
 
 
6
  messages: List[BaseMessage]
7
  next: str
8
  quiz: List[dict]
9
  quiz_created: bool
10
  question_answered: bool
11
  flashcards_created: bool
12
- # flashcard_path: str
 
3
 
4
  # Define the state for the system
5
  class TutorState(TypedDict):
6
+ """
7
+ A class representing the state of the tutor system.
8
+
9
+ Attributes:
10
+ messages (List[BaseMessage]): A list of messages in the system.
11
+ next (str): The next step in the tutor system.
12
+ quiz (List[dict]): A list of quiz questions and answers.
13
+ quiz_created (bool): Indicates if a quiz has been created.
14
+ question_answered (bool): Indicates if a question has been answered.
15
+ flashcards_created (bool): Indicates if flashcards have been created.
16
+ """
17
  messages: List[BaseMessage]
18
  next: str
19
  quiz: List[dict]
20
  quiz_created: bool
21
  question_answered: bool
22
  flashcards_created: bool
 
notebook_tutor/tools.py CHANGED
@@ -13,6 +13,23 @@ class FlashcardInput(BaseModel):
13
  flashcards: list = Field(description="A list of flashcards. Each flashcard should be a dictionary with 'question' and 'answer' keys.")
14
 
15
  class FlashcardTool(BaseTool):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  name = "create_flashcards"
17
  description = "Create flashcards in a .csv format suitable for import into Anki"
18
  args_schema: Type[BaseModel] = FlashcardInput
@@ -49,6 +66,18 @@ class FlashcardTool(BaseTool):
49
  create_flashcards_tool = FlashcardTool()
50
 
51
  class RetrievalChainWrapper:
 
 
 
 
 
 
 
 
 
 
 
 
52
  def __init__(self, retrieval_chain):
53
  self.retrieval_chain = retrieval_chain
54
 
 
13
  flashcards: list = Field(description="A list of flashcards. Each flashcard should be a dictionary with 'question' and 'answer' keys.")
14
 
15
  class FlashcardTool(BaseTool):
16
+ """
17
+ FlashcardTool class.
18
+
19
+ This class represents a tool for creating flashcards in a .csv format suitable for import into Anki.
20
+
21
+ Attributes:
22
+ name (str): The name of the tool.
23
+ description (str): The description of the tool.
24
+ args_schema (Type[BaseModel]): The schema for the input arguments of the tool.
25
+
26
+ Methods:
27
+ _run(flashcards: list, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
28
+ Use the tool to create flashcards.
29
+
30
+ _arun(flashcards: list, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
31
+ Use the tool asynchronously.
32
+ """
33
  name = "create_flashcards"
34
  description = "Create flashcards in a .csv format suitable for import into Anki"
35
  args_schema: Type[BaseModel] = FlashcardInput
 
66
  create_flashcards_tool = FlashcardTool()
67
 
68
  class RetrievalChainWrapper:
69
+ """
70
+ RetrievalChainWrapper class.
71
+
72
+ This class wraps a retrieval chain and provides a method to retrieve information using the wrapped chain.
73
+
74
+ Attributes:
75
+ retrieval_chain: The retrieval chain to be wrapped.
76
+
77
+ Methods:
78
+ retrieve_information(query: str) -> str:
79
+ Use this tool to retrieve information about the provided notebook.
80
+ """
81
  def __init__(self, retrieval_chain):
82
  self.retrieval_chain = retrieval_chain
83
 
requirements.txt CHANGED
@@ -1,6 +1,6 @@
1
  langchain==0.1.20
2
  langgraph==0.0.48
3
- crewai==0.30.0
4
  qdrant-client==1.9.1
5
  python-dotenv==1.0.1
6
  chainlit==1.0.506
 
1
  langchain==0.1.20
2
  langgraph==0.0.48
3
+ langchain-openai==0.0.5
4
  qdrant-client==1.9.1
5
  python-dotenv==1.0.1
6
  chainlit==1.0.506