JulsdL commited on
Commit
9f2166e
·
1 Parent(s): 48d9af7

Added logging for debugging

Browse files
flashcards/flashcards_03fb423e-2087-4598-9ebb-b99a20db0b93.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What is the first step in the retriever creation process?,The first step is loading a fine-tuned embedding model from the hub using the HuggingFaceEmbeddings class. The model used is named 'JulsdL/e2erag-arctic-m' and it is configured to run on a CUDA device.
3
- Which class is used to load the embedding model in the retriever creation process?,The HuggingFaceEmbeddings class is used to load the embedding model.
4
- What is the name of the embedding model used in the retriever creation process?,The embedding model used is named 'JulsdL/e2erag-arctic-m'.
5
- On which device is the embedding model 'JulsdL/e2erag-arctic-m' configured to run?,The embedding model is configured to run on a CUDA device.
6
- What is the purpose of setting up a VectorStore in the retriever creation process?,The VectorStore is set up to power the dense vector search and is populated with documents that have been split into chunks and embedded using the embedding model.
7
- Which tool is used to set up the VectorStore in the retriever creation process?,Meta's FAISS (Facebook AI Similarity Search) is used to set up the VectorStore.
8
- How are documents prepared for the VectorStore in the retriever creation process?,Documents are split into chunks and embedded using the previously loaded embedding model before being added to the VectorStore.
9
- What is the final step in the retriever creation process?,The final step is converting the VectorStore into a retriever that can fetch relevant documents or chunks based on query embeddings.
10
- "In the context of retriever creation, what is the purpose of converting the VectorStore to a retriever?",Converting the VectorStore to a retriever allows it to be used for fetching relevant documents or chunks based on the query embeddings.
11
- How does a retriever enhance the model's ability in a Retrieval-Augmented Generation (RAG) setup?,"A retriever enhances the model's ability by providing contextually relevant responses based on the input queries, improving the quality and relevance of the generated content."
 
 
 
 
 
 
 
 
 
 
 
 
flashcards/flashcards_30a9e627-3d47-45fc-81db-a991269edb24.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What command is used to clone a repository from GitHub?,!git clone https://github.com/arcee-ai/DALM
3
- How do you install a package using pip and upgrade it if necessary?,!pip install --upgrade -q -e .
4
- "Which command is used to install the latest versions of langchain, langchain-core, langchain-community, and sentence_transformers?",!pip install -qU langchain langchain-core langchain-community sentence_transformers
5
- How do you install the pymupdf and faiss-cpu libraries using pip?,!pip install -qU pymupdf faiss-cpu
6
- What is the import statement for pandas?,import pandas as pd
7
- How do you import HuggingFaceEmbeddings from the langchain_community library?,from langchain_community.embeddings import HuggingFaceEmbeddings
8
- What is the import statement for FAISS from the langchain_community library?,from langchain_community.vectorstores import FAISS
9
- Which import statement is used for SimpleDirectoryReader from llama_index.core?,from llama_index.core import SimpleDirectoryReader
10
- How do you import SimpleNodeParser from the llama_index.core.node_parser module?,from llama_index.core.node_parser import SimpleNodeParser
11
- What is the import statement for MetadataMode from llama_index.core.schema?,from llama_index.core.schema import MetadataMode
 
 
 
 
 
 
 
 
 
 
 
 
flashcards/flashcards_37136597-b97c-4889-9dc1-c3b5e10084c1.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What is the first step in the retriever creation process?,The first step in the retriever creation process is loading the embedding model.
3
- Which model is used as the embedding model in the retriever creation process?,The embedding model used is 'JulsdL/e2erag-arctic-m'.
4
- Which class and module are used to load the embedding model?,The embedding model is loaded using the HuggingFaceEmbeddings class from the langchain_community.embeddings module.
5
- On which device is the embedding model configured to run?,The embedding model is configured to run on a CUDA device.
6
- What is the purpose of the VectorStore in the retriever creation process?,The VectorStore is created to power dense vector searches.
7
- Which technology is used to set up the VectorStore?,Meta's FAISS (Facebook AI Similarity Search) is used to set up the VectorStore.
8
- How are the documents prepared for creating the VectorStore?,The documents are split into chunks before creating the VectorStore.
9
- Which method is called to create the VectorStore from the documents and embedding model?,The method FAISS.from_documents is called to create the VectorStore.
10
- How is the VectorStore converted into a retriever?,The VectorStore is converted into a retriever by invoking the as_retriever() method on the vector_store object.
11
- What are the benefits of the retriever created through this process?,"The retriever leverages the power of dense embeddings and efficient search capabilities provided by FAISS, making it effective for retrieval-augmented tasks."
 
 
 
 
 
 
 
 
 
 
 
 
flashcards/flashcards_4dbaa0d1-6363-4584-9691-824540220571.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What is the first step in creating a retriever for a document?,The first step is loading the documents using a loader like `PyMuPDFLoader` to load documents from a PDF file.
3
- Which function is used to load documents from a PDF file?,The function `PyMuPDFLoader` is used to load documents from a PDF file.
4
- How are documents split into chunks for processing?,"Documents are split into chunks using the `RecursiveCharacterTextSplitter`, which divides the documents based on predefined rules like token limits and split characters."
5
- What is the purpose of chunking documents?,Chunking documents into smaller parts helps in managing and processing them more efficiently for embedding and retrieval.
6
- Which library is used for loading a pre-trained embedding model?,The `HuggingFaceEmbeddings` library is used for loading a pre-trained embedding model.
7
- How do you specify the device for running the embedding model?,"You specify the device (e.g., 'cuda') in the `model_kwargs` parameter when loading the embedding model."
8
- What is Meta's FAISS used for in the retriever creation process?,Meta's FAISS is used to power dense vector search by creating a vector store from the document chunks and embedding model.
9
- How do you convert a vector store into a retriever?,You convert a vector store into a retriever using the `as_retriever` method.
10
- What is the role of the retriever in the retrieval-augmented generation (RAG) system?,"The retriever fetches relevant document chunks based on query vectors, providing contextually relevant responses."
11
- Which embedding model is used in the example provided for retriever creation?,The embedding model 'JulsdL/e2erag-arctic-m' from Hugging Face is used in the example.
 
 
 
 
 
 
 
 
 
 
 
 
flashcards/flashcards_52137f3d-8a8d-4891-9216-ab3bbc6ee66a.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What library is used to load the fine-tuned embedding model for retriever creation?,The HuggingFaceEmbeddings class is used to load the fine-tuned embedding model.
3
- What model is loaded for embedding in the retriever creation process?,The model 'JulsdL/e2erag-arctic-m' is loaded for embedding.
4
- On which device is the embedding model set to run?,The embedding model is set to run on a CUDA device.
5
- Which class is used to set up the vector store in retriever creation?,The FAISS class from Meta is used to set up the vector store.
6
- What is the purpose of the vector store in the retriever creation process?,The vector store powers the dense vector search by storing documents that have been split into chunks and embedded.
7
- How is the vector store created in the retriever creation process?,The vector store is created from documents that have been split into chunks and embedded using the loaded embedding model.
8
- How is the vector store converted into a retriever?,The vector store is converted into a retriever using the 'as_retriever()' method.
9
- What is a retriever used for in the context of retriever creation?,A retriever is used to retrieve context based on a query for a language model.
10
- What is the benefit of creating a retriever in a RAG setup?,Creating a retriever enhances the capabilities of language models by providing them with relevant context for generating responses.
11
- What does 'RAG' stand for in the context of retriever creation?,'RAG' stands for Retrieval-Augmented Generation.
 
 
 
 
 
 
 
 
 
 
 
 
flashcards/flashcards_545f9f54-f8b0-4549-b27d-66b7303a017e.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What tool is used to load documents from a PDF file in the retriever creation process?,The PyMuPDFLoader is used to load documents from a PDF file.
3
- What is the purpose of the RecursiveCharacterTextSplitter in the retriever creation process?,The RecursiveCharacterTextSplitter is used to divide the documents into manageable chunks based on specific rules such as token limits and preferred split characters.
4
- Which library is used to create a dense vector store in the retriever creation process?,Meta's FAISS library is used to create a dense vector store from the document chunks and the embedding model.
5
- How do you load a pre-trained embedding model from Hugging Face to run on a CUDA device?,Use the HuggingFaceEmbeddings class with the model name and specify the device as CUDA in the model_kwargs parameter.
6
- What is the role of the embedding model in the retriever creation process?,The embedding model is used to convert text into vectors that can be stored in the vector store for efficient retrieval.
7
- How do you convert a vector store into a retriever?,Use the as_retriever method on the vector store to convert it into a retriever.
8
- What is the first step in the retriever creation process?,The first step is loading the documents using PyMuPDFLoader.
9
- Which model name is used in the provided example for the embedding model?,"The model name used is ""JulsdL/e2erag-arctic-m""."
10
- What is the purpose of the vector store in the retriever creation process?,"The vector store holds the dense vector representations of document chunks, allowing for efficient retrieval based on query embeddings."
11
- Which Python import is necessary to use FAISS for creating a vector store?,You need to import FAISS from langchain_community.vectorstores.
 
 
 
 
 
 
 
 
 
 
 
 
flashcards/flashcards_af3650ed-34b5-4040-b672-036e1cf3b8e3.csv DELETED
@@ -1,11 +0,0 @@
1
- Front,Back
2
- What is the first step in creating a retriever?,The first step in creating a retriever is loading a fine-tuned embedding model from the hub using the HuggingFaceEmbeddings class.
3
- Which class is used to load the embedding model?,The HuggingFaceEmbeddings class is used to load the embedding model.
4
- How is the model 'JulsdL/e2erag-arctic-m' loaded onto the GPU?,"The model 'JulsdL/e2erag-arctic-m' is loaded onto the GPU by setting the model_kwargs parameter to {""device"": ""cuda""} in the HuggingFaceEmbeddings class."
5
- What library is used to set up the vector store?,Meta's FAISS library is used to set up the vector store.
6
- What does the vector store manage?,The vector store manages the dense vectors generated by the embedding model.
7
- How is the vector store converted into a retriever?,The vector store is converted into a retriever using the as_retriever() method.
8
- What is the purpose of the retriever in this context?,The purpose of the retriever is to efficiently fetch relevant document vectors based on the query vectors.
9
- What is the role of the embedding model in the retriever creation process?,"The embedding model generates dense vectors for the documents, which are then managed by the vector store and used by the retriever to find relevant information."
10
- Which parameter in the HuggingFaceEmbeddings class specifies the device to be used?,"The model_kwargs parameter specifies the device to be used, such as ""cuda"" for GPU."
11
- What does the FAISS library stand for?,FAISS stands for Facebook AI Similarity Search.
 
 
 
 
 
 
 
 
 
 
 
 
notebook_tutor/chainlit_frontend.py CHANGED
@@ -75,24 +75,25 @@ async def main(message: cl.Message):
75
  flashcard_filename="",
76
  )
77
 
78
- print(f"Initial state: {state}")
79
 
80
  # Process the message through the LangGraph chain
81
  for s in tutor_chain.stream(state, {"recursion_limit": 10}):
82
- print(f"State after processing: {s}")
83
 
84
  # Extract messages from the state
85
  if "__end__" not in s:
86
  agent_state = next(iter(s.values()))
87
  if "messages" in agent_state:
88
  response = agent_state["messages"][-1].content
89
- print(f"Response: {response}")
90
  await cl.Message(content=response).send()
91
  else:
92
  print("Error: No messages found in agent state.")
93
  else:
94
  # Extract the final state
95
  final_state = next(iter(s.values()))
 
96
 
97
  # Check if the quiz was created and send it to the frontend
98
  if final_state.get("quiz_created"):
@@ -109,20 +110,20 @@ async def main(message: cl.Message):
109
  flashcards_message = final_state["messages"][-1].content
110
  await cl.Message(content=flashcards_message).send()
111
 
112
- # Create a full path to the file
113
  flashcard_filename = final_state["flashcard_filename"]
114
- print(f"Flashcard filename: {flashcard_filename}")
115
- flashcard_path = os.path.abspath(flashcard_filename)
116
- print(f"Flashcard path: {flashcard_path}")
117
 
118
  # Use the File class to send the file
119
- file_element = cl.File(name=os.path.basename(flashcard_filename), path=flashcard_path)
120
- print(f"Sending flashcards file: {file_element}")
121
  await cl.Message(
122
  content="Here are your flashcards:",
123
  elements=[file_element]
124
  ).send()
125
 
126
- print("Reached END state.")
127
 
128
  break
 
75
  flashcard_filename="",
76
  )
77
 
78
+ print("\033[93m" + f"Initial state: {state}" + "\033[0m")
79
 
80
  # Process the message through the LangGraph chain
81
  for s in tutor_chain.stream(state, {"recursion_limit": 10}):
82
+ print("\033[93m" + f"State after processing: {s}" + "\033[0m")
83
 
84
  # Extract messages from the state
85
  if "__end__" not in s:
86
  agent_state = next(iter(s.values()))
87
  if "messages" in agent_state:
88
  response = agent_state["messages"][-1].content
89
+ print("\033[93m" + f"Response: {response}" + "\033[0m")
90
  await cl.Message(content=response).send()
91
  else:
92
  print("Error: No messages found in agent state.")
93
  else:
94
  # Extract the final state
95
  final_state = next(iter(s.values()))
96
+ print("\033[93m" + f"Final state: {final_state}" + "\033[0m")
97
 
98
  # Check if the quiz was created and send it to the frontend
99
  if final_state.get("quiz_created"):
 
110
  flashcards_message = final_state["messages"][-1].content
111
  await cl.Message(content=flashcards_message).send()
112
 
113
+ # Create a relative path to the file
114
  flashcard_filename = final_state["flashcard_filename"]
115
+ print("\033[93m" + f"Flashcard filename: {flashcard_filename}" + "\033[0m")
116
+ flashcard_path = os.path.join(".files", flashcard_filename)
117
+ print("\033[93m" + f"Flashcard path: {flashcard_path}" + "\033[0m")
118
 
119
  # Use the File class to send the file
120
+ file_element = cl.File(name=os.path.basename(flashcard_path), path=flashcard_path)
121
+ print("\033[93m" + f"Sending flashcards file: {file_element}" + "\033[0m")
122
  await cl.Message(
123
  content="Here are your flashcards:",
124
  elements=[file_element]
125
  ).send()
126
 
127
+ print("\033[93m" + "Reached END state." + "\033[0m")
128
 
129
  break
notebook_tutor/tools.py CHANGED
@@ -22,8 +22,8 @@ class FlashcardTool(BaseTool):
22
  ) -> str:
23
  """Use the tool to create flashcards."""
24
  filename = f"flashcards_{uuid.uuid4()}.csv"
25
- save_path = os.path.join('flashcards', filename) # Save in 'flashcards' directory
26
- os.makedirs(os.path.dirname(save_path), exist_ok=True)
27
  with open(save_path, 'w', newline='') as csvfile:
28
  fieldnames = ['Front', 'Back']
29
  writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
 
22
  ) -> str:
23
  """Use the tool to create flashcards."""
24
  filename = f"flashcards_{uuid.uuid4()}.csv"
25
+ save_path = os.path.join('.files', filename)
26
+ # os.makedirs(os.path.dirname(save_path), exist_ok=True) # Create directory if it doesn't exist
27
  with open(save_path, 'w', newline='') as csvfile:
28
  fieldnames = ['Front', 'Back']
29
  writer = csv.DictWriter(csvfile, fieldnames=fieldnames)