pragatheeswaran commited on
Commit
2c5d424
·
verified ·
1 Parent(s): 8e725de

Upload 3 files

Browse files
Files changed (3) hide show
  1. README_app.md +28 -0
  2. app.py +663 -0
  3. requirements.txt +20 -0
README_app.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LangGraph Document Q&A Assistant
2
+
3
+ This repository showcases a Document Question & Answering (Q&A) Assistant built using [LangGraph](https://gritholdings.gitbook.io/docs/langgraph) and the [DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) model. The assistant allows users to upload documents and receive AI-generated answers to their queries based on the content of those documents.
4
+
5
+ ## Features
6
+
7
+ - **Document Upload**: Users can upload various document formats for analysis.
8
+ - **Intelligent Q&A**: Utilizes the DeepSeek-V3 model to provide accurate answers based on the uploaded document's content.
9
+ - **Scalable Architecture**: Built with LangGraph to ensure modularity and scalability.
10
+
11
+ ## Getting Started
12
+
13
+ Follow these instructions to set up and run the project locally.
14
+
15
+ ### Prerequisites
16
+
17
+ - Python 3.8 or higher
18
+ - [LangGraph](https://gritholdings.gitbook.io/docs/langgraph)
19
+ - [DeepSeek-V3 model weights](https://huggingface.co/deepseek-ai/DeepSeek-V3)
20
+
21
+ ### Installation
22
+
23
+ 1. **Clone the Repository**:
24
+
25
+ ```bash
26
+ git clone https://huggingface.co/pragatheeswaran/langgraph-document-qa-assistant
27
+ cd langgraph-document-qa-assistant
28
+
app.py ADDED
@@ -0,0 +1,663 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import tempfile
3
+ import streamlit as st
4
+ from PIL import Image
5
+ import pytesseract
6
+ from pdf2image import convert_from_path
7
+ import pypdf
8
+ from dotenv import load_dotenv
9
+ import time
10
+
11
+ from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
12
+ from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
13
+ from langchain_core.output_parsers import StrOutputParser
14
+ from langchain_together import Together
15
+ from langchain_text_splitters import RecursiveCharacterTextSplitter
16
+ from langchain_community.vectorstores import FAISS
17
+ from langchain_community.embeddings import HuggingFaceEmbeddings
18
+
19
+ import langgraph
20
+ from langgraph.graph import END
21
+ from typing import List, Dict, Any, TypedDict, Optional
22
+
23
+ # Load environment variables
24
+ load_dotenv()
25
+
26
+ # Set page configuration
27
+ st.set_page_config(
28
+ page_title="Document Q&A",
29
+ page_icon="📚",
30
+ layout="wide",
31
+ initial_sidebar_state="expanded"
32
+ )
33
+
34
+ # Custom CSS for better UI
35
+ st.markdown("""
36
+ <style>
37
+ /* Base styles */
38
+ .main {
39
+ background-color: #f8fafc;
40
+ color: #333;
41
+ padding: 1rem;
42
+ }
43
+
44
+ /* Sidebar styling */
45
+ [data-testid="stSidebar"] {
46
+ background-color: #1e293b;
47
+ color: #f8fafc;
48
+ padding: 1rem;
49
+ }
50
+
51
+ /* Example questions */
52
+ .example-button {
53
+ background-color: #7c3aed;
54
+ color: white;
55
+ border: none;
56
+ border-radius: 0.5rem;
57
+ padding: 0.75rem 1rem;
58
+ margin-bottom: 0.75rem;
59
+ cursor: pointer;
60
+ text-align: left;
61
+ display: block;
62
+ width: 100%;
63
+ font-size: 0.9rem;
64
+ }
65
+
66
+ /* Chat container */
67
+ .chat-container {
68
+ min-height: 60vh;
69
+ overflow-y: auto;
70
+ padding: 1rem;
71
+ background-color: white;
72
+ border-radius: 0.5rem;
73
+ border: 1px solid #e2e8f0;
74
+ margin-bottom: 1rem;
75
+ }
76
+
77
+ /* Sidebar title */
78
+ .sidebar-title {
79
+ color: #f8fafc;
80
+ font-size: 1.2rem;
81
+ font-weight: 600;
82
+ margin-bottom: 1rem;
83
+ padding-bottom: 0.5rem;
84
+ border-bottom: 1px solid #475569;
85
+ }
86
+
87
+ /* File list */
88
+ .file-item {
89
+ padding: 0.5rem;
90
+ background-color: #334155;
91
+ border-radius: 0.25rem;
92
+ margin-bottom: 0.5rem;
93
+ color: #f8fafc;
94
+ }
95
+ .file-name {
96
+ font-weight: 500;
97
+ }
98
+ .file-type {
99
+ font-size: 0.75rem;
100
+ color: #cbd5e1;
101
+ }
102
+
103
+ /* Instructions */
104
+ .instructions {
105
+ color: #cbd5e1;
106
+ }
107
+ .instructions ol {
108
+ margin-left: 1.5rem;
109
+ padding-left: 0;
110
+ }
111
+ .instructions li {
112
+ margin-bottom: 0.5rem;
113
+ }
114
+
115
+ /* Divider */
116
+ .divider {
117
+ height: 1px;
118
+ background-color: #475569;
119
+ margin: 1.5rem 0;
120
+ }
121
+
122
+ /* Override Streamlit button styles */
123
+ .stButton > button {
124
+ background-color: #7c3aed;
125
+ color: white;
126
+ }
127
+
128
+ /* Override Streamlit file uploader */
129
+ .stFileUploader > div > div {
130
+ background-color: #334155;
131
+ color: #f8fafc;
132
+ border: 1px dashed #7c3aed;
133
+ border-radius: 0.5rem;
134
+ padding: 1rem;
135
+ }
136
+
137
+ /* Controls section */
138
+ .controls-section {
139
+ margin-top: 1rem;
140
+ }
141
+
142
+ /* Control buttons */
143
+ .control-button {
144
+ background-color: #7c3aed;
145
+ color: white;
146
+ border: none;
147
+ border-radius: 0.25rem;
148
+ padding: 0.5rem 1rem;
149
+ margin-right: 0.5rem;
150
+ margin-bottom: 0.5rem;
151
+ cursor: pointer;
152
+ }
153
+
154
+ /* How to use section */
155
+ .how-to-use {
156
+ margin-bottom: 1.5rem;
157
+ }
158
+ .how-to-use ol {
159
+ margin-left: 1.5rem;
160
+ padding-left: 0;
161
+ }
162
+ .how-to-use li {
163
+ margin-bottom: 0.5rem;
164
+ color: #f8fafc;
165
+ }
166
+
167
+ /* Input field */
168
+ .stTextInput > div > div > input {
169
+ border: 1px solid #e2e8f0;
170
+ border-radius: 0.5rem;
171
+ padding: 0.75rem;
172
+ font-size: 1rem;
173
+ }
174
+
175
+ /* Form styling */
176
+ [data-testid="stForm"] {
177
+ border: none;
178
+ padding: 0;
179
+ }
180
+
181
+ /* Hide Streamlit branding */
182
+ #MainMenu {visibility: hidden;}
183
+ footer {visibility: hidden;}
184
+
185
+ /* Chat messages */
186
+ .user-message {
187
+ background-color: #f3f4f6;
188
+ padding: 0.75rem;
189
+ border-radius: 0.5rem;
190
+ margin-bottom: 0.75rem;
191
+ color: #1e293b;
192
+ }
193
+
194
+ .assistant-message {
195
+ background-color: #f8fafc;
196
+ padding: 0.75rem;
197
+ border-radius: 0.5rem;
198
+ margin-bottom: 0.75rem;
199
+ border: 1px solid #e2e8f0;
200
+ color: #1e293b;
201
+ }
202
+
203
+ /* Chat input container */
204
+ .chat-input-container {
205
+ display: flex;
206
+ align-items: center;
207
+ background-color: white;
208
+ border-radius: 0.5rem;
209
+ padding: 0.5rem;
210
+ border: 1px solid #e2e8f0;
211
+ }
212
+
213
+ /* Document status */
214
+ .document-status {
215
+ padding: 0.5rem;
216
+ border-radius: 0.5rem;
217
+ margin-top: 0.5rem;
218
+ font-size: 0.9rem;
219
+ }
220
+
221
+ .status-success {
222
+ background-color: #dcfce7;
223
+ color: #166534;
224
+ }
225
+
226
+ .status-waiting {
227
+ background-color: #f3f4f6;
228
+ color: #4b5563;
229
+ }
230
+
231
+ /* Tabs styling */
232
+ .stTabs [data-baseweb="tab-list"] {
233
+ gap: 8px;
234
+ }
235
+
236
+ .stTabs [data-baseweb="tab"] {
237
+ background-color: #f1f5f9;
238
+ border-radius: 4px 4px 0 0;
239
+ padding: 8px 16px;
240
+ height: auto;
241
+ }
242
+
243
+ .stTabs [aria-selected="true"] {
244
+ background-color: white !important;
245
+ border-bottom: 2px solid #7c3aed !important;
246
+ }
247
+
248
+ /* Sidebar section headers */
249
+ .sidebar-section-header {
250
+ color: #f8fafc;
251
+ font-size: 1rem;
252
+ font-weight: 600;
253
+ margin-top: 1rem;
254
+ margin-bottom: 0.5rem;
255
+ }
256
+
257
+ /* Sidebar file uploader label */
258
+ .sidebar-uploader-label {
259
+ color: #f8fafc;
260
+ font-size: 0.9rem;
261
+ margin-bottom: 0.5rem;
262
+ }
263
+ </style>
264
+ """, unsafe_allow_html=True)
265
+
266
+ # Example questions
267
+ EXAMPLE_QUESTIONS = [
268
+ "How do the different topics in these documents relate to each other?",
269
+ "What is the structure of this document?",
270
+ "Can you analyze the writing style of this text?",
271
+ "Extract all dates and events mentioned in the document",
272
+ "What are the main arguments presented in this document?"
273
+ ]
274
+
275
+ # Initialize the LLM
276
+ @st.cache_resource
277
+ def get_llm():
278
+ return Together(
279
+ model="deepseek-ai/DeepSeek-V3",
280
+ temperature=0.7,
281
+ max_tokens=1024
282
+ )
283
+
284
+ # Initialize embeddings
285
+ @st.cache_resource
286
+ def get_embeddings():
287
+ return HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
288
+
289
+ # Initialize text splitter
290
+ @st.cache_resource
291
+ def get_text_splitter():
292
+ return RecursiveCharacterTextSplitter(
293
+ chunk_size=1000,
294
+ chunk_overlap=200
295
+ )
296
+
297
+ # Function to extract text from PDF
298
+ def extract_text_from_pdf(pdf_file):
299
+ pdf_reader = pypdf.PdfReader(pdf_file)
300
+ text = ""
301
+ for page in pdf_reader.pages:
302
+ text += page.extract_text() or ""
303
+ return text
304
+
305
+ # Function to extract text from image using OCR
306
+ def extract_text_from_image(image_file):
307
+ image = Image.open(image_file)
308
+ text = pytesseract.image_to_string(image)
309
+ return text
310
+
311
+ # Function to process PDF with OCR if needed
312
+ def process_pdf_with_ocr(pdf_file):
313
+ # First try normal text extraction
314
+ text = extract_text_from_pdf(pdf_file)
315
+
316
+ # If little or no text was extracted, try OCR
317
+ if len(text.strip()) < 100:
318
+ images = convert_from_path(pdf_file)
319
+ text = ""
320
+ for image in images:
321
+ text += pytesseract.image_to_string(image)
322
+
323
+ return text
324
+
325
+ # Function to process uploaded files
326
+ def process_uploaded_files(uploaded_files):
327
+ all_texts = []
328
+ file_info = []
329
+
330
+ for file in uploaded_files:
331
+ # Create a temporary file
332
+ with tempfile.NamedTemporaryFile(delete=False) as temp_file:
333
+ temp_file.write(file.getvalue())
334
+ temp_file_path = temp_file.name
335
+
336
+ # Process based on file type
337
+ if file.name.lower().endswith('.pdf'):
338
+ text = process_pdf_with_ocr(temp_file_path)
339
+ file_type = "PDF"
340
+ elif file.name.lower().endswith(('.png', '.jpg', '.jpeg')):
341
+ text = extract_text_from_image(temp_file_path)
342
+ file_type = "Image"
343
+ elif file.name.lower().endswith(('.txt', '.md')):
344
+ text = file.getvalue().decode('utf-8')
345
+ file_type = "Text"
346
+ else:
347
+ text = f"Unsupported file format: {file.name}"
348
+ file_type = "Unknown"
349
+
350
+ all_texts.append(f"--- Content from {file.name} ---\n{text}")
351
+ file_info.append({"name": file.name, "type": file_type})
352
+
353
+ # Clean up the temporary file
354
+ os.unlink(temp_file_path)
355
+
356
+ return "\n\n".join(all_texts), file_info
357
+
358
+ # Function to create vector store from text
359
+ def create_vectorstore(text):
360
+ text_splitter = get_text_splitter()
361
+ chunks = text_splitter.split_text(text)
362
+
363
+ # Use FAISS instead of Chroma to avoid SQLite dependency
364
+ return FAISS.from_texts(
365
+ texts=chunks,
366
+ embedding=get_embeddings()
367
+ )
368
+
369
+ # Define the state schema for the graph using TypedDict
370
+ class GraphState(TypedDict):
371
+ messages: List
372
+ documents: List
373
+ thinking: str
374
+
375
+ # Define the RAG agent using LangGraph
376
+ def create_rag_agent(vectorstore):
377
+ # Define the retrieval component
378
+ def retrieve(state: GraphState) -> GraphState:
379
+ query = state["messages"][-1].content
380
+ docs = vectorstore.similarity_search(query, k=5)
381
+ return {"documents": docs, "messages": state["messages"], "thinking": state.get("thinking", "")}
382
+
383
+ # Define the generation component with thinking step
384
+ def generate(state: GraphState) -> GraphState:
385
+ messages = state["messages"]
386
+ documents = state["documents"]
387
+
388
+ # Extract relevant context from documents
389
+ context = "\n\n".join([f"Document {i+1}:\n{doc.page_content}" for i, doc in enumerate(documents)])
390
+
391
+ # First, have the model think about the query
392
+ thinking_prompt = ChatPromptTemplate.from_messages([
393
+ SystemMessage(content="You are an assistant that thinks step by step before answering."),
394
+ MessagesPlaceholder(variable_name="messages"),
395
+ SystemMessage(content=f"Here is relevant context from the knowledge base:\n{context}\n\nThink step by step about how to answer the query using this context.")
396
+ ])
397
+
398
+ thinking = thinking_prompt | get_llm() | StrOutputParser()
399
+ thinking_result = thinking.invoke({"messages": messages})
400
+
401
+ # Then generate the final answer
402
+ answer_prompt = ChatPromptTemplate.from_messages([
403
+ SystemMessage(content="You are a helpful assistant that provides accurate information based on the given context."),
404
+ MessagesPlaceholder(variable_name="messages"),
405
+ SystemMessage(content=f"Here is relevant context from the knowledge base:\n{context}\n\nHere is your thinking process:\n{thinking_result}\n\nNow provide a clear and helpful answer based on this context and thinking.")
406
+ ])
407
+
408
+ answer = answer_prompt | get_llm() | StrOutputParser()
409
+ response = answer.invoke({"messages": messages})
410
+
411
+ return {
412
+ "messages": messages + [AIMessage(content=response)],
413
+ "thinking": thinking_result,
414
+ "documents": documents
415
+ }
416
+
417
+ # Create the graph
418
+ from langgraph.graph import StateGraph
419
+ workflow = StateGraph(GraphState)
420
+
421
+ workflow.add_node("retrieve", retrieve)
422
+ workflow.add_node("generate", generate)
423
+
424
+ workflow.set_entry_point("retrieve")
425
+ workflow.add_edge("retrieve", "generate")
426
+ workflow.add_edge("generate", END)
427
+
428
+ # Compile the graph
429
+ app = workflow.compile()
430
+
431
+ return app
432
+
433
+ # Function to clear all session state
434
+ def clear_session_state():
435
+ for key in list(st.session_state.keys()):
436
+ del st.session_state[key]
437
+
438
+ # Main app layout
439
+ def main():
440
+ # Initialize session state for showing examples
441
+ if "show_examples" not in st.session_state:
442
+ st.session_state.show_examples = True
443
+
444
+ # Initialize messages if not exists
445
+ if "messages" not in st.session_state:
446
+ st.session_state.messages = []
447
+
448
+ # Initialize thinking history if not exists
449
+ if "thinking_history" not in st.session_state:
450
+ st.session_state.thinking_history = []
451
+
452
+ # Sidebar for document upload and controls
453
+ with st.sidebar:
454
+ st.markdown('<div class="sidebar-title">📚 Document Q&A</div>', unsafe_allow_html=True)
455
+
456
+ st.markdown("""
457
+ <div class="how-to-use">
458
+ <ol>
459
+ <li>Upload your documents using the form below</li>
460
+ <li>Process the documents</li>
461
+ <li>Ask questions about your documents</li>
462
+ <li>View the AI's answers and thinking process</li>
463
+ </ol>
464
+ </div>
465
+ """, unsafe_allow_html=True)
466
+
467
+ # Document upload section
468
+ st.markdown('<div class="sidebar-section-header">📄 Upload Documents</div>', unsafe_allow_html=True)
469
+ st.markdown('<div class="sidebar-uploader-label">Select files to upload:</div>', unsafe_allow_html=True)
470
+
471
+ # File uploader
472
+ uploaded_files = st.file_uploader("Upload documents",
473
+ type=["pdf", "txt", "png", "jpg", "jpeg"],
474
+ accept_multiple_files=True,
475
+ label_visibility="collapsed")
476
+
477
+ # Process button
478
+ if uploaded_files:
479
+ if st.button("Process Documents"):
480
+ with st.spinner("Processing documents..."):
481
+ # Process progress bar
482
+ progress_bar = st.progress(0)
483
+ for i in range(100):
484
+ time.sleep(0.01)
485
+ progress_bar.progress(i + 1)
486
+
487
+ # Process the files
488
+ text, file_info = process_uploaded_files(uploaded_files)
489
+ st.session_state.vectorstore = create_vectorstore(text)
490
+ st.session_state.documents_processed = True
491
+ st.session_state.file_info = file_info
492
+
493
+ # Display success message
494
+ st.success(f"✅ Processed {len(uploaded_files)} documents successfully!")
495
+
496
+ # Document info section
497
+ if "file_info" in st.session_state and st.session_state.file_info:
498
+ st.markdown('<div class="divider"></div>', unsafe_allow_html=True)
499
+ st.markdown('<div class="sidebar-section-header">📋 Document Information</div>', unsafe_allow_html=True)
500
+
501
+ # Display file list
502
+ for i, file in enumerate(st.session_state.file_info):
503
+ st.markdown(f"""
504
+ <div class="file-item">
505
+ <div class="file-name">{file['name']}</div>
506
+ <div class="file-type">{file['type']} file</div>
507
+ </div>
508
+ """, unsafe_allow_html=True)
509
+
510
+ # Remove documents button
511
+ if st.button("Remove All Documents"):
512
+ if "vectorstore" in st.session_state:
513
+ del st.session_state.vectorstore
514
+ if "file_info" in st.session_state:
515
+ del st.session_state.file_info
516
+ if "documents_processed" in st.session_state:
517
+ del st.session_state.documents_processed
518
+ st.success("All documents removed!")
519
+ st.rerun()
520
+
521
+ # Controls section
522
+ st.markdown('<div class="divider"></div>', unsafe_allow_html=True)
523
+ st.markdown('<div class="sidebar-section-header">⚙️ Controls</div>', unsafe_allow_html=True)
524
+
525
+ # Clear chat button
526
+ if st.button("Clear Chat"):
527
+ if "messages" in st.session_state:
528
+ st.session_state.messages = []
529
+ if "thinking_history" in st.session_state:
530
+ st.session_state.thinking_history = []
531
+ st.rerun()
532
+
533
+ # Reset all button
534
+ if st.button("Reset All"):
535
+ clear_session_state()
536
+ st.rerun()
537
+
538
+ # Hide/Show examples button
539
+ if st.button("Hide Examples" if st.session_state.show_examples else "Show Examples"):
540
+ st.session_state.show_examples = not st.session_state.show_examples
541
+ st.rerun()
542
+
543
+ # Main content area
544
+ st.title("Document Q&A Assistant")
545
+
546
+ # Example questions section - only show if flag is True
547
+ if st.session_state.show_examples:
548
+ st.markdown("### Example Questions")
549
+ cols = st.columns(len(EXAMPLE_QUESTIONS))
550
+ for i, question in enumerate(EXAMPLE_QUESTIONS):
551
+ with cols[i]:
552
+ if st.button(question, key=f"example_{hash(question)}"):
553
+ st.session_state.messages.append(HumanMessage(content=question))
554
+
555
+ # Generate response if vectorstore exists
556
+ if "vectorstore" in st.session_state:
557
+ with st.spinner("Thinking..."):
558
+ # Create RAG agent
559
+ rag_agent = create_rag_agent(st.session_state.vectorstore)
560
+
561
+ # Run the agent
562
+ result = rag_agent.invoke({
563
+ "messages": [HumanMessage(content=question)],
564
+ "documents": [],
565
+ "thinking": ""
566
+ })
567
+
568
+ # Store thinking process
569
+ st.session_state.thinking_history.append(result["thinking"])
570
+
571
+ # Add AI message to chat history
572
+ st.session_state.messages.append(result["messages"][-1])
573
+ else:
574
+ # Add AI message to chat history
575
+ st.session_state.messages.append(AIMessage(content="Please upload and process documents first."))
576
+ st.rerun()
577
+
578
+ # Chat container
579
+ st.markdown("### 💬 Chat")
580
+ chat_container = st.container()
581
+
582
+ with chat_container:
583
+ # Display chat messages
584
+ if st.session_state.messages:
585
+ for i, message in enumerate(st.session_state.messages):
586
+ if isinstance(message, HumanMessage):
587
+ st.markdown(f"""
588
+ <div class="user-message">
589
+ <strong>User:</strong> {message.content}
590
+ </div>
591
+ """, unsafe_allow_html=True)
592
+ else:
593
+ st.markdown(f"""
594
+ <div class="assistant-message">
595
+ <strong>Assistant:</strong> {message.content}
596
+ </div>
597
+ """, unsafe_allow_html=True)
598
+
599
+ # Show thinking process if available
600
+ if "thinking_history" in st.session_state and i//2 < len(st.session_state.thinking_history):
601
+ thinking = st.session_state.thinking_history[i//2]
602
+
603
+ # Create a unique key for this thinking process
604
+ thinking_key = f"thinking_{i//2}"
605
+
606
+ # Store the visibility state in session_state if not already there
607
+ if thinking_key not in st.session_state:
608
+ st.session_state[thinking_key] = False
609
+
610
+ # Toggle button for thinking process
611
+ toggle_text = "Show thinking" if not st.session_state[thinking_key] else "Hide thinking"
612
+
613
+ # Create the toggle button
614
+ if st.button(toggle_text, key=f"toggle_{thinking_key}"):
615
+ st.session_state[thinking_key] = not st.session_state[thinking_key]
616
+ st.rerun()
617
+
618
+ # Show thinking process if toggled on
619
+ if st.session_state[thinking_key]:
620
+ with st.expander("Thinking Process", expanded=True):
621
+ st.write(thinking)
622
+ else:
623
+ st.info("Upload documents and start asking questions!")
624
+
625
+ # Chat input
626
+ st.markdown("### Ask a question about your documents")
627
+ with st.form(key="chat_form", clear_on_submit=True):
628
+ user_input = st.text_input("Type your question here...", key="user_question", label_visibility="collapsed")
629
+ cols = st.columns([6, 1])
630
+ with cols[0]:
631
+ submit_button = st.form_submit_button("Ask", use_container_width=True)
632
+
633
+ if submit_button and user_input:
634
+ # Add user message to chat history
635
+ st.session_state.messages.append(HumanMessage(content=user_input))
636
+
637
+ # Generate response if vectorstore exists
638
+ if "vectorstore" in st.session_state:
639
+ with st.spinner("Thinking..."):
640
+ # Create RAG agent
641
+ rag_agent = create_rag_agent(st.session_state.vectorstore)
642
+
643
+ # Run the agent
644
+ result = rag_agent.invoke({
645
+ "messages": [HumanMessage(content=user_input)],
646
+ "documents": [],
647
+ "thinking": ""
648
+ })
649
+
650
+ # Store thinking process
651
+ st.session_state.thinking_history.append(result["thinking"])
652
+
653
+ # Add AI message to chat history
654
+ st.session_state.messages.append(result["messages"][-1])
655
+ else:
656
+ # Add AI message to chat history
657
+ st.session_state.messages.append(AIMessage(content="Please upload and process documents first."))
658
+
659
+ # Rerun to update the UI
660
+ st.rerun()
661
+
662
+ if __name__ == "__main__":
663
+ main()
requirements.txt ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ langchain>=0.1.0
2
+ langchain-community>=0.0.13
3
+ langchain-together>=0.0.2
4
+ langchain-core>=0.1.10
5
+ langchain-text-splitters>=0.0.1
6
+ langchain-openai>=0.0.2
7
+ langchain-chroma>=0.0.1
8
+ langchain-experimental>=0.0.37
9
+ langchain-groq>=0.1.1
10
+ langsmith>=0.0.69
11
+ chromadb>=0.4.22
12
+ pydantic>=2.5.2
13
+ streamlit>=1.29.0
14
+ streamlit-chat>=0.1.1
15
+ python-dotenv>=1.0.0
16
+ pypdf>=3.17.1
17
+ pillow>=10.1.0
18
+ pytesseract>=0.3.10
19
+ pdf2image>=1.16.3
20
+ langgraph>=0.0.19