Liss, Alex (NYC-HUG) commited on
Commit
b11c09e
·
1 Parent(s): 14d3158

updating docs and cleaning up old files

Browse files
README.md CHANGED
@@ -4,9 +4,9 @@ app_file: gradio_app.py
4
  sdk: gradio
5
  sdk_version: 5.24.0
6
  ---
7
- # 49ers FanAI Hub - Streamlit Version
8
 
9
- This is a Streamlit-based chatbot application that provides information about the San Francisco 49ers, players, games, and fans. The application uses LangChain, Neo4j, and Zep for memory management.
10
 
11
  ## Features
12
 
@@ -14,6 +14,7 @@ This is a Streamlit-based chatbot application that provides information about th
14
  - Integration with Neo4j graph database for structured data queries
15
  - Vector search for finding game summaries
16
  - Memory management with Zep for conversation history
 
17
 
18
  ## Prerequisites
19
 
@@ -28,13 +29,12 @@ This is a Streamlit-based chatbot application that provides information about th
28
  2. Install the required packages:
29
 
30
  ```bash
31
- pip install -r requirements.txt
32
  ```
33
 
34
  3. Set up your environment variables:
35
  - Copy `.env.example` to `.env` in the root directory
36
- - Copy `data/.env.example` to `data/.env` in the data directory
37
- - Fill in your API keys and credentials in both `.env` files
38
 
39
  Example `.env` file:
40
  ```
@@ -46,50 +46,48 @@ AURA_PASSWORD=your_neo4j_password
46
  ZEP_API_KEY=your_zep_api_key
47
  ```
48
 
49
- Alternatively, you can set up your credentials in the `.streamlit/secrets.toml` file:
50
-
51
- ```toml
52
- # OpenAI API credentials
53
- OPENAI_API_KEY = "your_openai_api_key"
54
- OPENAI_MODEL = "gpt-4o" # Or your preferred model
55
-
56
- # Neo4j credentials
57
- NEO4J_URI = "your_neo4j_uri"
58
- NEO4J_USERNAME = "your_neo4j_username"
59
- NEO4J_PASSWORD = "your_neo4j_password"
60
-
61
- # Zep API key
62
- ZEP_API_KEY = "your_zep_api_key"
63
- ```
64
-
65
- > **IMPORTANT**: Never commit your actual API keys or credentials to the repository. The `.env` files and `.streamlit/secrets.toml` are included in `.gitignore` to prevent accidental exposure of sensitive information.
66
 
67
  ## Running the Application
68
 
69
- To run the Streamlit application:
70
 
71
  ```bash
72
- streamlit run app.py
73
  ```
74
 
75
- This will start the Streamlit server and open the application in your default web browser.
76
 
77
  ## Project Structure
78
 
79
- - `app.py`: Main Streamlit application
80
- - `agent.py`: Agent implementation using LangChain
81
- - `graph.py`: Neo4j graph connection
82
- - `llm.py`: Language model configuration
83
- - `utils.py`: Utility functions
84
  - `prompts.py`: System prompts for the agent
85
  - `tools/`: Specialized tools for the agent
86
  - `cypher.py`: Tool for Cypher queries to Neo4j
87
  - `vector.py`: Tool for vector search of game summaries
 
 
 
88
  - `data/`: Data files and scripts
89
- - `create_embeddings.py`: Script to create embeddings for game summaries
90
- - `upload_embeddings.py`: Script to upload embeddings to Neo4j
91
- - `neo4j_ingestion.py`: Script to ingest data into Neo4j
92
- - Various CSV files with 49ers data
 
 
 
 
 
 
 
 
 
 
 
93
 
94
  ## Security Considerations
95
 
@@ -110,6 +108,6 @@ Before pushing to a public repository:
110
  - "Who are the current players on the 49ers roster?"
111
  - "Tell me about the 49ers game against the Chiefs"
112
  - "Which fan communities have the most members?"
113
- - "Who is the most popular player among fans?"
114
 
115
  The application will use the appropriate tools to answer your questions based on the data in the Neo4j database.
 
4
  sdk: gradio
5
  sdk_version: 5.24.0
6
  ---
7
+ # 49ers FanAI Hub - Gradio Version
8
 
9
+ This is a Gradio-based chatbot application that provides information about the San Francisco 49ers, players, games, and fans. The application uses LangChain, Neo4j, and Zep for memory management.
10
 
11
  ## Features
12
 
 
14
  - Integration with Neo4j graph database for structured data queries
15
  - Vector search for finding game summaries
16
  - Memory management with Zep for conversation history
17
+ - Game Recap component that displays visual information for game-related queries
18
 
19
  ## Prerequisites
20
 
 
29
  2. Install the required packages:
30
 
31
  ```bash
32
+ pip install -r gradio_requirements.txt
33
  ```
34
 
35
  3. Set up your environment variables:
36
  - Copy `.env.example` to `.env` in the root directory
37
+ - Fill in your API keys and credentials
 
38
 
39
  Example `.env` file:
40
  ```
 
46
  ZEP_API_KEY=your_zep_api_key
47
  ```
48
 
49
+ > **IMPORTANT**: Never commit your actual API keys or credentials to the repository. The `.env` files are included in `.gitignore` to prevent accidental exposure of sensitive information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  ## Running the Application
52
 
53
+ To run the Gradio application:
54
 
55
  ```bash
56
+ python gradio_app.py
57
  ```
58
 
59
+ This will start the Gradio server and open the application in your default web browser.
60
 
61
  ## Project Structure
62
 
63
+ - `gradio_app.py`: Main Gradio application
64
+ - `gradio_agent.py`: Agent implementation using LangChain for Gradio
65
+ - `gradio_graph.py`: Neo4j graph connection for Gradio
66
+ - `gradio_llm.py`: Language model configuration for Gradio
67
+ - `gradio_utils.py`: Utility functions for Gradio
68
  - `prompts.py`: System prompts for the agent
69
  - `tools/`: Specialized tools for the agent
70
  - `cypher.py`: Tool for Cypher queries to Neo4j
71
  - `vector.py`: Tool for vector search of game summaries
72
+ - `game_recap.py`: Tool for game recaps with visual component
73
+ - `components/`: UI components
74
+ - `game_recap_component.py`: Game recap visual component
75
  - `data/`: Data files and scripts
76
+ - Various scripts and CSV files with 49ers data
77
+ - `docs/`: Documentation
78
+ - `requirements.md`: Detailed product and technical requirements
79
+ - `game_recap_implementation_instructions.md`: Implementation details for the game recap feature
80
+
81
+ ## Game Recap Component
82
+
83
+ The Game Recap feature provides visual information about games in addition to text-based summaries. When a user asks about a specific game, the application:
84
+
85
+ 1. Identifies the game being referenced
86
+ 2. Retrieves game data from the Neo4j database
87
+ 3. Displays a visual component with team logos, scores, and other game information
88
+ 4. Generates a text summary in the chat
89
+
90
+ Note: As mentioned in `docs/game_recap_implementation_instructions.md`, this component is still a work in progress. Currently, it displays above the chat window rather than embedded within chat messages.
91
 
92
  ## Security Considerations
93
 
 
108
  - "Who are the current players on the 49ers roster?"
109
  - "Tell me about the 49ers game against the Chiefs"
110
  - "Which fan communities have the most members?"
111
+ - "Show me the recap of the 49ers vs. Vikings game"
112
 
113
  The application will use the appropriate tools to answer your questions based on the data in the Neo4j database.
agent.py DELETED
@@ -1,191 +0,0 @@
1
- """
2
- Agent implementation for 49ers chatbot using LangChain and Neo4j.
3
- """
4
- import os
5
- from langchain.agents import AgentExecutor, create_react_agent
6
- from langchain_core.prompts import PromptTemplate
7
- from langchain.tools import Tool
8
- from langchain_core.runnables.history import RunnableWithMessageHistory
9
- from langchain_neo4j import Neo4jChatMessageHistory
10
- from langchain.callbacks.manager import CallbackManager
11
- from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
12
-
13
- from llm import llm
14
- from graph import graph
15
- from prompts import AGENT_SYSTEM_PROMPT, CHAT_SYSTEM_PROMPT
16
- from utils import get_session_id
17
-
18
- # Import tools
19
- from tools.cypher import cypher_qa_wrapper
20
- from tools.vector import get_game_summary
21
- from tools.game_recap import game_recap_qa # Import the new game recap tool
22
-
23
- # Create a basic chat chain for general football discussion
24
- from langchain_core.prompts import ChatPromptTemplate
25
- from langchain.schema import StrOutputParser
26
-
27
- chat_prompt = ChatPromptTemplate.from_messages(
28
- [
29
- ("system", CHAT_SYSTEM_PROMPT),
30
- ("human", "{input}"),
31
- ]
32
- )
33
-
34
- # Create a non-streaming LLM for the agent
35
- from langchain_openai import ChatOpenAI
36
- import streamlit as st
37
-
38
- # Get API key from environment or Streamlit secrets
39
- def get_api_key(key_name):
40
- """Get API key from environment or Streamlit secrets"""
41
- # First try to get from Streamlit secrets
42
- if hasattr(st, 'secrets') and key_name in st.secrets:
43
- return st.secrets[key_name]
44
- # Then try to get from environment
45
- return os.environ.get(key_name)
46
-
47
- OPENAI_API_KEY = get_api_key("OPENAI_API_KEY")
48
- OPENAI_MODEL = get_api_key("OPENAI_MODEL") or "gpt-4-turbo"
49
-
50
- agent_llm = ChatOpenAI(
51
- openai_api_key=OPENAI_API_KEY,
52
- model=OPENAI_MODEL,
53
- temperature=0.1,
54
- streaming=True # Disable streaming for agent
55
- )
56
-
57
- movie_chat = chat_prompt | llm | StrOutputParser()
58
-
59
- def football_chat_wrapper(input_text):
60
- """Wrapper function for football chat with error handling"""
61
- try:
62
- return {"output": movie_chat.invoke({"input": input_text})}
63
- except Exception as e:
64
- print(f"Error in football_chat: {str(e)}")
65
- return {"output": "I apologize, but I encountered an error while processing your question. Could you please rephrase it?"}
66
-
67
- # Define the tools
68
- tools = [
69
- Tool.from_function(
70
- name="49ers Graph Search",
71
- description="""Use for ANY specific 49ers-related queries about players, games, schedules, fans, or team info.
72
- Examples: "Who are the 49ers playing next week?", "Which players are defensive linemen?", "How many fan chapters are in California?"
73
- This is your PRIMARY tool for 49ers-specific information and should be your DEFAULT choice for most queries.""",
74
- func=cypher_qa_wrapper
75
- ),
76
- Tool.from_function(
77
- name="Game Recap",
78
- description="""Use SPECIFICALLY for detailed game recaps or when users want to see visual information about a particular game.
79
- Examples: "Show me the recap of the 49ers vs Jets game", "I want to see the highlights from the last 49ers game", "What happened in the game against the Patriots?"
80
- Returns both a text summary AND visual game data that can be displayed to the user.
81
- PREFER this tool over Game Summary Search for any game-specific questions.""",
82
- func=game_recap_qa
83
- ),
84
- Tool.from_function(
85
- name="Game Summary Search",
86
- description="""ONLY use for detailed game summaries or specific match results when Game Recap doesn't return good results.
87
- Examples: "What happened in the 49ers vs Seahawks game?", "Give me details about the last playoff game"
88
- Do NOT use for general schedule or player questions.""",
89
- func=get_game_summary,
90
- ),
91
- Tool.from_function(
92
- name="General Football Chat",
93
- description="""ONLY use for general football discussion NOT specific to 49ers data.
94
- Examples: "How does the NFL draft work?", "What are the basic rules of football?"
95
- Do NOT use for any 49ers-specific questions.""",
96
- func=football_chat_wrapper,
97
- )
98
- ]
99
-
100
- # Create the memory manager
101
- def get_memory(session_id):
102
- """Get the chat history from Neo4j for the given session"""
103
- return Neo4jChatMessageHistory(session_id=session_id, graph=graph)
104
-
105
- # Create the agent prompt
106
- agent_prompt = PromptTemplate.from_template(AGENT_SYSTEM_PROMPT)
107
-
108
- # Create the agent with non-streaming LLM
109
- agent = create_react_agent(agent_llm, tools, agent_prompt)
110
- agent_executor = AgentExecutor(
111
- agent=agent,
112
- tools=tools,
113
- verbose=True,
114
- handle_parsing_errors=True,
115
- max_iterations=5 # Limit the number of iterations to prevent infinite loops
116
- )
117
-
118
- # Create a chat agent with memory
119
- chat_agent = RunnableWithMessageHistory(
120
- agent_executor,
121
- get_memory,
122
- input_messages_key="input",
123
- history_messages_key="chat_history",
124
- )
125
-
126
- def generate_response(user_input, session_id=None):
127
- """
128
- Generate a response using the agent and tools
129
-
130
- Args:
131
- user_input (str): The user's message
132
- session_id (str, optional): The session ID for memory
133
-
134
- Returns:
135
- dict: The full response object from the agent
136
- """
137
- print('Starting generate_response function...')
138
- print(f'User input: {user_input}')
139
- print(f'Session ID: {session_id}')
140
-
141
- if not session_id:
142
- session_id = get_session_id()
143
- print(f'Generated new session ID: {session_id}')
144
-
145
- # Add retry logic
146
- max_retries = 3
147
- for attempt in range(max_retries):
148
- try:
149
- print('Invoking chat_agent...')
150
- response = chat_agent.invoke(
151
- {"input": user_input},
152
- {"configurable": {"session_id": session_id}},
153
- )
154
- print(f'Raw response from chat_agent: {response}')
155
-
156
- # Extract the output and format it for Streamlit
157
- if isinstance(response, dict):
158
- print('Response is a dictionary, extracting fields...')
159
- output = response.get('output', '')
160
- intermediate_steps = response.get('intermediate_steps', [])
161
- print(f'Extracted output: {output}')
162
- print(f'Extracted intermediate steps: {intermediate_steps}')
163
-
164
- # Create a formatted response
165
- formatted_response = {
166
- "output": output,
167
- "intermediate_steps": intermediate_steps,
168
- "metadata": {
169
- "tools_used": [step[0].tool for step in intermediate_steps] if intermediate_steps else ["None"]
170
- }
171
- }
172
- print(f'Formatted response: {formatted_response}')
173
- return formatted_response
174
- else:
175
- print('Response is not a dictionary, converting to string...')
176
- return {
177
- "output": str(response),
178
- "intermediate_steps": [],
179
- "metadata": {"tools_used": ["None"]}
180
- }
181
-
182
- except Exception as e:
183
- if attempt == max_retries - 1: # Last attempt
184
- print(f"Error in generate_response after {max_retries} attempts: {str(e)}")
185
- return {
186
- "output": "I apologize, but I encountered an error while processing your request. Could you please try again?",
187
- "intermediate_steps": [],
188
- "metadata": {"tools_used": ["None"]}
189
- }
190
- print(f"Attempt {attempt + 1} failed, retrying...")
191
- continue
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py DELETED
@@ -1,151 +0,0 @@
1
-
2
- import os
3
- import uuid
4
- import streamlit as st
5
- from zep_cloud.client import AsyncZep
6
- from zep_cloud.types import Message
7
- import asyncio
8
-
9
- # Import our components
10
- from agent import generate_response
11
- from utils import get_session_id, get_user_id, write_message
12
- from graph import graph
13
-
14
- # Page configuration
15
- st.set_page_config(
16
- page_title="49ers FanAI Hub",
17
- page_icon="🏈",
18
- layout="wide"
19
- )
20
-
21
- # Initialize Zep client
22
- zep_api_key = os.environ.get("ZEP_API_KEY")
23
- if not zep_api_key:
24
- st.error("ZEP_API_KEY environment variable is not set. Please set it to use memory features.")
25
- zep = None
26
- else:
27
- zep = AsyncZep(api_key=zep_api_key)
28
-
29
- # Initialize session state for messages if it doesn't exist
30
- if "messages" not in st.session_state:
31
- st.session_state.messages = []
32
-
33
- if "initialized" not in st.session_state:
34
- st.session_state.initialized = False
35
-
36
- # Function to initialize the chat session
37
- async def initialize_chat():
38
- """Set up the chat session when a user connects"""
39
- try:
40
- # Generate unique identifiers for the user and session
41
- user_id = get_user_id()
42
- session_id = get_session_id()
43
-
44
- print(f"Starting new chat session. User ID: {user_id}, Session ID: {session_id}")
45
-
46
- # Register user in Zep if available
47
- if zep:
48
- await zep.user.add(
49
- user_id=user_id,
50
- email="[email protected]",
51
- first_name="User",
52
- last_name="MovieFan",
53
- )
54
-
55
- # Start a new session in Zep
56
- await zep.memory.add_session(
57
- session_id=session_id,
58
- user_id=user_id,
59
- )
60
-
61
- # Add welcome message to session state
62
- welcome_message = """
63
- # 🏈 Welcome to the 49ers FanAI Hub!
64
-
65
- I can help you with:
66
- - Information about the 49ers, players, and fans
67
- - Finding 49ers games based on plot descriptions or themes
68
- - Discovering connections between people in the 49ers industry
69
-
70
- What would you like to know about today?
71
- """
72
- st.session_state.messages.append({"role": "assistant", "content": welcome_message})
73
- st.session_state.initialized = True
74
-
75
- except Exception as e:
76
- import traceback
77
- print(f"Error in initialize_chat: {str(e)}")
78
- print(f"Traceback: {traceback.format_exc()}")
79
- st.session_state.messages.append({
80
- "role": "system",
81
- "content": "There was an error starting the chat. Please refresh the page and try again."
82
- })
83
-
84
- # Function to process user messages
85
- async def process_message(message):
86
- """Process user messages and generate responses with the agent"""
87
- print("Starting message processing...")
88
- session_id = get_session_id()
89
- print(f"Session ID: {session_id}")
90
-
91
- try:
92
- # Store user message in Zep memory if available
93
- if zep:
94
- print("Storing user message in Zep...")
95
- await zep.memory.add(
96
- session_id=session_id,
97
- messages=[Message(role_type="user", content=message, role="user")]
98
- )
99
-
100
- # Process with the agent
101
- print('Calling generate_response function...')
102
- agent_response = generate_response(message, session_id)
103
- print(f"Agent response received: {agent_response}")
104
-
105
- # Extract the output and metadata
106
- output = agent_response.get("output", "")
107
- metadata = agent_response.get("metadata", {})
108
- print(f"Extracted output: {output}")
109
- print(f"Extracted metadata: {metadata}")
110
-
111
- # Add assistant response to session state
112
- st.session_state.messages.append({"role": "assistant", "content": output})
113
-
114
- # Store assistant's response in Zep memory if available
115
- if zep:
116
- print("Storing assistant response in Zep...")
117
- await zep.memory.add(
118
- session_id=session_id,
119
- messages=[Message(role_type="assistant", content=output, role="assistant")]
120
- )
121
- print("Assistant response stored in Zep")
122
-
123
- except Exception as e:
124
- import traceback
125
- print(f"Error in process_message: {str(e)}")
126
- print(f"Traceback: {traceback.format_exc()}")
127
- st.session_state.messages.append({
128
- "role": "assistant",
129
- "content": "I apologize, but I encountered an error. Could you please try again?"
130
- })
131
-
132
- # Initialize the chat session if not already initialized
133
- if not st.session_state.initialized:
134
- asyncio.run(initialize_chat())
135
-
136
- # Display chat messages
137
- for message in st.session_state.messages:
138
- write_message(message["role"], message["content"], save=False)
139
-
140
- # Chat input
141
- if prompt := st.chat_input("Ask me about the 49ers..."):
142
- # Display user message and save to history
143
- write_message("user", prompt)
144
-
145
- # Process the message and display response
146
- with st.spinner("Thinking..."):
147
- # Process the message asynchronously
148
- asyncio.run(process_message(prompt))
149
-
150
- # Force a rerun to display the new message
151
- st.rerun()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
recreate_relationships.py → data/z_old/recreate_relationships.py RENAMED
File without changes
gradio_agent.py CHANGED
@@ -15,7 +15,7 @@ from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
15
  from gradio_llm import llm
16
  from gradio_graph import graph
17
  from prompts import AGENT_SYSTEM_PROMPT, CHAT_SYSTEM_PROMPT
18
- from utils import get_session_id
19
 
20
  # Import tools
21
  from tools.cypher import cypher_qa_wrapper
 
15
  from gradio_llm import llm
16
  from gradio_graph import graph
17
  from prompts import AGENT_SYSTEM_PROMPT, CHAT_SYSTEM_PROMPT
18
+ from gradio_utils import get_session_id
19
 
20
  # Import tools
21
  from tools.cypher import cypher_qa_wrapper
graph.py DELETED
@@ -1,51 +0,0 @@
1
- """
2
- This module initializes the Neo4j graph connection using Streamlit secrets.
3
- """
4
-
5
- import os
6
- import streamlit as st
7
- from dotenv import load_dotenv
8
- from langchain_neo4j import Neo4jGraph
9
-
10
- # Load environment variables
11
- load_dotenv()
12
-
13
- # Get Neo4j credentials from environment or Streamlit secrets
14
- def get_credential(key_name):
15
- """Get credential from environment or Streamlit secrets"""
16
- # First try to get from Streamlit secrets
17
- if hasattr(st, 'secrets') and key_name in st.secrets:
18
- return st.secrets[key_name]
19
- # Then try to get from environment
20
- return os.environ.get(key_name)
21
-
22
- # Get Neo4j credentials
23
- AURA_CONNECTION_URI = get_credential("AURA_CONNECTION_URI") or get_credential("NEO4J_URI")
24
- AURA_USERNAME = get_credential("AURA_USERNAME") or get_credential("NEO4J_USERNAME")
25
- AURA_PASSWORD = get_credential("AURA_PASSWORD") or get_credential("NEO4J_PASSWORD")
26
-
27
- # Check if credentials are available
28
- if not all([AURA_CONNECTION_URI, AURA_USERNAME, AURA_PASSWORD]):
29
- missing = []
30
- if not AURA_CONNECTION_URI:
31
- missing.append("AURA_CONNECTION_URI/NEO4J_URI")
32
- if not AURA_USERNAME:
33
- missing.append("AURA_USERNAME/NEO4J_USERNAME")
34
- if not AURA_PASSWORD:
35
- missing.append("AURA_PASSWORD/NEO4J_PASSWORD")
36
-
37
- error_message = f"Missing Neo4j credentials: {', '.join(missing)}"
38
- st.error(error_message)
39
- raise ValueError(error_message)
40
-
41
- # Connect to Neo4j
42
- try:
43
- graph = Neo4jGraph(
44
- url=AURA_CONNECTION_URI,
45
- username=AURA_USERNAME,
46
- password=AURA_PASSWORD,
47
- )
48
- except Exception as e:
49
- error_message = f"Failed to connect to Neo4j: {str(e)}"
50
- st.error(error_message)
51
- raise Exception(error_message)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
llm.py DELETED
@@ -1,44 +0,0 @@
1
- """
2
- This module initializes the language model and embedding model using Streamlit secrets.
3
- """
4
-
5
- import os
6
- import streamlit as st
7
- from dotenv import load_dotenv
8
- from langchain_openai import ChatOpenAI, OpenAIEmbeddings
9
-
10
- # Load environment variables
11
- load_dotenv()
12
-
13
- # Get API keys from environment or Streamlit secrets
14
- def get_api_key(key_name):
15
- """Get API key from environment or Streamlit secrets"""
16
- # First try to get from Streamlit secrets
17
- if hasattr(st, 'secrets') and key_name in st.secrets:
18
- return st.secrets[key_name]
19
- # Then try to get from environment
20
- return os.environ.get(key_name)
21
-
22
- OPENAI_API_KEY = get_api_key("OPENAI_API_KEY")
23
- OPENAI_MODEL = get_api_key("OPENAI_MODEL") or "gpt-4-turbo"
24
-
25
- if not OPENAI_API_KEY:
26
- st.error("OPENAI_API_KEY is not set in environment variables or Streamlit secrets.")
27
- raise ValueError("OPENAI_API_KEY is not set")
28
-
29
- # Create the LLM with better error handling
30
- try:
31
- llm = ChatOpenAI(
32
- openai_api_key=OPENAI_API_KEY,
33
- model=OPENAI_MODEL,
34
- temperature=0.1,
35
- streaming=True # Enable streaming for better response handling
36
- )
37
-
38
- # Create the Embedding model
39
- embeddings = OpenAIEmbeddings(
40
- openai_api_key=OPENAI_API_KEY
41
- )
42
- except Exception as e:
43
- st.error(f"Failed to initialize OpenAI models: {str(e)}")
44
- raise Exception(f"Failed to initialize OpenAI models: {str(e)}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
utils.py DELETED
@@ -1,81 +0,0 @@
1
- """
2
- Utility functions for the chatbot application.
3
- """
4
-
5
- import uuid
6
- import streamlit as st
7
-
8
- # Try to import get_script_run_ctx from different possible locations
9
- # based on Streamlit version
10
- try:
11
- # For newer Streamlit versions
12
- from streamlit.runtime.scriptrunner.script_run_context import get_script_run_ctx
13
- except ImportError:
14
- try:
15
- # For older Streamlit versions
16
- from streamlit.script_run_context import get_script_run_ctx
17
- except ImportError:
18
- # Fallback if neither import works
19
- def get_script_run_ctx():
20
- return None
21
-
22
- def get_session_id():
23
- """
24
- Get the current session ID from Streamlit session state.
25
- Creates a new ID if one doesn't exist.
26
- """
27
- if "session_id" not in st.session_state:
28
- st.session_state.session_id = str(uuid.uuid4())
29
- return st.session_state.session_id
30
-
31
- def get_user_id():
32
- """
33
- Get the current user ID from Streamlit session state.
34
- Creates a new ID if one doesn't exist.
35
- """
36
- if "user_id" not in st.session_state:
37
- st.session_state.user_id = str(uuid.uuid4())
38
- return st.session_state.user_id
39
-
40
- def get_streamlit_session_id():
41
- """
42
- Get the Streamlit session ID from the script run context.
43
- This is different from our application session ID.
44
- Falls back to a generated UUID if the context is not available.
45
- """
46
- ctx = get_script_run_ctx()
47
- if ctx is not None:
48
- return ctx.session_id
49
- return str(uuid.uuid4()) # Fallback to a generated UUID
50
-
51
- def format_source_documents(source_documents):
52
- """
53
- Format source documents for display in Streamlit.
54
- """
55
- if not source_documents:
56
- return None
57
-
58
- formatted_docs = []
59
- for i, doc in enumerate(source_documents):
60
- if hasattr(doc, 'metadata') and doc.metadata:
61
- source = doc.metadata.get('source', 'Unknown')
62
- formatted_docs.append(f"Source {i+1}: {source}")
63
-
64
- return "\n".join(formatted_docs) if formatted_docs else None
65
-
66
- def write_message(role, content, save=True):
67
- """
68
- Helper function to write a message to the Streamlit UI and optionally save to session state.
69
-
70
- Args:
71
- role (str): The role of the message sender (e.g., "user", "assistant", "system")
72
- content (str): The content of the message
73
- save (bool): Whether to save the message to session state
74
- """
75
- # Append to session state if save is True
76
- if save and "messages" in st.session_state:
77
- st.session_state.messages.append({"role": role, "content": content})
78
-
79
- # Write to UI
80
- with st.chat_message(role):
81
- st.markdown(content)