Spaces:

aliss77777
/

IFX-trace-implementation

No application file

App Files Files Community

Liss, Alex (NYC-HUG) commited on Apr 15

Commit

b11c09e

1 Parent(s): 14d3158

updating docs and cleaning up old files

Browse files

Files changed (8) hide show

README.md +33 -35
agent.py +0 -191
app.py +0 -151
recreate_relationships.py → data/z_old/recreate_relationships.py +0 -0
gradio_agent.py +1 -1
graph.py +0 -51
llm.py +0 -44
utils.py +0 -81

README.md CHANGED Viewed

@@ -4,9 +4,9 @@ app_file: gradio_app.py
 sdk: gradio
 sdk_version: 5.24.0
 ---
-# 49ers FanAI Hub - Streamlit Version
-This is a Streamlit-based chatbot application that provides information about the San Francisco 49ers, players, games, and fans. The application uses LangChain, Neo4j, and Zep for memory management.
 ## Features
@@ -14,6 +14,7 @@ This is a Streamlit-based chatbot application that provides information about th
 - Integration with Neo4j graph database for structured data queries
 - Vector search for finding game summaries
 - Memory management with Zep for conversation history
 ## Prerequisites
@@ -28,13 +29,12 @@ This is a Streamlit-based chatbot application that provides information about th
 2. Install the required packages:
 ```bash
-pip install -r requirements.txt
 ```
 3. Set up your environment variables:
    - Copy `.env.example` to `.env` in the root directory
-   - Copy `data/.env.example` to `data/.env` in the data directory
-   - Fill in your API keys and credentials in both `.env` files
 Example `.env` file:
 ```
@@ -46,50 +46,48 @@ AURA_PASSWORD=your_neo4j_password
 ZEP_API_KEY=your_zep_api_key
 ```
-Alternatively, you can set up your credentials in the `.streamlit/secrets.toml` file:
-```toml
-# OpenAI API credentials
-OPENAI_API_KEY = "your_openai_api_key"
-OPENAI_MODEL = "gpt-4o"  # Or your preferred model
-# Neo4j credentials
-NEO4J_URI = "your_neo4j_uri"
-NEO4J_USERNAME = "your_neo4j_username"
-NEO4J_PASSWORD = "your_neo4j_password"
-# Zep API key
-ZEP_API_KEY = "your_zep_api_key"
-```
-> **IMPORTANT**: Never commit your actual API keys or credentials to the repository. The `.env` files and `.streamlit/secrets.toml` are included in `.gitignore` to prevent accidental exposure of sensitive information.
 ## Running the Application
-To run the Streamlit application:
 ```bash
-streamlit run app.py
 ```
-This will start the Streamlit server and open the application in your default web browser.
 ## Project Structure
-- `app.py`: Main Streamlit application
-- `agent.py`: Agent implementation using LangChain
-- `graph.py`: Neo4j graph connection
-- `llm.py`: Language model configuration
-- `utils.py`: Utility functions
 - `prompts.py`: System prompts for the agent
 - `tools/`: Specialized tools for the agent
   - `cypher.py`: Tool for Cypher queries to Neo4j
   - `vector.py`: Tool for vector search of game summaries
 - `data/`: Data files and scripts
-  - `create_embeddings.py`: Script to create embeddings for game summaries
-  - `upload_embeddings.py`: Script to upload embeddings to Neo4j
-  - `neo4j_ingestion.py`: Script to ingest data into Neo4j
-  - Various CSV files with 49ers data
 ## Security Considerations
@@ -110,6 +108,6 @@ Before pushing to a public repository:
    - "Who are the current players on the 49ers roster?"
    - "Tell me about the 49ers game against the Chiefs"
    - "Which fan communities have the most members?"
-   - "Who is the most popular player among fans?"
 The application will use the appropriate tools to answer your questions based on the data in the Neo4j database.

 sdk: gradio
 sdk_version: 5.24.0
 ---
+# 49ers FanAI Hub - Gradio Version
+This is a Gradio-based chatbot application that provides information about the San Francisco 49ers, players, games, and fans. The application uses LangChain, Neo4j, and Zep for memory management.
 ## Features
 - Integration with Neo4j graph database for structured data queries
 - Vector search for finding game summaries
 - Memory management with Zep for conversation history
+- Game Recap component that displays visual information for game-related queries
 ## Prerequisites
 2. Install the required packages:
 ```bash
+pip install -r gradio_requirements.txt
 ```
 3. Set up your environment variables:
    - Copy `.env.example` to `.env` in the root directory
+   - Fill in your API keys and credentials
 Example `.env` file:
 ```
 ZEP_API_KEY=your_zep_api_key
 ```
+> **IMPORTANT**: Never commit your actual API keys or credentials to the repository. The `.env` files are included in `.gitignore` to prevent accidental exposure of sensitive information.
 ## Running the Application
+To run the Gradio application:
 ```bash
+python gradio_app.py
 ```
+This will start the Gradio server and open the application in your default web browser.
 ## Project Structure
+- `gradio_app.py`: Main Gradio application
+- `gradio_agent.py`: Agent implementation using LangChain for Gradio
+- `gradio_graph.py`: Neo4j graph connection for Gradio
+- `gradio_llm.py`: Language model configuration for Gradio
+- `gradio_utils.py`: Utility functions for Gradio
 - `prompts.py`: System prompts for the agent
 - `tools/`: Specialized tools for the agent
   - `cypher.py`: Tool for Cypher queries to Neo4j
   - `vector.py`: Tool for vector search of game summaries
+  - `game_recap.py`: Tool for game recaps with visual component
+- `components/`: UI components
+  - `game_recap_component.py`: Game recap visual component
 - `data/`: Data files and scripts
+  - Various scripts and CSV files with 49ers data
+- `docs/`: Documentation
+  - `requirements.md`: Detailed product and technical requirements
+  - `game_recap_implementation_instructions.md`: Implementation details for the game recap feature
+## Game Recap Component
+The Game Recap feature provides visual information about games in addition to text-based summaries. When a user asks about a specific game, the application:
+1. Identifies the game being referenced
+2. Retrieves game data from the Neo4j database
+3. Displays a visual component with team logos, scores, and other game information
+4. Generates a text summary in the chat
+Note: As mentioned in `docs/game_recap_implementation_instructions.md`, this component is still a work in progress. Currently, it displays above the chat window rather than embedded within chat messages.
 ## Security Considerations
    - "Who are the current players on the 49ers roster?"
    - "Tell me about the 49ers game against the Chiefs"
    - "Which fan communities have the most members?"
+   - "Show me the recap of the 49ers vs. Vikings game"
 The application will use the appropriate tools to answer your questions based on the data in the Neo4j database.

agent.py DELETED Viewed

@@ -1,191 +0,0 @@
-"""
-Agent implementation for 49ers chatbot using LangChain and Neo4j.
-"""
-import os
-from langchain.agents import AgentExecutor, create_react_agent
-from langchain_core.prompts import PromptTemplate
-from langchain.tools import Tool
-from langchain_core.runnables.history import RunnableWithMessageHistory
-from langchain_neo4j import Neo4jChatMessageHistory
-from langchain.callbacks.manager import CallbackManager
-from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
-from llm import llm
-from graph import graph
-from prompts import AGENT_SYSTEM_PROMPT, CHAT_SYSTEM_PROMPT
-from utils import get_session_id
-# Import tools
-from tools.cypher import cypher_qa_wrapper
-from tools.vector import get_game_summary
-from tools.game_recap import game_recap_qa  # Import the new game recap tool
-# Create a basic chat chain for general football discussion
-from langchain_core.prompts import ChatPromptTemplate
-from langchain.schema import StrOutputParser
-chat_prompt = ChatPromptTemplate.from_messages(
-    [
-        ("system", CHAT_SYSTEM_PROMPT),
-        ("human", "{input}"),
-    ]
-)
-# Create a non-streaming LLM for the agent
-from langchain_openai import ChatOpenAI
-import streamlit as st
-# Get API key from environment or Streamlit secrets
-def get_api_key(key_name):
-    """Get API key from environment or Streamlit secrets"""
-    # First try to get from Streamlit secrets
-    if hasattr(st, 'secrets') and key_name in st.secrets:
-        return st.secrets[key_name]
-    # Then try to get from environment
-    return os.environ.get(key_name)
-OPENAI_API_KEY = get_api_key("OPENAI_API_KEY")
-OPENAI_MODEL = get_api_key("OPENAI_MODEL") or "gpt-4-turbo"
-agent_llm = ChatOpenAI(
-    openai_api_key=OPENAI_API_KEY,
-    model=OPENAI_MODEL,
-    temperature=0.1,
-    streaming=True  # Disable streaming for agent
-)
-movie_chat = chat_prompt | llm | StrOutputParser()
-def football_chat_wrapper(input_text):
-    """Wrapper function for football chat with error handling"""
-    try:
-        return {"output": movie_chat.invoke({"input": input_text})}
-    except Exception as e:
-        print(f"Error in football_chat: {str(e)}")
-        return {"output": "I apologize, but I encountered an error while processing your question. Could you please rephrase it?"}
-# Define the tools
-tools = [
-    Tool.from_function(
-        name="49ers Graph Search",
-        description="""Use for ANY specific 49ers-related queries about players, games, schedules, fans, or team info.
-Examples: "Who are the 49ers playing next week?", "Which players are defensive linemen?", "How many fan chapters are in California?"
-This is your PRIMARY tool for 49ers-specific information and should be your DEFAULT choice for most queries.""",
-        func=cypher_qa_wrapper
-    ),
-    Tool.from_function(
-        name="Game Recap",
-        description="""Use SPECIFICALLY for detailed game recaps or when users want to see visual information about a particular game.
-Examples: "Show me the recap of the 49ers vs Jets game", "I want to see the highlights from the last 49ers game", "What happened in the game against the Patriots?"
-Returns both a text summary AND visual game data that can be displayed to the user.
-PREFER this tool over Game Summary Search for any game-specific questions.""",
-        func=game_recap_qa
-    ),
-    Tool.from_function(
-        name="Game Summary Search",
-        description="""ONLY use for detailed game summaries or specific match results when Game Recap doesn't return good results.
-Examples: "What happened in the 49ers vs Seahawks game?", "Give me details about the last playoff game"
-Do NOT use for general schedule or player questions.""",
-        func=get_game_summary,
-    ),
-    Tool.from_function(
-        name="General Football Chat",
-        description="""ONLY use for general football discussion NOT specific to 49ers data.
-Examples: "How does the NFL draft work?", "What are the basic rules of football?"
-Do NOT use for any 49ers-specific questions.""",
-        func=football_chat_wrapper,
-    )
-]
-# Create the memory manager
-def get_memory(session_id):
-    """Get the chat history from Neo4j for the given session"""
-    return Neo4jChatMessageHistory(session_id=session_id, graph=graph)
-# Create the agent prompt
-agent_prompt = PromptTemplate.from_template(AGENT_SYSTEM_PROMPT)
-# Create the agent with non-streaming LLM
-agent = create_react_agent(agent_llm, tools, agent_prompt)
-agent_executor = AgentExecutor(
-    agent=agent,
-    tools=tools,
-    verbose=True,
-    handle_parsing_errors=True,
-    max_iterations=5  # Limit the number of iterations to prevent infinite loops
-)
-# Create a chat agent with memory
-chat_agent = RunnableWithMessageHistory(
-    agent_executor,
-    get_memory,
-    input_messages_key="input",
-    history_messages_key="chat_history",
-)
-def generate_response(user_input, session_id=None):
-    """
-    Generate a response using the agent and tools
-    Args:
-        user_input (str): The user's message
-        session_id (str, optional): The session ID for memory
-    Returns:
-        dict: The full response object from the agent
-    """
-    print('Starting generate_response function...')
-    print(f'User input: {user_input}')
-    print(f'Session ID: {session_id}')
-    if not session_id:
-        session_id = get_session_id()
-        print(f'Generated new session ID: {session_id}')
-    # Add retry logic
-    max_retries = 3
-    for attempt in range(max_retries):
-        try:
-            print('Invoking chat_agent...')
-            response = chat_agent.invoke(
-                {"input": user_input},
-                {"configurable": {"session_id": session_id}},
-            )
-            print(f'Raw response from chat_agent: {response}')
-            # Extract the output and format it for Streamlit
-            if isinstance(response, dict):
-                print('Response is a dictionary, extracting fields...')
-                output = response.get('output', '')
-                intermediate_steps = response.get('intermediate_steps', [])
-                print(f'Extracted output: {output}')
-                print(f'Extracted intermediate steps: {intermediate_steps}')
-                # Create a formatted response
-                formatted_response = {
-                    "output": output,
-                    "intermediate_steps": intermediate_steps,
-                    "metadata": {
-                        "tools_used": [step[0].tool for step in intermediate_steps] if intermediate_steps else ["None"]
-                    }
-                }
-                print(f'Formatted response: {formatted_response}')
-                return formatted_response
-            else:
-                print('Response is not a dictionary, converting to string...')
-                return {
-                    "output": str(response),
-                    "intermediate_steps": [],
-                    "metadata": {"tools_used": ["None"]}
-                }
-        except Exception as e:
-            if attempt == max_retries - 1:  # Last attempt
-                print(f"Error in generate_response after {max_retries} attempts: {str(e)}")
-                return {
-                    "output": "I apologize, but I encountered an error while processing your request. Could you please try again?",
-                    "intermediate_steps": [],
-                    "metadata": {"tools_used": ["None"]}
-                }
-            print(f"Attempt {attempt + 1} failed, retrying...")
-            continue

app.py DELETED Viewed

@@ -1,151 +0,0 @@
-import os
-import uuid
-import streamlit as st
-from zep_cloud.client import AsyncZep
-from zep_cloud.types import Message
-import asyncio
-# Import our components
-from agent import generate_response
-from utils import get_session_id, get_user_id, write_message
-from graph import graph
-# Page configuration
-st.set_page_config(
-    page_title="49ers FanAI Hub",
-    page_icon="🏈",
-    layout="wide"
-)
-# Initialize Zep client
-zep_api_key = os.environ.get("ZEP_API_KEY")
-if not zep_api_key:
-    st.error("ZEP_API_KEY environment variable is not set. Please set it to use memory features.")
-    zep = None
-else:
-    zep = AsyncZep(api_key=zep_api_key)
-# Initialize session state for messages if it doesn't exist
-if "messages" not in st.session_state:
-    st.session_state.messages = []
-if "initialized" not in st.session_state:
-    st.session_state.initialized = False
-# Function to initialize the chat session
-async def initialize_chat():
-    """Set up the chat session when a user connects"""
-    try:
-        # Generate unique identifiers for the user and session
-        user_id = get_user_id()
-        session_id = get_session_id()
-        print(f"Starting new chat session. User ID: {user_id}, Session ID: {session_id}")
-        # Register user in Zep if available
-        if zep:
-            await zep.user.add(
-                user_id=user_id,
-                email="[email protected]",
-                first_name="User",
-                last_name="MovieFan",
-            )
-            # Start a new session in Zep
-            await zep.memory.add_session(
-                session_id=session_id,
-                user_id=user_id,
-            )
-        # Add welcome message to session state
-        welcome_message = """
-# 🏈 Welcome to the 49ers FanAI Hub!
-I can help you with:
-- Information about the 49ers, players, and fans
-- Finding 49ers games based on plot descriptions or themes
-- Discovering connections between people in the 49ers industry
-What would you like to know about today?
-"""
-        st.session_state.messages.append({"role": "assistant", "content": welcome_message})
-        st.session_state.initialized = True
-    except Exception as e:
-        import traceback
-        print(f"Error in initialize_chat: {str(e)}")
-        print(f"Traceback: {traceback.format_exc()}")
-        st.session_state.messages.append({
-            "role": "system",
-            "content": "There was an error starting the chat. Please refresh the page and try again."
-        })
-# Function to process user messages
-async def process_message(message):
-    """Process user messages and generate responses with the agent"""
-    print("Starting message processing...")
-    session_id = get_session_id()
-    print(f"Session ID: {session_id}")
-    try:
-        # Store user message in Zep memory if available
-        if zep:
-            print("Storing user message in Zep...")
-            await zep.memory.add(
-                session_id=session_id,
-                messages=[Message(role_type="user", content=message, role="user")]
-            )
-        # Process with the agent
-        print('Calling generate_response function...')
-        agent_response = generate_response(message, session_id)
-        print(f"Agent response received: {agent_response}")
-        # Extract the output and metadata
-        output = agent_response.get("output", "")
-        metadata = agent_response.get("metadata", {})
-        print(f"Extracted output: {output}")
-        print(f"Extracted metadata: {metadata}")
-        # Add assistant response to session state
-        st.session_state.messages.append({"role": "assistant", "content": output})
-        # Store assistant's response in Zep memory if available
-        if zep:
-            print("Storing assistant response in Zep...")
-            await zep.memory.add(
-                session_id=session_id,
-                messages=[Message(role_type="assistant", content=output, role="assistant")]
-            )
-            print("Assistant response stored in Zep")
-    except Exception as e:
-        import traceback
-        print(f"Error in process_message: {str(e)}")
-        print(f"Traceback: {traceback.format_exc()}")
-        st.session_state.messages.append({
-            "role": "assistant",
-            "content": "I apologize, but I encountered an error. Could you please try again?"
-        })
-# Initialize the chat session if not already initialized
-if not st.session_state.initialized:
-    asyncio.run(initialize_chat())
-# Display chat messages
-for message in st.session_state.messages:
-    write_message(message["role"], message["content"], save=False)
-# Chat input
-if prompt := st.chat_input("Ask me about the 49ers..."):
-    # Display user message and save to history
-    write_message("user", prompt)
-    # Process the message and display response
-    with st.spinner("Thinking..."):
-        # Process the message asynchronously
-        asyncio.run(process_message(prompt))
-        # Force a rerun to display the new message
-        st.rerun()

recreate_relationships.py → data/z_old/recreate_relationships.py RENAMED Viewed

File without changes

gradio_agent.py CHANGED Viewed

@@ -15,7 +15,7 @@ from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
 from gradio_llm import llm
 from gradio_graph import graph
 from prompts import AGENT_SYSTEM_PROMPT, CHAT_SYSTEM_PROMPT
-from utils import get_session_id
 # Import tools
 from tools.cypher import cypher_qa_wrapper

 from gradio_llm import llm
 from gradio_graph import graph
 from prompts import AGENT_SYSTEM_PROMPT, CHAT_SYSTEM_PROMPT
+from gradio_utils import get_session_id
 # Import tools
 from tools.cypher import cypher_qa_wrapper

graph.py DELETED Viewed

@@ -1,51 +0,0 @@
-"""
-This module initializes the Neo4j graph connection using Streamlit secrets.
-"""
-import os
-import streamlit as st
-from dotenv import load_dotenv
-from langchain_neo4j import Neo4jGraph
-# Load environment variables
-load_dotenv()
-# Get Neo4j credentials from environment or Streamlit secrets
-def get_credential(key_name):
-    """Get credential from environment or Streamlit secrets"""
-    # First try to get from Streamlit secrets
-    if hasattr(st, 'secrets') and key_name in st.secrets:
-        return st.secrets[key_name]
-    # Then try to get from environment
-    return os.environ.get(key_name)
-# Get Neo4j credentials
-AURA_CONNECTION_URI = get_credential("AURA_CONNECTION_URI") or get_credential("NEO4J_URI")
-AURA_USERNAME = get_credential("AURA_USERNAME") or get_credential("NEO4J_USERNAME")
-AURA_PASSWORD = get_credential("AURA_PASSWORD") or get_credential("NEO4J_PASSWORD")
-# Check if credentials are available
-if not all([AURA_CONNECTION_URI, AURA_USERNAME, AURA_PASSWORD]):
-    missing = []
-    if not AURA_CONNECTION_URI:
-        missing.append("AURA_CONNECTION_URI/NEO4J_URI")
-    if not AURA_USERNAME:
-        missing.append("AURA_USERNAME/NEO4J_USERNAME")
-    if not AURA_PASSWORD:
-        missing.append("AURA_PASSWORD/NEO4J_PASSWORD")
-    error_message = f"Missing Neo4j credentials: {', '.join(missing)}"
-    st.error(error_message)
-    raise ValueError(error_message)
-# Connect to Neo4j
-try:
-    graph = Neo4jGraph(
-        url=AURA_CONNECTION_URI,
-        username=AURA_USERNAME,
-        password=AURA_PASSWORD,
-    )
-except Exception as e:
-    error_message = f"Failed to connect to Neo4j: {str(e)}"
-    st.error(error_message)
-    raise Exception(error_message)

llm.py DELETED Viewed

@@ -1,44 +0,0 @@
-"""
-This module initializes the language model and embedding model using Streamlit secrets.
-"""
-import os
-import streamlit as st
-from dotenv import load_dotenv
-from langchain_openai import ChatOpenAI, OpenAIEmbeddings
-# Load environment variables
-load_dotenv()
-# Get API keys from environment or Streamlit secrets
-def get_api_key(key_name):
-    """Get API key from environment or Streamlit secrets"""
-    # First try to get from Streamlit secrets
-    if hasattr(st, 'secrets') and key_name in st.secrets:
-        return st.secrets[key_name]
-    # Then try to get from environment
-    return os.environ.get(key_name)
-OPENAI_API_KEY = get_api_key("OPENAI_API_KEY")
-OPENAI_MODEL = get_api_key("OPENAI_MODEL") or "gpt-4-turbo"
-if not OPENAI_API_KEY:
-    st.error("OPENAI_API_KEY is not set in environment variables or Streamlit secrets.")
-    raise ValueError("OPENAI_API_KEY is not set")
-# Create the LLM with better error handling
-try:
-    llm = ChatOpenAI(
-        openai_api_key=OPENAI_API_KEY,
-        model=OPENAI_MODEL,
-        temperature=0.1,
-        streaming=True  # Enable streaming for better response handling
-    )
-    # Create the Embedding model
-    embeddings = OpenAIEmbeddings(
-        openai_api_key=OPENAI_API_KEY
-    )
-except Exception as e:
-    st.error(f"Failed to initialize OpenAI models: {str(e)}")
-    raise Exception(f"Failed to initialize OpenAI models: {str(e)}")

utils.py DELETED Viewed

@@ -1,81 +0,0 @@
-"""
-Utility functions for the chatbot application.
-"""
-import uuid
-import streamlit as st
-# Try to import get_script_run_ctx from different possible locations
-# based on Streamlit version
-try:
-    # For newer Streamlit versions
-    from streamlit.runtime.scriptrunner.script_run_context import get_script_run_ctx
-except ImportError:
-    try:
-        # For older Streamlit versions
-        from streamlit.script_run_context import get_script_run_ctx
-    except ImportError:
-        # Fallback if neither import works
-        def get_script_run_ctx():
-            return None
-def get_session_id():
-    """
-    Get the current session ID from Streamlit session state.
-    Creates a new ID if one doesn't exist.
-    """
-    if "session_id" not in st.session_state:
-        st.session_state.session_id = str(uuid.uuid4())
-    return st.session_state.session_id
-def get_user_id():
-    """
-    Get the current user ID from Streamlit session state.
-    Creates a new ID if one doesn't exist.
-    """
-    if "user_id" not in st.session_state:
-        st.session_state.user_id = str(uuid.uuid4())
-    return st.session_state.user_id
-def get_streamlit_session_id():
-    """
-    Get the Streamlit session ID from the script run context.
-    This is different from our application session ID.
-    Falls back to a generated UUID if the context is not available.
-    """
-    ctx = get_script_run_ctx()
-    if ctx is not None:
-        return ctx.session_id
-    return str(uuid.uuid4())  # Fallback to a generated UUID
-def format_source_documents(source_documents):
-    """
-    Format source documents for display in Streamlit.
-    """
-    if not source_documents:
-        return None
-    formatted_docs = []
-    for i, doc in enumerate(source_documents):
-        if hasattr(doc, 'metadata') and doc.metadata:
-            source = doc.metadata.get('source', 'Unknown')
-            formatted_docs.append(f"Source {i+1}: {source}")
-    return "\n".join(formatted_docs) if formatted_docs else None
-def write_message(role, content, save=True):
-    """
-    Helper function to write a message to the Streamlit UI and optionally save to session state.
-    Args:
-        role (str): The role of the message sender (e.g., "user", "assistant", "system")
-        content (str): The content of the message
-        save (bool): Whether to save the message to session state
-    """
-    # Append to session state if save is True
-    if save and "messages" in st.session_state:
-        st.session_state.messages.append({"role": role, "content": content})
-    # Write to UI
-    with st.chat_message(role):
-        st.markdown(content)