--- title: 🧠Deep🐍Research🌐Evaluator emoji: 🧠🐍🌐 colorFrom: red colorTo: purple sdk: streamlit sdk_version: 1.41.1 app_file: app.py pinned: true license: mit short_description: Deep Research Evaluator for Long Horizon Learning Tasks --- # 🎡', '🎢', '🎸', '🎹', '🎺', '🎷', 'πŸ₯', '🎻 Deep Research Evaluator is a conceptual AI system designed to analyze and synthesize information from extensive research literature, such as arXiv papers, to learn about specific topics and generate code applicable to long-horizon tasks in AI. This involves understanding complex subjects, identifying relevant methodologies, and implementing solutions that require planning and execution over extended sequences. # Project Architecture - πŸ“‚ **Root Folder** - **app.py** (πŸ€– *Streamlit App*) - Main entry point for your Streamlit application. - **requirements.txt** (πŸ“‹ *Dependencies*) - Lists all the Python packages needed to run the app. - πŸ“‚ **mycomponent** (πŸ”§ *HTML Component*) - A subdirectory containing your custom Streamlit component code. - **\_\_init\_\_.py** (🐍 *Python Init*) - Tells Python this folder is a module/package. - **index.html** (🌐 *Custom HTML*) - Front-end HTML/JS/CSS for the custom component. ```mermaid flowchart TB A[πŸ“‚ Root Folder] --> B[app.py πŸ€–
(Streamlit App)] A --> C[requirements.txt πŸ“‹
(Dependencies)] A --> D[πŸ“‚ mycomponent πŸ”§
(HTML Component)] D --> E[__init__.py 🐍
(Python Init)] D --> F[index.html 🌐
(Custom HTML)] ``` --- **Usage Flow**: 1. You run `streamlit run app.py`. 2. **app.py** imports **mycomponent** to load the HTML from **index.html**. 3. **requirements.txt** ensures needed dependencies are installed. 4. The **\_\_init\_\_.py** file ensures the custom component folder is recognized as a Python package. **Notes**: - **app.py** hosts your Streamlit logic and references the **mycomponent**. - **index.html** supplies the interface for any front-end custom elements. - **requirements.txt** keeps the environment consistent. Features 🎯 Core Configuration & Setup Configures the Streamlit page with title β€œπŸš²TalkingAIResearcherπŸ†β€, sets layout, sidebar states, and environment variables. πŸ”‘ API Setup & Clients Loads and initializes OpenAI, Anthropic, and HuggingFace clients from environment variables and secrets. πŸ“ Session State Management Manages conversation history, transcripts, file editing states, and model selections. 🧠 get_high_info_terms() Extracts top words/bigrams from a text by counting frequency and filtering out stop words. 🏷️ clean_text_for_filename() Sanitizes text for valid filenames by removing special characters, short/unhelpful words, and truncating length. πŸ“„ generate_filename() Creates an intelligent filename based on timestamps, high-info terms, and a snippet of the content (removing duplicates). πŸ’Ύ create_file() Saves prompt + response content to a file, using generate_filename(). πŸ”— get_download_link() Generates base64-encoded download links for .md, audio, or zip files for inline downloading. 🎀 clean_for_speech() Strips out line breaks, URLs, and symbols to create more readable text for TTS. πŸŽ™οΈ edge_tts_generate_audio() Asynchronously generates audio files (e.g., .mp3) using Edge TTS. πŸ”Š speak_with_edge_tts() A wrapper function for the async TTS call, allowing direct usage in synchronous code. 🎡 play_and_download_audio() Embeds an audio player in Streamlit and provides a download link for that audio file. πŸ’Ώ save_qa_with_audio() Stores Q&A content in a markdown file and generates TTS audio for the question + answer. πŸ“° parse_arxiv_refs() Parses the multi-line markdown references returned by the ArXiv RAG pipeline into structured paper objects. πŸ”— create_paper_links_md() Builds a minimal markdown page with numbered links to each paper’s ArXiv URL. πŸ“‘ create_paper_audio_files() Processes each parsed paper, generating TTS audio and embedding base64 download links. πŸ“š display_papers() Shows papers in the main area with a scrolling marquee (via streamlit_marquee), plus expanders for details and audio. πŸ—‚ display_papers_in_sidebar() Mirrors the paper listing in the sidebar with expanders, letting users quickly play or download paper audio. πŸ“‚ display_file_history_in_sidebar() Enumerates all local .md, .mp3, .wav files in descending modification time, letting users preview and download them. πŸ“¦ create_zip_of_files() Bundles multiple files (markdown + audio) into a zip with an automatically shortened filename. πŸ” perform_ai_lookup() The main function to: Query Anthropic (Claude) Call an ArXiv RAG pipeline Generate Q&A audio Parse and render the resulting papers 🎧 process_voice_input() Receives user text/voice input, then calls perform_ai_lookup() to produce an audio summary and final Q&A file. 🎬 main() Orchestrates the entire application flow: Renders tabs for Voice Input, Media Gallery, ArXiv search, and Editor Shows file history in the sidebar Manages marquee settings and final UI layout