awacke1 commited on
Commit
d8d8ca1
·
verified ·
1 Parent(s): 128aefc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -154
README.md CHANGED
@@ -55,193 +55,82 @@ flowchart TB
55
 
56
 
57
 
58
- Key Topics and Related Papers:
59
-
60
- Long-Horizon Task Planning in Robotics:
61
-
62
- "MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model"
63
- Authors: Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song
64
- This paper introduces a method that decomposes complex tasks at multiple levels to enhance planning capabilities using open-source large language models.
65
- ARXIV
66
-
67
- "ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning"
68
- Authors: Zhehua Zhou, Jiayang Song, Kunpeng Yao, Zhan Shu, Lei Ma
69
- The study presents a framework that improves LLM-based planning through an iterative self-refinement process, enhancing feasibility and correctness in task plans.
70
- ARXIV
71
-
72
- Skill-Based Reinforcement Learning:
73
-
74
- "Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks"
75
- Authors: Haoqi Yuan, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao Dong, Zongqing Lu
76
- This research focuses on building multi-task agents in open-world environments by learning basic skills and planning over them to accomplish long-horizon tasks efficiently.
77
- ARXIV
78
-
79
- "SkillTree: Explainable Skill-Based Deep Reinforcement Learning for Long-Horizon Control Tasks"
80
- Authors: Yongyan Wen, Siyuan Li, Rongchang Zuo, Lei Yuan, Hangyu Mao, Peng Liu
81
- The paper proposes a framework that integrates a differentiable decision tree within the high-level policy to generate skill embeddings, enhancing explainability in decision-making for complex tasks.
82
- ARXIV
83
-
84
- Neuro-Symbolic Approaches:
85
-
86
- "Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation"
87
- Authors: Jie-Jing Shao, Hao-Ran Hao, Xiao-Wen Yang, Yu-Feng Li
88
- This work introduces a framework that combines data-driven learning and symbolic-based reasoning to enable long-horizon planning through abductive imitation learning.
89
- ARXIV
90
-
91
- "CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning"
92
- Authors: [Authors not specified]
93
- The study presents a method that utilizes large language models to translate constraints into formal specifications, facilitating long-horizon task and motion planning.
94
- ARXIV
95
-
96
- Evaluation Frameworks for AI Models:
97
-
98
- "ASI: Accuracy-Stability Index for Evaluating Deep Learning Models"
99
- Authors: Wei Dai, Daniel Berleant
100
- The paper introduces the Accuracy-Stability Index (ASI), a quantitative measure that incorporates both accuracy and stability for assessing deep learning models.
101
- ARXIV
102
-
103
- "Benchmarks for Deep Off-Policy Evaluation"
104
- Authors: Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine
105
- This research provides a collection of policies that, in conjunction with existing offline datasets, can be used for benchmarking off-policy evaluation in deep learning.
106
- ARXIV
107
-
108
- These topics and papers contribute to the development of AI systems capable of understanding research literature and applying the acquired knowledge to complex, long-horizon tasks, thereby advancing the field of artificial intelligence.
109
-
110
- ---
111
-
112
-
113
- Features:
114
 
 
115
  🎯 Core Configuration & Setup
116
-
117
- Configures Streamlit page with title "🚲BikeAI🏆 Claude/GPT Research"
118
-
119
 
120
  🔑 API Setup & Clients
121
-
122
- Initializes OpenAI, Anthropic, and HuggingFace API clients with environment variables
123
-
124
 
125
  📝 Session State Management
126
-
127
- Manages conversation history, transcripts, file editing states, and model selections
128
-
129
 
130
  🧠 get_high_info_terms()
131
-
132
- Extracts meaningful keywords from text while filtering common stop words
133
-
134
 
135
  🏷️ clean_text_for_filename()
136
-
137
- Sanitizes text to create valid filenames by removing special characters
138
-
139
 
140
  📄 generate_filename()
141
-
142
- Creates intelligent filenames based on content and timestamps
143
-
144
 
145
  💾 create_file()
146
-
147
- Saves prompt and response content to files with smart naming
148
-
149
 
150
  🔗 get_download_link()
151
-
152
- Generates base64-encoded download links for files
153
-
154
 
155
  🎤 clean_for_speech()
 
156
 
157
- Prepares text for speech synthesis by removing special characters
158
-
159
-
160
- 🗣️ speech_synthesis_html()
161
-
162
-
163
- Creates HTML for browser-based speech synthesis
164
-
165
-
166
- 🔊 edge_tts_generate_audio()
167
 
 
 
168
 
169
- Generates MP3 audio files using Edge TTS
 
170
 
 
 
171
 
172
- 🎵 speak_with_edge_tts()
 
173
 
 
 
174
 
175
- Wrapper for Edge TTS audio generation
 
176
 
 
 
177
 
178
- 🎧 play_and_download_audio()
 
179
 
 
 
180
 
181
- Creates audio player interface with download option
182
-
183
-
184
- 📸 process_image()
185
-
186
-
187
- Analyzes images using GPT-4V
188
-
189
-
190
- 🎙️ process_audio()
191
-
192
-
193
- Transcribes audio using Whisper
194
-
195
-
196
- 🎥 process_video()
197
-
198
-
199
- Extracts frames from video files
200
-
201
-
202
- 🤖 process_video_with_gpt()
203
-
204
-
205
- Analyzes video frames using GPT-4V
206
-
207
-
208
- 📚 parse_arxiv_refs()
209
-
210
-
211
- Parses research paper references into structured format
212
-
213
 
214
  🔍 perform_ai_lookup()
 
215
 
216
-
217
- Searches and processes arXiv papers with audio summaries
218
-
219
-
220
- 📁 create_zip_of_files()
221
-
222
-
223
- Bundles multiple files into a zip with smart naming
224
-
225
-
226
- 📂 load_files_for_sidebar()
227
-
228
-
229
- Organizes files by timestamp for sidebar display
230
-
231
-
232
- 🏷️ extract_keywords_from_md()
233
-
234
-
235
- Pulls keywords from markdown files for organization
236
-
237
-
238
- 📊 display_file_manager_sidebar()
239
-
240
-
241
- Creates interactive sidebar for file management
242
-
243
 
244
  🎬 main()
 
245
 
 
 
 
246
 
247
- Orchestrates overall application flow and UI components
 
55
 
56
 
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
+ Features
60
  🎯 Core Configuration & Setup
61
+ Configures the Streamlit page with title “🚲TalkingAIResearcher🏆”, sets layout, sidebar states, and environment variables.
 
 
62
 
63
  🔑 API Setup & Clients
64
+ Loads and initializes OpenAI, Anthropic, and HuggingFace clients from environment variables and secrets.
 
 
65
 
66
  📝 Session State Management
67
+ Manages conversation history, transcripts, file editing states, and model selections.
 
 
68
 
69
  🧠 get_high_info_terms()
70
+ Extracts top words/bigrams from a text by counting frequency and filtering out stop words.
 
 
71
 
72
  🏷️ clean_text_for_filename()
73
+ Sanitizes text for valid filenames by removing special characters, short/unhelpful words, and truncating length.
 
 
74
 
75
  📄 generate_filename()
76
+ Creates an intelligent filename based on timestamps, high-info terms, and a snippet of the content (removing duplicates).
 
 
77
 
78
  💾 create_file()
79
+ Saves prompt + response content to a file, using generate_filename().
 
 
80
 
81
  🔗 get_download_link()
82
+ Generates base64-encoded download links for .md, audio, or zip files for inline downloading.
 
 
83
 
84
  🎤 clean_for_speech()
85
+ Strips out line breaks, URLs, and symbols to create more readable text for TTS.
86
 
87
+ 🎙️ edge_tts_generate_audio()
88
+ Asynchronously generates audio files (e.g., .mp3) using Edge TTS.
 
 
 
 
 
 
 
 
89
 
90
+ 🔊 speak_with_edge_tts()
91
+ A wrapper function for the async TTS call, allowing direct usage in synchronous code.
92
 
93
+ 🎵 play_and_download_audio()
94
+ Embeds an audio player in Streamlit and provides a download link for that audio file.
95
 
96
+ 💿 save_qa_with_audio()
97
+ Stores Q&A content in a markdown file and generates TTS audio for the question + answer.
98
 
99
+ 📰 parse_arxiv_refs()
100
+ Parses the multi-line markdown references returned by the ArXiv RAG pipeline into structured paper objects.
101
 
102
+ 🔗 create_paper_links_md()
103
+ Builds a minimal markdown page with numbered links to each paper’s ArXiv URL.
104
 
105
+ 📑 create_paper_audio_files()
106
+ Processes each parsed paper, generating TTS audio and embedding base64 download links.
107
 
108
+ 📚 display_papers()
109
+ Shows papers in the main area with a scrolling marquee (via streamlit_marquee), plus expanders for details and audio.
110
 
111
+ 🗂 display_papers_in_sidebar()
112
+ Mirrors the paper listing in the sidebar with expanders, letting users quickly play or download paper audio.
113
 
114
+ 📂 display_file_history_in_sidebar()
115
+ Enumerates all local .md, .mp3, .wav files in descending modification time, letting users preview and download them.
116
 
117
+ 📦 create_zip_of_files()
118
+ Bundles multiple files (markdown + audio) into a zip with an automatically shortened filename.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
 
120
  🔍 perform_ai_lookup()
121
+ The main function to:
122
 
123
+ Query Anthropic (Claude)
124
+ Call an ArXiv RAG pipeline
125
+ Generate Q&A audio
126
+ Parse and render the resulting papers
127
+ 🎧 process_voice_input()
128
+ Receives user text/voice input, then calls perform_ai_lookup() to produce an audio summary and final Q&A file.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
  🎬 main()
131
+ Orchestrates the entire application flow:
132
 
133
+ Renders tabs for Voice Input, Media Gallery, ArXiv search, and Editor
134
+ Shows file history in the sidebar
135
+ Manages marquee settings and final UI layout
136