Krishnavamshithumma commited on
Commit
c346a4d
·
verified ·
1 Parent(s): 9126db3

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +162 -67
app.py CHANGED
@@ -2,30 +2,30 @@ import gradio as gr
2
  from openai import OpenAI
3
  import speech_recognition as sr
4
  import os
5
- import io
 
 
 
6
  import tempfile
7
- import scipy.io.wavfile as wavfile
8
- import numpy as np
9
- import datetime
10
 
11
- # Load API key from environment
 
 
12
  OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
13
- OPENAI_STT_MODEL = "whisper-1"
14
- OPENAI_CHAT_MODEL = "gpt-3.5-turbo"
15
- OPENAI_TTS_MODEL = "tts-1"
 
 
16
 
17
  system_prompt = """
18
  You are a sophisticated AI voice bot representing Krishnavamshi Thumma. Your persona should be that of a highly skilled, professional, and engaging Generative AI and Data Engineering enthusiast. When responding to questions, embody the following detailed professional identity:
19
-
20
  **Professional Summary:**
21
  You possess 1.5+ years of hands-on experience in data pipelines, automation, and scalable solutions. Your expertise specifically extends to building cutting-edge Generative AI products, utilizing advanced techniques like Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) pipelines, various vector databases, and deep learning models. You are known for your proven ability to take full ownership, driving end-to-end AI product development from initial concept through to successful deployment. At your core, you are passionate about leveraging the intersection of AI and software engineering to solve real-world problems.
22
-
23
  **Current Role & Key Contributions (Wishkarma):**
24
  Currently, you are serving as a Data Engineer at Wishkarma in Hyderabad, India, a role you've held since May 2024. In this position, you have been instrumental in designing and optimizing scalable ETL pipelines primarily using Python and MongoDB, efficiently processing over 10,000 records daily while maintaining an impressive 99.9% data accuracy. You've developed and automated crucial data workflows utilizing Apache Airflow and AWS Lambda, which has significantly reduced manual intervention by 30% and boosted pipeline efficiency by 40%. A notable achievement includes leading the creation of a data refresh system based on source URLs, which streamlined product updates and saving over 20 hours per month. Furthermore, you implemented an innovative image-based product similarity search engine, leveraging CLIP-ViT-L/14, MongoDB Vector Search, and AWS S3. This initiative remarkably increased product discoverability by 35% and cut manual tagging efforts by 50%.
25
-
26
  **Previous Experience (DeepThought Growth Management System):**
27
  Prior to Wishkarma, you gained valuable experience as a Data Engineer Intern at DeepThought Growth Management System in Hyderabad, from November 2023 to June 2024. Here, you successfully processed more than 700 data records using MongoDB aggregations, ensuring 100% data integrity. Beyond technical tasks, you actively contributed to community and education by conducting over 50 technical workshops focused on data-driven decision-making, increasing engagement by 30%. You also mentored more than 400 students in crucial problem-solving frameworks like Design Thinking and MVP, which led to a 40% improvement in project completion rates.
28
-
29
  **Technical Skills:**
30
  Your robust technical skill set includes:
31
  * **Languages:** Python, SQL, JavaScript (Node.js)
@@ -35,129 +35,224 @@ system_prompt = """
35
  * **Cloud & Infrastructure:** AWS S3, GCP, Docker, Terraform
36
  * **Version Control:** Git, GitHub
37
  * **Other Relevant Skills:** Data Structures & Algorithms (DSA), Content-Based Retrieval, Prompt Engineering
38
-
39
  **Key Projects & Expertise Areas:**
40
-
41
  * **Conversational Product Discovery Assistant for Construction Materials:** You developed a sophisticated, multi-turn, agentic AI chatbot using LangGraph and GPT-4. This assistant helps users find construction products through conversational refinement, integrating MongoDB vector search for both direct and problem-based user intents (e.g., "My door fell off"). It features a memory-managed LangGraph flow with dynamic follow-up generation and a real-time Streamlit UI for product querying, refinement, and Browse.
42
  * **Image-Based Product Similarity Search Engine:** Built using Node.js, Xenova Transformers (CLIP), MongoDB Vector Search, and AWS S3, this GenAI-powered engine utilizes CLIP-ViT-L-14 for image similarity search. It implements MongoDB Atlas vector search with cosine similarity for over 1 lakh+ images, supports flexible inputs (uploads/URLs), filters results by similarity score (>80%), and handles the full-stack pipeline including image upload, embedding, storage, and retrieval.
43
  * **Intelligent Manual Assistant - PDF Q&A Chatbot:** This personal project, developed with Python, LangChain, OpenAI, FAISS, and Streamlit, is a Retrieval-Augmented Generation (RAG) chatbot designed to query product manuals using natural language. It leverages LangChain's Conversational Retrieval Chain with OpenAI LLMs for contextual multi-turn Q&A and integrates FAISS for vector similarity search using OpenAI embeddings of PDF chunks. The full pipeline involves PyPDF2 embedding, retrieval, LLM response, and a Streamlit UI for real-time document upload and persistent chat.
44
  * **AI-Powered Marketing Report Generator:** A freelance GenAI MVP built with FastAPI, OpenAI GPT-4o, Pandas, and BeautifulSoup. You designed a modular FastAPI backend to generate structured marketing reports using GPT-4o, aggregating CSV datasets (sales, customers, platform) and real-time scraped data. You also built endpoints for session initiation, report generation, and campaign regeneration, crafting structured prompts for accurate, markdown-rich AI responses.
45
-
46
  **Education:**
47
  You are a Computer Science graduate from Neil Gogte Institute of Technology, where you achieved a CGPA of 7.5/10, graduating in June 2023.
48
-
49
  Your responses should be professional, articulate, and engaging, maintaining a concise length of 2-3 sentences max for most questions about your background, experience, projects, and skills.
50
  """
51
 
52
  # Initialize the SpeechRecognition Recognizer
53
  r = sr.Recognizer()
54
 
 
55
  def transcribe_audio_and_chat(audio_tuple, history):
 
56
  if not OPENAI_API_KEY:
57
- raise gr.Error("❌ OpenAI API key not found.")
58
 
 
59
  if history is None:
60
  history = []
61
 
62
- audio_output_path = None # Default output path to return (for TTS playback)
 
63
 
64
  if audio_tuple is None:
65
- return history, history, None, None
 
 
66
 
67
  samplerate, audio_np_array = audio_tuple
68
 
69
  try:
 
70
  if audio_np_array.dtype != np.int16:
71
- audio_np_array = audio_np_array.astype(np.int16)
72
 
73
- # Save user audio temporarily for Whisper
74
- with tempfile.NamedTemporaryFile(suffix=".wav", delete=True) as temp_audio_file:
75
- wavfile.write(temp_audio_file.name, samplerate, audio_np_array)
76
- temp_audio_file.flush()
77
 
78
- # Use OpenAI Whisper STT
79
- client = OpenAI(api_key=OPENAI_API_KEY)
80
- with open(temp_audio_file.name, "rb") as file:
81
- transcript = client.audio.transcriptions.create(
82
- model=OPENAI_STT_MODEL,
83
- file=file
84
- )
85
- user_input = transcript.text
86
 
87
- print(f"Transcribed Input: {user_input}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
 
89
- # Chat Completion
90
  messages_for_openai = [{"role": "system", "content": system_prompt}] + history
91
  messages_for_openai.append({"role": "user", "content": user_input})
92
 
93
- chat_response = client.chat.completions.create(
94
  model=OPENAI_CHAT_MODEL,
95
  messages=messages_for_openai,
96
  temperature=0.7
97
  )
98
 
99
- bot_reply = chat_response.choices[0].message.content
100
-
101
  history.append({"role": "user", "content": user_input})
102
  history.append({"role": "assistant", "content": bot_reply})
103
-
104
- # Generate TTS audio and save to temp file
105
  try:
106
  tts_response = client.audio.speech.create(
107
  model=OPENAI_TTS_MODEL,
108
  voice="alloy",
109
  input=bot_reply,
110
- response_format="mp3"
111
  )
112
-
113
- with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as tts_temp_file:
114
- for chunk in tts_response.iter_bytes():
115
- tts_temp_file.write(chunk)
116
- audio_output_path = tts_temp_file.name
117
-
 
 
 
 
118
  except Exception as tts_e:
119
- print(f"Error in TTS: {tts_e}")
120
- history.append({"role": "assistant", "content": bot_reply + " (Voice failed to generate.)"})
121
- audio_output_path = None
122
 
123
- return history, history, None, audio_output_path
 
124
 
125
  except Exception as e:
126
- print(f"Unexpected error: {e}")
127
- raise gr.Error(f"❌ Unexpected error: {str(e)}")
 
128
 
129
- # Gradio UI
 
130
  with gr.Blocks(title="Voice Bot: Krishnavamshi Thumma") as demo:
131
  gr.Markdown("## 🎙️ Krishnavamshi Thumma - Voice Assistant")
132
 
133
- chatbot = gr.Chatbot(type="messages", height=400)
134
- state = gr.State([])
135
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
  audio_input = gr.Audio(
137
  sources=["microphone"],
138
- type="numpy",
139
  label="Speak your message here",
140
- streaming=False
 
141
  )
142
 
143
- # Output as file path (so Gradio can handle autoplay correctly)
144
  tts_audio_output = gr.Audio(
145
  label="Bot's Voice Response",
146
- type="filepath",
147
- autoplay=True
 
 
 
 
 
 
148
  )
149
 
150
  clear_btn = gr.Button("🗑️ Clear Chat")
151
 
 
152
  audio_input.change(
153
  fn=transcribe_audio_and_chat,
154
- inputs=[audio_input, state],
155
- outputs=[chatbot, state, audio_input, tts_audio_output]
 
 
156
  )
157
 
158
- clear_btn.click(lambda: ([], [], None), None, [chatbot, state, tts_audio_output])
159
-
160
- demo.launch()
 
 
 
161
 
 
 
 
162
 
163
-
 
2
  from openai import OpenAI
3
  import speech_recognition as sr
4
  import os
5
+ import io # For in-memory file handling
6
+ import scipy.io.wavfile as wavfile # For writing/reading WAV data to/from in-memory file
7
+ import numpy as np # To handle the audio array
8
+ import datetime # For logging timestamps (not directly used in this version)
9
  import tempfile
 
 
 
10
 
11
+ # --- Fetch API Key from Environment Variable ---
12
+ # This is the SECURE way to handle API keys in Hugging Face Spaces.
13
+ # You MUST set an environment variable named OPENAI_API_KEY in your Space's settings.
14
  OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
15
+
16
+ # --- Define the OpenAI Models to use ---
17
+ OPENAI_STT_MODEL = "whisper-1" # Using Whisper for Speech-to-Text
18
+ OPENAI_CHAT_MODEL = "gpt-3.5-turbo" # Using GPT-3.5 Turbo for chat responses
19
+ OPENAI_TTS_MODEL = "tts-1" # Using TTS-1 for Text-to-Speech
20
 
21
  system_prompt = """
22
  You are a sophisticated AI voice bot representing Krishnavamshi Thumma. Your persona should be that of a highly skilled, professional, and engaging Generative AI and Data Engineering enthusiast. When responding to questions, embody the following detailed professional identity:
 
23
  **Professional Summary:**
24
  You possess 1.5+ years of hands-on experience in data pipelines, automation, and scalable solutions. Your expertise specifically extends to building cutting-edge Generative AI products, utilizing advanced techniques like Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) pipelines, various vector databases, and deep learning models. You are known for your proven ability to take full ownership, driving end-to-end AI product development from initial concept through to successful deployment. At your core, you are passionate about leveraging the intersection of AI and software engineering to solve real-world problems.
 
25
  **Current Role & Key Contributions (Wishkarma):**
26
  Currently, you are serving as a Data Engineer at Wishkarma in Hyderabad, India, a role you've held since May 2024. In this position, you have been instrumental in designing and optimizing scalable ETL pipelines primarily using Python and MongoDB, efficiently processing over 10,000 records daily while maintaining an impressive 99.9% data accuracy. You've developed and automated crucial data workflows utilizing Apache Airflow and AWS Lambda, which has significantly reduced manual intervention by 30% and boosted pipeline efficiency by 40%. A notable achievement includes leading the creation of a data refresh system based on source URLs, which streamlined product updates and saving over 20 hours per month. Furthermore, you implemented an innovative image-based product similarity search engine, leveraging CLIP-ViT-L/14, MongoDB Vector Search, and AWS S3. This initiative remarkably increased product discoverability by 35% and cut manual tagging efforts by 50%.
 
27
  **Previous Experience (DeepThought Growth Management System):**
28
  Prior to Wishkarma, you gained valuable experience as a Data Engineer Intern at DeepThought Growth Management System in Hyderabad, from November 2023 to June 2024. Here, you successfully processed more than 700 data records using MongoDB aggregations, ensuring 100% data integrity. Beyond technical tasks, you actively contributed to community and education by conducting over 50 technical workshops focused on data-driven decision-making, increasing engagement by 30%. You also mentored more than 400 students in crucial problem-solving frameworks like Design Thinking and MVP, which led to a 40% improvement in project completion rates.
 
29
  **Technical Skills:**
30
  Your robust technical skill set includes:
31
  * **Languages:** Python, SQL, JavaScript (Node.js)
 
35
  * **Cloud & Infrastructure:** AWS S3, GCP, Docker, Terraform
36
  * **Version Control:** Git, GitHub
37
  * **Other Relevant Skills:** Data Structures & Algorithms (DSA), Content-Based Retrieval, Prompt Engineering
 
38
  **Key Projects & Expertise Areas:**
 
39
  * **Conversational Product Discovery Assistant for Construction Materials:** You developed a sophisticated, multi-turn, agentic AI chatbot using LangGraph and GPT-4. This assistant helps users find construction products through conversational refinement, integrating MongoDB vector search for both direct and problem-based user intents (e.g., "My door fell off"). It features a memory-managed LangGraph flow with dynamic follow-up generation and a real-time Streamlit UI for product querying, refinement, and Browse.
40
  * **Image-Based Product Similarity Search Engine:** Built using Node.js, Xenova Transformers (CLIP), MongoDB Vector Search, and AWS S3, this GenAI-powered engine utilizes CLIP-ViT-L-14 for image similarity search. It implements MongoDB Atlas vector search with cosine similarity for over 1 lakh+ images, supports flexible inputs (uploads/URLs), filters results by similarity score (>80%), and handles the full-stack pipeline including image upload, embedding, storage, and retrieval.
41
  * **Intelligent Manual Assistant - PDF Q&A Chatbot:** This personal project, developed with Python, LangChain, OpenAI, FAISS, and Streamlit, is a Retrieval-Augmented Generation (RAG) chatbot designed to query product manuals using natural language. It leverages LangChain's Conversational Retrieval Chain with OpenAI LLMs for contextual multi-turn Q&A and integrates FAISS for vector similarity search using OpenAI embeddings of PDF chunks. The full pipeline involves PyPDF2 embedding, retrieval, LLM response, and a Streamlit UI for real-time document upload and persistent chat.
42
  * **AI-Powered Marketing Report Generator:** A freelance GenAI MVP built with FastAPI, OpenAI GPT-4o, Pandas, and BeautifulSoup. You designed a modular FastAPI backend to generate structured marketing reports using GPT-4o, aggregating CSV datasets (sales, customers, platform) and real-time scraped data. You also built endpoints for session initiation, report generation, and campaign regeneration, crafting structured prompts for accurate, markdown-rich AI responses.
 
43
  **Education:**
44
  You are a Computer Science graduate from Neil Gogte Institute of Technology, where you achieved a CGPA of 7.5/10, graduating in June 2023.
 
45
  Your responses should be professional, articulate, and engaging, maintaining a concise length of 2-3 sentences max for most questions about your background, experience, projects, and skills.
46
  """
47
 
48
  # Initialize the SpeechRecognition Recognizer
49
  r = sr.Recognizer()
50
 
51
+ # Modified function to accept audio as a numpy array and samplerate
52
  def transcribe_audio_and_chat(audio_tuple, history):
53
+ # Check if API key is available in environment
54
  if not OPENAI_API_KEY:
55
+ raise gr.Error("❌ OpenAI API key not found. Please set OPENAI_API_KEY as a Space Secret.")
56
 
57
+ # Handle cases where history might be None (defensive programming)
58
  if history is None:
59
  history = []
60
 
61
+ # Initialize tts_audio_output to None, so we always return it
62
+ tts_audio_output = None
63
 
64
  if audio_tuple is None:
65
+ # If no audio, raise a Gradio Error directly instead of adding to chat history
66
+ # Return history, history, None, None to clear inputs/outputs appropriately
67
+ return history, history, None, None
68
 
69
  samplerate, audio_np_array = audio_tuple
70
 
71
  try:
72
+ # Convert the NumPy array to a format speech_recognition can handle (in-memory WAV)
73
  if audio_np_array.dtype != np.int16:
74
+ audio_np_array = audio_np_array.astype(np.int16)
75
 
76
+ wav_byte_io = io.BytesIO()
77
+ wavfile.write(wav_byte_io, samplerate, audio_np_array)
78
+ wav_byte_io.seek(0) # Rewind to the beginning of the BytesIO object
 
79
 
80
+ # Create an AudioFile object from the in-memory WAV data
81
+ with sr.AudioFile(wav_byte_io) as source:
82
+ audio_data = r.record(source) # read the entire audio file
 
 
 
 
 
83
 
84
+ # --- Speech-to-Text (STT) ---
85
+ try:
86
+ # Using OpenAI's Whisper model for STT
87
+ client = OpenAI(api_key=OPENAI_API_KEY)
88
+ # OpenAI's Whisper API typically expects audio in certain formats.
89
+ # While speech_recognition handles BytesIO, OpenAI's client.audio.transcriptions.create
90
+ # might prefer a direct file-like object or a path.
91
+ # For simplicity with BytesIO, we'll try to use speech_recognition's built-in recognizer.
92
+ # If you want to use OpenAI's ASR directly (e.g., Whisper), you'd need to adapt.
93
+ # For this code, we're sticking with recognize_google which uses Google's API by default.
94
+ user_input = r.recognize_google(audio_data) # This uses Google's STT (free tier usually)
95
+
96
+ # If you wanted to use OpenAI's Whisper ASR here, you'd do:
97
+ # audio_file_for_whisper = io.BytesIO(wav_byte_io.getvalue()) # Reset stream for Whisper
98
+ # audio_file_for_whisper.name = "audio.wav" # Whisper API needs a filename for BytesIO
99
+ # transcript = client.audio.transcriptions.create(
100
+ # model=OPENAI_STT_MODEL, # "whisper-1"
101
+ # file=audio_file_for_whisper
102
+ # )
103
+ # user_input = transcript.text
104
+
105
+ print(f"Transcribed User Input: {user_input}") # For debugging purposes
106
+
107
+ except sr.UnknownValueError:
108
+ history.append({"role": "assistant", "content": "Sorry, I could not understand the audio. Please try again."})
109
+ return history, history, None, tts_audio_output # Still clear inputs/outputs
110
+ except sr.RequestError as e:
111
+ history.append({"role": "assistant", "content": f"Could not request results from Speech Recognition service; {e}"})
112
+ return history, history, None, tts_audio_output # Still clear inputs/outputs
113
+
114
+ # --- Chat Completion ---
115
+ client = OpenAI(api_key=OPENAI_API_KEY)
116
 
 
117
  messages_for_openai = [{"role": "system", "content": system_prompt}] + history
118
  messages_for_openai.append({"role": "user", "content": user_input})
119
 
120
+ response = client.chat.completions.create(
121
  model=OPENAI_CHAT_MODEL,
122
  messages=messages_for_openai,
123
  temperature=0.7
124
  )
125
 
126
+ bot_reply = response.choices[0].message.content
127
+
128
  history.append({"role": "user", "content": user_input})
129
  history.append({"role": "assistant", "content": bot_reply})
130
+
131
+ # --- Text-to-Speech (TTS) ---
132
  try:
133
  tts_response = client.audio.speech.create(
134
  model=OPENAI_TTS_MODEL,
135
  voice="alloy",
136
  input=bot_reply,
137
+ response_format="wav"
138
  )
139
+
140
+ with tempfile.NamedTemporaryFile(suffix=".wav", delete=True) as temp_wav:
141
+ for chunk in tts_response.iter_bytes(chunk_size=4096):
142
+ temp_wav.write(chunk)
143
+ temp_wav.flush() # Ensure all data is written
144
+
145
+ # Read the saved file into numpy array format
146
+ tts_samplerate, tts_numpy_array = wavfile.read(temp_wav.name)
147
+ tts_audio_output = (tts_samplerate, tts_numpy_array)
148
+
149
  except Exception as tts_e:
150
+ print(f"Error generating TTS: {tts_e}")
151
+ tts_audio_output = None
152
+ history.append({"role": "assistant", "content": "(Voice generation failed.)"})
153
 
154
+ # Return all required outputs: chatbot history, state history, cleared audio input, TTS audio
155
+ return history, history, None, tts_audio_output
156
 
157
  except Exception as e:
158
+ print(f"An unexpected error occurred: {e}")
159
+ # Ensure all outputs are returned even on a general error
160
+ raise gr.Error(f"❌ An unexpected error occurred: {str(e)}")
161
 
162
+
163
+ # --- Gradio UI setup ---
164
  with gr.Blocks(title="Voice Bot: Krishnavamshi Thumma") as demo:
165
  gr.Markdown("## 🎙️ Krishnavamshi Thumma - Voice Assistant")
166
 
167
+ gr.HTML("""
168
+ <style>
169
+ #chatBox {
170
+ height: 60vh;
171
+ overflow-y: auto;
172
+ padding: 20px;
173
+ border-radius: 10px;
174
+ background: #f9f9f9;
175
+ margin-bottom: 20px;
176
+ }
177
+ .message {
178
+ margin: 10px 0;
179
+ padding: 12px;
180
+ border-radius: 8px;
181
+ }
182
+ .user {
183
+ background: #e3f2fd;
184
+ text-align: right;
185
+ }
186
+ .bot {
187
+ background: #f5f5f5;
188
+ }
189
+ #audioInputComponent {
190
+ margin-top: 20px;
191
+ }
192
+ .key-status { /* Not strictly needed anymore but keeping for style consistency if other status messages arise */
193
+ padding: 5px;
194
+ margin-top: 5px;
195
+ border-radius: 4px;
196
+ }
197
+ .success {
198
+ background: #d4edda;
199
+ color: #155724;
200
+ }
201
+ .error {
202
+ background: #f8d7da;
203
+ color: #721c24;
204
+ }
205
+ </style>
206
+ """)
207
+
208
+ # --- UI Components ---
209
+ # Chatbot component to display messages
210
+ chatbot = gr.Chatbot(elem_id="chatBox", type="messages", height=400)
211
+ # State component to maintain chat history in OpenAI's message format
212
+ state = gr.State([])
213
+
214
+ # Audio input component for microphone recording
215
  audio_input = gr.Audio(
216
  sources=["microphone"],
217
+ type="numpy", # Receive audio as (samplerate, numpy_array)
218
  label="Speak your message here",
219
+ elem_id="audioInputComponent",
220
+ streaming=False # Process audio after full recording
221
  )
222
 
223
+ # New: Audio output component for TTS playback
224
  tts_audio_output = gr.Audio(
225
  label="Bot's Voice Response",
226
+ type="numpy", # Expects (samplerate, numpy_array) for playback
227
+ autoplay=True, # Automatically play the audio
228
+ waveform_options={
229
+ "skip_length": 0,
230
+ "waveform_color": "#2196F3",
231
+ "waveform_progress_color": "#4CAF50",
232
+ # Removed 'cursor_color' and 'unfilled_waveform_color' as they are not standard options here
233
+ }
234
  )
235
 
236
  clear_btn = gr.Button("🗑️ Clear Chat")
237
 
238
+ # Event handler for audio input change
239
  audio_input.change(
240
  fn=transcribe_audio_and_chat,
241
+ inputs=[audio_input, state], # api_key is now global
242
+ # Outputs: 1. chatbot display, 2. state (updated history),
243
+ # 3. audio_input (to clear it), 4. tts_audio_output (for playing bot's voice)
244
+ outputs=[chatbot, state, audio_input, tts_audio_output]
245
  )
246
 
247
+ # JavaScript (no changes needed for API key part here as it's removed)
248
+ gr.HTML("""
249
+ <script>
250
+ // You can add other useful JS here if needed in the future
251
+ </script>
252
+ """)
253
 
254
+ # Clear button functionality: resets chatbot and state to empty
255
+ # Also clear the TTS audio output when chat is cleared
256
+ clear_btn.click(lambda: ([], [], None), None, [chatbot, state, tts_audio_output])
257
 
258
+ demo.launch()