Spaces:

nvidia
/

audio-flamingo-3-chat

Running on A100

App Files Files Community

multimodalart HF Staff commited on 6 days ago

Commit

14495d1

verified ·

1 Parent(s): 7272785

feat: Enable MCP

Browse files

Hello! This is an automated PR adding MCP compatibility to your AI App 🤖.

![image.png](https://cdn-uploads.huggingface.co/production/uploads/624bebf604abc7ebb01789af/HQQK38I_MDXLDMYDYBq8H.png)This PR introduces two improvements:
1. Adds docstrings to the functions in the app file that are directly connected to the Gradio UI, for the downstream LLM to use.
2. Enables the Model-Compute-Platform by adding `mcp_server=True` to the `.launch()` call.

No other logic has been changed. Please review and merge if it looks good!Learn more about MCP compatibility in Spaces here: https://huggingface.co/changelog/add-compatible-spaces-to-your-mcp-tools

Files changed (1) hide show

app.py +22 -1

app.py CHANGED Viewed

@@ -19,6 +19,18 @@ generation_config_multi = model_multi.default_generation_config
 # MULTI-TURN INFERENCE FUNCTION
 # ---------------------------------
 def multi_turn_chat(user_input, audio_file, history, current_audio):
     try:
         if audio_file is not None:
             current_audio = audio_file  # Update state if a new file is uploaded
@@ -37,6 +49,15 @@ def multi_turn_chat(user_input, audio_file, history, current_audio):
         history.append((user_input, f"❌ Error: {str(e)}"))
         return history, history, current_audio
 def speech_prompt_infer(audio_prompt_file):
     try:
         sound = llava.Sound(audio_prompt_file)
         full_prompt = "<sound>"
@@ -197,4 +218,4 @@ To enable these capabilities, we propose several large-scale training datasets c
 # Launch App
 # -----------------------
 if __name__ == "__main__":
-    demo.launch(share=True)

 # MULTI-TURN INFERENCE FUNCTION
 # ---------------------------------
 def multi_turn_chat(user_input, audio_file, history, current_audio):
+    """
+    Handle multi-turn chat interactions with audio context.
+    Args:
+        user_input: The user's text message/question about the audio
+        audio_file: New audio file path if uploaded, otherwise None
+        history: List of previous conversation turns as (user_msg, bot_response) tuples
+        current_audio: Path to the currently active audio file in the conversation
+    Returns:
+        Tuple of (updated_chatbot_display, updated_history, updated_current_audio)
+    """
     try:
         if audio_file is not None:
             current_audio = audio_file  # Update state if a new file is uploaded
         history.append((user_input, f"❌ Error: {str(e)}"))
         return history, history, current_audio
 def speech_prompt_infer(audio_prompt_file):
+    """
+    Process speech/audio input and generate a text response.
+    Args:
+        audio_prompt_file: Path to the audio file containing the user's speech prompt
+    Returns:
+        String containing the model's text response or error message
+    """
     try:
         sound = llava.Sound(audio_prompt_file)
         full_prompt = "<sound>"
 # Launch App
 # -----------------------
 if __name__ == "__main__":
+    demo.launch(share=True, mcp_server=True)