Spaces:

fffiloni
/

soft-video-understanding

Paused

fffiloni commited on Mar 8, 2024

Commit

c1ba3d1

verified ·

1 Parent(s): 52be6ba

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -221,7 +221,7 @@ with gr.Blocks(css=css) as demo :
         <p style="text-align: center;">
             An experiment to try to achieve what i call "soft video understanding" with open-source available models. <br />
             We use moondream1 to caption extracted frames, salmonn to analyze extracted audio, then send visual and audio details to Zephyr which is instructed to resume what it understood.
-            Instructions prompt is available for further discussion with the Community.
         </p>
         """)
         with gr.Row():
@@ -233,7 +233,7 @@ with gr.Blocks(css=css) as demo :
                     )
                 gr.Examples(
                     examples = ["examples/train.mp4"],
-                    inputs = [video_in]
                 )
             with gr.Column():
                 video_cut = gr.Video(label="Video cut to 10 seconds", interactive=False)

         <p style="text-align: center;">
             An experiment to try to achieve what i call "soft video understanding" with open-source available models. <br />
             We use moondream1 to caption extracted frames, salmonn to analyze extracted audio, then send visual and audio details to Zephyr which is instructed to resume what it understood.
+            Instructions prompt is available for further discussion with the Community. Note that audio is crucial for better overall vision. Video longer than 10 seconds will be cut.
         </p>
         """)
         with gr.Row():
                     )
                 gr.Examples(
                     examples = ["examples/train.mp4"],
+                    inputs = [video_cut]
                 )
             with gr.Column():
                 video_cut = gr.Video(label="Video cut to 10 seconds", interactive=False)