Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
5.23.3
Visual-language assistant with LLaVA Next and OpenVINO
nanoLLaVA is a "small but mighty" 1B vision-language model designed to run efficiently on edge devices. It uses SigLIP-400m as Image Encoder and Qwen1.5-0.5B as LLM. In this tutorial, we consider how to convert and run nanoLLaVA model using OpenVINO. Additionally, we will optimize model using NNCF
Notebook contents
The tutorial consists from following steps:
- Install requirements
- Download PyTorch model
- Convert model to OpenVINO Intermediate Representation (IR)
- Compress model weights using NNCF
- Prepare Inference Pipeline
- Run OpenVINO model inference
- Launch Interactive demo
In this demonstration, you'll create interactive chatbot that can answer questions about provided image's content.
Installation instructions
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to Installation Guide.