Spaces:

mknolan
/

cursor_slides_internvl2

Paused

App Files Files Community

cursor_slides_internvl2 / README-HF.md

mknolan's picture

Upload InternVL2 implementation

e59dc66 verified about 1 month ago

|

history blame contribute delete

899 Bytes

Image Description with Qwen2-VL-7B

This Hugging Face Space uses the powerful Qwen2-VL-7B vision language model to generate detailed descriptions of images.

About

Upload any image and get:

A basic description
A detailed analysis
A technical assessment

The app uses the Qwen2-VL-7B model with 4-bit quantization to provide efficient and high-quality image analysis.

Usage

Upload an image or use one of the example images
Click "Analyze Image"
View the three types of descriptions generated by the model

Examples

The space includes sample images in the data_temp folder that you can use to test the model.

Technical Details

Model: Qwen2-VL-7B
Framework: Gradio UI + Flask API backend
Quantization: 4-bit for efficient inference
GPU: A10G recommended

Credits

Qwen2-VL-7B model by Qwen team