cursor_slides_internvl2 / README-HF.md
mknolan's picture
Upload InternVL2 implementation
e59dc66 verified

Image Description with Qwen2-VL-7B

This Hugging Face Space uses the powerful Qwen2-VL-7B vision language model to generate detailed descriptions of images.

About

Upload any image and get:

  • A basic description
  • A detailed analysis
  • A technical assessment

The app uses the Qwen2-VL-7B model with 4-bit quantization to provide efficient and high-quality image analysis.

Usage

  1. Upload an image or use one of the example images
  2. Click "Analyze Image"
  3. View the three types of descriptions generated by the model

Examples

The space includes sample images in the data_temp folder that you can use to test the model.

Technical Details

  • Model: Qwen2-VL-7B
  • Framework: Gradio UI + Flask API backend
  • Quantization: 4-bit for efficient inference
  • GPU: A10G recommended

Credits