text2svg-demo-app

Running

File size: 6,468 Bytes

---
title: Text2svg Demo App
emoji: 🚀
colorFrom: blue
colorTo: yellow
sdk: docker
pinned: false
app_port: 8501
---

# Drawing with LLM 🎨

A Streamlit application that converts text descriptions into SVG graphics using multiple AI models.

Access demo app by this [link](https://huggingface.co/spaces/Timxjl/text2svg-demo-app)

## Overview

This project allows users to create vector graphics (SVG) from text descriptions using three different approaches:
1. **ML Model** - Uses Stable Diffusion to generate images and vtracer to convert them to SVG
2. **DL Model** - Uses Stable Diffusion for initial image creation and StarVector for direct image-to-SVG conversion
3. **Naive Model** - Uses Phi-4 LLM to directly generate SVG code from text descriptions

## Features

- Text-to-SVG generation with three different model approaches
- Adjustable parameters for each model type
- Real-time SVG preview and code display
- SVG download functionality
- GPU acceleration for faster generation

## Requirements

- Python 3.11+
- CUDA-compatible GPU (recommended)
- Dependencies listed in `requirements.txt`

## Installation

### Using Miniconda (Recommended)

```bash
# Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda
echo 'export PATH="$HOME/miniconda/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

# Create and activate environment
conda create -n svg-app python=3.11 -y
conda activate svg-app

# Install star-vector
cd star-vector 
pip install -e .
cd ..

# Install other dependencies
pip install -r requirements.txt
```

### Using Docker

```bash
# Build and run with Docker Compose
docker-compose up -d
```

## Usage

Start the Streamlit application:

```bash
streamlit run app.py
```

Or with the yes flag to automatically accept:

```bash
yes | streamlit run app.py
```

The application will be available at http://localhost:8501

## Models

### ML Model (vtracer)
Uses Stable Diffusion to generate an image from the text prompt, then applies vtracer to convert the raster image to SVG.

Configurable parameters:
- Simplify SVG
- Color Precision
- Filter Speckle
- Path Precision

### DL Model (starvector)
Uses Stable Diffusion for initial image creation followed by StarVector, a specialized model designed to convert images directly to SVG.

### Naive Model (phi-4)
Directly generates SVG code using the Phi-4 language model with specialized prompting.

Configurable parameters:
- Max New Tokens

## Evaluation Data and Results

### Data
The `data` directory contains synthetic evaluation data created using custom scripts:
- The first 15 examples are from the Kaggle competition "Drawing with LLM"
- `descriptions.csv` - Text descriptions for generating SVGs
- `eval.csv` - Evaluation metrics
- `gen_descriptions.py` - Script for generating synthetic descriptions
- `gen_vqa.py` - Script for generating visual question answering data
- Sample images (`gray_coat.png`, `purple_forest.png`) for reference

### Results
The `results` directory contains evaluation results comparing different models:
- Evaluation results for both Naive (Phi-4) and ML (vtracer) models
- The DL model (StarVector) was not evaluated as it typically fails on transforming natural images, often returning blank SVGs
- Performance visualizations:
  - `category_radar.png` - Performance comparison across categories
  - `complexity_performance.png` - Performance relative to prompt complexity
  - `quality_vs_time.png` - Quality-time tradeoff analysis
  - `generation_time.png` - Comparison of generation times
  - `model_comparison.png` - Overall model performance comparison
- Generated SVGs and PNGs in respective subdirectories
- Detailed results in JSON and CSV formats

## Project Structure

```
drawing-with-llm/             # Root directory
│
├── app.py                    # Main Streamlit application
├── requirements.txt          # Python dependencies
├── Dockerfile                # Docker container definition
├── docker-compose.yml        # Docker Compose configuration
│
├── ml.py                     # ML model implementation (vtracer approach)
├── dl.py                     # DL model implementation (StarVector approach)
├── naive.py                  # Naive model implementation (Phi-4 approach)
├── gen_image.py              # Common image generation using Stable Diffusion
│
├── eval.py                   # Evaluation script for model comparison
├── eval_analysis.py          # Analysis script for evaluation results
├── metric.py                 # Metrics implementation for evaluation
│
├── data/                     # Evaluation data directory
│   ├── descriptions.csv      # Text descriptions for evaluation
│   ├── eval.csv              # Evaluation metrics
│   ├── gen_descriptions.py   # Script for generating synthetic descriptions
│   ├── gen_vqa.py            # Script for generating VQA data
│   ├── gray_coat.png         # Sample image by GPT-4o
│   └── purple_forest.png     # Sample image by GPT-4o
│
├── results/                  # Evaluation results directory
│   ├── category_radar.png    # Performance comparison across categories
│   ├── complexity_performance.png # Performance by prompt complexity
│   ├── quality_vs_time.png   # Quality-time tradeoff analysis
│   ├── generation_time.png   # Comparison of generation times
│   ├── model_comparison.png  # Overall model performance comparison
│   ├── summary_*.csv         # Summary metrics in CSV format
│   ├── results_*.json        # Detailed results in JSON format
│   ├── svg/                  # Generated SVG outputs
│   └── png/                  # Generated PNG outputs
│
├── star-vector/              # StarVector dependency (installed locally)
└── starvector/               # StarVector Python package
```

## Acknowledgments

This project utilizes several key technologies:
- [Stable Diffusion](https://github.com/CompVis/stable-diffusion) for image generation
- [StarVector](https://github.com/joanrod/star-vector) for image-to-SVG conversion
- [vtracer](https://github.com/visioncortex/vtracer) for raster-to-vector conversion
- [Phi-4](https://huggingface.co/microsoft/phi-4) for text-to-SVG generation
- [Streamlit](https://streamlit.io/) for the web interface