text2svg-demo-app / README.md
Jinglong Xiong
change port
fca89aa
---
title: Text2svg Demo App
emoji: πŸš€
colorFrom: blue
colorTo: yellow
sdk: docker
pinned: false
app_port: 8501
---
# Drawing with LLM 🎨
A Streamlit application that converts text descriptions into SVG graphics using multiple AI models.
Access demo app by this [link](https://huggingface.co/spaces/Timxjl/text2svg-demo-app)
## Overview
This project allows users to create vector graphics (SVG) from text descriptions using three different approaches:
1. **ML Model** - Uses Stable Diffusion to generate images and vtracer to convert them to SVG
2. **DL Model** - Uses Stable Diffusion for initial image creation and StarVector for direct image-to-SVG conversion
3. **Naive Model** - Uses Phi-4 LLM to directly generate SVG code from text descriptions
## Features
- Text-to-SVG generation with three different model approaches
- Adjustable parameters for each model type
- Real-time SVG preview and code display
- SVG download functionality
- GPU acceleration for faster generation
## Requirements
- Python 3.11+
- CUDA-compatible GPU (recommended)
- Dependencies listed in `requirements.txt`
## Installation
### Using Miniconda (Recommended)
```bash
# Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda
echo 'export PATH="$HOME/miniconda/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
# Create and activate environment
conda create -n svg-app python=3.11 -y
conda activate svg-app
# Install star-vector
cd star-vector
pip install -e .
cd ..
# Install other dependencies
pip install -r requirements.txt
```
### Using Docker
```bash
# Build and run with Docker Compose
docker-compose up -d
```
## Usage
Start the Streamlit application:
```bash
streamlit run app.py
```
Or with the yes flag to automatically accept:
```bash
yes | streamlit run app.py
```
The application will be available at http://localhost:8501
## Models
### ML Model (vtracer)
Uses Stable Diffusion to generate an image from the text prompt, then applies vtracer to convert the raster image to SVG.
Configurable parameters:
- Simplify SVG
- Color Precision
- Filter Speckle
- Path Precision
### DL Model (starvector)
Uses Stable Diffusion for initial image creation followed by StarVector, a specialized model designed to convert images directly to SVG.
### Naive Model (phi-4)
Directly generates SVG code using the Phi-4 language model with specialized prompting.
Configurable parameters:
- Max New Tokens
## Evaluation Data and Results
### Data
The `data` directory contains synthetic evaluation data created using custom scripts:
- The first 15 examples are from the Kaggle competition "Drawing with LLM"
- `descriptions.csv` - Text descriptions for generating SVGs
- `eval.csv` - Evaluation metrics
- `gen_descriptions.py` - Script for generating synthetic descriptions
- `gen_vqa.py` - Script for generating visual question answering data
- Sample images (`gray_coat.png`, `purple_forest.png`) for reference
### Results
The `results` directory contains evaluation results comparing different models:
- Evaluation results for both Naive (Phi-4) and ML (vtracer) models
- The DL model (StarVector) was not evaluated as it typically fails on transforming natural images, often returning blank SVGs
- Performance visualizations:
- `category_radar.png` - Performance comparison across categories
- `complexity_performance.png` - Performance relative to prompt complexity
- `quality_vs_time.png` - Quality-time tradeoff analysis
- `generation_time.png` - Comparison of generation times
- `model_comparison.png` - Overall model performance comparison
- Generated SVGs and PNGs in respective subdirectories
- Detailed results in JSON and CSV formats
## Project Structure
```
drawing-with-llm/ # Root directory
β”‚
β”œβ”€β”€ app.py # Main Streamlit application
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ Dockerfile # Docker container definition
β”œβ”€β”€ docker-compose.yml # Docker Compose configuration
β”‚
β”œβ”€β”€ ml.py # ML model implementation (vtracer approach)
β”œβ”€β”€ dl.py # DL model implementation (StarVector approach)
β”œβ”€β”€ naive.py # Naive model implementation (Phi-4 approach)
β”œβ”€β”€ gen_image.py # Common image generation using Stable Diffusion
β”‚
β”œβ”€β”€ eval.py # Evaluation script for model comparison
β”œβ”€β”€ eval_analysis.py # Analysis script for evaluation results
β”œβ”€β”€ metric.py # Metrics implementation for evaluation
β”‚
β”œβ”€β”€ data/ # Evaluation data directory
β”‚ β”œβ”€β”€ descriptions.csv # Text descriptions for evaluation
β”‚ β”œβ”€β”€ eval.csv # Evaluation metrics
β”‚ β”œβ”€β”€ gen_descriptions.py # Script for generating synthetic descriptions
β”‚ β”œβ”€β”€ gen_vqa.py # Script for generating VQA data
β”‚ β”œβ”€β”€ gray_coat.png # Sample image by GPT-4o
β”‚ └── purple_forest.png # Sample image by GPT-4o
β”‚
β”œβ”€β”€ results/ # Evaluation results directory
β”‚ β”œβ”€β”€ category_radar.png # Performance comparison across categories
β”‚ β”œβ”€β”€ complexity_performance.png # Performance by prompt complexity
β”‚ β”œβ”€β”€ quality_vs_time.png # Quality-time tradeoff analysis
β”‚ β”œβ”€β”€ generation_time.png # Comparison of generation times
β”‚ β”œβ”€β”€ model_comparison.png # Overall model performance comparison
β”‚ β”œβ”€β”€ summary_*.csv # Summary metrics in CSV format
β”‚ β”œβ”€β”€ results_*.json # Detailed results in JSON format
β”‚ β”œβ”€β”€ svg/ # Generated SVG outputs
β”‚ └── png/ # Generated PNG outputs
β”‚
β”œβ”€β”€ star-vector/ # StarVector dependency (installed locally)
└── starvector/ # StarVector Python package
```
## Acknowledgments
This project utilizes several key technologies:
- [Stable Diffusion](https://github.com/CompVis/stable-diffusion) for image generation
- [StarVector](https://github.com/joanrod/star-vector) for image-to-SVG conversion
- [vtracer](https://github.com/visioncortex/vtracer) for raster-to-vector conversion
- [Phi-4](https://huggingface.co/microsoft/phi-4) for text-to-SVG generation
- [Streamlit](https://streamlit.io/) for the web interface