Spaces:
Running
Running
title: Text2svg Demo App | |
emoji: π | |
colorFrom: blue | |
colorTo: yellow | |
sdk: docker | |
pinned: false | |
app_port: 8501 | |
# Drawing with LLM π¨ | |
A Streamlit application that converts text descriptions into SVG graphics using multiple AI models. | |
Access demo app by this [link](https://huggingface.co/spaces/Timxjl/text2svg-demo-app) | |
## Overview | |
This project allows users to create vector graphics (SVG) from text descriptions using three different approaches: | |
1. **ML Model** - Uses Stable Diffusion to generate images and vtracer to convert them to SVG | |
2. **DL Model** - Uses Stable Diffusion for initial image creation and StarVector for direct image-to-SVG conversion | |
3. **Naive Model** - Uses Phi-4 LLM to directly generate SVG code from text descriptions | |
## Features | |
- Text-to-SVG generation with three different model approaches | |
- Adjustable parameters for each model type | |
- Real-time SVG preview and code display | |
- SVG download functionality | |
- GPU acceleration for faster generation | |
## Requirements | |
- Python 3.11+ | |
- CUDA-compatible GPU (recommended) | |
- Dependencies listed in `requirements.txt` | |
## Installation | |
### Using Miniconda (Recommended) | |
```bash | |
# Install Miniconda | |
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh | |
bash miniconda.sh -b -p $HOME/miniconda | |
echo 'export PATH="$HOME/miniconda/bin:$PATH"' >> ~/.bashrc | |
source ~/.bashrc | |
# Create and activate environment | |
conda create -n svg-app python=3.11 -y | |
conda activate svg-app | |
# Install star-vector | |
cd star-vector | |
pip install -e . | |
cd .. | |
# Install other dependencies | |
pip install -r requirements.txt | |
``` | |
### Using Docker | |
```bash | |
# Build and run with Docker Compose | |
docker-compose up -d | |
``` | |
## Usage | |
Start the Streamlit application: | |
```bash | |
streamlit run app.py | |
``` | |
Or with the yes flag to automatically accept: | |
```bash | |
yes | streamlit run app.py | |
``` | |
The application will be available at http://localhost:8501 | |
## Models | |
### ML Model (vtracer) | |
Uses Stable Diffusion to generate an image from the text prompt, then applies vtracer to convert the raster image to SVG. | |
Configurable parameters: | |
- Simplify SVG | |
- Color Precision | |
- Filter Speckle | |
- Path Precision | |
### DL Model (starvector) | |
Uses Stable Diffusion for initial image creation followed by StarVector, a specialized model designed to convert images directly to SVG. | |
### Naive Model (phi-4) | |
Directly generates SVG code using the Phi-4 language model with specialized prompting. | |
Configurable parameters: | |
- Max New Tokens | |
## Evaluation Data and Results | |
### Data | |
The `data` directory contains synthetic evaluation data created using custom scripts: | |
- The first 15 examples are from the Kaggle competition "Drawing with LLM" | |
- `descriptions.csv` - Text descriptions for generating SVGs | |
- `eval.csv` - Evaluation metrics | |
- `gen_descriptions.py` - Script for generating synthetic descriptions | |
- `gen_vqa.py` - Script for generating visual question answering data | |
- Sample images (`gray_coat.png`, `purple_forest.png`) for reference | |
### Results | |
The `results` directory contains evaluation results comparing different models: | |
- Evaluation results for both Naive (Phi-4) and ML (vtracer) models | |
- The DL model (StarVector) was not evaluated as it typically fails on transforming natural images, often returning blank SVGs | |
- Performance visualizations: | |
- `category_radar.png` - Performance comparison across categories | |
- `complexity_performance.png` - Performance relative to prompt complexity | |
- `quality_vs_time.png` - Quality-time tradeoff analysis | |
- `generation_time.png` - Comparison of generation times | |
- `model_comparison.png` - Overall model performance comparison | |
- Generated SVGs and PNGs in respective subdirectories | |
- Detailed results in JSON and CSV formats | |
## Project Structure | |
``` | |
drawing-with-llm/ # Root directory | |
β | |
βββ app.py # Main Streamlit application | |
βββ requirements.txt # Python dependencies | |
βββ Dockerfile # Docker container definition | |
βββ docker-compose.yml # Docker Compose configuration | |
β | |
βββ ml.py # ML model implementation (vtracer approach) | |
βββ dl.py # DL model implementation (StarVector approach) | |
βββ naive.py # Naive model implementation (Phi-4 approach) | |
βββ gen_image.py # Common image generation using Stable Diffusion | |
β | |
βββ eval.py # Evaluation script for model comparison | |
βββ eval_analysis.py # Analysis script for evaluation results | |
βββ metric.py # Metrics implementation for evaluation | |
β | |
βββ data/ # Evaluation data directory | |
β βββ descriptions.csv # Text descriptions for evaluation | |
β βββ eval.csv # Evaluation metrics | |
β βββ gen_descriptions.py # Script for generating synthetic descriptions | |
β βββ gen_vqa.py # Script for generating VQA data | |
β βββ gray_coat.png # Sample image by GPT-4o | |
β βββ purple_forest.png # Sample image by GPT-4o | |
β | |
βββ results/ # Evaluation results directory | |
β βββ category_radar.png # Performance comparison across categories | |
β βββ complexity_performance.png # Performance by prompt complexity | |
β βββ quality_vs_time.png # Quality-time tradeoff analysis | |
β βββ generation_time.png # Comparison of generation times | |
β βββ model_comparison.png # Overall model performance comparison | |
β βββ summary_*.csv # Summary metrics in CSV format | |
β βββ results_*.json # Detailed results in JSON format | |
β βββ svg/ # Generated SVG outputs | |
β βββ png/ # Generated PNG outputs | |
β | |
βββ star-vector/ # StarVector dependency (installed locally) | |
βββ starvector/ # StarVector Python package | |
``` | |
## Acknowledgments | |
This project utilizes several key technologies: | |
- [Stable Diffusion](https://github.com/CompVis/stable-diffusion) for image generation | |
- [StarVector](https://github.com/joanrod/star-vector) for image-to-SVG conversion | |
- [vtracer](https://github.com/visioncortex/vtracer) for raster-to-vector conversion | |
- [Phi-4](https://huggingface.co/microsoft/phi-4) for text-to-SVG generation | |
- [Streamlit](https://streamlit.io/) for the web interface | |