Spaces:
Paused
title: Text2svg Demo App
emoji: π
colorFrom: blue
colorTo: yellow
sdk: docker
pinned: false
app_port: 8501
Drawing with LLM π¨
A Streamlit application that converts text descriptions into SVG graphics using multiple AI models.
Access demo app by this link
Overview
This project allows users to create vector graphics (SVG) from text descriptions using three different approaches:
- ML Model - Uses Stable Diffusion to generate images and vtracer to convert them to SVG
- DL Model - Uses Stable Diffusion for initial image creation and StarVector for direct image-to-SVG conversion
- Naive Model - Uses Phi-4 LLM to directly generate SVG code from text descriptions
Features
- Text-to-SVG generation with three different model approaches
- Adjustable parameters for each model type
- Real-time SVG preview and code display
- SVG download functionality
- GPU acceleration for faster generation
Requirements
- Python 3.11+
- CUDA-compatible GPU (recommended)
- Dependencies listed in
requirements.txt
Installation
Using Miniconda (Recommended)
# Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda
echo 'export PATH="$HOME/miniconda/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
# Create and activate environment
conda create -n svg-app python=3.11 -y
conda activate svg-app
# Install star-vector
cd star-vector
pip install -e .
cd ..
# Install other dependencies
pip install -r requirements.txt
Using Docker
# Build and run with Docker Compose
docker-compose up -d
Usage
Start the Streamlit application:
streamlit run app.py
Or with the yes flag to automatically accept:
yes | streamlit run app.py
The application will be available at http://localhost:8501
Models
ML Model (vtracer)
Uses Stable Diffusion to generate an image from the text prompt, then applies vtracer to convert the raster image to SVG.
Configurable parameters:
- Simplify SVG
- Color Precision
- Filter Speckle
- Path Precision
DL Model (starvector)
Uses Stable Diffusion for initial image creation followed by StarVector, a specialized model designed to convert images directly to SVG.
Naive Model (phi-4)
Directly generates SVG code using the Phi-4 language model with specialized prompting.
Configurable parameters:
- Max New Tokens
Evaluation Data and Results
Data
The data
directory contains synthetic evaluation data created using custom scripts:
- The first 15 examples are from the Kaggle competition "Drawing with LLM"
descriptions.csv
- Text descriptions for generating SVGseval.csv
- Evaluation metricsgen_descriptions.py
- Script for generating synthetic descriptionsgen_vqa.py
- Script for generating visual question answering data- Sample images (
gray_coat.png
,purple_forest.png
) for reference
Results
The results
directory contains evaluation results comparing different models:
- Evaluation results for both Naive (Phi-4) and ML (vtracer) models
- The DL model (StarVector) was not evaluated as it typically fails on transforming natural images, often returning blank SVGs
- Performance visualizations:
category_radar.png
- Performance comparison across categoriescomplexity_performance.png
- Performance relative to prompt complexityquality_vs_time.png
- Quality-time tradeoff analysisgeneration_time.png
- Comparison of generation timesmodel_comparison.png
- Overall model performance comparison
- Generated SVGs and PNGs in respective subdirectories
- Detailed results in JSON and CSV formats
Project Structure
drawing-with-llm/ # Root directory
β
βββ app.py # Main Streamlit application
βββ requirements.txt # Python dependencies
βββ Dockerfile # Docker container definition
βββ docker-compose.yml # Docker Compose configuration
β
βββ ml.py # ML model implementation (vtracer approach)
βββ dl.py # DL model implementation (StarVector approach)
βββ naive.py # Naive model implementation (Phi-4 approach)
βββ gen_image.py # Common image generation using Stable Diffusion
β
βββ eval.py # Evaluation script for model comparison
βββ eval_analysis.py # Analysis script for evaluation results
βββ metric.py # Metrics implementation for evaluation
β
βββ data/ # Evaluation data directory
β βββ descriptions.csv # Text descriptions for evaluation
β βββ eval.csv # Evaluation metrics
β βββ gen_descriptions.py # Script for generating synthetic descriptions
β βββ gen_vqa.py # Script for generating VQA data
β βββ gray_coat.png # Sample image by GPT-4o
β βββ purple_forest.png # Sample image by GPT-4o
β
βββ results/ # Evaluation results directory
β βββ category_radar.png # Performance comparison across categories
β βββ complexity_performance.png # Performance by prompt complexity
β βββ quality_vs_time.png # Quality-time tradeoff analysis
β βββ generation_time.png # Comparison of generation times
β βββ model_comparison.png # Overall model performance comparison
β βββ summary_*.csv # Summary metrics in CSV format
β βββ results_*.json # Detailed results in JSON format
β βββ svg/ # Generated SVG outputs
β βββ png/ # Generated PNG outputs
β
βββ star-vector/ # StarVector dependency (installed locally)
βββ starvector/ # StarVector Python package
Acknowledgments
This project utilizes several key technologies:
- Stable Diffusion for image generation
- StarVector for image-to-SVG conversion
- vtracer for raster-to-vector conversion
- Phi-4 for text-to-SVG generation
- Streamlit for the web interface