text2svg-demo-app / README.md
Jinglong Xiong
change port
fca89aa
metadata
title: Text2svg Demo App
emoji: πŸš€
colorFrom: blue
colorTo: yellow
sdk: docker
pinned: false
app_port: 8501

Drawing with LLM 🎨

A Streamlit application that converts text descriptions into SVG graphics using multiple AI models.

Access demo app by this link

Overview

This project allows users to create vector graphics (SVG) from text descriptions using three different approaches:

  1. ML Model - Uses Stable Diffusion to generate images and vtracer to convert them to SVG
  2. DL Model - Uses Stable Diffusion for initial image creation and StarVector for direct image-to-SVG conversion
  3. Naive Model - Uses Phi-4 LLM to directly generate SVG code from text descriptions

Features

  • Text-to-SVG generation with three different model approaches
  • Adjustable parameters for each model type
  • Real-time SVG preview and code display
  • SVG download functionality
  • GPU acceleration for faster generation

Requirements

  • Python 3.11+
  • CUDA-compatible GPU (recommended)
  • Dependencies listed in requirements.txt

Installation

Using Miniconda (Recommended)

# Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda
echo 'export PATH="$HOME/miniconda/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

# Create and activate environment
conda create -n svg-app python=3.11 -y
conda activate svg-app

# Install star-vector
cd star-vector 
pip install -e .
cd ..

# Install other dependencies
pip install -r requirements.txt

Using Docker

# Build and run with Docker Compose
docker-compose up -d

Usage

Start the Streamlit application:

streamlit run app.py

Or with the yes flag to automatically accept:

yes | streamlit run app.py

The application will be available at http://localhost:8501

Models

ML Model (vtracer)

Uses Stable Diffusion to generate an image from the text prompt, then applies vtracer to convert the raster image to SVG.

Configurable parameters:

  • Simplify SVG
  • Color Precision
  • Filter Speckle
  • Path Precision

DL Model (starvector)

Uses Stable Diffusion for initial image creation followed by StarVector, a specialized model designed to convert images directly to SVG.

Naive Model (phi-4)

Directly generates SVG code using the Phi-4 language model with specialized prompting.

Configurable parameters:

  • Max New Tokens

Evaluation Data and Results

Data

The data directory contains synthetic evaluation data created using custom scripts:

  • The first 15 examples are from the Kaggle competition "Drawing with LLM"
  • descriptions.csv - Text descriptions for generating SVGs
  • eval.csv - Evaluation metrics
  • gen_descriptions.py - Script for generating synthetic descriptions
  • gen_vqa.py - Script for generating visual question answering data
  • Sample images (gray_coat.png, purple_forest.png) for reference

Results

The results directory contains evaluation results comparing different models:

  • Evaluation results for both Naive (Phi-4) and ML (vtracer) models
  • The DL model (StarVector) was not evaluated as it typically fails on transforming natural images, often returning blank SVGs
  • Performance visualizations:
    • category_radar.png - Performance comparison across categories
    • complexity_performance.png - Performance relative to prompt complexity
    • quality_vs_time.png - Quality-time tradeoff analysis
    • generation_time.png - Comparison of generation times
    • model_comparison.png - Overall model performance comparison
  • Generated SVGs and PNGs in respective subdirectories
  • Detailed results in JSON and CSV formats

Project Structure

drawing-with-llm/             # Root directory
β”‚
β”œβ”€β”€ app.py                    # Main Streamlit application
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ Dockerfile                # Docker container definition
β”œβ”€β”€ docker-compose.yml        # Docker Compose configuration
β”‚
β”œβ”€β”€ ml.py                     # ML model implementation (vtracer approach)
β”œβ”€β”€ dl.py                     # DL model implementation (StarVector approach)
β”œβ”€β”€ naive.py                  # Naive model implementation (Phi-4 approach)
β”œβ”€β”€ gen_image.py              # Common image generation using Stable Diffusion
β”‚
β”œβ”€β”€ eval.py                   # Evaluation script for model comparison
β”œβ”€β”€ eval_analysis.py          # Analysis script for evaluation results
β”œβ”€β”€ metric.py                 # Metrics implementation for evaluation
β”‚
β”œβ”€β”€ data/                     # Evaluation data directory
β”‚   β”œβ”€β”€ descriptions.csv      # Text descriptions for evaluation
β”‚   β”œβ”€β”€ eval.csv              # Evaluation metrics
β”‚   β”œβ”€β”€ gen_descriptions.py   # Script for generating synthetic descriptions
β”‚   β”œβ”€β”€ gen_vqa.py            # Script for generating VQA data
β”‚   β”œβ”€β”€ gray_coat.png         # Sample image by GPT-4o
β”‚   └── purple_forest.png     # Sample image by GPT-4o
β”‚
β”œβ”€β”€ results/                  # Evaluation results directory
β”‚   β”œβ”€β”€ category_radar.png    # Performance comparison across categories
β”‚   β”œβ”€β”€ complexity_performance.png # Performance by prompt complexity
β”‚   β”œβ”€β”€ quality_vs_time.png   # Quality-time tradeoff analysis
β”‚   β”œβ”€β”€ generation_time.png   # Comparison of generation times
β”‚   β”œβ”€β”€ model_comparison.png  # Overall model performance comparison
β”‚   β”œβ”€β”€ summary_*.csv         # Summary metrics in CSV format
β”‚   β”œβ”€β”€ results_*.json        # Detailed results in JSON format
β”‚   β”œβ”€β”€ svg/                  # Generated SVG outputs
β”‚   └── png/                  # Generated PNG outputs
β”‚
β”œβ”€β”€ star-vector/              # StarVector dependency (installed locally)
└── starvector/               # StarVector Python package

Acknowledgments

This project utilizes several key technologies: