CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs
Abstract
CheXGenBench is a comprehensive evaluation framework for synthetic chest radiographs that assesses fidelity, privacy, and clinical utility of text-to-image generative models using standardized metrics and a unified protocol.
We introduce CheXGenBench, a rigorous and multifaceted evaluation framework for synthetic chest radiograph generation that simultaneously assesses fidelity, privacy risks, and clinical utility across state-of-the-art text-to-image generative models. Despite rapid advancements in generative AI for real-world imagery, medical domain evaluations have been hindered by methodological inconsistencies, outdated architectural comparisons, and disconnected assessment criteria that rarely address the practical clinical value of synthetic samples. CheXGenBench overcomes these limitations through standardised data partitioning and a unified evaluation protocol comprising over 20 quantitative metrics that systematically analyse generation quality, potential privacy vulnerabilities, and downstream clinical applicability across 11 leading text-to-image architectures. Our results reveal critical inefficiencies in the existing evaluation protocols, particularly in assessing generative fidelity, leading to inconsistent and uninformative comparisons. Our framework establishes a standardised benchmark for the medical AI community, enabling objective and reproducible comparisons while facilitating seamless integration of both existing and future generative models. Additionally, we release a high-quality, synthetic dataset, SynthCheX-75K, comprising 75K radiographs generated by the top-performing model (Sana 0.6B) in our benchmark to support further research in this critical domain. Through CheXGenBench, we establish a new state-of-the-art and release our framework, models, and SynthCheX-75K dataset at https://raman1121.github.io/CheXGenBench/
Community
We present CheXGenBench, a new benchmark that evaluates leading text-to-image (T2I) models for synthetic chest X-ray generation using 20+ metrics covering image fidelity, privacy and patient re-identification risk, and downstream utility. We also release the new state-of-the-art model (SoTA) for synthetic radiograph generation. Furthermore, using our benchmark-leading model, we release a high-quality dataset of synthetic X-rays.
Project Page - https://raman1121.github.io/CheXGenBench/
SynthCheX-75K Dataset - https://huggingface.co/datasets/raman07/SynthCheX-75K-v2
SoTA Model - https://huggingface.co/raman07/CheXGenBench-Models-Sana-e20
Github - https://github.com/Raman1121/CheXGenBench
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Eyes Tell the Truth: GazeVal Highlights Shortcomings of Generative AI in Medical Imaging (2025)
- Prompt to Polyp: Medical Text-Conditioned Image Synthesis with Diffusion Models (2025)
- Metrics that matter: Evaluating image quality metrics for medical image generation (2025)
- AI-GenBench: A New Ongoing Benchmark for AI-Generated Image Detection (2025)
- Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model (2025)
- Any-to-Any Vision-Language Model for Multimodal X-ray Imaging and Radiological Report Generation (2025)
- Statistical Guarantees in Synthetic Data through Conformal Adversarial Generation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper