arxiv:2505.10496

CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs

Published on May 15

· Submitted by

raman07 on May 19

Upvote

Authors:

Raman Dutt ,

Pedro Sanchez ,

Steven McDonagh ,

Abstract

CheXGenBench is a comprehensive evaluation framework for synthetic chest radiographs that assesses fidelity, privacy, and clinical utility of text-to-image generative models using standardized metrics and a unified protocol.

AI-generated summary

We introduce CheXGenBench, a rigorous and multifaceted evaluation framework for synthetic chest radiograph generation that simultaneously assesses fidelity, privacy risks, and clinical utility across state-of-the-art text-to-image generative models. Despite rapid advancements in generative AI for real-world imagery, medical domain evaluations have been hindered by methodological inconsistencies, outdated architectural comparisons, and disconnected assessment criteria that rarely address the practical clinical value of synthetic samples. CheXGenBench overcomes these limitations through standardised data partitioning and a unified evaluation protocol comprising over 20 quantitative metrics that systematically analyse generation quality, potential privacy vulnerabilities, and downstream clinical applicability across 11 leading text-to-image architectures. Our results reveal critical inefficiencies in the existing evaluation protocols, particularly in assessing generative fidelity, leading to inconsistent and uninformative comparisons. Our framework establishes a standardised benchmark for the medical AI community, enabling objective and reproducible comparisons while facilitating seamless integration of both existing and future generative models. Additionally, we release a high-quality, synthetic dataset, SynthCheX-75K, comprising 75K radiographs generated by the top-performing model (Sana 0.6B) in our benchmark to support further research in this critical domain. Through CheXGenBench, we establish a new state-of-the-art and release our framework, models, and SynthCheX-75K dataset at https://raman1121.github.io/CheXGenBench/

View arXiv page View PDF Project page GitHub repository Add to collection

Community

raman07

Paper author Paper submitter 15 days ago

We present CheXGenBench, a new benchmark that evaluates leading text-to-image (T2I) models for synthetic chest X-ray generation using 20+ metrics covering image fidelity, privacy and patient re-identification risk, and downstream utility. We also release the new state-of-the-art model (SoTA) for synthetic radiograph generation. Furthermore, using our benchmark-leading model, we release a high-quality dataset of synthetic X-rays.

Project Page - https://raman1121.github.io/CheXGenBench/
SynthCheX-75K Dataset - https://huggingface.co/datasets/raman07/SynthCheX-75K-v2
SoTA Model - https://huggingface.co/raman07/CheXGenBench-Models-Sana-e20
Github - https://github.com/Raman1121/CheXGenBench

librarian-bot

15 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.10496 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.10496 in a Space README.md to link it from this page.