Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction
Abstract
Uni-Instruct unifies and enhances one-step diffusion distillation methods through a novel diffusion expansion theory, achieving state-of-the-art performance in unconditional and conditional image generation and text-to-3D generation.
In this paper, we unify more than 10 existing one-step diffusion distillation approaches, such as Diff-Instruct, DMD, SIM, SiD, f-distill, etc, inside a theory-driven framework which we name the \emph{Uni-Instruct}. Uni-Instruct is motivated by our proposed diffusion expansion theory of the f-divergence family. Then we introduce key theories that overcome the intractability issue of the original expanded f-divergence, resulting in an equivalent yet tractable loss that effectively trains one-step diffusion models by minimizing the expanded f-divergence family. The novel unification introduced by Uni-Instruct not only offers new theoretical contributions that help understand existing approaches from a high-level perspective but also leads to state-of-the-art one-step diffusion generation performances. On the CIFAR10 generation benchmark, Uni-Instruct achieves record-breaking Frechet Inception Distance (FID) values of \emph{1.46} for unconditional generation and \emph{1.38} for conditional generation. On the ImageNet-64times 64 generation benchmark, Uni-Instruct achieves a new SoTA one-step generation FID of \emph{1.02}, which outperforms its 79-step teacher diffusion with a significant improvement margin of 1.33 (1.02 vs 2.35). We also apply Uni-Instruct on broader tasks like text-to-3D generation. For text-to-3D generation, Uni-Instruct gives decent results, which slightly outperforms previous methods, such as SDS and VSD, in terms of both generation quality and diversity. Both the solid theoretical and empirical contributions of Uni-Instruct will potentially help future studies on one-step diffusion distillation and knowledge transferring of diffusion models.
Community
We proposed Uni-Instruct, which unifies more than 10 existing one-step diffusion distillation approaches, with a new SoTA 1-step generation ImageNet64x64 benchmark with 1.02 FID. In Uni-Instruct, we draw inspiration from the connection between score matching and maximum log-likelihood. We give a novel diffusion expansion theorem that expands f-divergences to a form of integral score-based divergences, which unifies previous distillation methods such as DI (integral KL divergence) and SIM (integral score-based divergence). Code and model weights will be available soon.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Few-Step Diffusion via Score identity Distillation (2025)
- One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models (2025)
- Unified Continuous Generative Models (2025)
- JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models (2025)
- One-Step Diffusion-Based Image Compression with Semantic Distillation (2025)
- Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model (2025)
- How can Diffusion Models Evolve into Continual Generators? (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper