Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
Abstract
Unsupervised reconstruction networks using self-attention transformers have achieved state-of-the-art performance for multi-class (unified) anomaly detection with a single model. However, these self-attention reconstruction models primarily operate on target features, which may result in perfect reconstruction for both normal and anomaly features due to high consistency with context, leading to failure in detecting anomalies. Additionally, these models often produce inaccurate anomaly segmentation due to performing reconstruction in a low spatial resolution latent space. To enable reconstruction models enjoying high efficiency while enhancing their generalization for unified anomaly detection, we propose a simple yet effective method that reconstructs normal features and restores anomaly features with just One Normal Image Prompt (OneNIP). In contrast to previous work, OneNIP allows for the first time to reconstruct or restore anomalies with just one normal image prompt, effectively boosting unified anomaly detection performance. Furthermore, we propose a supervised refiner that regresses reconstruction errors by using both real normal and synthesized anomalous images, which significantly improves pixel-level anomaly segmentation. OneNIP outperforms previous methods on three industry anomaly detection benchmarks: MVTec, BTAD, and VisA. The code and pre-trained models are available at https://github.com/gaobb/OneNIP.
Community
OneNIP mainly consists of Unsupervised Reconstruction, Unsupervised Restoration, and Supervised Refiner. Unsupervised Reconstruction and Unsupervised Restoration share the same encoder-decoder architectures and weights. Supervised Refiner is implemented by two transposed convolution blocks, each followed by a 1×1 convolution layer.
- Unsupervised Reconstruction reconstructs normal tokens;
- Unsupervised Restoration restores pseudo anomaly tokens to the corresponding normal tokens;
- Supervised Refiner refines reconstruction/restoration errors to achieve more accurate anomaly segmentation.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection (2025)
- Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections (2025)
- Learning Multi-view Multi-class Anomaly Detection (2025)
- Correcting Deviations from Normality: A Reformulated Diffusion Model for Multi-Class Unsupervised Anomaly Detection (2025)
- Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation (2025)
- Search is All You Need for Few-shot Anomaly Detection (2025)
- Bi-Grid Reconstruction for Image Anomaly Detection (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper