面向多类异常检测与定位的提示引导Mamba网络
Prompt-Guided Mamba Network for Multi-Class Anomaly Detection and Localization
DOI: 10.12677/mos.2025.144271, PDF,   
作者: 林 太:上海理工大学光电信息与计算机工程学院,上海
关键词: 多类异常检测知识蒸馏MambaMulti-Class Anomaly Detection Knowledge Distillation Mamba
摘要: 现有的无监督异常检测方法在多类训练时往往受到类别间干扰的困扰,导致在实际应用中检测性能显著下降。为了解决这个问题,本文提出了一种新颖的基于提示引导的Mamba网络(PGM)用于多类异常检测和定位。首先,通过采用UniRepLKNet作为预训练特征提取器,模型有效压缩并记忆正常图像的大量信息。接着,基于Mamba结构的解码器结合了全局和局部建模能力能够有效重建多尺度特征。最后,PGM采用了分层类感知提示模块,动态地将类别特定信息编码到类别先验池中,以减轻异常类别之间的干扰。在MVTec AD数据集上进行的实验表明,所提出的PGM方法与大多数先进方法相比具有最佳性能,在图像级AUROC方面达到98.7%、像素级AUROC方面达到98.2%。同时,详尽的消融研究验证了模型每个组件的贡献。总体而言,所提出的PGM方法在异常检测领域具有显著优势。
Abstract: Existing unsupervised anomaly detection methods often suffer from significant performance degradation in practical applications due to inter-class interference during multi-class training. To address this issue, this paper proposes a novel Prompt-Guided Mamba Network (PGM) for multi-class anomaly detection and localization. Firstly, by employing UniRepLKNet as the pre-trained feature extractor, the model effectively compresses and memorizes a large amount of information from normal images. Subsequently, the decoder based on the Mamba structure combines global and local modeling capabilities to effectively reconstruct multi-scale features. Finally, PGM incorporates a hierarchical class-aware prompt module that dynamically encodes class-specific information into a class prior pool to mitigate interference between different anomaly classes. Experiments conducted on the MVTec AD dataset demonstrate that the proposed PGM method outperforms most state-of-the-art methods, achieving 98.7% in image-level AUROC and 98.2% in pixel-level AUROC. Additionally, extensive ablation studies validate the contribution of each component of the model. Overall, the proposed PGM method shows significant advantages in the field of anomaly detection.
文章引用:林太. 面向多类异常检测与定位的提示引导Mamba网络[J]. 建模与仿真, 2025, 14(4): 129-138. https://doi.org/10.12677/mos.2025.144271

参考文献

[1] Bergmann, P., Fauser, M., Sattlegger, D. and Steger, C. (2019) MVTec AD—A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 9584-9592. [Google Scholar] [CrossRef
[2] Zhang, X., Wu, Y., Angelini, E., et al. (2023) MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling. arXiv: 2303.09373. [Google Scholar] [CrossRef
[3] Wu, P., Zhou, X., Pang, G., Zhou, L., Yan, Q., Wang, P., et al. (2024) VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 6074-6082. [Google Scholar] [CrossRef
[4] Roth, K., Pemula, L., Zepeda, J., Scholkopf, B., Brox, T. and Gehler, P. (2022) Towards Total Recall in Industrial Anomaly Detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 14298-14308. [Google Scholar] [CrossRef
[5] Rudolph, M., Wehrbein, T., Rosenhahn, B. and Wandt, B. (2023) Asymmetric Student-Teacher Networks for Industrial Anomaly Detection. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, 2-7 January 2023, 2591-2601. [Google Scholar] [CrossRef
[6] Cao, Y., Wan, Q., Shen, W. and Gao, L. (2022) Informative Knowledge Distillation for Image Anomaly Segmentation. Knowledge-Based Systems, 248, Article 108846. [Google Scholar] [CrossRef
[7] Batzner, K., Heckler, L. and König, R. (2024) EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies. 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, 3-8 January 2024, 127-137. [Google Scholar] [CrossRef
[8] Li, C., Sohn, K., Yoon, J. and Pfister, T. (2021) CutPaste: Self-Supervised Learning for Anomaly Detection and Localization. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 9659-9669. [Google Scholar] [CrossRef
[9] Mishra, P., Verk, R., Fornasier, D., Piciarelli, C. and Foresti, G.L. (2021) VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization. 2021 IEEE 30th International Symposium on Industrial Electronics (ISIE), Kyoto, 20-23 June 2021, 1-6. [Google Scholar] [CrossRef
[10] Ding, X., Zhang, Y., Ge, Y., Zhao, S., Song, L., Yue, X., et al. (2024) UniRepLKNet: A Universal Perception Large-Kernel Convnet for Audio, Video, Point Cloud, Time-Series and Image Recognition. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 5513-5524. [Google Scholar] [CrossRef
[11] Yao, X., Li, R., Qian, Z., Wang, L. and Zhang, C. (2024) Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection. Computer VisionECCV 2024, Milan, 29 September-4 October 2024, 92-108. [Google Scholar] [CrossRef
[12] Liu, J. and Wang, F. (2024) Mixed-Attention Auto Encoder for Multi-Class Industrial Anomaly Detection. ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, 14-19 April 2024, 4120-4124. [Google Scholar] [CrossRef
[13] Deng, H. and Li, X. (2022) Anomaly Detection via Reverse Distillation from One-Class Embedding. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 9727-9736. [Google Scholar] [CrossRef
[14] He, L., Jiang, Z., Peng, J., Zhu, W., Liu, L., Du, Q., et al. (2024) Learning Unified Reference Representation for Unsupervised Multi-Class Anomaly Detection. Computer Vision—ECCV 2024, Milan, 29 September-4 October 2024, 216-232. [Google Scholar] [CrossRef
[15] You, Z., Cui, L., Shen, Y., et al. (2022) A Unified Model for Multi-Class Anomaly Detection. Advances in Neural Information Processing Systems, 35, 4571-4584.
[16] Gao, B. (2024) Learning to Detect Multi-Class Anomalies with Just One Normal Image Prompt. Computer Vision—ECCV 2024, Milan, 29 September-4 October 2024, 454-470. [Google Scholar] [CrossRef
[17] He, H., Zhang, J., Chen, H., Chen, X., Li, Z., Chen, X., et al. (2024) A Diffusion-Based Framework for Multi-Class Anomaly Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 8472-8480. [Google Scholar] [CrossRef
[18] Wang, F., Liu, C., Shi, L., et al. (2024) MiniMaxAD: A Lightweight Autoencoder for Feature-Rich Anomaly Detection. arXiv: 2405.09933. [Google Scholar] [CrossRef
[19] Woo, S., Debnath, S., Hu, R., Chen, X., Liu, Z., Kweon, I.S., et al. (2023) ConvNeXt V2: Co-Designing and Scaling Convnets with Masked Autoencoders. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 16133-16142. [Google Scholar] [CrossRef
[20] He, H., Bai, Y., Zhang, J., et al. (2024) MambaAD: Exploring State Space Models for Multi-Class Unsupervised Anomaly Detection. arXiv: 2404.06564. [Google Scholar] [CrossRef
[21] Kashiani, H., Talemi, N.A. and Afghah, F. (2024) ROADS: Robust Prompt-Driven Multi-Class Anomaly Detection under Domain Shift. arXiv: 2411.16049. [Google Scholar] [CrossRef
[22] Glorot, X. and Bengio, Y. (2010) Understanding the Difficulty of Training Deep Feedforward Neural Networks. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, 13-15 May 2010, 249-256.
[23] Bergmann, P., Fauser, M., Sattlegger, D. and Steger, C. (2020) Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 4182-4191. [Google Scholar] [CrossRef
[24] Zavrtanik, V., Kristan, M. and Skocaj, D. (2021) DRÆM—A Discriminatively Trained Reconstruction Embedding for Surface Anomaly Detection. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 8310-8319. [Google Scholar] [CrossRef
[25] Zou, Y., Jeong, J., Pemula, L., Zhang, D. and Dabeer, O. (2022) Spot-the-Difference Self-Supervised Pre-Training for Anomaly Detection and Segmentation. Computer VisionECCV 2022, Tel Aviv, 23-27 October 2022, 392-408. [Google Scholar] [CrossRef
[26] Liu, Z., Zhou, Y., Xu, Y. and Wang, Z. (2023) SimpleNet: A Simple Network for Image Anomaly Detection and Localization. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 20402-20411. [Google Scholar] [CrossRef
[27] Zhang, X., Li, S., Li, X., Huang, P., Shan, J. and Chen, T. (2023) DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 3914-3923. [Google Scholar] [CrossRef