YOLO-XRAY:跨尺度注意力与深度可分离卷积在实时X光违禁品检测中的应用
YOLO-XRAY: Cross-Scale Attention and Depthwise Separable Convolutions for Real-Time X-Ray Contraband Detection
摘要: X射线安检系统在公共安全领域发挥着至关重要的作用,其核心任务是实现对潜在违禁品的实时检测,以预防安全风险和突发事件。然而,现有的传统X射线违禁品检测模型普遍存在结构复杂、参数量大、计算开销高等问题,不仅难以在轻量化的小型安检设备上高效部署,而且在检测精度和推理速度方面也存在明显局限,难以兼顾实际应用对高效性与实时性的双重要求。针对上述问题,本文提出了一种基于改进YOLOv8的轻量化检测模型——YOLO-XRAY。该模型在设计过程中充分兼顾了复杂场景下违禁品检测的实际需求。首先,引入跨尺度注意力机制,有效融合高、低分辨率特征层的信息,实现全局上下文与局部细节的协同建模,从而显著提升了小目标在复杂背景与遮挡条件下的显著性与检测精度。其次,采用深度可分离卷积替代标准卷积,有效降低冗余计算与参数规模,从而大幅减少计算成本并提升模型的部署效率。在SIXray数据集上的实验结果表明,改进后的YOLO-XRAY模型在保持轻量化结构的同时,mAP50提升了5.8%,参数量减少了5.5%。这一结果不仅验证了模型在违禁品识别任务中的有效性与优越性,也充分证明了其在实际安检场景中实现高精度、低延迟检测的应用潜力。该研究为智能安检系统的轻量化设计与实用化落地提供了一种切实可行的解决方案,对推动公共安全领域智能化发展具有重要意义。
Abstract: X-ray security inspection systems play a crucial role in the field of public safety, with their core mission being the real-time detection of potential contraband items to prevent security risks and emergencies. However, existing conventional X-ray contraband detection models often suffer from excessive structural complexity, large parameter sizes, and high computational costs. These limitations hinder their efficient deployment on lightweight inspection devices and restrict their performance in terms of detection accuracy and inference speed, making it difficult to simultaneously meet the dual requirements of efficiency and real-time applicability in practical scenarios. To address these challenges, this paper proposes a lightweight detection model based on an improved YOLOv8, termed YOLO-XRAY. The model is designed with full consideration of the practical demands of contraband detection in complex scenarios. First, a cross-scale attention mechanism is introduced to effectively integrate information from both high- and low-resolution feature layers, achieving collaborative modeling of global context and local details. This significantly enhances the saliency and detection accuracy of small objects under complex backgrounds and occlusion conditions. Second, depthwise separable convolutions are employed to replace standard convolutions, effectively reducing redundant computation and parameter scale, thereby substantially lowering computational cost and improving deployment efficiency. Experimental results on the SIXray dataset demonstrate that the improved YOLO-XRAY model achieves a 5.8% increase in mAP@50 while reducing the number of parameters by 5.5%, all while maintaining a lightweight structure. These findings not only validate the effectiveness and superiority of the proposed model for contraband recognition tasks but also confirm its potential for achieving high-accuracy, low-latency detection in real-world security inspection scenarios. Overall, this study provides a practical and feasible solution for the lightweight design and deployment of intelligent security inspection systems and holds significant implications for advancing the intelligent development of public safety technologies.
文章引用:魏裕康. YOLO-XRAY:跨尺度注意力与深度可分离卷积在实时X光违禁品检测中的应用[J]. 图像与信号处理, 2026, 15(1): 144-153. https://doi.org/10.12677/jisp.2026.151012

参考文献

[1] Schmidt-Hackenberg, L., Yousefi, M.R. and Breuel, T.M. (2012) Visual Cortex Inspired Features for Object Detection in X-Ray Images. Proceedings of the 21st International Conference on Pattern Recognition, Tsukuba, 11-15 November 2012, 2573-2576.
[2] Chumuang, N., Chansuek, P., Ketcham, M., Silsanpisut, A., Ganokratanaa, T. and Selarat, P. (2017) Analysis of X-Ray for Locating the Weapon in the Vehicle by Using Scale-Invariant Features Transform. 2017 Fourth Asian Conference on Defence Technology—Japan (ACDT), Tokyo, 29 November-1 December 2017, 1-6. [Google Scholar] [CrossRef
[3] Mery, D. and Katsaggelos, A.K. (2017) A Logarithmic X-Ray Imaging Model for Baggage Inspection: Simulation and Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, 21-26 July 2017, 251-259. [Google Scholar] [CrossRef
[4] Ren, S., He, K., Girshick, R., et al. (2015) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems, 28, 1-12.
[5] Zhang, Z., Feng, Y., Yang, L. and Chen, X. (2024) YOLO-Steel: An Enhanced Detection Model for Steel Surface Defects. 2024 4th International Conference on Electronic Information Engineering and Computer (EIECT), Shenzhen, 15-17 November 2024, 260-263. [Google Scholar] [CrossRef
[6] Jiang, P., Ergu, D., Liu, F., Cai, Y. and Ma, B. (2022) A Review of YOLO Algorithm Developments. Procedia Computer Science, 199, 1066-1073. [Google Scholar] [CrossRef
[7] Mademlis, I., Batsis, G., Chrysochoou, A.A.R. and Papadopoulos, G.T. (2023) Visual Inspection for Illicit Items in X-Ray Images Using Deep Learning. 2023 IEEE International Conference on Big Data (BigData), Sorrento, 15-18 December 2023, 4081-4089. [Google Scholar] [CrossRef
[8] Batsis, G., Mademlis, I. and Papadopoulos, G.T. (2023) Illicit Item Detection in X-Ray Images for Security Applications. 2023 IEEE Ninth International Conference on Big Data Computing Service and Applications (BigDataService), Athens, 17-20 July 2023, 63-70. [Google Scholar] [CrossRef
[9] Go, S., Kim, S., Moon, H., Park, C., Park, M., Ro, W.W., et al. (2024) DEPrune: Depth-Wise Separable Convolution Pruning for Maximizing GPU Parallelism. Advances in Neural Information Processing Systems 37, Vancouver, 10-15 December 2024, 106906-106923. [Google Scholar] [CrossRef
[10] 倪东海, 段先华, 陶宇诚, 等. YOLO-LCR: X光违禁品检测模型[J]. 计算机工程与设计, 2025, 46(9): 2480-2486.
[11] 苗苗. 基于神经网络的特殊物体X光图像识别算法的研究[D]: [硕士学位论文]. 呼和浩特: 内蒙古大学, 2019.
[12] 支洪平, 彭志超, 鲁盈悦, 等. 基于深度学习的X光安检图像智能识别设备的设计与实现[J]. 电子测试, 2019(19): 5-8, 21.
[13] 卢官有. 基于深度学习的X光图像中危险品检测算法的研究与应用[D]: [硕士学位论文]. 扬州: 扬州大学, 2020.
[14] 任杰. 基于YOLOv5的x光安检图像违禁品检测[D]: [硕士学位论文]. 北京: 中国地质大学(北京), 2021.
[15] Jubayer, F., Soeb, J.A., Mojumder, A.N., Paul, M.K., Barua, P., Kayshar, S., et al. (2021) Detection of Mold on the Food Surface Using YOLOv5. Current Research in Food Science, 4, 724-728. [Google Scholar] [CrossRef] [PubMed]
[16] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[17] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., et al. (2016) SSD: Single Shot Multibox Detector. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Computer VisionECCV 2016, Springer, 21-37. [Google Scholar] [CrossRef
[18] Schmidt-Hackenberg, L., Yousefi, M.R. and Breuel, T.M. (2012) Visual Cortex Inspired Features for Object Detection in X-Ray Images. Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, 11-15 November 2012, 2573-25760
[19] Liu, J., Leng, X. and Liu, Y. (2019) Deep Convolutional Neural Network Based Object Detector for X-Ray Baggage Security Imagery. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, 4-6 November 2019, 1757-1761. [Google Scholar] [CrossRef
[20] Liu, Z., Li, J., Shu, Y. and Zhang, D. (2018) Detection and Recognition of Security Detection Object Based on YOLO9000. 2018 5th International Conference on Systems and Informatics (ICSAI), Nanjing, 10-12 November 2018, 278-282. [Google Scholar] [CrossRef
[21] Miao, C., Xie, L., Wan, F., Su, C., Liu, H., Jiao, J., et al. (2019) SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Images. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 2114-2123. [Google Scholar] [CrossRef