融合双向路由注意力的多尺度X光违禁品检测
Multiscale X-Ray Contraband Detection Incorporating Bidirectional Routing Attention
DOI: 10.12677/CSA.2024.143060, PDF,    国家自然科学基金支持
作者: 王若璇, 李 野, 赵 鹏:长春理工大学物理学院,吉林 长春
关键词: X射线图像双向路由注意力小目标检测层YOLOv7X-Ray Images Bidirectional Routing Attention Small Target Detection Layer YOLOv7
摘要: 针对违禁品检测中存在的复杂背景干扰、物体间的重叠遮挡和多尺度变化问题,提出一种基于改进YOLOv7的X射线违禁品目标检测算法。首先,在主干中引入MBConv,以更有效的捕获全局信息;其次在特征融合网络中加入RFE模块,以增加特征图的感受野,从而提高违禁品多尺度检测的准确性。并设计出一种ELAN-BiF模块,用于抑制复杂背景干扰,使网络提取不同尺度的物品特征;为了提高小目标物体的检测精度,增加了一个微小物体检测头;最后,结合CARAFE上采样和Mish激活函数来提高网络对重叠和遮挡对象的识别能力,并提升在正负样本不平衡情况下的检测能力。结果表明,改进后的模型在SIXray_OOD数据集上进行测试,该方法map达到了95.2%,比原模型提高4.9%,比其他主流检测模型在违禁品检测任务上具有更好的优越性。
Abstract: Aiming at the problems of complex background interference, overlapping occlusion between ob-jects and multi-scale change in contraband detection, an X-ray contraband target detection algo-rithm based on improved YOLOv7 was proposed. Firstly, MBConv is introduced into the backbone to capture the global information more efficiently; secondly, an RFE module is added into the feature fusion network to increase the receptive field of the feature map, so as to improve the accuracy of contraband multi-scale detection. And an ELAN-BiF module is designed to suppress the complex background interference so that the network ex-tracts the features of items at different scales; In order to improve the detection accuracy of small target objects, a small object detection head has been added; finally, CARAFE up-sampling and Mish activation function are combined to improve the network’s ability to recognize overlapping and occluded objects and enhance the detection ability in the case of positive and negative sample imbalance situation. The results show that the improved model is tested on the SIXray_OOD dataset, and the method achieves a map of 95.2%, which is 4.9% better than the original model, and has a better superiority than other mainstream detection models in the contra-band detection task.
文章引用:王若璇, 李野, 赵鹏. 融合双向路由注意力的多尺度X光违禁品检测[J]. 计算机科学与应用, 2024, 14(3): 78-95. https://doi.org/10.12677/CSA.2024.143060

参考文献

[1] 张珂, 张良. 复杂背景下多尺度X光违禁品检测[J]. 激光与光电子学进展, 2021(22): 58.
[2] Michel, S., Koller, S.M., de Ruiter, J.C., Moerland, R., Hogervorst, M. and Schwaninger, A. (2007) Computer-Based Training Increases Efficiency in X-Ray Image Interpretation by Aviation Security Screeners. 2007 41st Annual IEEE International Carna-han Conference on Security Technology, Ottawa, 8-11 October 2007, 201-206. [Google Scholar] [CrossRef
[3] Zhang, K. and Zhang, L. (2021) Multi-Scale Detection for X-Ray Prohibited Items in Complex Background. Laser & Optoelectronics Progress, 58, Article ID: 2210002. [Google Scholar] [CrossRef
[4] Akcay, S., Kundegorski, M.E., Willcocks, C.G. and Breckon, T.P. (2018) Using Deep Convolutional Neural Network Architectures for Object Classification and Detection within X-Ray Baggage Security Imagery. IEEE Transactions on Information Forensics and Security, 13, 2203-2215. [Google Scholar] [CrossRef
[5] Gaus, Y.F.A., Bhowmik, N., Akcay, S. and Breckon, T. (2019) Evaluating the Transferability and Adversarial Discrimination of Convolutional Neural Networks for Threat Object De-tection and Classification within X-Ray Security Imagery. 2019 18th IEEE International Conference on Machine Learn-ing and Applications (ICMLA), Boca Raton, 16-19 December 2019, 420-425. [Google Scholar] [CrossRef
[6] Ge, Z., Liu, S., Wang, F., Li, Z. and Sun, J. (2021) Yolox: Ex-ceeding Yolo Series in 2021.
[7] Zhang, Y.T., Zhang, H.G., Zhao, T.F. and Yang, J.F. (2020) Automatic Detection of Prohibited Items with Small Size in X-Ray Images. Optoelectronics Letters, 16, 313-317. [Google Scholar] [CrossRef
[8] Zhu, X., Zhang, J., Chen, X., Li, D., Wang, Y. and Zheng, M. (2021) AMOD-Net: Attention-Based Multi-Scale Object Detection Network for X-Ray Baggage Security Inspection. Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence, Beijing, 4-6 De-cember 2021, 27-32. [Google Scholar] [CrossRef
[9] Wang, C.Y., Bochkovskiy, A. and Liao, H.Y.M. (2022) YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 7464-7475. [Google Scholar] [CrossRef
[10] Song, L.I. and Musa, Y. (2023) Improved YOLOv7 X-Ray Image Real-Time Detection of Prohibited Items. Journal of Computer Engineering & Applications, 59, 193-200.
[11] Tan, M. and Le, Q. (2019) Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks. International Conference on Machine Learning, Long Beach, 9-15 June 2019, 6105-6114.
[12] Zhu, L., Wang, X., Ke, Z., Zhang, W. and Lau, R.W. (2023) BiFormer: Vision Transformer with Bi-Level Routing Attention. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 17-24 June 2023, 10323-10333. [Google Scholar] [CrossRef
[13] Wang, J., Chen, K., Xu, R., Liu, Z., Loy, C.C. and Lin, D. (2019) Carafe: Content-Aware Reassembly of Features. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, 27 October-2 November 2019, 3007-3016. [Google Scholar] [CrossRef
[14] Yu, Z., Huang, H., Chen, W., Su, Y., Liu, Y. and Wang, X. (2022) Yolo-facev2: A Scale and Occlusion Aware Face Detector.
[15] Misra, D. (2020) Mish: A Self Regularized Non-Monotonic Activation Function. arXiv:1908.08681.
[16] Miao, C., Su, C., Wan, F., Liu, H., Jiao, J., Xie, L., et al. (2019) SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Im-ages. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 2114-2123. [Google Scholar] [CrossRef
[17] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y. and Berg, A.C. (2016) Ssd: Single Shot Multibox Detector. Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, 11-14 October 2016, 21-37. [Google Scholar] [CrossRef
[18] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A. and Zagoruyko, S. (2020) End-to-End Object Detection with Transformers. In: European Conference on Computer Vi-sion, Springer International Publishing, Cham, 213-229. [Google Scholar] [CrossRef
[19] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A. and Zagoruyko, S. (2020) End-to-End Object Detection with Transformers. In: Vedaldi, A., Bischof, H., Brox, T. and Frahm, JM., Eds., Computer Vision—ECCV 2020, Lecture Notes in Computer Science, Vol. 12346, Springer, Cham, 213-229. [Google Scholar] [CrossRef
[20] Wang, Y., Zhang, X., Li, L., Wang, L., Zhou, Z. and Zhang, P. (2023) An Improved YOLOv7 Model Based on Visual Attention Fusion: Application to the Recognition of Bouncing Locks in Substation Power Cabinets. Applied Sciences, 13, Article No. 6817. [Google Scholar] [CrossRef
[21] Ren, S., He, K., Girshick, R. and Sun, J. (2015) Faster R-CNN: To-wards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Ma-chine Intelligence, 39, 1137-1149.
[22] He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2017) Mask R-CNN. Pro-ceedings of the IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 2961-2969. [Google Scholar] [CrossRef