双分支网络下监督通道注意力机制的行人重识别
Supervised Channel Attention Learning in Dual-Branch Networks for Person Re-Identification
DOI: 10.12677/csa.2026.164143, PDF,   
作者: 魏子仪, 芦俊池*:沈阳工业大学信息科学与工程学院,辽宁 沈阳;杨海波:沈阳工业大学信息科学与工程学院,辽宁 沈阳;先进计算与信创技术重点实验室,辽宁 沈阳
关键词: 行人重识别特征融合双分支监督注意力Person Re-Identification Feature Fusion Dual-Branch Supervised Attention
摘要: 行人重识别(ReID)技术旨在跨摄像头图像或视频序列识别特定个体,针对不同图像中行人比例各异的核心挑战,本文提出双分支监督通道注意力模型(DSAM)。该方法在ResNet-50的瓶颈模块嵌入无参监督注意力模块,结合分类层权重矩阵作为监督信号,通过均值与方差动态调整通道权重,有效抑制背景干扰、聚焦关键特征区域。基于ResNet-50构建双分支网络,分别输入原始图像与裁剪图像以增强主体特征显著性;同时在conv3、conv4层引入CBAM模块,通过特征融合模块(FFM)将其增强特征与conv5高层语义特征跨尺度整合,实现低层细节与高层语义的互补,提升模型对尺度变化、视角差异及光照波动的鲁棒性。在MSMT17、Market-1501和DukeMTMC-ReID数据集上的实验表明,所提方法的mAP与Rank-1指标均优于现有方法。
Abstract: Person re-identification (ReID) technology aims to identify and track specific individuals through cross-camera images or video sequences. In order to solve the challenge of the different proportions of persons in different images, this paper proposes a ReID method based on a Dual-Branch Supervised Channel Attention Model (DSAM). DSAM embeds a parameter-free supervised attention module into the bottleneck blocks of ResNet-50 and leverages the classification-layer weight matrix as a supervisory signal. Channel-wise weights are dynamically adjusted via channel mean and variance, effectively suppressing background noise and focusing on salient person regions. Built on ResNet-50, DSAM adopts a dual-branch architecture that feeds both original and randomly resized images to enhance foreground distinctiveness. CBAM (Convolutional Block Attention Modules) are inserted after conv3 and conv4, and their augmented features are cross-scale fused with high-level semantic features from conv5 through a Feature Fusion Module (FFM). This complementary integration of low-level details and high-level semantics improves robustness against scale variations, viewpoint changes, and illumination fluctuations. Experiments on MSMT17, Market-1501 and DukeMTMC-ReID datasets show that the proposed method is superior to the existing methods in mAP and Rank-1 metrics.
文章引用:魏子仪, 杨海波, 芦俊池. 双分支网络下监督通道注意力机制的行人重识别[J]. 计算机科学与应用, 2026, 16(4): 439-452. https://doi.org/10.12677/csa.2026.164143

参考文献

[1] Chen, T., Ding, S., Xie, J., Yuan, Y., Chen, W., Yang, Y., et al. (2019) ABD-Net: Attentive but Diverse Person Re-identification. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 8351-8361. [Google Scholar] [CrossRef
[2] Wang, G., Yuan, Y., Chen, X., Li, J. and Zhou, X. (2018) Learning Discriminative Features with Multiple Granularities for Person Re-Identification. Proceedings of the 26th ACM International Conference on Multimedia, Seoul, 22-26 October 2018, 274-282. [Google Scholar] [CrossRef
[3] Bai, X., Yang, M., Huang, T., Dou, Z., Yu, R. and Xu, Y. (2020) Deep-Person: Learning Discriminative Deep Features for Person Re-Identification. Pattern Recognition, 98, Article ID: 107036. [Google Scholar] [CrossRef
[4] Vinyals, O., Toshev, A., Bengio, S. and Erhan, D. (2015) Show and Tell: A Neural Image Caption Generator. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 3156-3164. [Google Scholar] [CrossRef
[5] Zhang, Z., Lan, C., Zeng, W., Jin, X. and Chen, Z. (2020) Relation-Aware Global Attention for Person Re-Identification. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 3186-3195. [Google Scholar] [CrossRef
[6] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[7] Woo, S., Park, J., Lee, J.Y. and Kweon, I.S. (2018) CBAM: Convolutional Block Attention Module. In: Ferrari, V., et al., Eds., Computer VisionECCV 2018, Springer International Publishing, 3-19. [Google Scholar] [CrossRef
[8] Tan, Z., Zhang, G., Tan, Z., Tiwari, P., Wang, Y. and Yang, Y. (2025) Cam2former: Fusion of Camera-Specific Class Activation Map Matters for Occluded Person Re-Identification. Information Fusion, 120, Article ID: 103011. [Google Scholar] [CrossRef
[9] Wei, L., Zhang, S., Gao, W. and Tian, Q. (2018) Person Transfer GAN to Bridge Domain Gap for Person Re-Identification. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 79-88. [Google Scholar] [CrossRef
[10] Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J. and Tian, Q. (2015) Scalable Person Re-Identification: A Benchmark. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 1116-1124. [Google Scholar] [CrossRef
[11] Ristani, E., Solera, F., Zou, R., Cucchiara, R. and Tomasi, C. (2016) Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. In: Hua, G. and Jégou, H., Eds., Computer VisionECCV 2016 Workshops, Springer International Publishing, 17-35. [Google Scholar] [CrossRef
[12] Chen, Y., Zhu, X. and Gong, S. (2017) Person Re-Identification by Deep Learning Multi-Scale Representations. 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice, 22-29 October 2017, 2590-2600. [Google Scholar] [CrossRef
[13] Zhou, K., Yang, Y., Cavallaro, A. and Xiang, T. (2019) Omni-Scale Feature Learning for Person Re-Identification. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 3702-3712. [Google Scholar] [CrossRef
[14] Li, H., Wu, G. and Zheng, W. (2021) Combined Depth Space Based Architecture Search for Person Re-Identification. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 6729-6738. [Google Scholar] [CrossRef
[15] Zhu, K., Guo, H., Yan, T., Zhu, Y., Wang, J. and Tang, M. (2022) PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification. In: Avidan, S., et al., Eds., Computer VisionECCV 2022, Springer, 198-214. [Google Scholar] [CrossRef
[16] Tao, Y., Zhang, J., Chen, T., Wang, Y. and Zhu, Y. (2022) Transformer-Based Contrastive Learning for Unsupervised Person Re-Identification. 2022 International Joint Conference on Neural Networks (IJCNN), Padua, 18-23 July 2022, 1-9. [Google Scholar] [CrossRef
[17] Li, D., Chen, S., Zhong, Y. and Ma, L. (2023) Dip: Learning Discriminative Implicit Parts for Person Re-Identification.
[18] Xiong, M., Hu, K., Lyu, Z., Fang, F., Wang, Z., Hu, R., et al. (2024) Inter-Camera Identity Discrimination for Unsupervised Person Re-Identification. ACM Transactions on Multimedia Computing, Communications, and Applications, 20, 1-18. [Google Scholar] [CrossRef
[19] Li, J., Wang, M. and Gong, X. (2023) Transformer Based Multi-Grained Features for Unsupervised Person Re-Identification. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), Waikoloa, 3-7 January 2023, 42-50. [Google Scholar] [CrossRef
[20] Chang, X., Hospedales, T.M. and Xiang, T. (2018) Multi-Level Factorisation Net for Person Re-Identification. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 2109-2117. [Google Scholar] [CrossRef
[21] Han, J., Yao, X., Cheng, G., Feng, X. and Xu, D. (2022) P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 579-590. [Google Scholar] [CrossRef] [PubMed]
[22] Tay, C.-P., Roy, S. and Yap, K.-H. (2019) AANet: Attribute Attention Network for Person Re-Identifications. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 7134-7143. [Google Scholar] [CrossRef
[23] Rao, Y., Chen, G., Lu, J. and Zhou, J. (2021) Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-Identification. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 1025-1034. [Google Scholar] [CrossRef
[24] Chen, G., Gu, T., Lu, J., Bao, J.-A. and Zhou, J. (2022) Person Re-Identification via Attention Pyramid. IEEE Transactions on Image Processing, 31, 7167-7179.
[25] Wang, H., Shen, J., Liu, Y., Gao, Y. and Gavves, E. (2022) NFormer: Robust Person Re-Identification with Neighbor Transformer. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 7297-7307. [Google Scholar] [CrossRef
[26] Chen, Z., Cui, Z., Zhang, C., Zhou, J. and Liu, Y. (2023) Dual Clustering Co-Teaching with Consistent Sample Mining for Unsupervised Person Re-Identification. IEEE Transactions on Circuits and Systems for Video Technology, 33, 5908-5920. [Google Scholar] [CrossRef
[27] Bertocco, G.C., Theophilo, A., Andaló, F. and De Rezende Rocha, A. (2023) Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised Person Re-Identification and Text Authorship Attribution. IEEE Transactions on Information Forensics and Security, 18, 3876-3890. [Google Scholar] [CrossRef
[28] Ji, F., Zhang, B., Chao, L., Guo, H. and Li, J. (2024) MDL: Multi-Granularity Distribution Features Learning for Unsupervised Person Re-Identification. 2024 IEEE 30th International Conference on Parallel and Distributed Systems (ICPADS), Belgrade, 10-14 October 2024, 512-519. [Google Scholar] [CrossRef