基于图神经网络的行人重识别方法
Person Re-Identification Method Based on Graph Neural Network
DOI: 10.12677/CSA.2021.114101, PDF,    国家自然科学基金支持
作者: 郝志峰:佛山科学技术学院数学与大数据学院,广东 佛山;广东工业大学计算机学院,广东 广州;苏伟根, 许柏炎, 温 雯, 蔡瑞初:广东工业大学计算机学院,广东 广州
关键词: 行人重识别图度量学习人体语义解析局部对齐Person Re-Identification Graph Metric Learning Human Semantic Parsing Part Alignment
摘要: 基于深度学习的行人重识别方法面临的挑战在于如何解决由于行人姿态多变、背景混杂、摄像头视角差异大和部分遮挡等情况引起的行人不对齐的问题。如何提取细粒度的、具有强判别性的特征成为解决问题的关键,因此本文提出了一种新型的基于图神经网络的行人重识别方法,其中包括:1) 结合人体语义解析结果提取细粒度局部特征,构建部位关系图,并通过图神经网络学习到细粒度的图表示。2) 通过图度量学习方法联合优化学习网络,学习强判别性的特征表示。本文提出的方法在主流评估数据集上与行人重识别前沿方法进行实验比较,结果表明了方法的有效性。
Abstract: The challenge for the deep learning-based person re-identification method is how to solve the problem of pedestrian misalignment caused by pedestrian posture change, mixed background, large camera viewing angle difference, and partial occlusion. How to extract fine-grained and highly discriminative features has become the key to solving the problem. Therefore, a new person re-identification method based on graph neural network is proposed, which includes: 1) Combine the human semantic parsing model to locate the position map and learn fine-grained local features as graph representation through the graph neural network. 2) Through the proposed graph metric learning method to jointly optimize the learning network to learn more discriminative feature representation. Experiments were performed on mainstream evaluation datasets, and the results showed the effectiveness of the method.
文章引用:郝志峰, 苏伟根, 许柏炎, 温雯, 蔡瑞初. 基于图神经网络的行人重识别方法[J]. 计算机科学与应用, 2021, 11(4): 983-993. https://doi.org/10.12677/CSA.2021.114101

参考文献

[1] 罗浩, 姜伟, 范星, 等. 基于深度学习的行人重识别研究进展[J]. 自动化学报, 2019, 45(11): 2032-2049.
[2] Zheng, L., Shen, L., Tian, L., et al. (2015) Scalable Person Re-Identification: A Benchmark. Proceed-ings of the IEEE International Conference on Computer Vision, Santiago, 7-13 December 2015, 1116-1124. [Google Scholar] [CrossRef
[3] Lindeberg, T. (2012) Scale Invariant Feature Transform. Scholarpedia, 7, 10491. [Google Scholar] [CrossRef
[4] Ahonen, T., Hadid, A. and Pietikäinen, M. (2004) Face Recogni-tion with Local Binary Patterns. In: Pajdla, T. and Matas, J., Eds., European Conference on Computer Vision, Springer, Berlin, Heidelberg, 469-481. [Google Scholar] [CrossRef
[5] Liao, S., Hu, Y., Zhu, X. and Li, S.Z. (2015) Person Re-Identification by Local Maximal Occurrence Representation and Metric Learning. Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 2197-2206. [Google Scholar] [CrossRef
[6] Sun, Y., Zheng, L., Yang, Y., et al. (2018) Beyond Part Models: Person Retrieval with Refined Part Pooling (and a Strong Convolutional Baseline). Proceedings of the European Con-ference on Computer Vision (ECCV), 480-496. [Google Scholar] [CrossRef
[7] Zhao, H., Tian, M., Sun, S., et al. (2017) Spindle Net: Person Re-Identification with Human Body Region Guided Feature Decomposition and Fusion. Proceedings of the IEEE Con-ference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 1077-1085. [Google Scholar] [CrossRef
[8] Li, W., Zhu, X. and Gong, S. (2018) Harmonious Attention Network for Person Re-Identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 2285-2294. [Google Scholar] [CrossRef
[9] Kalayeh, M.M., Basaran, E., Gökmen, M., et al. (2018) Human Semantic Parsing for Person Re-Identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pat-tern Recognition, Salt Lake City, 18-23 June 2018, 1062-1071. [Google Scholar] [CrossRef
[10] Varior, R.R., Haloi, M. and Wang, G. (2016) Gated Siamese Convolutional Neural Network Architecture for Human Re-Identification. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., European Conference on Computer Vision, Springer, Cham, 791-808. [Google Scholar] [CrossRef
[11] Hermans, A., Beyer, L. and Leibe, B. (2017) In Defense of the Triplet Loss for Person Re-Identification. arXiv preprint arXiv:1703.07737.
[12] Cheng, D., Gong, Y., Zhou, S., et al. (2016) Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 1335-1344. [Google Scholar] [CrossRef
[13] Chen, W., Chen, X., Zhang, J. Huang, K. (2017) Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-Identification. Proceedings of the IEEE Conference on Computer Vi-sion and Pattern Recognition, Honolulu, 21-26 July 2017, 403-412. [Google Scholar] [CrossRef
[14] Kipf, T.N. and Welling, M. (2016) Semi-Supervised Classification with Graph Convolutional Networks. arXiv preprint arXiv:1609.02907.
[15] Caramalau, R., Bhattarai, B. and Kim, T.K. (2020) Sequential Graph Convolutional Network for Active Learning. arXiv preprint arXiv:2006.10219.
[16] Chen, Z.M., Wei, X.S., Wang, P. and Guo, Y. (2019) Multi-Label Image Recognition with Graph Convolutional Networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 5177-5186. [Google Scholar] [CrossRef
[17] Lei, H., Akhtar, N. and Mian, A. (2020) Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. IEEE Transactions on Pattern Analysis and Machine Intel-ligence, 1. [Google Scholar] [CrossRef
[18] Shen, Y., Li, H., Yi, S., et al. (2018) Person Re-Identification with Deep Similarity-Guided Graph Neural Network. Proceedings of the European Conference on Computer Vision (ECCV), 486-504. [Google Scholar] [CrossRef
[19] Ye, M., Li, J., Ma, A.J., et al. (2019) Dynamic Graph Co-Matching for Unsupervised Video-Based Person Re-Identification. IEEE Transactions on Image Processing, 28, 2976-2990. [Google Scholar] [CrossRef
[20] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition, Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[21] Luo, H., Jiang, W., Gu, Y., Liu, F., et al. (2019) A Strong Baseline and Batch Normalization Neck for Deep Person Re-Identification. IEEE Transactions on Multimedia, 22, 2597-2609. [Google Scholar] [CrossRef
[22] Gong, K., Liang, X., Zhang, D., et al. (2017) Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing. Proceedings of the IEEE Con-ference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 932-940. [Google Scholar] [CrossRef
[23] Li, P., Xu, Y., Wei, Y. and Yang, Y. (2020) Self-Correction for Human Parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1. [Google Scholar] [CrossRef
[24] Kumar, B.G.V., Carneiro, G. and Reid, I. (2016) Learning Lo-cal Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimising Global Loss Functions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 5385-5394.
[25] Szegedy, C., Vanhoucke, V., Ioffe, S., et al. (2016) Rethinking the Inception Architecture for Comput-er Vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 2818-2826. [Google Scholar] [CrossRef
[26] Alex, D., Sami, Z., Banerjee, S. and Panda, S. (2018) Cluster Loss for Person Re-Identification. Proceedings of the 11th Indian Conference on Computer Vision, Graphics and Image Processing, December 2018, Article No. 43. [Google Scholar] [CrossRef
[27] Saquib Sarfraz, M., Schumann, A., Eberle, A. and Stiefelhagen, R. (2018) A Pose-Sensitive Embedding for Person Re-Identification with Expanded cross Neighborhood Re-Ranking. Pro-ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 420-429. [Google Scholar] [CrossRef
[28] Li, J., Zhang, S., Tian, Q., et al. (2019) Pose-Guided Representa-tion Learning for Person Re-Identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1. [Google Scholar] [CrossRef
[29] Song, C., Huang, Y., Ouyang, W., et al. (2018) Mask-Guided Contrastive Attention Model for Person Re-Identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 1179-1188. [Google Scholar] [CrossRef
[30] Zhang, X., Luo, H., Fan, X., et al. (2017) Alignedreid: Surpassing Human-Level Performance in Person Re-Identification. arXiv preprint arXiv:1711.08184.
[31] Dai, Z., Chen, M., Gu, X., et al. (2019) Batch DropBlock Network for Person Re-Identification and Beyond. Proceedings of the IEEE/CVF In-ternational Conference on Computer Vision, Seoul, 27 October-2 November 2019, 3691-3701. [Google Scholar] [CrossRef
[32] Wang, G., Yuan, Y., Chen, X., et al. (2018) Learning Discrimina-tive Features with Multiple Granularities for Person Re-Identification. Proceedings of the 26th ACM international con-ference on Multimedia, October 2018, 274-282. [Google Scholar] [CrossRef
[33] Si, J., Zhang, H., Li, C.G., et al. (2018) Dual Attention Matching Network for Context-Aware Feature Sequence Based Person Re-Identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 5363-5372. [Google Scholar] [CrossRef
[34] Wang, C., Zhang, Q., Huang, C., et al. (2018) Mancs: A Mul-ti-Task Attentional Network with Curriculum Sampling for Person Re-Identification. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Proceedings of the European Conference on Computer Vision (ECCV), Springer, Cham, 365-381. [Google Scholar] [CrossRef
[35] Li, Y., Yao, H., Zhang, T. and Xu, C. (2020) Part-Based Structured Representation Learning for Person Re-Identification. ACM Transactions on Multimedia Computing, Com-munications, and Applications (TOMM), 16, 1-22. [Google Scholar] [CrossRef