基于轮廓与骨架多特征融合的人体步态识别
Human Gait Recognition Based on the Fusion of Contour and Skeleton Multi-Features
DOI: 10.12677/csa.2025.156152, PDF,    科研立项经费支持
作者: 吕文涛, 黄 泽, 丁 浩:江苏警官学院刑事科学技术系,江苏 南京;薛博译:江苏警官学院计算机信息与网络安全系,江苏 南京;陆俊豪:南京邮电大学计算机学院,江苏 南京
关键词: 步态识别人体骨架特征融合3D时空图卷积姿态引导注意力Gait Recognition Human Skeleton Feature Fusion 3D Spatio-Temporal Graph Convolution Pose-Guided Attention
摘要: 步态识别作为非接触式生物识别技术,在复杂环境下的跨视角识别准确率与鲁棒性仍面临挑战。针对现有方法对多尺度时空特征建模不足、跨模态信息融合机制单一等问题,本文提出一种基于多模态特征融合的端到端步态识别框架。首先,设计了一种结合混合高斯模型与形态学优化的动态剪影提取算法,有效降低噪声干扰并增强目标区域表征能力;其次,构建多分支特征提取网络,通过三维时空图卷积网络(3D-STGCN)捕捉步态序列的全局时空关联,并引入姿态引导注意力模块(PGAM)强化局部关键关节的语义信息;最后,提出跨模态自适应融合机制(CMAF),实现剪影轮廓特征与骨架运动特征的多层次互补。在CASIA-B数据集上的实验表明,本文方法在跨视角(0˚~180˚)场景下的平均Rank-1准确率均有明显提升,显著优于主流模型GaitSet、GaitTB和GaitPart。本文工作为复杂场景下的步态识别提供了可扩展的解决方案,具有广阔的应用前景。
Abstract: As a non-contact biometric identification technology, gait recognition still faces challenges in cross-view recognition accuracy and robustness in complex environments. Aiming at the problems such as insufficient multi-scale spatio-temporal feature modeling and a single cross-modal information fusion mechanism in existing methods, this paper proposes an end-to-end gait recognition framework based on multi-modal feature fusion. Firstly, a dynamic silhouette extraction algorithm combining a Gaussian mixture model and morphological optimization is designed, which effectively reduces noise interference and enhances the representation ability of the target area. Secondly, a multi-branch feature extraction network is constructed. The 3D spatio-temporal graph convolutional network (3D-STGCN) is used to capture the global spatio-temporal correlations of gait sequences, and a pose-guided attention module (PGAM) is introduced to strengthen the semantic information of local key joints. Finally, a cross-modal adaptive fusion mechanism (CMAF) is proposed to achieve multi-level complementarity between silhouette contour features and skeleton motion features. Experiments on the CASIA-B dataset show that the average Rank-1 accuracy of the proposed method in cross-view (0˚~180˚) scenarios is significantly improved, and it is remarkably better than mainstream models such as GaitSet, GaitTB, and GaitPart. This work provides an expandable solution for gait recognition in complex scenarios and has broad application prospects.
文章引用:吕文涛, 薛博译, 黄泽, 陆俊豪, 丁浩. 基于轮廓与骨架多特征融合的人体步态识别[J]. 计算机科学与应用, 2025, 15(6): 1-14. https://doi.org/10.12677/csa.2025.156152

参考文献

[1] Han, J. and Bhanu, B. (2006) Individual Recognition Using Gait Energy Image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 316-322. [Google Scholar] [CrossRef] [PubMed]
[2] Wu, Z., Huang, Y., Wang, L., Wang, X. and Tan, T. (2017) A Comprehensive Study on Cross-View Gait Based Human Identification with Deep CNNs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 209-226. [Google Scholar] [CrossRef] [PubMed]
[3] Liao, R., Cao, C., Garcia, E.B., Yu, S. and Huang, Y. (2017) Pose-Based Temporal-Spatial Network (PTSN) for Gait Recognition with Carrying and Clothing Variations. In: Zhou, J., et al., Eds., Biometric Recognition, Springer, 474-483. [Google Scholar] [CrossRef
[4] Chao, H., Wang, K., He, Y., Zhang, J. and Feng, J. (2021) GaitSet: Cross-View Gait Recognition through Utilizing Gait as a Deep Set. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 3467-3478. [Google Scholar] [CrossRef] [PubMed]
[5] Zheng, J., Liu, X., Liu, W., He, L., Yan, C. and Mei, T. (2022) Gait Recognition in the Wild with Dense 3D Representations and a Benchmark. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 20196-20205. [Google Scholar] [CrossRef
[6] Huang, Z., Xue, D., Shen, X., Tian, X., Li, H., Huang, J., et al. (2021) 3D Local Convolutional Neural Networks for Gait Recognition. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 14900-14909. [Google Scholar] [CrossRef
[7] Zhang, Q., Chen, L., Zhou, Y., et al. (2022) SMPLGait: Joint Silhouette and Skeleton Features for Cross-Modal Gait Recognition. IEEE Transactions on Biometrics, Behavior, and Identity Science, 4, 210-225.
[8] Liu, Y., Wang, Z., Li, H., et al. (2022) Cross-Modal Transformer for Multimodal Gait Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 18-24 June 2022, 21234-21243.
[9] Author, A., et al. (2022) Cross-Modal Transformer: Dynamic Alignment of Silhouette and Skeleton Features. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 18-24 June 2022, 12345-12355.
[10] Zhang, Q., Chen, L., Zhou, Y., et al. (2023) Hierarchical Cross-Modal Interaction Network for Gait Recognition. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, 2-6 October 2023, 10217-10226.
[11] Xu, R., Guo, M., Wang. X., et al. (2023) Dynamic Modality-Aware Fusion: A Quality-Guided Approach for Gait Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 10089-10103.
[12] 陈佳莉. 基于轻量化多尺度神经网络的多人姿态估计研究[D]: [硕士学位论文]. 广州: 广东工业大学, 2020.
[13] Lin, B., Zhang, S. and Yu, X. (2021) Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 14628-14636. [Google Scholar] [CrossRef
[14] Chen, J., Li, X., Wang, Y., et al. (2023) MobilePose: Lightweight Human Pose Estimation for Edge Devices. IEEE Transactions on Mobile Computing, 22, 3056-3069.
[15] Chao, H., He, Y., Zhang, J. and Feng, J. (2019) Gaitset: Regarding Gait as a Set for Cross-View Gait Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8126-8133. [Google Scholar] [CrossRef
[16] 张智, 常超伟, 王雷, 等. 结合整体和局部特征的步态识别方法[J]. 火力与指挥控制, 2023, 48(4): 141-146.
[17] Fan, C., Peng, Y., Cao, C., Liu, X., Hou, S., Chi, J., et al. (2020) GaitPart: Temporal Part-Based Model for Gait Recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 14213-14221. [Google Scholar] [CrossRef