面向嵌入式平台的红外人体姿态估计系统
Infrared Human Pose Estimation for Embedded System
DOI: 10.12677/csa.2024.145121, PDF,    科研立项经费支持
作者: 侯雨琪*, 白 玉:沈阳航空航天大学电子信息工程学院,辽宁 沈阳
关键词: 人体姿态估计HRNet剪枝迁移学习Human Pose Estimation HRNet Pruning Transfer Learning
摘要: 针对在低照度环境下的人体姿态估计精度下降严重,且模型参数量大导致部署在嵌入式设备时效率低的问题。本文设计了一种基于红外摄像头的Jetson Xavier NX平台轻量化人体姿态估计系统,提出基于HRNet的人体姿态估计方法。首先,引入残差模块并在可见域和红外域之间进行迁移学习;其次,提出结合通道剪枝模块的HRNet,消除分支中的冗余,以低开销进行规模感知特征融合;最后,利用TensorRT方法优化深度学习模型并部署到Jetson Xavier NX嵌入式平台。实验结果表明,改进后模型更符合对嵌入式设备的实时性需求,参数量相比原模型减少46%,与同样规模相比具有更高的检测精度,模型的mAP保持在74.2%以上,经过TensorRT加速优化后系统检测速度可达33 fps。
Abstract: In response to the significant degradation in human pose estimation accuracy under low-illumination conditions, and the challenges posed by the large number of model parameters leading to low efficiency when deployed on embedded devices, this paper introduces a lightweight human pose estimation system based on the Jetson Xavier NX platform utilizing infrared cameras. We propose a novel human pose estimation method, which is founded on HRNet. Initially, we opted to introduce a residual module and perform transfer learning between the visible and infrared domains. Given the absence of large, this paper introduces HRNet combined with a channel pruning module, which eliminates redundancy within the branches, enabling scalable feature fusion with low overhead. Subsequently, we utilize the output keypoint heatmaps for simple action classification. Finally, the deep learning model is optimized using TensorRT methods to enhance inference speed and deploy it on the Jetson Xavier NX embedded platform. Experimental results demonstrate that the improved model has a 46% reduction in parameters compared to the original model, offering higher detection accuracy when compared to models of similar size. The model’s mAP remains above 74.2%, and after acceleration optimization, the detection speed reaches 33 fps.
文章引用:侯雨琪, 白玉. 面向嵌入式平台的红外人体姿态估计系统[J]. 计算机科学与应用, 2024, 14(5): 126-136. https://doi.org/10.12677/csa.2024.145121

参考文献

[1] 冯晓月, 宋杰. 二维人体姿态估计研究进展[J]. 计算机科学, 2020, 47(11): 128-136.
[2] Toshev, A. and Szegedy, C. (2014) Deeppose: Human Pose Estimation via Deep Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 1653-1660. [Google Scholar] [CrossRef
[3] Wei, S.E., Ramakrishna, V., Kanade, T., et al. (2016) Convolutional Pose Machines. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 27-30 June 2016, 4724-4732. [Google Scholar] [CrossRef
[4] Xiao, B., Wu, H. and Wei, Y. (2018) Simple Baselines for Human Pose Estimation and Tracking. European Conference on Computer Vision, Munich, Germany, 8-14 September 2018, 466-481. [Google Scholar] [CrossRef
[5] Sun, K., Xiao, B., Liu, D., et al. (2019) Deep High-Resolution Representation Learning for Human Pose Estimation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 15-20 June 2019, 5693-5703. [Google Scholar] [CrossRef
[6] 臧影. 低照度下人体姿态估计及行为识别研究[D]: [博士学位论文]. 北京: 中国科学院大学(中国科学院沈阳计算技术研究所), 2022.
[7] Weiss, K., Khoshgoftaar, T.M., Wang, D.D., et al. (2016) A Survey of Transfer Learning. Journal of Big Data, 3, Article No. 9. [Google Scholar] [CrossRef
[8] Zhuang, F., et al. (2021) A Comprehensive Survey on Transfer Learning. Proceedings of the IEEE, 109, 43-76. [Google Scholar] [CrossRef
[9] Ding, X., Zhang, X., Han, J., et al. (2021) Diverse Branch Block: Building a Convolution as an Inception-Like Unit. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 10886-10895. [Google Scholar] [CrossRef
[10] Ding, X., Zhou, X., Guo, Y., et al. (2019) Global Sparse Momentum SGD for Pruning Very Deep Neural Networks. Advances in Neural Information Processing Systems, 32, 1-13.
[11] 唐乾琛. 英伟达公司发布全球最小边缘AI超级计算模块[J]. 科技中国, 2019(12): 108.
[12] 张宇昂, 李琦, 薛芳芳, 等. 基于Jetson TX2的路面裂缝检测系统设计[J]. 公路, 2023, 68(12): 337-344.
[13] 周立君, 刘宇, 白璐, 等. 使用TensorRT进行深度学习推理[J]. 应用光学, 2020, 41(2): 337-341.
[14] Jeong, E.J., Kim, J. and Ha, S. (2022) TensorRT-Based Framework and Optimization Methodology for Deep Learning Inference on Jetson Boards. ACM Transactions on Embedded Computing Systems (TECS), 21, 1-26.
[15] Song, Z. and Shui, K. (2019) Research on the Acceleration Effect of Tensorrt in Deep Learning. Scientific Journal of Intelligent Systems Research, 1, 45-50.
[16] Lin, T.Y., Maire, M., Belongie, S., et al. (2014) Microsoft Coco: Common Objects in Context. European Conference on Computer Vision. Switzerland, Zurich, 6-12 September 2004, 740-755. [Google Scholar] [CrossRef
[17] Smith, J., Loncomilla, P. and Ruiz-Del-Solar, J. (2023) Human Pose Estimation Using Thermal Images. IEEE Access, 11, 35352-35370. [Google Scholar] [CrossRef