U型网络的轻量化设计及端侧部署研究
Research on Lightweight Design and Edge Deployment of U-Shaped Networks
DOI: 10.12677/csa.2026.162058, PDF,   
作者: 冯佳祥, 刘 洋, 赵一丁, 黄孟轩, 于欣鑫:长春理工大学数学与统计学院,吉林 长春
关键词: RK3588MALUNetNPU量化部署RK3588 MALUNet NPU Quantized Deployment
摘要: 针对现有医学影像分割模型参数规模大、计算复杂度高而难以在资源受限的边缘设备上高效部署的问题,本文提出了一种基于国产瑞芯微RK3588开发板的轻量化U型网络设计及量化部署方案。研究首先基于 U-Net架构引入空洞门控注意力(DGA)、反转外部注意力(IEA)及特征桥接模块,构建了轻量化网络 MALUNet以平衡特征提取能力与计算开销;结合一次性层剪枝与归一化知识蒸馏技术对模型进行深度压缩,并利用rknn-toolkit2完成NPU端侧的量化部署。在ISIC2017数据集上的实验结果显示,优化后的MALUNetGlobalAtt学生模型在保持较高分割精度(mIoU为0.8126)的前提下,单样本推理时间较原始模型降低了96%,验证了该方案在国产边缘计算平台上实现医学影像实时智能分析的可行性与优越性。
Abstract: To address the challenges that existing medical image segmentation models suffer from large parameter sizes and high computational complexity, making them difficult to be efficiently deployed on resource-constrained edge devices, this paper proposes a lightweight U-shaped network design and quantized deployment scheme based on the domestic Rockchip RK3588 development board. Firstly, based on the U-Net architecture, the study constructs a lightweight network, MALUNet, by incorporating Dilated Gated Attention (DGA), Inverted External Attention (IEA), and feature bridge blocks to balance feature extraction capability with computational cost. Furthermore, the model is deeply compressed by combining one-shot layer pruning and normalized knowledge distillation techniques, and the quantized deployment on the NPU is completed using rknn-toolkit2. Experimental results on the ISIC2017 dataset demonstrate that the optimized MALUNetGlobalAtt student model reduces the single-sample inference time by 96% compared to the original model while maintaining high segmentation accuracy (mIoU of 0.8126). This validates the feasibility and superiority of this scheme for real-time intelligent analysis of medical images on domestic edge computing platforms.
文章引用:冯佳祥, 刘洋, 赵一丁, 黄孟轩, 于欣鑫. U型网络的轻量化设计及端侧部署研究[J]. 计算机科学与应用, 2026, 16(2): 277-288. https://doi.org/10.12677/csa.2026.162058

参考文献

[1] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W. and Frangi, A., Eds., Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Springer, 234-241. [Google Scholar] [CrossRef
[2] Du, G., Cao, X., Liang, J., Chen, X. and Zhan, Y. (2020) Medical Image Segmentation Based on U-Net: A Review. Journal of Imaging Science and Technology, 64, 020508-1-020508-12. [Google Scholar] [CrossRef
[3] Cui, K. and Tian, Q.C. (2024) Review of Medical Image Segmentation Algorithms Based on U-Net Variants. Journal of Computer Engineering & Applications, 60, 32.
[4] Zhang, R. and Chung, A.C.S. (2024) EfficientQ: An Efficient and Accurate Post-Training Neural Network Quantization Method for Medical Image Segmentation. Medical Image Analysis, 97, Article ID: 103277. [Google Scholar] [CrossRef] [PubMed]
[5] Liu, Q., Zhou, S. and Lai, J. (2023) EdgeMedNet: Lightweight and Accurate U-Net for Implementing Efficient Medical Image Segmentation on Edge Devices. IEEE Transactions on Circuits and Systems II: Express Briefs, 70, 4329-4333. [Google Scholar] [CrossRef
[6] Ramesh, K.K.D., Kumar, G.K., Swapna, K., Datta, D. and Rajest, S.S. (2021) A Review of Medical Image Segmentation Algorithms. EAI Endorsed Transactions on Pervasive Health and Technology, 7, e6. [Google Scholar] [CrossRef
[7] Bu, D., Sun, B., Sun, X. and Guo, R. (2024) Research on YOLOv8 UAV Ground Target Detection Based on RK3588. 2024 2nd International Conference on Computer, Vision and Intelligent Technology (ICCVIT), Huaibei, 24-27 November 2024, 1-5. [Google Scholar] [CrossRef
[8] Ruan, J., Xiang, S., Xie, M., Liu, T. and Fu, Y. (2022) MALUNet: A Multi-Attention and Light-Weight UNet for Skin Lesion Segmentation. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, 6-8 December 2022, 1150-1156. [Google Scholar] [CrossRef
[9] Guo, M., Liu, Z., Mu, T. and Hu, S. (2022) Beyond Self-Attention: External Attention Using Two Linear Layers for Visual Tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 5436-5447. [Google Scholar] [CrossRef] [PubMed]
[10] LeCun, Y., Denker, J. and Solla, S. (1989) Optimal Brain Damage. Advances in Neural Information Processing Systems, 2, 1-5.
[11] Hassibi, B., Stork, D.G. and Wolff, G.J. (1993) Optimal Brain Surgeon and General Network Pruning. IEEE International Conference on Neural Networks, San Francisco, 28 March-1 April 1993, 293-299. [Google Scholar] [CrossRef
[12] Han, S., Mao, H. and Dally, W.J. (2015) Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. arXiv: 1510.00149.
[13] Hinton, G., Vinyals, O. and Dean, J. (2015) Distilling the Knowledge in a Neural Network. arXiv: 1503.02531.
[14] Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C. and Bengio, Y. (2014) FitNets: Hints for Thin Deep Nets. arXiv: 1412.6550.
[15] Courbariaux, M., Hubara, I., Soudry, D., et al. (2016) Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to+ 1 or −1. arXiv: 1602.02830.
[16] Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., et al. (2018) Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 2704-2713. [Google Scholar] [CrossRef
[17] Zhang, D., Li, S., Chen, C., et al. (2024) LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models. arXiv: 2404.11098.
[18] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W. and Keutzer, K. (2022) A Survey of Quantization Methods for Efficient Neural Network Inference. In: Thiruvathukal, G.K., Lu, Y.H., Kim, J., Chen, Y.R. and Chen, B., Eds., Low-Power Computer Vision, Chapman and Hall/CRC, 291-326. [Google Scholar] [CrossRef