基于高分特征保留网络的中医齿痕舌图像分类方法
Image Classification Method for Tooth-Marked Tongue in Traditional Chinese Medicine Based on High-Resolution Feature Preservation Network
DOI: 10.12677/tcm.2026.155270, PDF,    科研立项经费支持
作者: 马玉明:伊犁师范大学网络安全与信息技术学院,新疆 伊宁;王振华*:伊犁师范大学网络安全与信息技术学院,新疆 伊宁;伊犁智能计算研究与应用重点实验室,新疆 伊宁
关键词: 中医舌诊齿痕舌识别全局最大池化图像分类Traditional Chinese Medicine (TCM) Tongue Diagnosis Tooth-Marked Tongue Recognition Global Max Pooling Image Classification
摘要: 齿痕舌是反映脾虚、水湿内停等病理状态的重要中医客观体征。现有深度学习模型在提取此类局限于舌体边缘的微小形态特征时,常因网络过深导致“空间平滑效应”,且全局平均池化(GAP)易稀释微弱的异常信号,引发临床漏诊。为此,本文提出一种基于高分特征保留网络(HRFP-Net)的齿痕舌图像分类方法。该方法引入早期退出(Early-Exit)机制,在轻量化骨干网络(ConvNeXt-Tiny)浅层进行截断,以保留高分辨率的物理边缘形态;同时采用全局最大池化(GMP)构建峰值信号检测器,精准锁定并提取局部异常激活区域。在公开的Tooth-Marked数据集上的实验表明,HRFP-Net有效克服了边缘特征被掩盖的问题,准确率达94.42%,精确率高达97.06%,F1分数和AUC分别为93.40%和97.73%,性能显著优于主流的CNN与Transformer模型。本文方法在降低网络冗余的同时实现了微小齿痕体征的高敏锐度捕捉,为中医客观化辅助诊断提供了高效、鲁棒的新思路。
Abstract: The tooth-marked tongue is a crucial objective sign in Traditional Chinese Medicine (TCM) that reflects pathological states such as spleen deficiency and the internal retention of dampness and fluid. When extracting such subtle morphological features confined to the edge of the tongue, existing deep learning models often suffer from a “spatial smoothing effect” due to excessive network depth. Furthermore, Global Average Pooling (GAP) tends to dilute weak abnormal signals, leading to missed clinical diagnoses. To address these issues, this paper proposes an image classification method for tooth-marked tongues based on a High-Resolution Feature Preservation Network (HRFP-Net). This method introduces an Early-Exit mechanism that truncates the lightweight backbone network (ConvNeXt-Tiny) at a shallow layer to preserve high-resolution physical edge morphologies. Simultaneously, Global Max Pooling (GMP) is employed to construct a peak signal detector, which accurately locks onto and extracts locally abnormal activated regions. Experiments on the public Tooth-Marked dataset demonstrate that HRFP-Net effectively overcomes the problem of edge feature obfuscation. It achieves an accuracy of 94.42% and a remarkably high precision of 97.06%, with an F1-score and AUC of 93.40% and 97.73%, respectively. Its overall performance is significantly superior to mainstream CNN and Transformer models. While reducing network redundancy, the proposed method achieves high-sensitivity capture of subtle tooth-marked signs, providing an efficient and robust new approach for objective computer-aided diagnosis in TCM.
文章引用:马玉明, 王振华. 基于高分特征保留网络的中医齿痕舌图像分类方法[J]. 中医学, 2026, 15(5): 207-214. https://doi.org/10.12677/tcm.2026.155270

参考文献

[1] 彭素霞, 杨多, 钟俐芹, 梁昊. 中医舌诊智能化的研究进展[J]. 中医学, 2024, 13(7): 1590-1598
[2] 卢运西, 李晓光, 张辉, 张菁, 卓力. 中医舌象分割技术研究进展: 方法、性能与展望[J]. 自动化学报, 2021, 47(5): 1005-1016
[3] 徐雍钦, 杨晶东, 姜泉, 等. 基于多特征融合的中医症候舌象分类方法研究[J]. 智能计算机与应用, 2022, 12(7): 25-34.
[4] 翟鹏博, 杨浩, 宋婷婷, 等. 融合注意力机制的多阶段舌象分类算法[J]. 计算机工程与设计, 2021, 42(6): 1606-1613.
[5] Zhou, J., Li, S., Wang, X., Yang, Z., Hou, X., Lai, W., et al. (2022) Weakly Supervised Deep Learning for Tooth-Marked Tongue Recognition. Frontiers in Physiology, 13, Article 847267. [Google Scholar] [CrossRef] [PubMed]
[6] 吴欣, 徐红, 林卓胜, 等. 深度学习在舌象分类中的研究综述[J]. 计算机科学与探索, 2023, 17(2): 303-323.
[7] Tang, W., Gao, Y., Liu, L., Xia, T., He, L., Zhang, S., et al. (2020) An Automatic Recognition of Tooth-Marked Tongue Based on Tongue Region Detection and Tongue Landmark Detection via Deep Learning. IEEE Access, 8, 153470-153478. [Google Scholar] [CrossRef
[8] Liu, Z., Mao, H., Wu, C., Feichtenhofer, C., Darrell, T. and Xie, S. (2022) A ConvNet for the 2020s. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 11966-11976. [Google Scholar] [CrossRef
[9] Teerapittayanon, S., McDanel, B. and Kung, H.T. (2016). BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks. 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, 4-8 December 2016, 2464-2469.[CrossRef
[10] Oquab, M., Bottou, L., Laptev, I. and Sivic, J. (2015) Is Object Localization for Free? Weakly-Supervised Learning with Convolutional Neural Networks. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 685-694. [Google Scholar] [CrossRef
[11] Zhang, H., Cisse, M., Dauphin, Y.N., et al. (2017) Mixup: Beyond Empirical Risk Minimization. [Google Scholar] [CrossRef
[12] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[13] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. and Wojna, Z. (2016) Rethinking the Inception Architecture for Computer Vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 2818-2826. [Google Scholar] [CrossRef
[14] Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., et al. (2021) Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 9992-10002. [Google Scholar] [CrossRef
[15] 杨炳乾, 冯秀芳, 董云云, 等. 结合CNN和Transformer病变信号引导的蜂窝肺CT图像识别[J]. 激光与光电子学进展, 2024, 61(14): 457-466.