基于扩散模型自监督表征学习的脑瘤医学图像分类研究
Research on Brain Tumor Medical Image Classification Based on Diffusion Self-Supervised Representation Learning
摘要: 本文提出了一种基于扩散模型自监督表征学习的医学图像分类方法MicDiffRep (Medical Image Classification with Diffusion-based Representation)。通过扩散模型预训练,学习医学图像完整的细节纹理信息和图像整体结构,从而在进行医学图像分类时充分捕捉图像的细节特征。为了同时利用图像的全局信息,本文提出一个多尺度的特征聚合MSFA (Multi-Scale Feature Aggregation)模块,将MicDiffRep模型不同尺度的各层特征聚合起来。在脑瘤图像分类数据集上的实验显示,本文方法相比于现有最优的自监督方法的线性分类准确率提升多达6个百分点。
Abstract: This paper proposes a self-supervised representation learning method for medical image classification, MicDiffRep (Medical Image Classification with Diffusion-based Representation). Through diffusion model pre-training, the complete detailed texture information of medical images and the overall structure of the image are learned, so as to fully capture the detailed features of the image when classifying medical images. In order to utilize the global information of the image at the same time, this paper proposes a multi-scale feature aggregation MSFA (Multi-Scale Feature Aggregation) module to aggregate the features of each layer of the MicDiffRep model at different scales. Experiments on a brain tumor image classification data set show that the linear classification accuracy of this method is improved by up to 6 percentage points compared with the existing best self-supervised methods.
文章引用:朱泽宇, 赵曙光. 基于扩散模型自监督表征学习的脑瘤医学图像分类研究[J]. 计算机科学与应用, 2024, 14(4): 133-140. https://doi.org/10.12677/csa.2024.144084

参考文献

[1] De Bruijne, M. (2016) Machine Learning Approaches in Medical Image Analysis: From Detection to Diagnosis. Medical Image Analysis, 33, 94-97. [Google Scholar] [CrossRef] [PubMed]
[2] Esteva, A., et al. (2017) Dermatologist-Level Classification of Skin Cancer with Deep Neural Networks. Nature, 542, 115-118. [Google Scholar] [CrossRef] [PubMed]
[3] Esteva, A., et al. (2019) A Guide to Deep Learning in Health Care. Nature Medicine, 25, 24-29. [Google Scholar] [CrossRef] [PubMed]
[4] Shamshad, F., et al. (2023) Transformers in Medical Imaging: A Survey. Medical Image Analysis, 88, Article ID: 102802. [Google Scholar] [CrossRef] [PubMed]
[5] He, K.M., et al. (2020) Momentum Contrast for Unsupervised Visual Representation Learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 13-19 June 2020, 9726-9735. [Google Scholar] [CrossRef
[6] Chen, X.L., et al. (2020) Improved Baselines with Momentum Contrastive Learning. Xiv:2003.04297.
[7] Grill, J.-B., et al. (2020) Bootstrap Your Own Latent—A New Approach to Self-Supervised Learning. Advances in Neural Information Processing Systems, 33, 21271-21284.
[8] Radford, A., et al. (2021) Learning Transferable Visual Models from Natural Language Supervision. International Conference on Machine Learning, 139, 8763-8748.
[9] He, K.M., et al. (2022) Masked Autoencoders Are Scalable Vision Learners. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 18-24 June 2022, 16000-16009. [Google Scholar] [CrossRef
[10] Xiang, W.L., et al. (2023) Denoising Diffusion Autoencoders are Unified Self-supervised Learners. Proceedings of the IEEE/CVF Intenational Conference on Computer Vision, Paris, 1-6 October 2023, 15802-15812. [Google Scholar] [CrossRef
[11] Vincent, P., et al. (2008) Extracting and Composing Robust Features with Denoising Autoencoders. Proceedings of the 25th International Conference on Machine Learning, Helsinki Finland, 5-9 July 2008, 1096-1103. [Google Scholar] [CrossRef
[12] Ho, J., Ajay, J. and Pieter, A. (2020) Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems, 33, 6840-6851.
[13] Song, Y. and Stefano, E. (2019) Generative Modeling by Estimating Gradients of the Data Distribution. Advances in Neural Information Processing Systems, 32, 11895-11907.
[14] Karras, T., et al. (2022) Elucidating the Design Space of Diffusion-Based Generative Models. Advances in Neural Information Processing Systems, 35, 26565-26577.
[15] William, P. and Xie, S.N. (2023) Scalable Diffusion Models with Transformers. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, 1-6 October 2023, 4172-4182. [Google Scholar] [CrossRef