基于3D监督预训练的全身病灶检测SOTA
SOTA for Systemic Lesion Detection Based on 3D Supervised Pre-Training
摘要: 现如今医学领域和计算机领域的融合程度越来越深,越来越多的研究团队使用计算机先进的图像处理技术来完成医学影像的分类,分割,检测,配准和成像重建等工作,但是这仍是少数,绝大多数的医疗机构仍然是依靠医生来根据图像进行诊断和结果分析,这大大降低了医院的工作效率,诊断时间长,给患者无论是精神上还是身体上都带来了负担。因此,我们通过机器深度学习的预训练3D模型,3D建模和卷积神经网络技术来建立一个可以直接进行影像分类,切割和病灶检测的系统。本研究针对CT层面中的2D病灶检测问题提出了一种可以有效利用3D上下文信息的新框架,同时提出了一种预训练3D卷积神经网络的新思路,该研究在迄今规模最大的CT图像数据集NIH DeepLesion上进行了实验,并取得了SOTA的病灶检测结果。有监督预训练方法可以有效提升3D模型训练的收敛速度,以及在小规模数据集上的模型精度,用于提升病症分析的速度和准确率,提高医疗效率。论文的主要研究成果包括:1) 提出了3D卷积模型病灶检测预处理方法。该方法是通过数据的预处理将CT影像转换为三通道的伪彩色图像,从而将图像中的像素值归一化到相同的范围,减小了不同类型病灶形态差异。其次,通过预处理,结合多类型病灶框大小的先验知识,对锚框宽高比进行了优化。2) 开发了一个通用和高效的能够增强3D上下文信息建模的网络框架。首先提出一种改进的伪3D框架来对连续多层输入进行高效的3D上下文特征提取,同时配合一个组卷积变换模块,在该特征输入到检测头之前可以将3D特征转换为2D特征,来适配我们的2D目标检测任务。以确保模型始终具备3D上下文建模能力。3) 研究设计了一种有监督的预训练方法来增强MP3D的训练以及收敛性能。本研究工作提出一种基于变维度转换的3D模型预训练方法:将2D空间中的channel维度转换为3D空间中的dept维度,将原始具有色彩信息的RGB三通道二维图像转化成三维空间中的三个连续层面图像,以此达到有效的利用2D自然图像处理进行3D模型的预训练。本论文在DeepLesion数据集上对提出的方法进行了定性和定量的分析对比。结果显示本论文方法可以作为CT影像中多类型病灶的辅助检测方法,从而促进CT技术在临床的应用。
Abstract: Nowadays, the integration between the medical field and the computer field is getting deeper and deeper. More and more research teams are using advanced computer image processing technology to complete the classification, segmentation, detection, registration and imaging reconstruction of medical images. However, this is still a minority, and the vast majority of medical institutions still rely on doctors for diagnosis and result analysis based on images. This greatly reduces the working efficiency of hospitals, takes a long time to diagnose, and puts a burden on patients both mentally and physically. Therefore, we use the pretrained 3D model of machine deep learning, 3D modeling and convolutional neural network technology to build a system that can directly perform image classification, cutting and lesion detection. This study proposed a new framework for 2D lesion detection at the CT level that can effectively utilize 3D contextual information, and proposed a new idea of pre-training 3D convolutional neural network. This study conducted experiments on NIH DeepLesion, the largest CT image dataset to date, and obtained SOTA lesion detection results. The supervised pre-training method can effectively improve the convergence speed of 3D model training, as well as the model accuracy on small-scale data sets, so as to improve the speed and accuracy of disease analysis and medical efficiency. The main research achievements of this paper include: 1) A pretreatment method for lesion detection with 3D convolution model is proposed. This method converts CT images into three-channel false-color images through data preprocessing, so that the pixel values in the images are normalized to the same range, and the morphological differences between different types of lesions are reduced. Secondly, the aspect ratio of the anchor frame was optimized by preconditioning combined with the prior knowledge of the size of the multi-type focus frame. 2) Develop a general and efficient network framework that can enhance 3D contextual information modeling. Firstly, an improved pseudo-3D framework is proposed to extract 3D context features efficiently for the continuous multi-layer input. At the same time, a group convolution transform module is combined to convert 3D features into 2D features before the feature is input to the detection head, so as to adapt to our 2D target detection task. To ensure that the model always has 3D context modeling capability. 3) A supervised pretraining method was designed to enhance the MP3D training and convergence performance. This research proposes a 3D model pretraining method based on variable dimension transformation: The channel dimension in 2D space is converted to the dept dimension in 3D space, and the original RGB three-channel two-dimensional image with color information is converted into three continuous level images in 3D space, so as to effectively use 2D natural image processing to conduct 3D model pretraining. In this paper, we compare the proposed methods qualitatively and quantitatively on DeepLesion data sets. The results show that the method in this paper can be used as an auxiliary detection method for multiple types of lesions in CT images, so as to promote the clinical application of CT technology.
文章引用:周菁仪, 贾采薇, 袁嘉怡. 基于3D监督预训练的全身病灶检测SOTA[J]. 计算机科学与应用, 2022, 12(12): 2916-2924. https://doi.org/10.12677/CSA.2022.1212296

参考文献

[1] 李文雅. 一种基于卷积神经网络的肝脏CT图像病灶检测问题研究[D]: [硕士学位论文]. 湘潭: 湘潭大学, 2019.
[2] 史延新. 医学CT图像预处理上下文算法研究[J]. 电子科技, 2018, 31(9): 72-76.
[3] 施俊, 汪琳琳, 王珊珊, 等. 深度学习在医学影像中的应用综述[J]. 中国图象图形学报, 2020, 25(10): 1953-1981.
[4] 吴楠, 李晓曦, 宋方敏. 图像挖掘及其在医学图像分类中的应用[C]//2006年全国理论计算机科学学术年会论文集. 南京大学软件新技术国家重点实验室、南京大学医学院免疫学与生殖科学实验室, 2006: 237-239.
[5] Kamnitsas, K., Ledig, C., Newcombe, V.F.J., et al. (2017) Efficient Multi-Scale 3D CNN with Fully Connected CRF for Accurate Brain Lesion Segmentation. Medical Image Analysis, 36, 61-78. [Google Scholar] [CrossRef] [PubMed]
[6] Li, Z.H., Zhang, S., Zhang, J.G., et al. (2019) MVP-Net: Mul-ti-View FPN with Position-Aware Attention for Deep Universal Lesion Detection. In: Shen, D.G., et al., Eds., Interna-tional Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Cham, 13-21. [Google Scholar] [CrossRef
[7] Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2017) ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 60, 84-90. [Google Scholar] [CrossRef
[8] 昌杰. 基于深度神经网络的肿瘤图像分析与处理[D]: [博士学位论文]. 合肥: 中国科学技术大学, 2019.
[9] 赵洋. CT影像多类型病灶自动检测研究[D]: [硕士学位论文]. 重庆: 重庆大学, 2021.
[10] Zhang, S., Xu, J.C., Chen, Y.C., et al. (2020) Revisiting 3D Context Modeling with Supervised Pre-Training for Universal Lesion Detection on CT Slices. In: Martel, A.L., et al., Eds., International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, Cham, 542-548. [Google Scholar] [CrossRef
[11] Lambin, P., et al. (2011) Radiomics: Extracting More Infor-mation from Medical Images Using Advanced Feature Analysis. European Journal of Cancer, 48, 441-446. [Google Scholar] [CrossRef] [PubMed]
[12] Zhang, S., Li, Z.H., Zhou, H.-Y., et al. (2022) Advancing 3D Medical Image Analysis with Variable Dimension Transform Based Supervised 3D Pre-Training.
[13] 马国祥, 严传波, 张志豪, 森干. 基于数据增强的CT图像病灶检测方法[J]. 计算机系统应用, 2021, 30(10): 187-194.
[14] Zhang, S., Xu, J.C., Chen, Y.C., et al. (2020) Revisiting 3D Context Modeling with Supervised Pre-Training for Universal Lesion Detection on CT Slices. In: Martel, A.L., et al., Eds., International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, Cham, 548-551. [Google Scholar] [CrossRef
[15] 隆涛. 医学图像三维重建及辅助诊断算法研究[D]: [硕士学位论文]. 重庆: 重庆邮电大学, 2020.
[16] 金亚荣. 基于深度学习的CT图像病灶检测和识别[D]: [硕士学位论文]. 北京: 北京工业大学, 2020.
[17] 王佳浩. 基于CNN与医学影像的肺结核检测方法研究[D]: [硕士学位论文]. 天津: 河北工业大学, 2022.
[18] 向松. 基于深度卷积神经网络的医学图像分割方法研究[D]: [硕士学位论文]. 武汉: 武汉科技大学, 2019.
[19] 杨晶东, 王海灵. 一种有效的全卷积神经网络生物医学图像分割方法[J]. 小型微型计算机系统, 2021, 42(6): 1281-1287.
[20] Uijlings, J.R.R., Sande, K.E.A., Gevers, T. and Smeulders, A.W.M. (2013) Selective Search for Object Recognition. International Journal of Computer Vision, 104, 154-171. [Google Scholar] [CrossRef
[21] 吴鹃. 基于三维数据重建与目标定位的DICOM图像分析测量系统[J]. 电气自动化, 2021, 43(4): 112-114.