基于改进ReliefF与ACO特征选择算法的心肌病分类模型
Optimized Feature Selection Algorithm Based on ReliefF and Ant Colony for Cardiomyopathy Classification
DOI: 10.12677/SEA.2022.112029, PDF,    国家自然科学基金支持
作者: 孟朋辉, 黄凯雯, 徐 磊*:上海理工大学光电信息与计算机工程学院,上海
关键词: 核磁共振图像图像重构纹理特征心肌病分类改进ReliefF改进ACOMR Images Image Reconstruction Textural Features Cardiomyopathy Classification Improved ReliefF Improved ACO
摘要: 基于MR图像来进行心肌病模式的识别以用于医生辅助诊断。首先提出了一种心肌图像重构的方法,该方法以心肌MR图像的医学结论为基础,采用线性插值的方式将原始心肌重构为更能表现环方向特征的形式,通过灰度共生矩阵(GLCM)、灰度游程矩阵(GLRLM)、局部二值模式(LBP)以及直方图模式提取出422个纹理特征。然后针对特征维数高的情况,提出了一种改进ReliefF算法和ACO算法的联合特征选择方法,引进欧氏距离为基础的距离系数改进ReliefF,所得的特征权重配合皮尔逊相关系数、识别精度以及特征子集长度来改进ACO算法的信息素更新和剪枝策略等。该算法在4个公开的高维特征基因数据集上,平均选择0.4%的特征,平均精度可达91.73%。该算法在重构和原生所构成的纹理特征中选取了6个特征用于三种心肌纤维化模式(正常、HCM、DCM)的识别,通过SVM分类模型,在测试集上取得了93.8%的准确率,可在临床应用中辅助医生进行心肌病诊断。
Abstract: Recognition of cardiomyopathy patterns based on MR images is used to assist diagnosis. First, a method of myocardial image reconstruction is proposed. Based on the medical conclusions of myocardial MR images, linear interpolation is used to reconstruct the original myocardium into a form that can better express the characteristics of the ring direction. 422 texture features are extracted through gray level co-occurrence matrix (GLCM), gray run length matrix (GLRLM), local binary mode (LBP) and histogram mode. Then, in view of the high sample feature dimension, a joint feature selection method of improved ReliefF algorithm and ACO algorithm is proposed. The distance coefficient based on Euclidean distance is introduced to improve ReliefF, and the obtained feature weights are combined with Pearson correlation coefficient, recognition accuracy and feature subset length to improve the pheromone update and pruning strategy of the ACO algorithm. The algorithm selects 0.4% of the features on 4 publicly available high-dimensional feature gene data sets, with an average accuracy of 91.73%. In the end, the algorithm selected 6 features from the texture features composed of reconstruction and original to identify the three myocardial fibrosis modes (normal, HCM, DCM). Through the SVM classification model, it achieved 93.8% on the test set. The accuracy rate can assist doctors in the diagnosis of cardiomyopathy in clinical applications.
文章引用:孟朋辉, 黄凯雯, 徐磊. 基于改进ReliefF与ACO特征选择算法的心肌病分类模型[J]. 软件工程与应用, 2022, 11(2): 267-281. https://doi.org/10.12677/SEA.2022.112029

参考文献

[1] Yacoub, M.H. (2014) Decade in Review—Cardiomyopathies: Cardiomyopathy on the Move. Nature Reviews Cardiology, 11, 628-629. [Google Scholar] [CrossRef] [PubMed]
[2] Gupta, S., Goyal, P., Idrees, S., Aggarwal, S., Bajaj, D. and Mattana, J. (2018) Association of Endocrine Conditions With Takotsubo Cardiomyopathy: A Comprehensive Review. Journal of the American Heart Association, 10, 7-19. [Google Scholar] [CrossRef
[3] Zapolski, T., Furmaga, J., Wysokiński, A.P., et al. (2019) The Atrial Uremic Cardiomyopathy Regression in Patients after Kidney Transplantation—The Prospective Echocardiographic Study. BMC Nephrology, 20, Article No. 152. [Google Scholar] [CrossRef] [PubMed]
[4] Baeßler, B., Mannil, M., Maintz, D., et al. (2018) Texture Analysis and Machine Learning of Non-Contrast T1-Weighted MR Images in Patients with Hypertrophic Cardiomyopathy—Preliminary Results. European Journal of Radiology, 102, 61-67. [Google Scholar] [CrossRef] [PubMed]
[5] Shao, X.N., Sun, Y.J., Xiao, K.T., et al. (2018) Texture Analysis of Magnetic Resonance T1 Mapping with Dilated Cardiomyopathy: A Machine Learning Approach. Medicine, 97, e12246. [Google Scholar] [CrossRef
[6] Jain, A. and Zongker, D. (1997) Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 153-158. [Google Scholar] [CrossRef
[7] Urbanowicz, R.J., Meeker, M., La Cava, W., et al. (2018) Relief-Based Feature Selection: Introduction and Review. Journal of Biomedical Informatics, 85, 189-203. [Google Scholar] [CrossRef] [PubMed]
[8] Reyes, O., Morell, C. and Ventura, S. (2015) Scalable Extensions of the ReliefF Algorithm for Weighting and Selecting Features on the Multi-Label Learning Context. Neurocomputing, 161, 168-182. [Google Scholar] [CrossRef
[9] Dorigo, M., Caro, G.D. and Gambardella, L.M. (1999) Ant Algorithms for Discrete Optimization. Artificial Life, 5, 137-172. [Google Scholar] [CrossRef] [PubMed]
[10] Nemati, S., et al. (2009) A Novel ACO-GA Hybrid Algorithm for Feature Selection in Protein Function Prediction. Expert Systems with Applications, 36, 12086-12094. [Google Scholar] [CrossRef
[11] Ding, S. (2009) Feature Selection Based F-Score and ACO Algorithm in Support Vector Machine. 2009 Second International Symposium on Knowledge Acquisition and Modeling, Wuhan, 30 November-1 December 2009, 19-23. [Google Scholar] [CrossRef
[12] Raju, P., Rao, V.M. and Rao, B.P. (2019) Optimal GLCM Combined FCM Segmentation Algorithm for Detection of Kidney Cysts and Tumor. Multimedia Tools and Applications, 78, 18419-18441. [Google Scholar] [CrossRef
[13] El-Rewaidy, H., Neisius, U., Nakamori, S., et al. (2020) Characterization of Interstitial Diffuse Fibrosis Patterns Using Texture Analysis of Myocardial Native T1 Mapping. PLoS ONE, 15, e0233694. [Google Scholar] [CrossRef] [PubMed]
[14] 曾令明, 陈榆舒, 郜发宝. 心脏评估弥漫性心肌纤维化研究进展[J]. 国际医学放射学杂志, 2019, 42(2): 426-429.
[15] Fukui, R. and Shiraishi, J. (2019) Application of a Pixel-Shifted Linear Interpolation Technique for Reducing the Projection Number in Tomosynthesis Imaging. Radiological Physics and Technology, 12, 30-39. [Google Scholar] [CrossRef] [PubMed]
[16] 邵俊健, 王士同. 高维数据的增量式聚类算法的距离度量选择研究[J]. 计算机工程与科学, 2019, 41(2): 214-223.
[17] Four Public Gene Expression Datasets. NCBI. https://www.ncbi.nlm.nih.gov/guide/genes-expression
[18] 张忠林, 曹婷婷. 基于重采样与特征选择的不均衡数据分类算法[J]. 小型微型计算机系统, 2020, 41(6): 209-215.
[19] 邓晶, 王淑平, 魏佳. 基于最大信息系数和蚁群算法的无监督特征选择的研究[J]. 信息系统工程, 2020, 1(1): 125-127+131.
[20] 朱英亮, 仇旭阳, 徐磊. 基于改进ReliefF与K-means算法的良恶性肺结节分类模型[J]. 小型微型计算机系统, 2021, 42(3): 566-571.
[21] Laurens, V.D.M. and Hinton, G. (2008) Visualizing Data Using t-SNE. Journal of Machine Learning Research, 9, 2579-2605.
[22] Fawcett, T. (2003) ROC Graphs: Notes and Practical Considerations for Data Mining Researchers. Pattern Recognition Letters, 31, 1-38.