基于改进MobileNetV2的中耳炎影像分类诊断模型
Otitis Media Image Classification and Recognition Model Based on Improved MobileNetV2
DOI: 10.12677/mos.2024.132170, PDF,    科研立项经费支持
作者: 胡江婧, 张学典:上海理工大学医用光学技术集仪器教育部重点实验室,上海
关键词: 中耳炎分类MobileNetV2坐标注意力特征融合HardSwishOtitis Media Classification MobileNetV2 Coordinate Attention Feature Fusion HardSwish
摘要: 中耳炎诊断是有效防止耳道疾病进一步恶化的重要前提,针对现有中耳炎相关研究中存在的数据集量少、网络参数量大、模型识别精度低等问题,提出一种基于改进MobileNetV2模型的中耳炎分类方法。首先,在MobileNetV2网络的倒置残差结构中嵌入坐标注意力机制,增强网络对中耳炎影像特征细化能力;其次,设计改进注意特征融合模块替换原始特征简单相加过程,加强模型在跨通道背景下对不同尺度特征的提取能力;同时采用HardSwish激活函数替换原始ReLU6函数,提升模型鲁棒性;最后,减少模型瓶颈层通道数,简化模型结构。实验结果表明,所提出CIH-MobileNetV2模型在中耳炎数据集上的识别准确率和F1 Score达到91.05%和89.06%,相较于原始MobileNetV2模型,分别提高了2.31%和3.69%,参数量较初始模型减少了43%。与经典网络AlexNet、GoogleNet、VGG16、ResNet50、MobileNetV3、ShuffleNetV2等网络相比,有更高的识别准确率和F1值,因此,该研究所提出模型能够较好对中耳炎类型进行分类,为中耳炎诊断提供有效帮助。
Abstract: Otitis media recognition is an important prerequisite for effectively preventing the deterioration of otological diseases. Aiming at the problems of small amount of dataset, large number of network parameters, low model recognition accuracy, and excessive computational volume in the existing otitis media recognition research, we propose an otitis media recognition model based on improved MobileNetV2. First, the coordinate attention mechanism is embedded in the inverted residual structure of the MobileNetV2 network to enhance the network’s ability to refine the otitis media image features. Second, Iterative Attention Feature Fusion module is used to replace the simple summation of the original features, which strengthens the model’s capability of extracting features of different scales in the context of cross-channel. At the same time, the HardSwish activation function is used to replace the original ReLU6 function to improve the robustness of the model. Finally, the number of channels in the bottleneck layer of the model is reduced to simplify the model structure. The experimental results show that the recognition accuracy and F1 Score of the proposed CIH-MobileNetV2 model on the otitis media dataset reach 91.05% and 89.06%, which are improved by 2.31% and 3.69%, respectively. Compared with the original MobileNetV2 model, the number of parameters is reduced by 43%. Compared with the classical networks AlexNet, GoogleNet, VGG16, ResNet50, MobileNetV3, ShuffleNetV2, etc., there are higher recognition accuracy and F1 value, therefore, the proposed model of the institute is able to classify the type of otitis media better and provide an effective help for otitis media diagnosis.
文章引用:胡江婧, 张学典. 基于改进MobileNetV2的中耳炎影像分类诊断模型[J]. 建模与仿真, 2024, 13(2): 1814-1829. https://doi.org/10.12677/mos.2024.132170

参考文献

[1] Bock, S. and Weiß, M. (2019) A Proof of Local Convergence for the Adam Optimizer. 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, 14-19 July 2019, 1-8. [Google Scholar] [CrossRef
[2] Kim, S.H., Kim, J.R., Song, J.J. and Chae, S.W. (2020) Trend and Patterns in the Antibiotics Prescription for the Acute Otitis Media in Koreanchildren. International Journal of Pediatric Otorhinolaryngology, 130, Article ID: 109789. [Google Scholar] [CrossRef] [PubMed]
[3] Aduda, D.S., Macharia, I.M., Mugwe, P., Oburra, H., Farragher, B., Brabin, B. and Mackenzie, I. (2013) Bacteriology of Chronic Suppurative Otitis Media (CSOM) in Children in Garissa District, Kenya: A Point Prevalence Study. International Journal of Pediatric Otorhinolaryngology, 77, 1107-1111. [Google Scholar] [CrossRef] [PubMed]
[4] Cömert, Z. and Kocamaz, A.F. (2018) Open-Access Software for Analysis of Fetal Heart Rate Signals. Biomedical Signal Processing and Control, 45, 98-108. [Google Scholar] [CrossRef
[5] Marom, T., Kraus, O., Habashi, N. and Tamir, S.O. (2019) Emerging Technologies for the Diagnosis of Otitis Media. Otolaryngology: Head and Neck Surgery, 160, 447-456. [Google Scholar] [CrossRef] [PubMed]
[6] Goggin, L.S., Eikelboom, R.H. and Atlas, M.D. (2007) Clinical Decision Support Systems and Computer-Aided Diagnosis in Otology. Otolaryngology: Head and Neck Surgery, 136, s21-s26. [Google Scholar] [CrossRef] [PubMed]
[7] Sorrento, A. and Pichichero, M.E. (2001) Assessing Diagnostic Accuracy and Tympanocentesis Skills by Nurse Practitioners in Management of Otitis Media. Journal of the American Academy of Nurse Practitioners, 13, 524-529. [Google Scholar] [CrossRef] [PubMed]
[8] Myburgh, H.C., Jose, S., Swanepoel, D.W. and Laurent, C. (2018) Towards Low Cost Automated Smartphone-and Cloud-Based Otitis Media Diagnosis. Biomedical Signal Processing and Control, 39, 34-52. [Google Scholar] [CrossRef
[9] Nyquist, A.C., Gonzales, R., Steiner, J.F. and Sande, M.A. (1998) Antibiotic Prescribing for Children with Colds, Upper Respiratory Tract Infections, and Bronchitis. JAMA, 279, 875-877. [Google Scholar] [CrossRef] [PubMed]
[10] 俞益洲, 石德君, 马杰超, 等. 人工智能在医学影像分析中的应用进展[J]. 中国医学影像技术, 2019, 35(12): 1808-1812. [Google Scholar] [CrossRef
[11] Mironică, I., Vertan, C. and Gheorghe, D.C. (2011) Automatic Pediatric Otitis Detection by Classification of Global Image Features. 2011 E-Health and Bioengineering Conference (EHB), Iasi, 24-26 November 2011, 1-4.
[12] Myburgh, H.C., Van Zijl, W.H., Swanepoel, D., Hellström, S. and Laurent, C. (2016) Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis. eBioMedicine, 5, 156-160. [Google Scholar] [CrossRef] [PubMed]
[13] Shie, C.K., Chang, H.T., Fan, F.C., Chen, C.J., Fang, T.Y. and Wang, P.C. (2014) A Hybrid Feature-Based Segmentation and Classification System for the Computer Aided Self-Diagnosis of Otitis Media. 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, 26-30 August 2014, 4655-4658.
[14] Kuruvilla, A., Shaikh, N., Hoberman, A. and Kovačević, J. (2013) Automated Diagnosis of Otitis Media: Vocabulary and Grammar. Journal of Biomedical Imaging, 2013, Article ID: 327515. [Google Scholar] [CrossRef] [PubMed]
[15] Başaran, E., Cömert, Z. and Çelik, Y. (2020) Convolutional Neural Network Approach for Automatic Tympanic Membrane Detection and Classification. Biomedical Signal Processing and Control, 56, Article ID: 101734. [Google Scholar] [CrossRef
[16] Lemley, J., Bazrafkan, S. and Corcoran, P. (2017) Deep Learning for Consumer Devices and Services: Pushing the Limits for Machine Learning, Artificial Intelligence, and Computer Vision. IEEE Consumer Electronics Magazine, 6, 48-56. [Google Scholar] [CrossRef
[17] Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Adam, H., et al. (2017) Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv: 1704.04861.
[18] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L. C. (2018) Mobilenetv2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4510-4520. [Google Scholar] [CrossRef
[19] Hou, Q., Zhou, D. and Feng, J. (2021) Coordinate Attention for Efficient Mobile Network Design. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 20-25 June 2021, 13708-13717. [Google Scholar] [CrossRef
[20] Dai, Y., Gieseke, F., Oehmcke, S., Wu, Y. and Barnard, K. (2021) Attentional Feature Fusion. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, 3-8 January 2021, 3559-3568. [Google Scholar] [CrossRef
[21] Glorot, X., Bordes, A. and Bengio, Y. (2011) Deep Sparse Rectifier Neural Networks. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, 11-13 April 2011, 315-323.
[22] Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Adam, H., et al. (2019) Searching for MobileNetV3. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, 27 October-2 November 2019, 1314-1324. [Google Scholar] [CrossRef
[23] Zafer, C. (2020) Fusing Fine-Tuned Deep Features for Recognizing Different Tympanic Membranes. Biocybernetics and Biomedical Engineering, 40, 40-51. [Google Scholar] [CrossRef
[24] Viscaino, M., Maass, J.C., Delano, P.H., Torrente, M., Stott, C. and AuatCheein, F. (2020) Computer-Aided Diagnosis of External and Middle Ear Conditions: A Machine Learning Approach. PLOS ONE, 15, e0229226. [Google Scholar] [CrossRef] [PubMed]
[25] Doyle, S., Hwang, M., Shah, K., Madabhushi, A., Feldman, M. and Tomaszeweski, J. (2007) Automated Grading of Prostate Cancer Using Architectural and Textural Image Features. 2007 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Arlington, 12-15 April 2007, 1284-1287. [Google Scholar] [CrossRef