基于深度学习的机器异常声音检测
A Machine Abnormal Sound Detection Based on Deep Learning
DOI: 10.12677/CSA.2023.1311208, PDF,    科研立项经费支持
作者: 朱 鹏, 黎春玲, 郑荣璞, 刘 琳*, 魏喜庆, 吕 品:上海电机学院电子信息学院,上海
关键词: 异常声音检测深度学习对数Mel谱Mobilenetv2Abnormal Sound Detection Deep Learning Log-Mel Spectrum Mobilenetv2
摘要: 随着大规模工业生产的发展,机器设备的健康管理越来越重要。由于机器设备潜在的故障,机器异常声音的检测对工业生产的保障有待提高。不同的机器运作时发出的声音有规律性,可以根据这一特性判断机器是否处于一个正常运作状态,通过对机器运作时的声音特征进行研究,提出一种基于深度学习的机器异常声音的检测,通过对声音特征的提取,经过模型的训练,判断机器是否处于异常状态,防患于未然。首先对数据集通过等高梅尔滤波器处理后提取出对数Mel谱作为声音特征,之后针对实际中异常声音的缺失等问题,使用mobilenetv2对声音模型进行训练,通过模型输出的逻辑回归值来计算异常分数和确定异常阈值。经过对比分析,表明对原始音频进行特征提取后训练的模型,机器异常声音检测性能有所提升。
Abstract: With the development of large-scale industrial production, the health management of machinery and equipment is becoming increasingly important. Due to the potential failure of machinery and equipment, the detection of abnormal machine sounds needs to be improved. The sound emitted by different machines during operation is regular, and whether the machine is in a normal operation state can be judged according to this characteristic. Through the research on the sound characteristics of the machine during operation, a machine abnormal sound detection based on deep learning is proposed. Through the extraction of sound characteristics and the training of the model, whether the machine is in an abnormal state can be judged to prevent potential problems. Firstly, the data set is processed by the constant-height Mel filter to extract the logarithmic Mel spectrum as the sound feature, then the sound model is trained by mobilenetv2 to process the absence of abnormal sound in the data set, and the abnormal score and the abnormal threshold are calculated by the logistic regression value output by the model. After comparative analysis, it is shown that the machine abnormal sound detection performance of the model trained is improved by feature extraction of the original audio.
文章引用:朱鹏, 黎春玲, 郑荣璞, 刘琳, 魏喜庆, 吕品. 基于深度学习的机器异常声音检测[J]. 计算机科学与应用, 2023, 13(11): 2089-2096. https://doi.org/10.12677/CSA.2023.1311208

参考文献

[1] Wondimu, M. and Tekeba, M. (2019) Signal Based Ethiopian Languages Identification Using Gaussian Mixture Model. Zede Journal, 37, 39-54.
[2] 邵玉斌, 刘晶, 龙华, 等. 基于声道频谱参数的语种识别[J]. 北京邮电大学学报, 2021, 44(3): 112-119.
[3] Jiang, B., Song, Y., Wei, S., et al. (2014) Deep Bottleneck Features for Spoken Language Identification. PLOS ONE, 9, e100795. [Google Scholar] [CrossRef] [PubMed]
[4] Das, H.S. and Roy, P. (2021) A CNN-BiLSTM Based Hybrid Model for Indian Language Identification. Applied Acoustics, 182, Article ID: 108274. [Google Scholar] [CrossRef
[5] Koolagudi, S.G. (2012) Identification of Language Using Mel-Frequency Cepstral Coefficients (MFCC). Procedia Engineering, 38, 3391-3398. [Google Scholar] [CrossRef
[6] 张卫强, 刘加. 基于听感知特征的语种识别[J]. 清华人学学报(自然科学版), 2009, 49(1): 78-81.
[7] 孙颖, 姚慧, 张雪英, 等. 基于混沌特性的情感语音特征提取[J]. 天津大学学报(自然科学与工程技术版), 2015, 48(8): 681-685.
[8] 张科, 苏雨, 王靖宇, 等. 基于融合特征以及卷积神经网络的环境声音分类系统研究[J]. 西北工业大学学报, 2020, 38(1): 162-169.
[9] Al-Kaltakchi, M.T.S., Woo, W.L., Dlay, S.S., et al. (2016) Study of Fusion Strategies and Exploiting the Combination of MFCC and PNCC Features for Robust Biometric Speaker Identification. 2016 4th International Conference on Biometrics and Forensics (IWBF), Limassol, 3-4 March 2016, 1-6. [Google Scholar] [CrossRef
[10] 郑艳, 姜源样. 基于特征融合的说话人聚类算法[J]. 东北大学学报(自然科学版), 2021, 42(7): 952-959.
[11] Bhanja, C.C., Bisharad, D., Laskar, R.H., et al. (2019) Deep Residual Networks for Pre-Classification Based Indian Language Identification. Journal of In-telligent & Fuzzy Systems: Applications in Engineering and Technology, 36, 2207-2218. [Google Scholar] [CrossRef
[12] Montavon, G. (2009) Deep Learning for Spoken Language Identification. NIPS Workshop on Deep Learning for Speech Recognition and Related Applications, Vancouver, December 2009, 1-4.
[13] Deepti, D. (2020) A Language Identification System Using Hybrid Features and Back-Propagation Neural Network. Applied Acoustics, 164, Article ID: 107289. [Google Scholar] [CrossRef
[14] 周萍, 沈昊, 郑凯鹏. 基于MFCC与GFCC混合特征参数的说话人识别[J]. 应用科学学报, 2019, 37(1): 24-32.
[15] Anirban, B. (2021) Identification/Segmentation of Indian Regional Languages with Singular Value Decom-position Based Feature Embedding. Applied Acoustics, 176, Article ID: 107864. [Google Scholar] [CrossRef
[16] 邵玉斌, 刘晶, 龙华, 等. 面向实噪声环境的语种识别[J]. 北京邮电大学学报, 2021, 44(6): 134-140.
[17] 柯凯航. 基于深度学习的机器异常声音检测研究[D]: [硕士学位论文]. 汕头: 汕头大学, 2022.[CrossRef
[18] 薛英杰, 陈颀, 周松斌, 等. 基于自监督特征提取的机械异常声音检测[J]. 激光与光电子学进展, 2022, 59(12): 361-371.
[19] Sandler, M., Howard, A., Zhu, M., et al. (2018) MobileNetV2: Inverted Residuals and Linear Bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, 18-23 June 2018, 4510-4520. [Google Scholar] [CrossRef
[20] Purohit, H., Tanabe, R., Ichige, T., et al. (2019) MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection. [Google Scholar] [CrossRef
[21] Tanabe, R., Purohit, H., Dohi, K., Endo, T., Nikaido, Y., Nakamura, T. and Kawaguchi, Y. (2021) MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspec-tion with Domain Shifts Due to Changes in Operational and Environmental Conditions. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, 17-20 October 2021, 21-25. [Google Scholar] [CrossRef
[22] Harada, N., Niizumi, D., Takeuchi, D., Ohishi, Y., Yasuda, M. and Saito, S. (2021) ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anoma-lous Sound Detection under Domain Shift Conditions. Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), Barcelona, November 2021, 1-5.
[23] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H. and Endo, T. (2021) Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Detection for Machine Condition Monitoring under Do-main Shifted Conditions. Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Work-shop (DCASE2021), Barcelona, November 2021, 186-190.