基于ResNeXt的异常声音检测算法
Anomaly Sound Detection Algorithm Based on ResNeXt
DOI: 10.12677/mos.2024.136574, PDF,    科研立项经费支持
作者: 章 璇, 唐加山:南京邮电大学理学院,江苏 南京;周正康:南京城建隧桥智慧管理有限公司,江苏 南京
关键词: 声音异常检测无监督深度学习Anomaly Sound Detection Unsupervised Deep Learning
摘要: 本文提出了一种用于异常检测的新方法,结合了ResNeXt神经网络、改进的损失函数SCAdaCos以及高斯混合模型(GMM)进行异常判断。我们在风扇、泵、滑块、阀门和玩具车五种机器类型上进行了评估,仅使用正常声音数据进行训练。该架构从音频信号中提取log-mel特征,通过ResNeXt模型的组卷积实现高效的特征学习,增强了模型在处理复杂模式上的表现力。SCAdaCos损失函数引入子簇自适应性,使得每个类可以由多个中心表示,克服了单一中心的局限性,进而提升表示学习的精度。GMM则用于对学到的嵌入进行分类,基于负对数似然函数计算异常分数,并设立90%分位数作为阈值进行判断。与当前最优算法相比,我们的算法在AUC平均值上提高了2.43%,在pAUC上提高了6.27%,展示了该方法在不同机器类型上的优越性能。
Abstract: This paper proposes a novel approach for anomaly detection, combining the ResNeXt neural network, an improved loss function called SCAdaCos, and Gaussian Mixture Models (GMM) for anomaly classification. We evaluated this method on five machine types: fan, pump, slider, valve, and ToyCar, using only normal sound data for training. The architecture extracts log-mel features from audio signals and leverages the ResNeXt model’s group convolutions for efficient feature learning, enhancing its capability to handle complex patterns. The SCAdaCos loss function introduces sub-cluster adaptivity, allowing each class to be represented by multiple centers, overcoming the limitations of single-center representation and improving the precision of learned representations. GMM is employed to classify the learned embeddings, using the negative log-likelihood to represent anomaly scores, with the 90th percentile as the threshold for detection. Compared to the current state-of-the-art algorithms, our method demonstrates an average improvement of 2.43% in AUC and 6.27% in pAUC, highlighting its effectiveness across different machine types.
文章引用:章璇, 周正康, 唐加山. 基于ResNeXt的异常声音检测算法[J]. 建模与仿真, 2024, 13(6): 6274-6282. https://doi.org/10.12677/mos.2024.136574

参考文献

[1] Chandola, V., Banerjee, A. and Kumar, V. (2009) Anomaly Detection: A Survey. ACM Computing Surveys, 41, 1-58. [Google Scholar] [CrossRef
[2] Hershey, S., Chaudhuri, S., Ellis, D.P.W., Gemmeke, J.F., Jansen, A., Moore, R.C., et al. (2017) CNN Architectures for Large-Scale Audio Classification. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, 5-9 March 2017, 131-135. [Google Scholar] [CrossRef
[3] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F.Z., Isik, U. and Krishnaswamy, A. (2020) Unsupervised Anomalous Sound Detection Using Self-Supervised Classification and Group Masked Autoencoder for Density Estimation. Tech. Rep., DCASE2020 Challenge.
[4] Daniluk, P., Gozdziewski, M., Kapka, S. and Kosmider, M. (2020) Ensemble of Auto-Encoder Based Systems for Anomaly Detection. Tech. Rep., DCASE2020 Challenge.
[5] Primus, P. (2020) Reframing Unsupervised Machine Condition Monitoring as a Supervised Classification Task with Outlier-Exposed Classifiers. Tech. Rep., DCASE2020 Challenge.
[6] 薛英杰, 韩威, 等. 基于生成对抗单分类网络的异常声音检测[J]. 吉林大学学报(理学版), 2021, 59(6): 1517-1524.
[7] 姜慧天. 用于机器异常声音检测的自监督学习方法研究[D]: [硕士学位论文]. 成都: 电子科技大学, 2023.
[8] Wilkinghoff, K. (2021) Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection. 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, 18-22 July 2021, 1-8. [Google Scholar] [CrossRef
[9] Xie, S., Girshick, R., Dollar, P., Tu, Z. and He, K. (2017) Aggregated Residual Transformations for Deep Neural Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 1492-1500. [Google Scholar] [CrossRef
[10] Zhang, X., Zhao, R., Qiao, Y., Wang, X. and Li, H. (2019) AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 10815-10824. [Google Scholar] [CrossRef
[11] Grollmisch, S., Johnson, D., AbeBer, J. and Lukashevich, H. (2020) IAEO3—Combining OpenL3 Embeddings and Interpolation Autoencoder for Anomalous Sound Detection. Tech. Rep., DCASE2020 Challenge.
[12] Hayashi, T., Yoshimura, T. and Adachi, Y. (2020) Conformer-Based ID-Aware Autoen-Coder for Unsupervised Anomalous Sound Detection. Tech. Rep., DCASE2020 Challenge.
[13] Wilkinghoff, K. (2020) Anomalous Sound Detection with Look, Listen, and Learn Embeddings. Tech. Rep., DCASE2020 Challenge.
[14] Lopez, J., Hong, L., Lopez-Meyer, P., Nachman, L., Stemmer, G. and Huang, J. (2020) A Speaker Recognition Approach to Anomaly Detection. Tech. Rep., DCASE2020 Challenge.
[15] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F.Z., Isik, U. and Krishnaswamy, A. (2020) Un-Supervised Anomalous Sound Detection Using Self-Supervised Classification and Group Masked Autoencoder for Density Estimation. Tech. Rep., DCASE2020 Challenge.