基于噪声自适应加权的GCC时延估计算法

doi:10.12677/jisp.2026.152014

期刊菜单

基于噪声自适应加权的GCC时延估计算法
A GCC Time Delay Estimation Algorithm Based on Noise-Adaptive Weighting

DOI: 10.12677/jisp.2026.152014, PDF,
作者: 甘坦^*, 康绍绕, 熊林：江西理工大学理学院，江西赣州
关键词: 声源定位；广义互相关；时延估计；自适应加权；Sound Source Localization； Generalized Cross-Correlation (GCC)； Time Delay Estimation； Adaptive Weighting

摘要: 在强噪声环境下，传统GCC时延估计方法性能会急剧下降，主要原因是噪声会干扰信号的互相关函数，造成旁瓣抬升和主瓣展宽。此时，峰值所对应的时延会出现偏差。广义互相关(Generalized Cross-Correlation, GCC)通过引入加权函数来减轻噪声带来的影响，例如，PHAT加权通过相位归一化减轻信号幅度对时延估计的影响，但它对所有频率分量一视同仁，不能有效区分高信噪比的有效信号频带和噪声占主导的低信噪比频带。为此，本文提出了一种噪声自适应加权函数(Noise-Robust Adaptive Weighting, NRAW)，对GCC-PHAT的输出进行二次频域加权，具体的权重根据局部信噪比来动态调节，高信噪比分量保持PHAT特性，低信噪比分量则被大幅抑制。该方法无需预设的噪声模型，仅利用信号起始段的静音假设，就能针对性地抑制噪声频带。实验结果显示，在SNR = −5 dB的强噪声条件下，NRAW方法的平均绝对误差较GCC的7.14˚降低到4.57˚。同时，在5˚误差容限内，方位角估计准确率从50%提高到68%，这充分验证了该自适应加权策略在抑制噪声干扰和提升定位鲁棒性上的显著效果。

Abstract: In strong noise environments, the performance of traditional GCC time delay estimation methods degrades sharply, primarily because noise interferes with the signal’s cross-correlation function, leading to sidelobe elevation and mainlobe broadening. As a result, deviations occur in the time delay corresponding to the peak value. The Generalized Cross-Correlation (GCC) mitigates the impact of noise by introducing weighting functions; for instance, PHAT weighting reduces the influence of signal amplitude on time delay estimation through phase normalization. However, it treats all frequency components equally and fails to effectively distinguish between effective signal bands with high Signal-to-Noise Ratio (SNR) and noise-dominated bands with low SNR. To address this, this paper proposes a Noise-Robust Adaptive Weighting (NRAW), which applies secondary frequency-domain weighting to the GCC-PHAT output. The specific weights are dynamically adjusted based on local SNR: high-SNR components retain PHAT characteristics, while low-SNR components are significantly suppressed. This method requires no preset noise models and achieves targeted suppression of noise frequency bands solely by leveraging the silence assumption in the initial segment of the signal. Experimental results show that, under strong noise conditions with SNR = −5 dB, the NRAW method reduces the mean absolute error from 7.14˚ for GCC to 4.57˚. Meanwhile, within a 5˚ error tolerance, the azimuth estimation accuracy improves from 50% to 68%. This fully validates the significant effectiveness of the proposed adaptive weighting strategy in suppressing noise interference and enhancing localization robustness.

文章引用：甘坦, 康绍绕, 熊林. 基于噪声自适应加权的GCC时延估计算法[J]. 图像与信号处理, 2026, 15(2): 165-173. https://doi.org/10.12677/jisp.2026.152014

参考文献

[1]	Dai, X.M., Lou, W.Z., Liu, P., et al. (2014) Speaker Tracking Based on Microphone Cross Array in the Smart Conference System. 2014 IEEE International Conference on Consumer Electronics—China. Shenzhen, 9-13 April 2014, 1-4. [Google Scholar] [CrossRef]
[2]	柯显信, 张文朕, 杨阳, 等. 仿人机器人多传感器定位系统[J]. 浙江大学学报(工学版), 2018, 52(7): 1247-1252.
[3]	李典航, 王申营, 程骏, 等. 结合Sigmoid函数的自适应时延估计算法研究[J]. 自动化仪表, 2025, 46(4): 31-35+41.
[4]	徐彬, 马鹏飞, 张恒远, 等. 声源定位系统的广义二次互相关算法改进与测试[J]. 现代电子技术, 2025, 48(13): 133-137.
[5]	李保伟, 张兴敢. 基于广义互相关改进的麦克风阵列声源定位方法[J]. 南京大学学报(自然科学), 2020, 56(6): 917-922.
[6]	陈海宏, 易永利, 韩钰, 等. 基于频率滑动广义互相关的电力噪声源定位算法[J]. 声学技术, 2024, 43(4): 550-556.
[7]	梁家碧, 邵剑, 李群, 等. 一种用于声源定位的改进的广义互相关时延估计算法[J]. 光通信技术, 2024, 48(6): 28-33.
[8]	程方晓, 刘璐, 姚清华, 等. 基于改进时延估计的声源定位算法[J]. 吉林大学学报(理学版), 2018, 56(3): 681-687.
[9]	简泽明, 彭阳, 高泽平, 等. 基于改进二次相关算法的声源定位仿真研究[J]. 压电与声光, 2021, 43(2): 244-247, 293.
[10]	冯斌, 赵军峰, 郭强. 自适应广义互相关的声阵列炸点定位[J]. 探测与控制学报, 2023, 45(5): 60-65.
[11]	Li, Z.P. and Li, Z.C. (2025) Reference Coordinate Based Chan Algorithm for UWB Personnel Localization in Underground Coal Mines. Scientific Reports, 15, Article No. 17922. [Google Scholar] [CrossRef] [PubMed]
[12]	Diaz-Guerra, D., Miguel, A. and Beltran, J.R. (2021) gpuRIR: A Python Library for Room Impulse Response Simulation with GPU Acceleration. Multimedia Tools and Applications, 80, 11757-11775. [Google Scholar] [CrossRef]
[13]	Panayotov, V., Chen, G., Povey, D., et al. (2015) Librispeech: An ASR Corpus Based on Public Domain Audio Books. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, 19-24 April 2015, 5206-5210. [Google Scholar] [CrossRef]
[14]	Varga, A. and Steenenkel, H.J.M. (1993) Assessment for Automatic Speech Recognition: II. NOISEX-92: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems. Speech Communication, 12, 247-251. [Google Scholar] [CrossRef]

友情链接