|
[1]
|
Comon, P. (1994) Independent Component Analysis, a New Concept? Signal Processing, 36, 287-314. [Google Scholar] [CrossRef]
|
|
[2]
|
Virtanen, T. (2007) Monaural Sound Source Separation by Nonnegative Matrix Factorization with Temporal Continuity and Sparseness Criteria. IEEE Transactions on Audio, Speech and Language Processing, 15, 1066-1074. [Google Scholar] [CrossRef]
|
|
[3]
|
Stöter, F., Uhlich, S., Liutkus, A. and Mitsufuji, Y. (2019) Open-Unmix—A Reference Implementation for Music Source Separation. Journal of Open Source Software, 4, Article No. 1667. [Google Scholar] [CrossRef]
|
|
[4]
|
Défossez, A., Usunier, N., Bottou, L. and Bach, F. (2021) Hybrid Spectrogram and Waveform Source Separation. Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, 6-14 December 2021.
|
|
[5]
|
Luo, Y. and Mesgarani, N. (2019) Conv-Tasnet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27, 1256-1266. [Google Scholar] [CrossRef] [PubMed]
|
|
[6]
|
Subakan, C., Ravanelli, M., Cornell, S., Bronzi, M. and Zhong, J. (2021) Attention Is All You Need in Speech Separation. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, 6-11 June 2021, 21-25.
|
|
[7]
|
ITU-T (1996) Recommendation P.800: Methods for Subjective Determination of Transmission Quality. International Tele-Communication Union.
|
|
[8]
|
Luo, Y., Chen, J., Du, J. and Yoshioka, T. (2023) DiffSep: Leveraging Diffusion Models for Speech Separation. IEEE Proceedings of ICASSP 2023, Rhodes Island, 4-10 June 2023, 1-5.
|
|
[9]
|
Luo, Y., Lin, Z.-Q., Zhang, J. and Mesgarani, N. (2022) TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation. Proceedings of Interspeech 2022, Incheon, 18-22 September 2022, 2768-2772.
|
|
[10]
|
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W. and Hu, Q. (2020) ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 14-19 June 2020, 11531-11539. [Google Scholar] [CrossRef]
|
|
[11]
|
He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef]
|
|
[12]
|
Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., et al., Eds., Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Springer International Publishing, 234-241. [Google Scholar] [CrossRef]
|
|
[13]
|
Rafii, Z., Liutkus, A., Stoeter, F.-R., Mimilakis, S.I. and Bittner, R. (2017) The MUSDB18 Corpus for Music Separation. Machine Learning for Signal Processing (MLSP). https://sigsep.github.io/datasets/musdb.html
|
|
[14]
|
Vincent, E., Gribonval, R. and Fevotte, C. (2006) Performance Measurement in Blind Audio Source Separation. IEEE Transactions on Audio, Speech and Language Processing, 14, 1462-1469. [Google Scholar] [CrossRef]
|
|
[15]
|
Le Roux, J., Weiss, R.J. and Kinoshita, K. (2019) SNR-Based Objective Evaluation of Source Separation Methods. IEEE Transactions on Audio, Speech, and Language Processing, 27, 929-941.
|