|
[1]
|
Bahmaninezhad, F., Wu, J., Gu, R., Zhang, S.-X., Xu, Y., Yu, M. and Yu, D. (2019) A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation. Interspeech 2019, 20th Annual Conference of the In-ternational Speech Communication Association, Graz, 15-19 September 2019, 4574-4578. [Google Scholar] [CrossRef]
|
|
[2]
|
Abdi, H. and Williams, L.J. (2010) Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 433-459. [Google Scholar] [CrossRef]
|
|
[3]
|
Abdali, S. and NaserSharif, B. (2017) Non-Negative Matrix Factorization for Speech/Music Separation Using Source Dependent Decomposition Rank, Temporal Continuity Term and Fil-tering. Biomedical Signal Processing and Control, 36, 168-175. [Google Scholar] [CrossRef]
|
|
[4]
|
Ozerov, A., Vincent, E. and Bimbot, F. (2012) A General Flexible Framework for the Handling of Prior Information in Audio Source Separation. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1118-1133. [Google Scholar] [CrossRef]
|
|
[5]
|
Pons, J., Janer, J. and Rode, T. (2016) Remixing Music Using Source Separation Algorithms to Improve the Musical Experience of Cochlear Implant Users. Journal of the Acoustical Society of America, 140, 4338-4349. [Google Scholar] [CrossRef] [PubMed]
|
|
[6]
|
Heo, W.H., Kim, H. and Kwon, O.W. (2020) Source Separation Using Dilated Time-Frequency DenseNet for Music Identification in Broadcast Contents. Applied Sciences-Basel, 10, Ar-ticle No. 1727. [Google Scholar] [CrossRef]
|
|
[7]
|
Stöter, F.-R., Liutkus, A. and Ito, N. (2018) The 2018 Signal Sep-aration Evaluation Campaign. Springer International Publishing, Cham. [Google Scholar] [CrossRef]
|
|
[8]
|
Jao, P.-K., Su, L., Yang, Y.-H. and Wohlberg, B. (2016) Monaural Music Source Separation Using Convolutional Sparse Coding. IEEE-ACM Transactions on Audio Speech and Language Processing, 24, 2158-2170. [Google Scholar] [CrossRef]
|
|
[9]
|
Luo, Y. and Mesgarani, N. (2019) Conv-TasNet: Surpas-sing Ideal Time-Frequency Magnitude Masking for Speech Separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27, 1256-1266. [Google Scholar] [CrossRef]
|
|
[10]
|
Brandenburg, K. and Sporer, T. (1992) NMR and Masking Flag: Evaluation of Quality Using Perceptual Criteria. Audio Engineering Society Conference: 11th International Conference: Test & Measurement, Portland, Oregon, 29-31 May 1992, Paper No. 11-020. https://www.aes.org/e-lib/online/browse.cfm?elib=6276
|
|
[11]
|
Emiya, V., Vincent, E., Harlander, N. and Hohmann, V. (2011) Subjective and Objective Quality Assessment of Audio Source Separation. IEEE Transactions on Audio, Speech, and Language Processing, 19, 2046-2057. [Google Scholar] [CrossRef]
|
|
[12]
|
Cano, E., FitzGerald, D. and Brandenburg, K. (2016) Evaluation of Quality of Sound Source Separation Algorithms: Human Perception vs Quantitative Metrics. 2016 24th European Signal Processing Conference (EUSIPCO), Budapest, 29 August-2 September 2016, 1758-1762. [Google Scholar] [CrossRef]
|
|
[13]
|
Févotte, C., Gribonval, R. and Vincent, E. (2005) BSS_EVAL Toolbox User Guide—Revision 2.0.
|
|
[14]
|
Roux, J.L., Wisdom, S., Erdogan, H. and Hershey, J.R. (2019) SDR—Half-Baked or Well Done? ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, 12-17 May 2019, 626-630. [Google Scholar] [CrossRef]
|
|
[15]
|
ITU (2014) Recommendation ITU-R BS.1534-3: Method for the Subjective Assessment of Intermediate Quality Level of Audio Systems.
|
|
[16]
|
赵毅. 空间音频编码及多声道音频恢复技术研究[D]: [硕士学位论文]. 北京: 北京理工大学, 2015.
|
|
[17]
|
Rafii, Z., et al. (2017) MUSDB18—A Corpus for Music Separation.
|
|
[18]
|
Liu, H., Kong, Q. and Liu, J. (2021) CWS-PResUNet: Music Source Separation with Channel-Wise Subband Phase-Aware Resunet.
|
|
[19]
|
Diakogiannis, F.I., Waldner, F., Caccetta, P. and Wu, C. (2020) ResUNet-a: A Deep Learning Framework for Semantic Segmentation of Remotely Sensed Data. ISPRS Journal of Photogrammetry and Remote Sensing, 162, 94-114. [Google Scholar] [CrossRef]
|
|
[20]
|
Takahashi, N., Goswami, N. and Mitsufuji, Y. (2018) Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation. 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC), Tokyo, 17-20 Sep-tember 2018, 106-110. [Google Scholar] [CrossRef]
|
|
[21]
|
Iandola, F., et al. (2014) Densenet: Implementing Efficient Convnet Descriptor Pyramids.
|
|
[22]
|
Stöter, F.-R., Uhlich, S., Liutkus, A. and Mitsufuji, Y. (2019) Open-Unmix—A Reference Implementation for Music Source Separation. Journal of Open Source Software, 4, Article No. 1667. [Google Scholar] [CrossRef]
|
|
[23]
|
Lluís, F., Pons, J. and Serra, X. (2018) End-to-End Music Source Separation: Is It Possible in the Waveform Domain? Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, 15-19 September 2019, 4619-4623. [Google Scholar] [CrossRef]
|
|
[24]
|
Stoller, D., Ewert, S. and Dixon, S. (2018) Wave-u-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation.
|
|
[25]
|
Luo, Y. and Mesgarani, N. (2018) Tasnet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, 15-20 April 2018, 696-700. [Google Scholar] [CrossRef]
|
|
[26]
|
Défossez, A., et al. (2019) Demucs: Deep Ex-tractor for Music Sources with Extra Unlabeled Data Remixed.
|
|
[27]
|
Luo, Y., Chen, Z. and Yoshioka, T. (2020) Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation. ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, 4-8 May 2020, 46-50. [Google Scholar] [CrossRef]
|
|
[28]
|
Nachmani, E., Adi, Y. and Wolf, L. (2020) Voice Separation with an Unknown Number of Multiple Speakers. Proceedings of the 37th International Conference on Machine Learning, Vienna, 12-18 July 2020, 7164-7175.
|
|
[29]
|
Rafii, Z., Liutkus, A., Stöter, F.-R., Mimilakis, S.I., FitzGerald, D. and Pardo, B. (2018) An Overview of Lead and Accompaniment Separation in Music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26, 1307-1335. [Google Scholar] [CrossRef]
|
|
[30]
|
Hsu, C.-L. and Jang, J.-S.R. (2010) On the Improvement of Singing Voice Separation for Monaural Recordings Using the Mir-1k Dataset. IEEE Transactions on Audio, Speech, and Language Processing, 18, 310-319. [Google Scholar] [CrossRef]
|
|
[31]
|
Bittner, R.M., et al. (2014) MedleyDB: A Multitrack Dataset for Annotation-Intensive Mir Research. 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, 27-31 October 2014, 155-160.
|