基于非负矩阵分解的全球贸易缺失数据填补
Imputation of Missing Data for Global Trade Based on Non-Negative Matrix Factorization
摘要: 大数据时代,外贸企业对全球贸易数据高度依赖。但是数据缺失严重,给数据分析带来不便。本文提出用非负矩阵分解填补缺失数据;构造并实现了填补算法。实验通过和线性插值填补法进行对比,证明非负矩阵分解更适合应用于缺失数据填补,同时能够提取主题进出口矩阵,帮助人们理解贸易状况。
Abstract: In the era of big data, foreign trade enterprises are highly dependent on global trade data. However, serious data loss brings inconvenience to data analysis. In this paper, non-negative matrix factorization is proposed to impute the missing data; an imputation algorithm is constructed and implemented. The experiment proves that non-negative matrix factorization is more suitable for imputation of missing data and can extract the topic import-export matrices to help people understand the trade situation by comparing with the imputation method by linear interpolation.
文章引用:宋丛威, 张晓明. 基于非负矩阵分解的全球贸易缺失数据填补[J]. 计算机科学与应用, 2022, 12(9): 2094-2105. https://doi.org/10.12677/CSA.2022.129212

参考文献

[1] Cichocki, A., Zdunek, R., Phan, A.H. and Amari, S. (2009) Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multiway Data Analysis and Blind Source Separation. John Wiley & Sons, Ltd., Hoboken. [Google Scholar] [CrossRef
[2] Lee, D.D. and Seung, H.S. (2000) Algorithms for Non-Negative Matrix Factoriza-tion. Neural Information Processing Systems 2000, Vol. 13, 556-562.
[3] Lin, C.J. (2007) On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization. IEEE Transactions on Neural Networks, 18, 1589-1596. [Google Scholar] [CrossRef
[4] Hastie, T., Tibshirani, R. and Friedman, J. (2009) The Elements of Statistical Learning. 2nd Edition, Springer, Berlin, 44-49. [Google Scholar] [CrossRef
[5] Wang, Y.X. and Zhang, Y.J. (2013) Nonnegative Matrix Factorization: A Comprehensive Review. IEEE Transactions on Knowledge and Data Engineering, 25, 1336-1353. [Google Scholar] [CrossRef
[6] Brouwer, T. (2017) Bayesian Matrix Factorisation: Inference, Priors, and Data Integration. University of Cambridge, Cambridge.
[7] Gouvert, O., Oberlin, T. and Fevotte, C. (2020) Negative Binomial Matrix Factorization. IEEE Signal Processing Letters, 27, 815-819. [Google Scholar] [CrossRef
[8] Simchowitz, M. (2013) Zero-Inflated Poisson Factorization for Recommendation Systems. Princeton Department of Mathematics, Princeton.
https://msimchowitz.github.io/JuniorPaper.pdf
[9] Poisson Distribution.
https://en.wikipedia.org/wiki/Poisson_distribution
[10] Yoo, J. and Choi, S. (2009) Probabilistic Matrix Tri-Factorization. IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, 19-24 April 2009, 1553-1556. [Google Scholar] [CrossRef
[11] Kim, Y. and Choi, S. (2009) Weighted Nonnegative Matrix Factorization. 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, 19-24 April 2009, 1541-1544. [Google Scholar] [CrossRef
[12] Zhang, S., Wang, W., James, F.J. and Makedon, F. (2006) Learning from Incomplete Ratings Using Non-Negative Matrix Factorization. Proceedings of the 2006 SIAM International Conference on Data Min-ing (SDM), Bethesda, 20-22 April 2006, 549-553. [Google Scholar] [CrossRef
[13] Buitinck, L., Louppe, G. and Blondel, M. (2013) API Design for Machine Learning Software: Experiences from the Scikit-Learn Project. ECML PKDD Work-shop: Languages for Data Mining and Machine Learning, Prague, September 2013, 108-122.
[14] Harshman, R.A. (1970) Founda-tions of the PARAFAC Procedure: Models and Conditions for an “Explanatory” Multimodal Factor Analysis.
[15] Kim, J., He, Y. and Park, H. (2014) Algorithms for Nonnegative Matrix and Tensor Factorizations: A Unified View Based on Block Coordinate Descent Framework. Journal of Global Optimization, 58, 285-319. [Google Scholar] [CrossRef
[16] Kolda, T.G. and Bader, B.W. (2009) Tensor Decompositions and Applications. SIAM Review, 51, 455-500. [Google Scholar] [CrossRef
[17] Hoffman, M.D. (2012) Poisson-Uniform Nonnegative Matrix Factorization. IEEE In-ternational Conference on Acoustics, Speech and Signal Processing, Kyoto, 25-30 March 2012, 5361-5364. [Google Scholar] [CrossRef