带有错误标签的张量数据的稳健多分类模型
Robust Multiclass Models for Mislabeled Tensor Data
DOI: 10.12677/orf.2024.143262, PDF,    国家自然科学基金支持
作者: 张家瑞*, 樊亚莉#:上海理工大学理学院,上海
关键词: 图像多分类错误标签低秩张量张量管道秩机器学习Image Multi-Classification Mislabel Low-Rank Tensor Tensor Tubal Rank Machine Learning
摘要: 传统机器学习方法大多都是基于正确标签的训练数据进行监督学习,但实际观测到的训练数据标签极可能受到污染,而错误标签的存在会导致传统模型产生有偏估计。现存的关于错误标签的稳健模型往往基于向量数据进行分类,面对存在错误标签的高阶张量数据时只能将其转化为低阶格式,由此产生过拟合问题且破坏张量结构。针对上述问题提出一种稳健的张量多分类模型(RMLTMLR),基于最小γ-散度估计、张量管道秩及相应的核范数来处理带有错误标签的低秩张量,在利用张量结构特点的同时使模型对污染标签具有稳健性,提高多分类准确率。进行的大量实验表明RMLTMLR模型在不同类别和污染程度的张量数据上有着优良的分类效果,与非稳健的模型相比,分类准确率显著提升。
Abstract: Most of the traditional machine learning methods perform supervised learning based on training data with correct labels. However, the actual observed training data labels are likely to be contaminated, and the existence of wrong labels will lead to biased estimates of the traditional model. The existing robust models for mislabel classification are often based on vector data. When facing high-order tensor data with mislabels, they have to transform it into low-order format, resulting in overfitting problem and damage to the tensor structure. Aiming at the above problems, a robust tensor multi-classification model (RMLTMLR) is proposed, which is based on minimum γ-divergence estimation, tensor tubal rank and the corresponding nuclear norm to deal with low-rank tensors with wrong labels. The model is robust to contaminated labels while taking advantage of the structural characteristics of tensors, and improves the accuracy of multi-classification. A large number of experiments show that the RMLTMLR model has excellent classification effects on tensor data with different categories and pollution levels, and the classification accuracy is significantly improved compared with the non-robust model.
文章引用:张家瑞, 樊亚莉. 带有错误标签的张量数据的稳健多分类模型[J]. 运筹与模糊学, 2024, 14(3): 242-255. https://doi.org/10.12677/orf.2024.143262

参考文献

[1] He, K., Zhang, X., Ren, S. and Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778.[CrossRef
[2] Hu, J., Shen, L. and Sun, G. (2018). Squeeze-and-Excitation Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7132-7141.[CrossRef
[3] Teng, C.M. (2000) Evaluating Noise Correction. In: Mizoguchi, R. and Slaney, J., Eds., PRICAI 2000 Topics in Artificial Intelligence, Springer, Berlin, 188-198. [Google Scholar] [CrossRef
[4] Teng, C.M. (2001) A Comparison of Noise Handling Techniques. Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference, Key West, 21-23 May 2001, 269-273.
[5] Teng, C.M. (2005) Dealing with Data Corruption in Remote Sensing. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A. and Feelders, A., Eds., Advances in Intelligent Data Analysis VI, Springer, Berlin, 452-463. [Google Scholar] [CrossRef
[6] Brodley, C.E. and Friedl, M.A. (19996) Identifying and Eliminating Mislabeled Training Instances. Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, 4-8 August 1996, 799-805.
[7] Copas, J.B. (1988) Binary Regression Models for Contaminated Data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 50, 225-253. [Google Scholar] [CrossRef
[8] Wainer, H., Bradlow, E.T. and Wang, X. (2007). Testlet Response Theory and Its Applications. Cambridge University Press, Cambridge.[CrossRef
[9] Komori, O., Eguchi, S., Ikeda, S., Okamura, H., Ichinokawa, M. and Nakayama, S. (2015) An Asymmetric Logistic Regression Model for Ecological Data. Methods in Ecology and Evolution, 7, 249-260. [Google Scholar] [CrossRef
[10] Hayashi, K. (2011) A Boosting Method with Asymmetric Mislabeling Probabilities Which Depend on Covariates. Computational Statistics, 27, 203-218. [Google Scholar] [CrossRef
[11] Takenouchi, T. and Eguchi, S. (2004) Robustifying Adaboost by Adding the Naive Error Rate. Neural Computation, 16, 767-787. [Google Scholar] [CrossRef] [PubMed]
[12] Hung, H., Jou, Z. and Huang, S. (2017) Robust Mislabel Logistic Regression without Modeling Mislabel Probabilities. Biometrics, 74, 145-154. [Google Scholar] [CrossRef] [PubMed]
[13] Song, K., Nie, F., Han, J. and Li, X. (2017) Parameter Free Large Margin Nearest Neighbor for Distance Metric Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31, 2555-2561. [Google Scholar] [CrossRef
[14] Cai, D., He, X., Hu, Y., Han, J. and Huang, T. (2007). Learning a Spatially Smooth Subspace for Face Recognition. 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, 17-22 June 2007, 1-7.[CrossRef
[15] Liu, J., Zhu, C., Long, Z., Huang, H. and Liu, Y. (2021) Low-Rank Tensor Ring Learning for Multi-linear Regression. Pattern Recognition, 113, Article ID: 107753. [Google Scholar] [CrossRef
[16] Koniusz, P., Wang, L. and Cherian, A. (2022) Tensor Representations for Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 648-665. [Google Scholar] [CrossRef] [PubMed]
[17] Tao, D., Li, X.L., Hu, W.M., et al. (2005) Supervised Tensor Learning. Fifth IEEE International Conference on Data Mining (ICDM’05), Houston, 27-30 November 2005, 8.
[18] Kotsia, I., Guo, W. and Patras, I. (2012) Higher Rank Support Tensor Machines for Visual Recognition. Pattern Recognition, 45, 4192-4203. [Google Scholar] [CrossRef
[19] Irene, K. and Ioannis, P. (2011) Support Tucker Machines. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, 20-25 June 2011, 633-640.
[20] Tan, X., Zhang, Y., Tang, S.L., et al. (2012) Logistic Tensor Regression for Classification. Proceedings of the Third Sino-Foreign-Interchange Conference on Intelligent Science and Intelligent Data Engineering, Nanjing, 15-17 October 2012, 589-597.
[21] 张家瑞, 胡毓榆, 唐开煜, 樊亚莉. 基于张量低管道秩的图像多分类模型[J]. 建模与仿真, 2024, 13(3): 3980-3997.
[22] Kilmer, M.E. and Martin, C.D. (2011) Factorization Strategies for Third-Order Tensors. Linear Algebra and Its Applications, 435, 641-658. [Google Scholar] [CrossRef
[23] Lu, C., Feng, J., Chen, Y., Liu, W., Lin, Z. and Yan, S. (2020) Tensor Robust Principal Component Analysis with a New Tensor Nuclear Norm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 925-938. [Google Scholar] [CrossRef] [PubMed]
[24] Kilmer, M.E., Braman, K., Hao, N. and Hoover, R.C. (2013) Third-Order Tensors as Operators on Matrices: A Theoretical and Computational Framework with Applications in Imaging. SIAM Journal on Matrix Analysis and Applications, 34, 148-172. [Google Scholar] [CrossRef
[25] Zhang, Z., Ely, G., Aeron, S., Hao, N. and Kilmer, M. (2014). Novel Methods for Multilinear Data Completion and De-Noising Based on Tensor-SVD. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 3842-3849.[CrossRef
[26] Jones, M.C. (2001) A Comparison of Related Density-based Minimum Divergence Estimators. Biometrika, 88, 865-873. [Google Scholar] [CrossRef
[27] Fujisawa, H. and Eguchi, S. (2008) Robust Parameter Estimation with a Small Bias against Heavy Contamination. Journal of Multivariate Analysis, 99, 2053-2081. [Google Scholar] [CrossRef
[28] Kanamori, T. and Fujisawa, H. (2015) Robust Estimation Under Heavy Contamination Using Unnormalized Models. Biometrika, 102, 559-572. [Google Scholar] [CrossRef
[29] Hu, Y., Fan, Y., Song, Y. and Li, M. (2023) A General Robust Low-Rank Multinomial Logistic Regression for Corrupted Matrix Data Classification. Applied Intelligence, 53, 18564-18580. [Google Scholar] [CrossRef
[30] Yin, M., Zeng, D., Gao, J., Wu, Z. and Xie, S. (2018) Robust Multinomial Logistic Regression Based on RPCA. IEEE Journal of Selected Topics in Signal Processing, 12, 1144-1154. [Google Scholar] [CrossRef
[31] 张齐航. 信用评分中拒绝推断问题的研究[D]: [硕士学位论文]. 厦门: 厦门大学, 2021.[CrossRef