基于张量低管道秩的图像多分类模型
Image Multi-Classification Model Based on Tensor Low Tubal Rank
摘要: 传统机器学习方法在对高阶张量数据进行分类时,往往将其转化为低阶格式,由此会产生过拟合问题并且破坏张量的结构。针对上述问题提出一种基于张量低管道秩的多分类模型(LRTMLR)。该模型可以直接对张量格式的图像进行分类,使用由张量–张量积诱导的张量管道秩及相应的张量核范数来处理低秩张量,更好地利用张量结构特点,提高张量格式图像的多分类准确性。在三分类仿真数据集上,LRTMLR的分类准确率较无结构信息(MLR)、带矩阵结构信息(LRMLR)的方法均提升9.6个百分点,在五分类仿真数据集上则分别提升23.2和25.2个百分点。在加州理工大学的101类彩色图像识别数据集的三分类、五分类和十四分类子集上,LRTMLR的分类准确率较MLR分别提升了10.01、25.61和40.78个百分点,较LRMLR分别提升了10.68、25.61和40.78个百分点,与基于CP分解的方法(MCPLR)相比提高了6.47、13.37和27.73个百分点,与基于Tucker分解的方法(MTuLR)相比提高了1.79、12.38和13.71个百分点。并在消融实验中证明了创新的有效性。
Abstract: Traditional machine learning methods often convert high-order tensor data into a lower-order format for classification purposes. However, this approach can result in overfitting and the loss of tensor structure. To address these issues, this paper proposes a multi-classification model based on tensor low-tubal-rank (LRTMLR). The LRTMLR model directly classifies images in tensor format and utilizes the tensor pipe rank induced by tensor-tensor product and corresponding tensor kernel norm to handle low-rank tensors. This enables better utilization of the characteristics of the tensor structure and improves the accuracy of multi-classification for images in tensor format. The classification accuracy of LRTMLR was 9.6 percentage points higher than that of the methods without structure information (MLR) and with matrix structure information (LRMLR) on the three-class simulation data set, and 23.2 and 25.2 percentage points higher on the five-class simulation data set, respectively. On the three-class, five-class and 14-class subsets of the 101-class color image recognition dataset of Caltech University, the classification accuracy of LRTMLR was 10.01, 25.61 and 40.78 percentage points higher than that of MLR, and 10.68, 25.61 and 40.78 percentage points higher than that of LRMLR. Compared with the method based on CP decomposition (MCPLR), it is 6.47, 13.37 and 27.73 percentage points higher, and compared with the method based on Tucker decomposition (MTuLR), it is 1.79, 12.38 and 13.71 percentage points higher. The effectiveness of the innovation is demonstrated in ablation experiments.
文章引用:张家瑞, 胡毓榆, 唐开煜, 樊亚莉. 基于张量低管道秩的图像多分类模型[J]. 建模与仿真, 2024, 13(3): 3980-3997. https://doi.org/10.12677/mos.2024.133362

参考文献

[1] 李航. 统计学习方法[M]. 第2版. 北京: 清华大学出版社, 2019.
[2] Corinna, C. and Vladimir, N.V. (1995) Support-Vector Networks. Machine Learning, 20, 273-297. [Google Scholar] [CrossRef
[3] Gregory, S., Trevor, D. and Piotr, I. (2008) Nearest-Neighbor Methods in Learning and Vision. IEEE Transactions on Neural Networks, 19, 377. [Google Scholar] [CrossRef
[4] Chanyeong, K. and Alan, C.-M. (2002) Multinomial Logistic Regression. Nursing Research, 51, 404-410. [Google Scholar] [CrossRef] [PubMed]
[5] Böhning, D. (1992) Multinomial Logistic Regression Algorithm. Annals of the Institute of Statistical Mathematics, 44, 197-200. [Google Scholar] [CrossRef
[6] Erb, R.J. (1993) Introduction to Backpropagation Neural Network Computation. Pharmaceutical Research, 10, 165-170. [Google Scholar] [CrossRef
[7] Neelapu, R., Devi, G.L. and Rao, K.S. (2018) Deep Learning Based Conventional Neural Network Architecture for Medical Image Classification. Traitement du Signal, 35, 169-182.
[8] Song, K., Nie, F.P., Han, J.W., et al. (2017) Parameter Free Large Margin nearest Neighbor for Distance Metric Learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, 4-9 February 2017, 2555-2561. [Google Scholar] [CrossRef
[9] Cai, D., He, X.F., Hu, Y.X., et al. (2007) Learning a Spatially Smooth Subspace for Face Recognition. Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, 17-22 June 2007, 1-7. [Google Scholar] [CrossRef
[10] Gabriel, K.R. (1998) Generalised Bilinear Regression. Biometrika, 85, 689-700. [Google Scholar] [CrossRef
[11] Hou, C.P., Nie, F.P., Yi, D. and Wu, Y. (2012) Efficient Image Classification via Multiple Rank Regression. IEEE Transactions on Image Processing, 22, 340-352. [Google Scholar] [CrossRef
[12] Hou, C.P., Jiao, Y.Y., Nie, F.P., et al. (2017) 2D Feature Selection by Sparse Matrix Regression. IEEE Transactions on Image Processing, 26, 4255-4268. [Google Scholar] [CrossRef
[13] Yuan, H.L., Li, J.Y., Loi, L.L., et al. (2020) Low-Rank Matrix Regression for Image Feature Extraction and Feature Selection. Information Sciences, 522, 214-226. [Google Scholar] [CrossRef
[14] Hu, Y.Y., Fan, Y.L., Song, Y., et al. (2023) A General Robust Low-Rank Multinomial Logistic Regression for Corrupted Matrix Data Classification. Applied Intelligence, 53, 18564-18580. [Google Scholar] [CrossRef
[15] Liu, J.N., Zhu, C., Long, Z., et al. (2021) Low-Rank Tensor Ring Learning for Multi-Linear Regression. Pattern Recognition, 113, Article ID: 107753. [Google Scholar] [CrossRef
[16] Koniusz, P., Wang, L. and Cherian, A. (2022) Tensor Representations for Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 648-665. [Google Scholar] [CrossRef
[17] Tao, D.C., Li, X.L., Hu, W.M., et al. (2005) Supervised Tensor Learning. Proceedings of the 5th IEEE International Conference on Data Mining (ICDM’05), Houston, 27-30 November 2005, 8.
[18] Kotsia, I., Guo, W.W. and Patras, I. (2012) Higher Rank Support Tensor Machines for Visual Recognition. Pattern Recognition, 45, 4192-4203. [Google Scholar] [CrossRef
[19] Kotsia, I. and Patras, I. (2011) Support Tucker Machines. Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, 20-25 June 2011, 633-640.
[20] Tan, X., Zhang, Y., Tang, S.L., et al. (2012) Logistic Tensor Regression for Classification. Proceedings of the Third Sino-Foreign-Interchange Conference on Intelligent Science and Intelligent Data Engineering, Nanjing, 15-17 October 2012, 589-597.
[21] Kolda, T.G. and Bader, B.W. (2009) Tensor Decompositions and Applications. Society for Industrial and Applied Mathematics, 51, 455-500. [Google Scholar] [CrossRef
[22] Chen, C., Batselier, K., Ko, C.-Y., et al. (2018) A Support Tensor Train Machine. Proceedings of the 2019 International Joint Conference on Neural Networks, Budapest, 14-19 July 2019, 1-8. [Google Scholar] [CrossRef
[23] Chen, C., Batselier, K., Yu, W.J., et al. (2022) Kernelized Support Tensor Train Machines. Pattern Recognition, 122, Article ID: 108337. [Google Scholar] [CrossRef
[24] Yang, J.-H., Zhao, X.-L., Ji, T.-Y., et al. (2020) Low-Rank Tensor Train for Tensor Robust Principal Component Analysis. Applied Mathematics and Computation, 367, Article ID: 124783. [Google Scholar] [CrossRef
[25] Liu, J., Musialski, P., Wonka, P., et al. (2013) Tensor Completion for Estimating Missing Values in Visual Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 208-220. [Google Scholar] [CrossRef
[26] Gandy, S., Recht, B. and Yamada, I. (2011) Tensor Completion and Low-n-Rank Tensor Recovery via Convex Optimization. Inverse Problems, 27, Article ID: 025010. [Google Scholar] [CrossRef
[27] Tomioka, R., Hayashi, K. and Kashima, H. (2010) Estimation of Low-Rank Tensors via Convex Optimization. arXiv:1010.0789
[28] Lu, C.Y., Feng, J.S., Liu, W., et al. (2020) Tensor Robust Principal Component Analysis with a New Tensor Nuclear Norm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 925-938. [Google Scholar] [CrossRef
[29] Wang, H.L., Zhang, F., Wang, J.Y., et al. (2020) Estimating Structural Missing Values via Low-Tubal-Rank Tensor Completion. 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, 4-8 May 2020, 3297-3330. [Google Scholar] [CrossRef
[30] Kilmer, M.E. and Martin, C.D. (2011) Factorization Strategies for Third-Order Tensors. Linear Algebra and Its Applications, 435, 641-695. [Google Scholar] [CrossRef
[31] Kilmer, M.E., Braman, K., Hao, N. and Hoover, R.C. (2013) Third-Order Tensors as Operators on Matrices: A Theoretical and Computational Framework with Applications in Imaging. SIAM Journal on Matrix Analysis and Applications, 34, 148-172. [Google Scholar] [CrossRef
[32] Zhang, Z.M., Ely, G., Aeron, S., et al. (2014) Novel Methods for Multilinear Data Completion and De-Noising Based on Tensor-SVD. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 3842-3849. [Google Scholar] [CrossRef
[33] 张建光, 韩亚洪. 基于张量分解的逻辑回归图像分类算法[EB/OL].
http://www.paper.edu.cn/releasepaper/content/201605-772, 2016-05-20.
[34] Yin, M., Zeng, D.Y., Gao, J.B., et al. (2018) Robust Multinomial Logistic Regression Based on RPCA. IEEE Journal of Selected Topics in Signal Processing, 12, 1144-1154. [Google Scholar] [CrossRef
[35] Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J. (2011) Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends® in Machine Learning, 3, 1-122. [Google Scholar] [CrossRef
[36] Tang, W., Hu, J., Zhang, H., et al. (2015) Kappa Coefficient: A Popular Measure of Rater Agreement. Shanghai Archives of Psychiatry, 27, 62-67.
[37] 李鸿吉. 模糊数学基础及实用算法[M]. 北京: 科学出版社, 2005.