双层规划在支持向量机超参数选取问题的应用
The Application of Bilevel Optimization in Support Vector Machine Hyperparameter Selection Problem
摘要: 支持向量机(SVM)作为一种高效的分类模型,其性能在很大程度上取决于超参数的选择。本文将SVM的超参数选择问题重新构建为一个双层规划问题,结合正向模式法和梯度下降法来解决这一问题,从而获得优化后的SVM模型。为了应对高维数据的挑战,本文采用了主成分分析法(PCA)对原始数据进行降维处理,从而提升了SVM模型在高维小样本数据上的表现。通过与当前流行的三种方法:网格搜索、贝叶斯优化和模拟退火算法进行比较,结果表明,采用双层规划方法得到的SVM模型准确率为98.2%,召回率为100%,训练时间为0.768 s,分别优于其他三种方法,说明本文提出的方法得到的模型具有更好的预测效果。
Abstract: As an efficient classification model, the performance of Support Vector Machine (SVM) depends largely on the hyperparameter selection. In this paper, the hyperparameter selection problem of SVM is reconstructed into a bilevel optimization problem, which is combined with the forward mode method and Gradient descent method to solve this problem, resulting in an optimized SVM model. To tackle the challenges posed by high-dimensional data, this paper employs Principal Component Analysis (PCA) for dimensionality reduction on the original data, thereby enhancing the performance of the SVM model on high-dimensional, small-sample datasets. Comparing the results with three currently popular methods—grid search, Bayesian optimization, and simulated annealing—shows that the SVM model obtained through the proposed bilevel optimization method achieves an accuracy of 98.2%, a recall of 100%, and a training time of 0.768 seconds, outperforming the other three methods. This indicates that the model obtained through our proposed approach has better predictive effectiveness.
文章引用:陈骞, 徐梦薇. 双层规划在支持向量机超参数选取问题的应用[J]. 应用数学进展, 2024, 13(10): 4601-4609. https://doi.org/10.12677/aam.2024.1310441

参考文献

[1] Okuno, T., Takeda, A., Kawana, A., et al. (2021) On lp-Hyperparameter Learning via Bilevel Nonsmooth Optimization. Journal of Machine Learning Research, 22, 1-47.
[2] Kunisch, K. and Pock, T. (2013) A Bilevel Optimization Approach for Parameter Learning in Variational Models. SIAM Journal on Imaging Sciences, 6, 938-983. [Google Scholar] [CrossRef
[3] Qi, H. (2013) A Semismooth Newton Method for the Nearest Euclidean Distance Matrix Problem. SIAM Journal on Matrix Analysis and Applications, 34, 67-93. [Google Scholar] [CrossRef
[4] Moore, G.M., Bergeron, C. and Bennett, K.P. (2009) Nonsmooth Bilevel Programming for Hyperparameter Selection. 2009 IEEE International Conference on Data Mining Workshops, Miami, 6 December 2009. [Google Scholar] [CrossRef
[5] Bennett, K.P., Hu, J., Ji, X.Y., Kunapuli, G. and Pang, J.-S. (2006) Model Selection via Bilevel Optimization. The 2006 IEEE International Joint Conference on Neural Network Proceedings, 16-21 July 2006. [Google Scholar] [CrossRef
[6] Vapnik, V.N. and Chervonenkis, A.Y. (1971) On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. Theory of Probability & Its Applications, 16, 264-280. [Google Scholar] [CrossRef
[7] Cortes, C. and Vapnik, V. (1995) Support-Vector Networks. Machine Learning, 20, 273-297. [Google Scholar] [CrossRef
[8] 贺加贝. 基于改进SFLA算法对SVM算法超参数的优化[J]. 科技与创新, 2024(6): 39-41.
[9] Franceschi, L., Donini, M., Frasconi, P. and Pontil M. (2017) Forward and Reverse Gradient-Based Hyperparameter Optimization. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6-11 August 2017, 1165-1173.