基于C-Vine Copula理论的监督学习分类器的优化
Optimization for C-Vine Copula-Based Supervised Learning Classification
DOI: 10.12677/SA.2021.101007, PDF,    科研立项经费支持
作者: 王 蕾, 杨 光:沈阳师范大学,数学与系统科学学院,辽宁 沈阳;付志慧:闽南师范大学,数学与统计学院,福建 漳州
关键词: 缺失数据C-Vine Copula监督学习分类器贝叶斯决策Missing Data C-Vine Copula Supervised Learning Classification Bayesian Decision Theory
摘要: 由于朴素贝叶斯分类器对特征变量作了独立性假设,忽略了相关性,导致在某些特征相关的情况下分类效果很差。为了提高分类效果,本文对有缺失的数据集利用C-Vine Copula理论进行填补从而得到完整的数据集,并结合Copula函数研究特征变量之间的相关性优化问题,用C-Vine Copula分类器对完整数据集做分类。结果表明,基于C-Vine Copula理论的监督学习分类器具备良好的分类性能。
Abstract: Because of the feature independence assumption, the correlation between variables is ignored, causing that the Naive Bayes works poorly in classification for some cases when the features are correlated. In this paper, for improving the classification effect, the missing datasets are filled by using C-Vine Copula theory. As a result, the complete datasets are got after imputation. By combining the copula function and investigating on the correlation between features, C-vine copula classifier is used to classify complete datasets. The obtained results show that the supervised learning classifier based on the C-Vine Copula theory has better performance.
文章引用:王蕾, 杨光, 付志慧. 基于C-Vine Copula理论的监督学习分类器的优化[J]. 统计学与应用, 2021, 10(1): 70-76. https://doi.org/10.12677/SA.2021.101007

参考文献

[1] 李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012: 48-53.
[2] Sklar, A. (1959) Fonctions de repartition an dimensions et leurs marges. Publications de l’Institut de Statistique de l’Universite de Paris, 33, 229-231.
[3] Nelsen, R.B. (1999) An Introduction to Copulas. Springer-Verlag, New York.
[Google Scholar] [CrossRef
[4] Joe, H. (1997) Multivariate Models and Dependence Concepts. Chapman and Hall, London.
[Google Scholar] [CrossRef
[5] Bedford, T. and Cooke, R. (2001) Probability Density Decomposition for Conditionally Dependent Random Variables Modeled by Vines. Annals of Mathematics and Artificial Intelligence, 32, 245-268.
[Google Scholar] [CrossRef
[6] Aas, K., Czado, C., Frigessi, A., et al. (2009) Pair-Copula Constructions of Multiple Dependence. Insurance Mathematics and Economics, 44, 182-198.
[Google Scholar] [CrossRef
[7] Czado, C., Schepsmeier, U. and Min, A. (2011) Maximum Likelihood Estimation of Mixed C-Vines with Application to Exchange Rates. Statistical Modelling, 12, 229-255.
[Google Scholar] [CrossRef
[8] Brechmann, E.C., Schepsmeier, U., Grün, B., et al. (2013) Modeling Dependence with C- and D-Vine Copulas: The R Package CDVine. J of Statistical Software, 52, 1-27.
[Google Scholar] [CrossRef
[9] Bedford, T. and Cooke, R. (2002) Vines a New Graphical Model for Dependent Random Variables. The Annals of Statistics, 30, 1031-1068.
[Google Scholar] [CrossRef
[10] Kurowicka, D. and Cooke, R. (2006) Uncertainty Analysis with High Dimensional Dependence Modeling. John Wiley & Sons, Manhattan.
[Google Scholar] [CrossRef
[11] 韦艳华, 张世英. Copula理论及其在金融分析上的应用[M]. 北京: 清华大学出版社, 2008: 1-40.
[12] Chen, Y.H. (2014) A Copula-Based Supervised Learning Classification for Continuous and Discrete Data. Journal of Data Science, No. 13, 769-790.