基于判别分析的葡萄酒类型分析
Wine Type Analysis Based on Discriminant Analysis
摘要: 基于数据集,使用距离判别,贝叶斯判别和费希尔判别这三种判别分析方法对每个样本来自哪个红酒厂进行判别,与原始数据中所属酒厂进行比较,并用三种非参数估计方法(回代法,划分样本和交叉验证法)计算估计的误判率,比较三种非参数估计方法的优劣,对其他未知产地的红酒判别来自三个红酒厂中的哪一个提供便利。结合三种非参数估计方法的误判率分析得出,一般情况下交叉验证法较其他两种方法效果较好,最值得推荐。
Abstract:
Based on the data set, three discriminant analysis methods, namely distance discrimination, Bayesian discrimination and Fisher discrimination, are used to distinguish which wine factory each sample comes from, compare it with the wine factory in the original data, and calculate the error rate of the estimation with three nonparametric estimation methods (back substitution method, sample division and cross validation method) to compare the advantages and disadvantages of the three nonparametric estimation methods. It is convenient to identify which of the three wineries is the source of red wine from other unknown origins. Combined with the error rate analysis of three nonparametric estimation methods, the cross validation method is generally better than the other two methods, and is most recommended.
参考文献
|
[1]
|
王学民. 应用多元统计分析[M]. 上海: 上海财经大学出版社, 2017.
|
|
[2]
|
王晓华. AI制胜[M]. 北京: 清华大学出版社, 2020.
|