基于决策树的上市公司风险分类与预测

doi:10.12677/AAM.2022.111045

期刊菜单

基于决策树的上市公司风险分类与预测
Risk Classification and Forecast of Listed Companies Based on Decision Tree

DOI: 10.12677/AAM.2022.111045, PDF, 科研立项经费支持
作者: 张晋敏, 樊弟军：上海工程技术大学管理学院，上海；李旭芳：上海工程技术大学管理学院，上海；上海理工大学管理学院，上海
关键词: 决策树；风险分类；不平衡数据；指标体系；成本敏感决策树；Decision Tree； Risk Classification； Unbalanced Data； Index System； Cost-Sensitive Decision Tree

摘要: 上市公司财务造假、违规担保等违规犯罪行为屡见不鲜，对上市公司合理的分类评级对维持金融市场秩序具有重要意义。本研究构建了一套指标类型为基础性和触发性的风险分类评级指标体系，使用触发性指标应用决策树算法对收集到的上市公司进行风险分类，使用基础性指标构建决策树模型用于风险预测，并且针对错分代价的不平衡等问题，对分类为正常上市公司的样本使用成本敏感决策树进行二次分类。结果表明，用于分类的模型准确率达到100%，用于预测的模型训练集准确度为92.7%，测试集准确度为80%，成本敏感决策树的二次分类，将有风险上市公司分类准确率提升至100%，整体准确率由91.7%提高到96.7%。

Abstract: Financial fraud, illegal guarantee and other illegal crimes of listed companies are common, and a reasonable classification and rating of listed companies is of great significance to maintaining the order of financial market. In this study, a set of risk classification and rating index system with basic and trigger indicators is constructed. The trigger indicators are used to classify the risks of the listed companies by decision tree algorithm, and the classification accuracy of training set and test set is 100%. Then, the decision tree model is built by using basic indicators for prediction. In order to improve the prediction accuracy, the relationship between the model and the sector where listed companies are located is discussed. In view of the imbalance of misclassification cost, the samples classified as normal listed companies are classified twice by using cost-sensitive decision tree. The results show that the accuracy of the model training set used for prediction is 92.7%, the accuracy of the test set is 80%, the segmentation of listed companies can improve the accuracy of the model, and the secondary classification of cost-sensitive decision tree can improve the classification accuracy of risky listed companies to 100% and the overall accuracy from 91.7% to 96.7%.

文章引用：张晋敏, 李旭芳, 樊弟军. 基于决策树的上市公司风险分类与预测[J]. 应用数学进展, 2022, 11(1): 370-380. https://doi.org/10.12677/AAM.2022.111045

参考文献

[1]	Zhou, L., Tam, K.P. and Fujita, H. (2016) Predicting the Listing Status of Chinese Listed Companies with Multi-Class Classification Models. Information Sciences, 328, 222-236. [Google Scholar] [CrossRef]
[2]	Guo, Y. (2020) Credit Risk Assessment of P2P Lending Platform towards Big Data Based on BP Neural Network. Journal of Visual Communication and Image Representation, 71, Article ID: 102730. [Google Scholar] [CrossRef]
[3]	Zhou, X., Jiang, W. and Shi, Y. (2010) Credit Risk Evaluation by Using Nearest Subspace Method. Procedia Computer Science, 1, 2449-2455. [Google Scholar] [CrossRef]
[4]	Luo, J., Yan, X. and Tian, Y. (2020) Unsupervised Quadratic Surface Support Vector Machine with Application to Credit Risk Assessment. European Journal of Operational Research, 280, 1008-1017. [Google Scholar] [CrossRef]
[5]	Tian, Z., Xiao, J., Feng, H., et al. (2020) Credit Risk Assessment Based on Gradient Boosting Decision Tree. Procedia Computer Science, 174, 150-160. [Google Scholar] [CrossRef]
[6]	陈云, 石松, 潘彦, 等. 基于SVM混合集成的信用风险评估模型[J]. 计算机工程与应用, 2016, 52(4): 115-120.
[7]	孙晓琳, 秦学志, 周颖颖. 基于混合Logit模型的房地产公司信用风险预测研究[J]. 现代管理科学, 2010(2): 20-22.
[8]	申晴, 张连增. 一种新的银行信用风险识别方法: SVM-KNN组合模型[J]. 金融监管研究, 2020(7): 23-37.
[9]	马威. 基于决策树技术的小额贷款公司信用风险预警研究[J]. 财会通讯, 2019(26): 106-109.
[10]	赵静娴, 杜子平. 基于神经网络和决策树相结合的信用风险评估模型研究[J]. 北京理工大学学报(社会科学版), 2009, 11(1): 76-79.
[11]	赵文隆, 龚俊, 马俊辉, 等. 基于决策树算法的复合包装膜袋材质鉴别[J]. 包装工程, 2020, 41(21): 93-102.
[12]	徐立娟, 吴春华, 王元章, 等. 基于决策树的光伏组件故障诊断方法研究[J]. 电工电能新技术, 2017, 36(6): 83-88.
[13]	牛宵. 决策树分类模型的住宅建筑物图斑识别[J]. 测绘科学, 2021(3): 163-168.
[14]	张红军, 闫士举. 基于双侧乳腺图像“共用”阈值分割的乳腺癌近期发病预测[J]. 中国医学物理学杂志, 2017, 34(8): 820-824.
[15]	姚博, 张怀清, 刘洋, 等. 面向对象CART决策树方法的湿地遥感分类[J]. 林业科学研究, 2019, 32(5): 91-98.
[16]	马立川, 彭佳怡, 裴庆祺, 等. 高效的决策树隐私分类服务协议[J]. 通信学报, 2021, 42(8): 80-89.
[17]	姜如霞, 黄水源, 段文影, 等. C4.5算法的研究及改进[J]. 南昌大学学报(理科版), 2019, 43(1): 90-96.
[18]	邹鹏, 莫佳卉, 江亦华, 等. 基于代价敏感决策树的客户价值细分(英文) [J]. 管理科学, 2011, 24(2): 20-29.

为你推荐

友情链接