基于随机森林算法的个性影响因素研究
Research on Personal Influencing Factors Based on Random Forest Algorithm
DOI: 10.12677/sa.2025.1412341, PDF,   
作者: 盛 林:广西师范大学数学与统计学院,广西 桂林
关键词: 个性随机森林决策树社交Personality Random Forest Decision Tree Socializing
摘要: 个性形成受到父母、学校和社会等因素影响,虽然个性一经形成,比较稳定,但并非不可改变。尤其是未成年的学生,个性的可塑性还是很强的。本文通过对个性因素数据分析,帮助老师改善学生性格,促进个性内向学生全面发展。本文引入随机森林算法对个性数据集进行研究,利用随机森林算法准确率高,处理大数据拟合效果好的优势。在建立CART决策树和随机森林模型过程中,利用网格搜索,获得最优的参数组合。同时引入ROC曲线,将CART决策树和随机森林模型的效果进行实验对比,结果显示随机森林模型的效果更好,能够准确对个性影响因素进行分析。最后,对特征变量进行重要性排序,为教师教学提供一定的参考。
Abstract: Personality formation is influenced by factors such as parents, schools, and society. Although once formed, personality is relatively stable, it is not unchangeable. Especially for underage students, the plasticity of personality is still quite strong. This paper analyzes the data of Personality factors to help teachers improve students’ characters and promote the all-round development of introverted students. This paper introduces the random forest algorithm to study the personality dataset, taking advantage of its high accuracy and good fitting effect on big data. During the establishment of CART decision tree and random forest models, grid search is used to obtain the optimal parameter combination. Meanwhile, the ROC curve is introduced to experimentally compare the effects of the CART decision tree and random forest models. The results show that the random forest model performs better and can accurately analyze the influencing factors of personality. Finally, the importance of feature variables is ranked, providing a certain reference for teachers’ teaching.
文章引用:盛林. 基于随机森林算法的个性影响因素研究[J]. 统计学与应用, 2025, 14(12): 19-25. https://doi.org/10.12677/sa.2025.1412341

参考文献

[1] Kwok, S.W. and Carter, C. (1990) Multiple Decision Trees. Machine Intelligence and Pattern Recognition, 9, 327-335.
[2] 马红迪. 基于决策树和随机森林模型的食品安全风险预警[D]: [硕士学位论文]. 大连: 东北财经大学, 2020.
[3] 曲艳婷. P2P网络借贷违约的随机森林预测模型[D]: [硕士学位论文]. 重庆: 重庆大学, 2018.
[4] 孙文轩. 基于随机森林算法的公司人才流失问题与对策研究[D]: [硕士学位论文]. 长春: 吉林大学, 2021.
[5] Yan, G., Chen, X. and Zhang, Y. (2021) Study on the Distribution Pattern and Influencing Factors of Shrinking Cities in Northeast China Based on the Random Forest Model. Journal of Geography and Cartography, 3, 41-51. [Google Scholar] [CrossRef
[6] Wang, Q., Wang, X., Zhou, Y., Liu, D. and Wang, H. (2022) The Dominant Factors and Influence of Urban Characteristics on Land Surface Temperature Using Random Forest Algorithm. Sustainable Cities and Society, 79, Article ID: 103722. [Google Scholar] [CrossRef
[7] Carter, M.R., Tjernström, E. and Toledo, P. (2019) Heterogeneous Impact Dynamics of a Rural Business Development Program in Nicaragua. Journal of Development Economics, 138, 77-98. [Google Scholar] [CrossRef
[8] Wang, H. and Wang, G. (2020) Improving Random Forest Algorithm by Lasso Method. Journal of Statistical Computation and Simulation, 91, 353-367. [Google Scholar] [CrossRef
[9] Qiu, X., Wang, H., Lan, Y., Miao, J., Pan, C., Sun, W., et al. (2022) Explore the Influencing Factors and Construct Random Forest Models of Post-Stroke Depression at 3 Months in Males and Females. BMC Psychiatry, 22, Article No. 811. [Google Scholar] [CrossRef] [PubMed]
[10] Parkhurst, D.F., Brenner, K.P., Dufour, A.P. and Wymer, L.J. (2005) Indicator Bacteria at Five Swimming Beaches—Analysis Using Random Forests. Water Research, 39, 1354-1360. [Google Scholar] [CrossRef] [PubMed]
[11] Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32. [Google Scholar] [CrossRef