基于因子分析和K-means聚类算法的行业聚类研究

doi:10.12677/CSA.2020.1012260

期刊菜单

基于因子分析和K-means聚类算法的行业聚类研究
Research on Industry Clustering Based on Factor Analysis and K-Means Clustering Algorithm

DOI: 10.12677/CSA.2020.1012260, PDF, 国家自然科学基金支持
作者: 曹钰, 何国辉, 谭钜源：五邑大学智能制造学部，广东江门;江门市智能数据分析与应用工程技术研究中心，广东江门
关键词: 企业经营范围；行业聚类；因子分析；K-means聚类；Business Scope； Industry Clustering； Factor Analysis； K-Means Clustering

摘要: 工商登记信息中的企业经营范围记录了企业主要从事的生产经营活动，是反映企业所属行业类别的重要标准。对企业进行行业聚类，不仅方便国家管理企业，且有利于企业自身定位，顺应国家趋势发展经济。本文采用基于因子分析和K-means聚类算法，以国家发布的《国民经济行业分类》为标准文本，对企业经营字段样本进行行业聚类分析。首先通过因子分析算法得到K-means聚类的最佳聚类个数，然后通过K-means算法，对企业经营范围进行聚类分析，得到每个企业的所属行业类别，最终通过人工评价和戴维森堡丁指数(DBI)评价聚类结果，证明方法的有效性。

Abstract: The business scope of the enterprise in the industrial and commercial registration information records the main production and operation activities of the enterprise, which is an important standard to reflect the industry category of the enterprise. Industry clustering is not only convenient for the state to manage enterprises, but also conducive to the positioning of enterprises and the development of economy in line with the national trend. In this paper, based on factor analysis and K-means clustering algorithm, and taking the national economic industry classification as the standard text, this paper conducts industry cluster analysis on enterprise business field samples. Firstly, the optimal number of K-means clustering is obtained by factor analysis algorithm, and then the business scope of enterprises is clustered by K-means algorithm, and the industry category of each enterprise is obtained. Finally, the clustering results are evaluated by artificial evaluation and Davies Bouldin index (DBI) to prove the effectiveness of the method.

文章引用：曹钰, 何国辉, 谭钜源. 基于因子分析和K-means聚类算法的行业聚类研究[J]. 计算机科学与应用, 2020, 10(12): 2447-2456. https://doi.org/10.12677/CSA.2020.1012260

参考文献

[1]	陈正伟. 国民经济行业分类及应用[Z]. 重庆: 重庆工商大学, 2014.
[2]	吴娇. 四川省各市州经济综合发展水平比较研究——基于因子分析和K-means聚类分析[J]. 知行铜仁, 2019(3): 35-39.
[3]	彭凯, 秦永彬, 许道云. 应用因子分析和K-MEANS聚类的客户分群建模[J]. 计算机科学, 2011, 38(5): 154-158, 198.
[4]	黎明, 熊伟. 基于因子分析与聚类分析的化妆品上市公司绩效评价[J]. 财会通讯, 2020(14): 96-99.
[5]	任恒妮. 大数据K-means聚类算法的研究与应用[J]. 信息技术, 2019, 43(11): 20-23.
[6]	王春枝. 因子分析中公因子提取方法的比较与选择[J]. 内蒙古财经学院学报(综合版), 2014, 12(1): 90-94.
[7]	Martinez-Martin, P., Rojo-Abuín, J.M., Weintraub, D., Chaudhuri, K.R., Rodriguez-Blázquez, C., Rizos, A. and Schrag, A. (2020) Factor Analysis and Clustering of the Movement Disorder Society-Non-Motor Rating Scale. Movement Disorders, 35, No. 6. [Google Scholar] [CrossRef] [PubMed]
[8]	韩雪, 张业, 朱聪慧. 企业经营范围文本自动分类方法探究[J]. 标准科学, 2012(1): 93-96.
[9]	Martinez-Martin, P., Rojo-Abuín, J.M., Weintraub, D., Chaudhuri, K.R., Rodriguez-Blázquez, C., Rizos, A. and Schrag, A. (2020) Factor Analysis and Clustering of the Movement Disorder Society-Non-Motor Rating Scale. Movement Disorders, 35, 969-975.
[10]	Subramaniyam, B.A., Muliyala, K.P., Suchandra, H.H. and Reddi, V.S.K. (2020) Diagnosing Catatonia and Its Dimen-sions: Cluster Analysis and Factor Solution Using the Bush Francis Catatonia Rating Scale (BFCRS). Asian Journal of Psychiatry, 52, 102002. [Google Scholar] [CrossRef] [PubMed]
[11]	Wen, F., Du, H., Ding, L., Hu, J., Huang, Z., Huang, H., et al. (2020) Clinical Efficacy and Safety of Drug Interventions for Primary and Secondary Prevention of Osteoporotic Fractures in Postmenopausal Women: Network Meta-Analysis Followed by Factor and Cluster Analysis. PLoS ONE, 15, e0234123. [Google Scholar] [CrossRef] [PubMed]
[12]	秦志勇. 安徽省医疗卫生机构服务水平综合评价——基于因子分析和聚类分析方法[J]. 合肥学院学报(综合版), 2020, 37(2): 63-68.
[13]	Zhang, Q.H. (2019) Customers Segmentation Based on Factor Analysis and Cluster. E-Commerce Letters, 8, 53-62.
[14]	Wang, W. (2017) Stock Evaluation Based on Factor Analysis and Cluster-ing. Chongqing Technology and Business University. In: Proceedings of 2017 2nd International Seminar on Education Innovation and Economic Management (SEIEM 2017), Atlantis Press, 473-476. [Google Scholar] [CrossRef]
[15]	金涛, 戴玉刚. 浅析文本聚类有效性评价的方法[J]. 中文信息, 2018(5): 3.
[16]	黄越辉, 曲凯, 李驰, 司刚全. 基于K-means MCMC算法的中长期风电时间序列建模方法研究[J]. 电网技术, 2019, 43(7): 2469-2476.

为你推荐

友情链接