基于模糊层次分析法和熵权法的金融数据仓库数据质量评估——以H公司为例
Data Quality Evaluation of Financial Data Warehouse Based on Fuzzy Analytic Hierarchy Process and Entropy Weight Method—Taking H Company as an Example
摘要: 本文以H公司为例,探讨金融数据仓库的数据质量评估。首先介绍证券行业数据仓库数据内容及特点,构建包含完整性、准确性等7个一级指标及相关二级指标的评价体系并量化。接着阐述模糊层次分析法和熵权法,前者通过构建层次模型和模糊判断矩阵确定主观权重,后者经数据标准化等步骤计算客观权重,两者结合得出综合权重。通过对H公司主体、交易等四个主题域数据集的算例分析,包括指标量化、权重计算及质量评估,结果表明主体和渠道数据在准确性及一致性方面有不足,研究为金融数据仓库数据质量管理提供了科学方法和改进方向。
Abstract: This research focuses on the data quality evaluation of financial data warehouses, taking H Company as an example. Firstly, it introduced the data content and characteristics of the security industry data warehouse, constructed an evaluation system including 7 first-level indicators such as integrity and accuracy and related second-level indicators, and quantified them. Then, it elaborated on the fuzzy analytic hierarchy process and the entropy weight method. The former determines subjective weights by constructing a hierarchical model and a fuzzy judgment matrix, while the latter calculates objective weights through steps such as data standardization. The two methods are combined to obtain comprehensive weights. Through a case analysis of the data sets of four theme domains such as the main body and transactions of Company H, including index quantification, weight calculation, and quality assessment, the results show that the main body and channel data have deficiencies in terms of accuracy and consistency. This study provides a scientific method and an improvement direction for the data quality management of financial data warehouses.
文章引用:王浩, 孟飞. 基于模糊层次分析法和熵权法的金融数据仓库数据质量评估——以H公司为例[J]. 管理科学与工程, 2025, 14(2): 461-471. https://doi.org/10.12677/mse.2025.142050

参考文献

[1] Liaw, S.-T., Guo, J.G.M., et al. (2021) Quality Assessment of Real-World Data Repositories across the Data Lifecycle: A Literature Review. Journal of the American Medical Informatics Association, 28, 1591-1599. [Google Scholar] [CrossRef] [PubMed]
[2] Ya, L., Li, Y., Song, H., et al. (2020) Method for Calculating the Weights of Internet+Government Service Data Quality Assessment Indexes Based on Analytic Hierarchy Process. Journal of Physics: Conference Series, 1584, Article 012043. [Google Scholar] [CrossRef
[3] Ya, L., Li, Y., Song, H., et al. (2020) Studies on Data Quality Evaluation Index System for Internet Plus Government Services in Big Data Era. Journal of Physics: Conference Series, 1584, Article 012014. [Google Scholar] [CrossRef
[4] Gupta, N. (2023) Optimising Data Quality of a Data Warehouse Using Data Purgation Process. International Journal of Data Mining, Modelling and Management, 15, 1-13. [Google Scholar] [CrossRef
[5] Chu, Y.C., Yang, S.S. and Yang, C.C. (2001) Enhancing Data Quality through Attribute-Based Metadata and Cost Evaluation in Data Warehouse Environments. Journal of the Chinese Institute of Engineers, 24, 497-507. [Google Scholar] [CrossRef
[6] Travkin, P. (2023) Data Observability: Is It Data Quality Monitoring or More? Database Trends and Applications, 37, 1-5.
[7] Soňa, K. (2023) Data Governance Model to Enhance Data Quality in Financial Institutions. Information Systems Management, 40, 1-10. [Google Scholar] [CrossRef
[8] 周艳会, 曾荣仁. 基于元数据的数据质量管理研究[J]. 信息技术与信息化, 2020(7): 26-29.
[9] 刘智锋, 王继民, 李倩. 元数据质量评价研究综述[J]. 情报理论与实践, 2022, 45(7): 42-48.
[10] 赵恩毅. 大数据中的数据清洗与预处理技术研究[J]. 信息记录材料, 2024, 25(3): 195-197.
[11] 宋俊典, 刘丰源. 一种支持数据质量评价的方法与应用研究[J]. 计算机应用与软件, 2018, 35(5): 328-333.
[12] 计蓉, 侯慧娟, 盛戈皞, 等. 基于组合赋权法和模糊综合评价的电力设备状态数据质量评估[J]. 高电压技术, 2024, 50(1): 274-281.
[13] 赵毅. 基于大数据平台构建数据仓库的研究与实践[J]. 中国金融电脑, 2017(5): 37-42.
[14] 刘明吉, 张晓京, 刘洪杰, 等. 数据仓库在证券交易中的研究与应用[J]. 计算机工程, 2000(2): 47-49+94.
[15] ISO/IEC JTC 1/SC 7. ISO/IEC25012-2008软件工程, 软件产品质量要求和评估(SQuaRE)数据质量模型[S]. 日内瓦: 国际标准化组织, 2008.
[16] 姚前, 蒋东兴, 刘铁斌, 周云晖, 王东明, 毛嘉伟, 等. GB/T42775-2023证券期货业数据安全风险防控数据分类分级指引[S]. 北京: 中国标准出版社, 2023.
[17] 刘益江, 毛宁, 陈庆新. 一种评估数据仓库设计质量的方法[J]. 计算机技术与发展, 2012, 22(9): 161-165.
[18] 申梁, 陈立强, 黄勇. 基于模糊层次分析法和熵权法的岩溶发育强度评价方法探讨[J]. 勘察科学技术, 2024(2): 30-35.
[19] 杨栋枢, 杨德胜. 基于熵权和层次分析法的数据质量评估研究[J]. 现代电子技术, 2013, 36(22): 39-42.