基于PCA-K均值聚类算法的古代玻璃制品的成分分析与鉴别
Component Analysis and Identification of Ancient Glassware Based on PCA-K Mean Clustering Algorithm
DOI: 10.12677/aam.2025.1411474, PDF,    科研立项经费支持
作者: 刘忠慧*, 李进友#:广西民族师范学院数学与计算机科学学院,广西 崇左
关键词: Pearson卡方检验玻璃制品K-Means聚类主成分分析Pearson’s Chi-Squared Test Glass Products K-Means Clustering Principal Component Analysis
摘要: 本文主要对古代玻璃制品相关检测数据进行风化前后规律的研究,根据风化规律进行亚分类,并建立玻璃制品各项化学成分含量风化前的预测模型,同时根据分类规律,对未知类别的玻璃制品进行鉴别。首先采用数据可视化和Pearson卡方检验方法分析表面风化情况与玻璃类型、颜色和纹饰之间的相关性。其次,采用Python软件计算玻璃在风化前后各项化学含量的均值和方差,获得玻璃的风化规律,并利用风化前后各项化学成分含量的平均值的改变比率来预测风化前的化学含量。最后,利用数据可视化分析高钾玻璃和铅钡玻璃的分类规律。采用主成分分析(PCA)和K-Means聚类方法对两种玻璃进行亚分类,并利用ROC曲线对分类模型进行了合理性和敏感性分析。通过分类模型对未知类别的玻璃制品化学成分含量进行分析并鉴别其所属类型,即文物编号A1,A5,A6,A7为高钾玻璃,A2,A3,A4,A8为铅钡玻璃。通过模型敏感性分析,说明分类结果有较好的稳定性和准确性。
Abstract: This paper mainly studies the pre- and post-weathering patterns of the relevant detection data of ancient glassware, conducts subclassification based on the weathering patterns, and establishes a prediction model for the content of various chemical components of glassware before weathering. At the same time, it identifies glassware of unknown categories according to the classification patterns. Firstly, data visualization and Pearson chi-square test methods were adopted to analyze the correlation between surface weathering and the type, color and pattern of glass. Secondly, Python software is used to calculate the mean and variance of each chemical content of the glass before and after weathering, obtaining the weathering law of the glass. Then, the change rate of the average value of each chemical component content before and after weathering is utilized to predict the chemical content before weathering. Finally, data visualization is utilized to analyze the classification patterns of high-potassium glass and lead-barium glass. The principal component analysis and K-Means clustering methods were adopted to subclassify the two types of glass, and the ROC curve was used to analyze the rationality and sensitivity of the classification model. Through the classification model, the chemical composition content of glass products of unknown categories was analyzed and their types were identified. Specifically, the cultural relic numbers A1, A5, A6, and A7 were high-potassium glass, while A2, A3, A4, and A8 were lead-barium glass. Through model sensitivity analysis, it is indicated that the classification results have good stability and accuracy.
文章引用:刘忠慧, 李进友. 基于PCA-K均值聚类算法的古代玻璃制品的成分分析与鉴别[J]. 应用数学进展, 2025, 14(11): 178-192. https://doi.org/10.12677/aam.2025.1411474

参考文献

[1] 周静. 丝绸之路与中国早期玻璃艺术[J]. 艺术与设计(理论), 2012, 2(5): 144-146.
[2] 冯百龄. 中国出土古代玻璃珠数据库建设与应用[D]: [硕士学位论文]. 西安: 西北大学, 2021.
[3] 全国大学生数学建模组委会. 2022 “高教社杯”全国大学生数学建模竞赛赛题[EB/OL].
https://dxs.moe.gov.cn/zx/a/hd_sxjm_sthb/220811/1792383.shtml, 2025-11-11.
[4] 李群, 徐红剑, 杨金, 等. 基于Pearson卡方检验算法评价指标优选的波密-墨脱地区泥石流易发性评价[J]. 地质科技通报, 2025, 44(4): 316-329.
[5] 周纲, 黄瑞, 刘度度, 等. 基于改进K-Means聚类和皮尔逊相关系数户变关系异常诊断[J]. 电测与仪表, 2024, 61(3): 76-82+152.
[6] 黄恒秋, 莫洁安, 谢东津, 等. Python大数据分析与挖掘实战(微课版) [M]. 北京: 人民邮电出版社, 2020.