隐私保护下基于XGBoost-LightGBM融合模型用电数据异常检测
Privacy-Preserving Electricity Consumption Anomaly Detection Based on XGBoost-LightGBM Hybrid Model
DOI: 10.12677/mos.2025.147523, PDF,    科研立项经费支持
作者: 孙 慧, 智路平:上海理工大学管理学院,上海
关键词: 异常检测混合加密XGBoostLightGBM差异性Anomaly Detection Hybrid Encryption XGBoost LightGBM Diversity
摘要: 据Northeast Group报导,全球由于窃电造成的经济损失高达960亿美元/年,对发达国家和发展中国家用电安全均产生了严重影响,为了减少窃电损失,亟需提升用户侧用电数据异常检测的安全性和效率。本研究在传统的电力数据异常检测框架上引入基于AES + RSA的混合加密架构,并结合SHA-256加密哈希算法和数字签名技术实现数据传输的安全性保护、完整性验证与身份认证;在异常检测阶段,利用网格搜索对XGBoost与LightGBM模型进行参数优化后,通过构建以AUC与预测差异性为动态权重调整因子的模型融合式,使优化后的XGBoost与LightGBM模型实现自适应融合以提升检测方法的泛化性。利用国家电网公开数据集进行检测实验,结果显示该模型AUC达到了82.03%,Accuracy为91.53%,G-mean值为54.99%,继而在4个KEEL公开数据集上进行泛化性能测试,结果表明该检测方法具有较好的检测异常样本的能力。
Abstract: According to a Northeast Group report, electricity theft causes up to USD 96 billion in annual global economic losses, severely compromising power security in both developed and developing nations. To mitigate these losses, enhancing the security and efficiency of user-side power-consumption anomaly detection is imperative. This study extends the conventional anomaly-detection framework by integrating a hybrid encryption scheme based on AES and RSA, combined with SHA-256 hashing and digital-signature techniques, to ensure data confidentiality, integrity verification, and authentication during transmission. In the anomaly-detection phase, the research first employs grid search to optimize hyperparameters for XGBoost and LightGBM models. The authors then fuse the optimized models via a dynamic weighting mechanism where weights are adaptively adjusted based on each model’s AUC and prediction diversity, thereby improving the ensemble’s generalization capability. Experimental evaluation on a publicly available State Grid dataset demonstrates that the proposed hybrid model achieves an AUC of 82.03%, an accuracy of 91.53%, and a G-mean of 54.99%. Generalization tests on four KEEL benchmark datasets confirm the method’s robust anomaly detection capability.
文章引用:孙慧, 智路平. 隐私保护下基于XGBoost-LightGBM融合模型用电数据异常检测[J]. 建模与仿真, 2025, 14(7): 140-153. https://doi.org/10.12677/mos.2025.147523

参考文献

[1] Glauner, P., Meira, J.A., Valtchev, P., State, R. and Bettinger, F. (2017) The Challenge of Non-Technical Loss Detection Using Artificial Intelligence: A Survey. International Journal of Computational Intelligence Systems, 10, 760-775. [Google Scholar] [CrossRef
[2] Jartelius, M. (2020) The 2020 Data Breach Investigations Report—A Cso’s Perspective. Network Security, 2020, 9-12. [Google Scholar] [CrossRef
[3] Wang, X., Xie, H., Tang, L., Chen, C. and Bie, Z. (2024) Decentralized Privacy-Preserving Electricity Theft Detection for Distribution System Operators. IEEE Transactions on Smart Grid, 15, 2179-2190. [Google Scholar] [CrossRef
[4] 周李, 赵露君, 高卫国. 稀疏编码模型在电力用户异常用电行为探测中的应用研究(英文) [J]. 电网技术, 2015, 39(11): 3182-3188.
[5] 许刚, 谈元鹏, 戴腾辉. 稀疏随机森林下的用电侧异常行为模式检测[J]. 电网技术, 2017, 41(6): 1964-1973.
[6] Xia, X., Xiao, Y. and Liang, W. (2019) ABSI: An Adaptive Binary Splitting Algorithm for Malicious Meter Inspection in Smart Grid. IEEE Transactions on Information Forensics and Security, 14, 445-458. [Google Scholar] [CrossRef
[7] Zheng, Z., Yang, Y., Niu, X., Dai, H. and Zhou, Y. (2018) Wide and Deep Convolutional Neural Networks for Electricity-Theft Detection to Secure Smart Grids. IEEE Transactions on Industrial Informatics, 14, 1606-1615. [Google Scholar] [CrossRef
[8] Nabil, M., Ismail, M., Mahmoud, M., Shahin, M., Qaraqe, K. and Serpedin, E. (2018) Deep Recurrent Electricity Theft Detection in AMI Networks with Random Tuning of Hyper-Parameters. 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, 20-24 August 2018, 740-745. [Google Scholar] [CrossRef
[9] 张宇帆, 艾芊, 李昭昱, 等. 基于特征提取的面向边缘数据中心的窃电监测[J]. 电力系统自动化, 2020, 44(9): 128-134.
[10] Javaid, N., Jan, N. and Javed, M.U. (2021) An Adaptive Synthesis to Handle Imbalanced Big Data with Deep Siamese Network for Electricity Theft Detection in Smart Grids. Journal of Parallel and Distributed Computing, 153, 44-52. [Google Scholar] [CrossRef
[11] 高欣, 纪维佳, 赵兵, 等. 不平衡数据集下基于CVAE-CNN模型的智能电表故障多分类方法[J]. 电网技术, 2021, 45(8): 3052-3060.
[12] 严莉, 张凯, 徐浩, 等. 基于图注意力机制和Transformer的异常检测[J]. 电子学报, 2022, 50(4): 900-908.
[13] 蔡梓文, 赵云, 陆煜锌, 等. 基于变分自编码器的多源数据融合窃电检测方法[J]. 电力系统保护与控制, 2025, 53(4): 176-187.
[14] 游文霞, 申坤, 杨楠, 等. 基于Bagging异质集成学习的窃电检测[J]. 电力系统自动化, 2021, 45(2): 105-113.
[15] 李国成, 陆俊, 王赟, 等. 基于Bagging二次加权集成的孤立森林窃电检测算法[J]. 电力系统自动化, 2022, 46(2): 92-100.
[16] Naim, K., Khelifa, B. and Fateh, B. (2020) A Cryptographic-Based Approach for Electricity Theft Detection in Smart Grid. Computers, Materials & Continua, 62, 97-117. [Google Scholar] [CrossRef
[17] 史佳琪, 张建华. 基于多模型融合Stacking集成学习方式的负荷预测方法[J]. 中国电机工程学报, 2019, 39(14): 4032-4042.
[18] Shwartz-Ziv, R. and Armon, A. (2022) Tabular Data: Deep Learning Is Not All You Need. Information Fusion, 81, 84-90. [Google Scholar] [CrossRef