基于DBSCAN的风机功率异常数据清洗
Cleaning of Abnormal Data of Wind Turbine Power Based on DBSCAN
DOI: 10.12677/CSA.2021.1110255, PDF,  被引量   
作者: 孔维胜, 石明全*:中国科学院重庆绿色智能技术研究院,重庆;中国科学院大学,北京;朱海鹏, 王晓东:中国科学院重庆绿色智能技术研究院,重庆
关键词: 风电机组异常检测风功率曲线密度聚类Wind Turbine Anomaly Detection Wind Power Curve Density Clustering
摘要: 风功率曲线是评价风电机组性能的重要指标,对风电场整体的运行管理具有重要意义。在风电机组实际运行过程中,由于设备故障及自然因素等原因的影响会导致数据采集与监视控制系统(SCADA)采集的数据中存在大量异常数据,导致风功率曲线评价不准确。本文从异常数据的产生机理分析,将数据分为0功率堆积数据、恒功率限电数据和分散型异常数据,根据不同类型数据特征,提出了基于DBSCAN和区间DBSCAN (DBSCAN-Interval DBSCAN)组合的异常检测模型,实现了对运行数据的清洗。最后,将本方法应用到某风场全年的风机采集数据中,对其进行数据清洗,结果表明该方法可以有效地检测和分离运行数据中的异常数据,在保证数据完整性的基础上提高了数据质量,显著提高了风电机组性能分析的准确性。
Abstract: The wind power curve is an important indicator for evaluating the performance of wind turbines, and is of great significance to the overall operation and management of the wind farm. However, due to equipment failures and natural factors, there will be a large number of abnormal data in the data collected by the data collection and monitoring control system (SCADA) in practices, making the wind power curve inaccurate. In this paper, aimed to clean the data of wind power curve, the data is divided into zero power accumulation data, constant power limit data and scattered abnormal data based on the analysis of the generation mechanism of abnormal data firstly. Then, according to the characteristics of the different types of data, a combination of DBSCAN and interval DBSCAN (DBSCAN-Interval DBSCAN) method is proposed, which realizes the anomaly operating data detection and cleaning. Finally, this method is applied to the annual wind turbine collection data of a wind farm to clean the data. The results show that DBSCAN-Interval DBSCAN method can effectively detect and clean abnormal data in the operating data, enforce the data integrity, and improve data quality, which significantly enhanced the accuracy of performance analysis of wind turbines.
文章引用:孔维胜, 朱海鹏, 王晓东, 石明全. 基于DBSCAN的风机功率异常数据清洗[J]. 计算机科学与应用, 2021, 11(10): 2517-2528. https://doi.org/10.12677/CSA.2021.1110255

参考文献

[1] 白永秀, 鲁能, 李双媛. 双碳目标提出的背景、挑战、机遇及实现路径[J]. 中国经济评论, 2021(5): 10-13.
[2] 王一妹, 刘辉, 宋鹏, 等. 基于多阶段递进识别的风电机组异常运行数据清洗方法[J]. 可再生能源, 2020, 38(11): 1470-1476.
[3] 朱倩雯, 叶林, 赵永宁, 等. 风电场输出功率异常数据识别与重构方法研究[J]. 电力系统保护与控制, 2015, 43(3): 38-45.
[4] Wang, Y., Infield, D.G., Stephen, B., et al. (2014) Copula-Based Model for Wind Turbine Power Curve Outlier Rejection. Wind Energy, 17, 1677-1688. [Google Scholar] [CrossRef
[5] 娄建楼, 胥佳, 陆恒, 等. 基于功率曲线的风电机组数据清洗算法[J]. 电力系统自动化, 2016, 40(10): 116-121.
[6] 沈小军, 付雪姣, 周冲成, 等. 风电机组风速-功率异常运行数据特征及清洗方法[J]. 电工技术学报, 2018, 33(14): 3353-61.
[7] Zheng, L., Hu, W. and Min, Y. (2015) Raw Wind Data Preprocessing: A Data-Mining Approach. IEEE Transactions on Sustainable Energy, 6, 11-19. [Google Scholar] [CrossRef
[8] 胡阳, 乔依林. 基于置信等效边界模型的风功率数据清洗方法[J]. 电力系统自动化, 2018, 42(15): 18-23+149.
[9] 范晓泉, 杜大军, 费敏锐. 风电异常测量数据智能识别方法研究[J]. 仪表技术, 2017(1): 10-14.
[10] 田书欣, 程浩忠, 曾平良, 等. 基于调频层面的风电弃风分析[J]. 电工技术学报, 2015, 30(7): 18-26.
[11] 贺玲, 吴玲达, 蔡益朝. 数据挖掘中的聚类算法综述[J]. 计算机应用研究, 2007(1): 10-13.