基于马尔科夫决策过程的轨道状态维修决策

doi:10.12677/HJCE.2023.123041

期刊菜单

基于马尔科夫决策过程的轨道状态维修决策
Track Condition Maintenance Decision Based on Markov Decision Process

DOI: 10.12677/HJCE.2023.123041, PDF, 科研立项经费支持
作者: 赵文博：中国铁道科学研究院集团有限公司基础设施检测研究所，北京
关键词: 轨道状态维修；数据分析；K-Means聚类；TQI；马尔科夫决策；Track Condition Maintenance； Data Analysis； K-Means Clustering； TQI； Markov Decision Process

摘要: 本文根据自然劣化情况下的轨道几何特征数据进行聚类分析，以形成的簇类为决策单元构建完整环境下的马尔科夫决策模型，以最大化轨道运行长期期望利润为目标，优化铁路维修决策过程。首先，基于轨道不平顺数据的变化特征，对照高速铁路实际运行里程和地形特征，以函数型聚类的思想对不同区段下轨道时序数据进行预处理，并采用K-Means++的方法对轨道几何变化特征进行聚类，形成多个独立的决策单元，以增强后续决策建模的科学性。其次，采用马尔科夫过程来描述轨道质量状态变化并建立轨道状态转移概率矩阵，简化铁路运行收入和维修养护成本，以建立利润模型，基于轨道日常维修养护措施建立决策动作模型，以不同决策单元的TQI数据为基础构建完整环境下的马尔科夫决策模型。最后，基于某高速铁路轨道数据进行数值实验，采用值迭代法求解马尔科夫决策模型，使得高速铁路运行的长期期望利润最大化，确定不同轨道状态下的最优维护决策，以达到减少维修成本、优化维修决策的目的，对现行铁路维修养护工作起到一定的实际指导意义。

Abstract: In this paper, cluster analysis is carried out based on the geometric feature data of the track under natural deterioration, and a Markov Decision Process model under a complete environment is constructed by taking the formed cluster class as a decision unit. In order to maximize the long-term expected profit of track operation as the goal, the decision-making process of railway maintenance is optimized. Firstly, based on the various characteristics of track irregularity data and the actual running distance and terrain characteristics of high-speed railways, the track timing data under different sections were preprocessed by the idea of functional clustering. Moreover, the K-Means++ method was used to cluster the geometric variation characteristics of the track, forming multiple independent decision units to enhance the scientific nature of subsequent decision modeling. Secondly, the Markov process is used to describe the change in track quality state and establish the probability matrix of track state transfer to simplify the railway operating income and maintenance cost, so as to establish the profit model. The decision action model is established based on the daily maintenance measures of the track, and the Markov Decision Process model under the complete environment is built based on the TQI data of different decision units. Finally, based on the track data of a high-speed railway, the numerical experiment is carried out, and the value iteration method is used to solve the Markov Decision Process model, which maximizes the long-term expected profit of high-speed railway operation and determines the optimal maintenance decision under different track states, so as to reduce the maintenance cost and optimize the maintenance decision, which plays a certain practical guiding significance for the current railway maintenance work.

文章引用：赵文博. 基于马尔科夫决策过程的轨道状态维修决策[J]. 土木工程, 2023, 12(3): 367-379. https://doi.org/10.12677/HJCE.2023.123041

参考文献

[1]	Sancho, L.C.B., Braga, J.A.P. and Andrade, A.R. (2021) Optimizing Maintenance Decision in Rails: A Markov Decision Process Approach. ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering, 7, Article ID: 04020051. [Google Scholar] [CrossRef]
[2]	Kamrani, M., Srinivasan, A.R., Chakraborty, S., et al. (2020) Applying Markov Decision Process to Understand Driving Decisions Using Basic Safety Messages Data. Transportation Research Part C: Emerging Technologies, 115, Article ID: 102642. [Google Scholar] [CrossRef]
[3]	田雪雁, 王孟雅, 潘尔顺. 基于马尔科夫决策过程的带缓存双机系统不完美维护策略[J]. 上海交通大学学报, 2021, 55(4): 480-488.
[4]	赵扬. 基于马尔科夫决策过程的城市轨道交通轨道不平顺修理决策优化技术研究[D]: [硕士学位论文]. 北京: 北京交通大学, 2020.
[5]	Blackwell, D. (1962) Discrete Dynamic Programming. The Annals of Mathematical Statistics, 33, 719-726. [Google Scholar] [CrossRef]
[6]	Watkins, C. (1989) Learning from Delayed Rewards. Ph.D. Thesis, Cambridge University, Cambridge, 1-234.
[7]	孙利荣, 卓伟杰. 函数型聚类分析方法研究[J]. 高校应用数学学报, 2020, 32(2): 127-140.
[8]	Bellman, R.E. (1957) A Markov Decision Process. Journal of Mathematical Mechanics, 6, 679-684. [Google Scholar] [CrossRef]

为你推荐

友情链接