离散时间时变Markov跳变系统的线性二次最优控制
Linear Quadratic Optimal Control for Discrete Time Time-Varying Markov Jump Systems
摘要: 研究了离散时间时变Markov跳变系统在有限时域上的最优控制问题。证明了任何离散时间时变Markov跳变系统的最优控制问题都可以通过广义代数Riccati方程的解来确定;也证明了最优控制问题的可达性是广义代数Riccati方程解的存在性的充分条件;最后通过数值举例来验证结果的正确性。
Abstract: The optimal control of discrete-time time-varying Markov jump systems in infinite horizon is stud-ied. It is proved that the optimal control problem of any discrete time time-varying Markov jump system can be determined by the solution of generalized algebraic Riccati equation. It is also proved that the attainablility of the optimal control problem is a sufficient condition for the existence of so-lutions of generalized algebraic Riccati equations. Finally, numerical examples are given to verify the correctness of the results.
文章引用:赵红霞, 何鑫, 贾亚琪, 张春梅. 离散时间时变Markov跳变系统的线性二次最优控制[J]. 应用数学进展, 2023, 12(5): 2569-2581. https://doi.org/10.12677/AAM.2023.125258

1. 引言

最优性是我们生活生产中所追求的目标之一。研究控制系统时,若系统是线性的,二次指标表示成状态变量 x ( t ) 和输入控制变量 u ( t ) 的二次函数,则把求得系统所对应的二次指标达到最优的控制问题,称为最优控制问题 [1] 。线性二次最优控制问题最早实在1958年由Bellman,Glicksberg,Gross等人研究的 [2] 。在此基础之上,R.E. Kalman针对二次最优控制建立了状态反馈 [3] ,并且将Riccati方程引入控制理论中,建立了最优线性反馈调节器 [4] [5] [6] [7] 。参考文献 [8] - [15] 中针对线性系统的系数问题作出假设,并且研究了任何的线性系统的稳定解可以用广义Riccati方程的可解性来表示 [16] [17] [18] 。在文献 [19] 中,为了保证二次线性控制问题是适定的和反馈稳定控制的存在性,要定义一些稳定性概念。

本文与参考文献 [20] 不同之处在于,本文考虑了具体非齐次马尔可夫切换的无限时域离散时间时变Markov跳变系统的线性二次最优控制,介绍了与KKT定理相关的最优线性反馈的存在性,且证明了广义Riccati方程的可解性与二次最优问题的适定性与可达性是等价的。要保证线性二次最优问题是适定的,则对应的广义Riccati方程是可解的,并且存在最优控制。在文章的第二节介绍了一些相关的定义和后续文章中所需要的一些引理;第三节对所研究的模型进行详细的描述,对模型中所出现的符号进行说明。

2. 预备知识

为了能够方便在后续文章中对一些相关性定理引理的应用,先对文章中将要用到的定理引理进行简单的介绍:

数学规划:

min f ( x ) s .t . { g ( x ) 0 h ( x ) = 0

其中:

g ( x ) = ( g 1 ( x ) , g 2 ( x ) , , g m ( x ) ) h ( x ) = ( h 1 ( x ) , h 2 ( x ) , , h m ( x ) )

定义1 (正则点):对于数学规划中的可行点 x ,令 I = { i | g 1 ( x ) = 0 } ,如果梯度向量 g i ( x ) , i I h j ( x ) , j = 1 , 2 , , n 都是线性独立的,则 x 成为约束正则点。

定义2 (正则性条件):令 I = { i | g 1 ( x ) = 0 } ,如果梯度向量 g i ( x ) , i I h j ( x ) , j = 1 , 2 , , n 都是线性独立的,则称之为正则性条件。

定义3 (KKT定理):设正则点 x 是数学规划的局部最优解,且目标函数 f ( x ) 以及约束函数 g ( x ) = ( g 1 ( x ) , g 2 ( x ) , , g m ( x ) ) h ( x ) = ( h 1 ( x ) , h 2 ( x ) , , h m ( x ) ) ,在点 x 处是连续可微的,则存在 λ m , λ 0 μ n 叫做KKT乘数,使得下列等式成立:

{ x L ( x , λ , μ ) = 0 λ T g ( x ) = 0

其中拉格朗日函数为: L ( x , λ , μ ) = f ( x ) + λ T g ( x ) + μ T h ( x )

引理1 [12] :给定矩阵 Q m × n ,矩阵 Q n × m 叫做Q的广义逆矩阵,则:

{ Q Q Q = Q , Q Q Q = Q ( Q Q ) T = Q Q , ( Q Q ) T = Q Q

引理2 [12] :给定对称矩阵Q,则:

{ ( Q ) T = ( Q T ) , Q Q = Q Q Q 0 , Q 0

引理3 (舒尔补引理) [13] :设具有适当维数的矩阵 Q T = Q , N = N T , R = R T ,则下面的条件是等价的:

(I) Q N R N T 0 , R 0 , N ( I R R ) = 0

(II) [ Q N N T R ] 0

(III) [ R N T N Q ] 0

引理4 [14] :给定矩阵 Q , K , L ,当且仅当 Q , K , L ,当且仅当 Q Q L K K = L 时,矩阵方程 Q X K = L 有一个解X,另外,X满足 X = Q L K + Y Q Q K K ,其中Y是一个适当维数的矩阵。

3. 系统描述

本节主要考虑以下随机离散时间系统:

x ( t + 1 ) = A 0 ( t , θ t ) x ( t ) + k = 1 r A k ( t , θ t ) x ( t ) ω ( t ) + B 0 ( t , θ t ) u ( t ) + k = 1 r B k ( t , θ t ) u ( t ) ω ( t ) + ω ( t ) (2-1)

a i 1 x 1 ( T ) + a i 2 x 2 ( T ) + + a i 1 x n ( T ) ξ i

其中 A 0 ( t , θ t ) n , A k ( t , θ t ) n B 0 ( t , θ t ) B K ( t , θ t ) 是具有适当维数的向量, x R n 是系统的输出状态, u R m 是系统的输入控制, x 0 R n 是给定的系统的初始状态, { ω ( t ) } t 0 是定义在完备概率空间 ( Ω , F , P ) 上的一维独立随机变量。 ω ( t ) 是噪声, t { t 0 , t 0 + 1 , , T 1 } , N T { 0 , 1 , , T } 。令 N r × n = ( a i j ) r × n , ξ = ( ξ 1 , ξ 2 , , ξ r ) T ,其中 a i j 是常数。

系统(2-1)所对应的二次指标为:

J ( t 0 , x 0 ; u ( t 0 ) , u ( t 0 + 1 ) , , u ( T 1 ) ) = E { t = t 0 T 1 ( x ( t ) u ( t ) ) T ( Q ( t , θ t ) L ( t , θ t ) ( L ( t , θ t ) ) T R ( t , θ t ) ) ( x ( t ) u ( t ) ) + x T ( T ) M x ( T ) } (2-2)

其中, ( Q ( t , θ t ) L ( t , θ t ) ( L ( t , θ t ) ) T R ( t , θ t ) ) 与M均为对称矩阵。值函数定义为:

V ( t 0 , x 0 ) = inf u ( t 0 ) , u ( t 0 + 1 ) , , u ( T 1 ) J ( t 0 , x 0 ; u ( ) ) (2-3)

假设1:为方便后续研究,现作出以下假设:

1) 上述矩阵都是有界的矩阵值序列。

2) 对于任意的 t 0 P t = ( p t ( i , j ) ) i , j × 是一个非退化的随机矩阵,对于任意的 i , j × 满足:

{ 0 p t ( i , j ) 1 k = 1 N p t ( i , k ) = 1 k = 1 N p t ( k , j ) > 0 (2-4)

假设2: { ω ( t ) } t 0 是定义在完备概率空间 ( Ω , F , P ) 上的一维独立随机变量,与初值条件 x ( t 0 ) = x 0 是相互独立的,且具有以下性质:

E [ ω ( t ) ] = 0 , E [ ω ( s ) ( ω ( t ) ) T ] = δ s t , E [ ( ω ( t ) ) 2 ] = 1 (2-5)

其中:

δ s t = { I r , s = t 0 , s t (2-6)

假设3:对于任意的 t 0 σ -代数 F t σ -代数 G t 是相互独立的,其中, F t = σ { ω ( s ) , 0 s t } G t = σ { ϑ ( s ) , 0 s t }

定义 4:无限随机线性二次问题(2-1)、(2-2)、(2-3)是适定的,若对于任意的 ( t 0 , x 0 ) N T × R n V ( t 0 , x 0 ) >

定义 5:无限随机线性二次问题(2-1)、(2-2)、(2-3)是可达的,若对于任意的 ( t 0 , x 0 ) N T × R n ,存在一个序列 { u ( t ) , t = t 0 , t 0 + 1 , , T 1 } 为系统的最优控制,使得 V ( t 0 , x 0 ) = J ( t 0 , x 0 ; u ( t 0 + 1 ) , , u ( T 1 ) )

注记 1:若无限随机线性二次问题(2-1)、(2-2)、(2-3)的一个线性反馈控制是最优的,则必存在一个最优线性反馈控制具有以下形式:

u ( t ) = K ( t , θ t ) x ( t ) , t N T

其中, K ( t ) 是一个矩阵值函数。

4. 主要成果

在研究中发现,系统(2-1)、(2-2)、(2-3)的最优控制问题可以通过广义Riccati差分方程的解来表示。

4.1. 随机线性系统与Riccati方程

定理1:如果系统(2-1)、(2-2)、(2-3)的最优控制问题是可达的, u ( t ) = K ( t , θ t ) x ( t ) , t N T ,而且正则点 ( u ( t ) , x ( t ) ) 是系统(2-1)、(2-2)、(2-3)的局部最优解,则下列广义Riccati方程(GDRE)有解 ( P ( k ) , μ ) , 0 μ R 1 , t N T

P ( t ) = ( A 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) × A 0 ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) A k ( t , θ t ) + Q ( t , θ t ) ( T ( t , θ t ) ) T ( W ( t , θ t ) ) + T ( t , θ t ) (2-7)

T ( t , θ t ) = ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) × A 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) A k ( t , θ t ) (2-8)

W ( t , θ t ) = ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) × B 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) + R ( t , θ t ) (2-9)

W ( t , θ t ) 0

μ = T r { P ( T ) M } T r { N T N } (2-10)

此外,

u * ( t ) = [ ( W ( t , θ t ) ) T ( t , θ t ) + Y ( t , θ t ) ( W ( t , θ t ) ) W ( t , θ t ) Y ( t , θ t ) ] x ( t ) u * ( t ) R m × n , t N T (2-11)

V ( x 0 , θ t ) = t = 1 T 1 T r { V ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) } + x 0 T P ( 0 , θ t ) x 0 μ T r { M } (2-12)

证明:对于任意的 t N T ,令 X ( t , θ t ) = E ( [ x ( t ) x ( t ) T ] { θ t 0 = i } ) ,其中 u ( t ) = K ( t , θ t ) x ( t ) , t N T ,则线

性二次控制问题(2-1)、(2-2)、(2-3)可以转化成下面的最优化问题:

J ( t 0 , x 0 ; u ( t 0 ) , u ( t 0 + 1 ) , , u ( ) ) = E { t = t 0 ( x ( t ) u ( t ) ) T ( Q ( t , θ t ) L ( t , θ t ) ( L ( t , θ t ) ) T R ( t , θ t ) ) ( x ( t ) u ( t ) ) + lim T x T ( T ) M x ( T ) } = t = t 0 E { ( x ( t ) ) T Q ( t , θ t ) x ( t ) + ( K ( t , θ t ) ) T ( x ( t ) ) T ( L ( t , θ t ) ) T x ( t ) + ( x ( t ) ) T L ( t , θ t ) K ( t , θ t ) x ( t ) + ( K ( t , θ t ) ) T ( x ( t ) ) T R ( t , θ t ) x ( t ) K ( t , θ t ) + lim T x T ( T ) M x ( T ) } = t = t 0 T r { Q ( t , θ t ) X ( t ) + ( K ( t , θ t ) ) T ( L ( t , θ t ) ) T K ( t , θ t ) X ( t ) + ( K ( t , θ t ) ) T ( L ( t , θ t ) ) T X ( t ) + L ( t , θ t ) K ( t , θ t ) X ( t ) + lim T M X ( T ) } (2-13)

s.t.下式均成立:

X ( t + 1 ) = A 0 ( t , θ t ) X ( t ) ( A 0 ( t , θ t ) ) T + A 0 ( t , θ t ) X ( t ) ( K ( t , θ t ) ) T ( B 0 ( t , θ t ) ) T + B 0 ( t , θ t ) K ( t , θ t ) X ( t ) ( A 0 ( t , θ t ) ) T + B 0 ( t , θ t ) K ( t , θ t ) X ( t ) ( K ( t , θ t ) ) T ( B 0 ( t , θ t ) ) T + k = 1 r ( A k ( t , θ t ) ) T X ( t ) A k ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T X ( t ) ( K ( t , θ t ) ) T ( B k ( t , θ t ) ) T + k = 1 r B k ( t , θ t ) K ( t , θ t ) X ( t ) ( K ( t , θ t ) ) T ( B k ( t , θ t ) ) T + V ( t ) (2-14)

X ( t 0 ) = E [ x 0 x 0 T ] = x 0 x 0 T T r ( X ( T ) N T N ) T r S , S = E [ ξ ξ T ]

即:

min t N T 1 f [ X ( t ) , K ( t , θ t ) ]

s .t . H [ X ( t ) , K ( t , θ t ) ] = 0 G ( X ( T ) ) 0

其中, K ( 1 , θ t ) , K ( 2 , θ t ) , , K ( T 1 , θ t ) 为待定控制项。

f [ X ( t ) , K ( t , θ t ) ] = J ( t 0 , x 0 ; u ( t 0 ) , u ( t 0 + 1 ) , , u ( T 1 ) ) (2-15)

H [ X ( t ) , K ( t , θ t ) ] X ( t + 1 ) = 0 G ( X ( T ) ) = T r ( X ( T ) N T N ) T r S

根据KKT定理,设拉格朗日函数为:

L [ X ( t ) , K ( t , θ t ) , P ( t + 1 , θ t ) , μ ] = f [ X ( t ) , K ( t , θ t ) ] + t = t 0 T r { P ( t + 1 , θ t ) H [ X ( t ) , K ( t , θ t ) ] } + μ g ( X ( T ) ) (2-16)

L [ X ( t ) , K ( t , θ t ) , P ( t + 1 , θ t ) , μ ] 中的 X ( t ) 进行求导,且根据KKT定理得:

L [ X ( t ) , K ( t , θ t ) , P ( t + 1 , θ t ) , μ ] X ( t ) = 0

通过计算可得:若 X ( 0 ) = 0 ,则有

T r M Δ X ( T ) T r Δ X ( T ) P ( T , θ t ) + T r ( Δ X ( T ) N T N ) = 0

即:

P ( T , θ t ) = M + μ N T N (2-17)

其中, T

又由于 Δ X ( T ) Δ X ( t ) 相互独立,则有:

P ( t , θ t ) = Q ( t , θ t ) + ( K ( t , θ t ) ) T [ R ( t , θ t ) + ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T K ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) ] K ( t , θ t ) + [ ( L ( t , θ t ) ) T + ( A 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) ] K ( t , θ t )

+ ( K ( t , θ t ) ) T [ L ( t , θ t ) + ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) A 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) A k ( t , θ t ) ] + A 0 ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ( A 0 ( t , θ t ) ) T + k = 1 r B k ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ( B k ( t , θ t ) ) T (2-18)

同理,对 K ( t , θ t ) 求导可得:

[ R ( t , θ t ) + ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T K ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) ] K ( t , θ t ) + ( L ( t , θ t ) ) T + ( A 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) = 0 (2-19)

取:

W ( t , θ t ) = R ( t , θ t ) + ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T K ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) (2-20)

T ( t , θ t ) = ( L ( t , θ t ) ) T + ( A 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) (2-21)

综合上述(2-20)、(2-21)则第(2-18)、(2-19)式可写做:

P ( t ) = Q ( t , θ t ) + ( K ( t , θ t ) ) T W ( t , θ t ) K ( t , θ t ) + T ( t , θ t ) K ( t , θ t ) + ( K ( t , θ t ) ) T ( T ( t , θ t ) ) T + A 0 ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ( A 0 ( t , θ t ) ) T + k = 1 r B k ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ( B k ( t , θ t ) ) T (2-22)

W ( t , θ t ) K ( t , θ t ) + T ( t , θ t ) = 0 (2-23)

根据引理4可得:若 W ( t , θ t ) K ( t , θ t ) + T ( t , θ t ) = 0 有解 K ( t , θ t ) ,当且仅当 W ( t , θ t ) ( W ( t , θ t ) ) K ( t , θ t ) = T ( t , θ t ) ,则此时方程的解 K ( t , θ t ) 表示如下:

K ( t , θ t ) = ( W ( t , θ t ) ) + T ( t , θ t ) + Y ( t , θ t ) ( W ( t , θ t ) ) + W ( t , θ t ) Y ( t , θ t ) (2-24)

K ( t , θ t ) 带入(2-22)式得以下广义Riccati方程:

P ( t ) = Q ( t , θ t ) + A 0 ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ( A 0 ( t , θ t ) ) T + k = 1 r B k ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ( B k ( t , θ t ) ) T + ( T ( t , θ t ) ) T ( W ( t , θ t ) ) + T ( t , θ t ) (2-25)

即为(2-7)式。

W ( t , θ t ) = R ( t , θ t ) + ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T K ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) (2-26)

即为(2-9)式。

T ( t , θ t ) = ( L ( t , θ t ) ) T + ( A 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) (2-27)

即为(2-10)式。

μ = T r { P ( T , θ t ) M } T r { N T N }

且得到:

K ( t , θ t ) = ( W ( t , θ t ) ) + T ( t , θ t ) + Y ( t , θ t ) ( W ( t , θ t ) ) + W ( t , θ t ) Y ( t , θ t ) (2-28)

由于 u ( t ) = K ( t , θ t ) x ( t ) , t N T ,故有:

u * ( t ) = [ ( W ( t , θ t ) ) + T ( t , θ t ) + Y ( t , θ t ) ( W ( t , θ t ) ) + W ( t , θ t ) Y ( t , θ t ) ] x ( t ) u * ( t ) R m × n , t N T (2-29)

即为(2-11)式。

假设 P ( t + 1 , θ t ) 是一个对称矩阵,若 P ( t + 1 , θ t ) 不是对称矩阵,则取:

P ˜ ( t + 1 , θ t ) = P ( t + 1 , θ t ) + ( P ( t + 1 , θ t ) ) T 2

显然, P ˜ ( t + 1 , θ t ) 也是一个对称矩阵。

t = 0 T 1 E [ ( x ( t + 1 ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) x ( t + 1 ) ( x ( t ) ) T E ( t ) [ P ( t ) ] ( θ t ) x ( t ) ] = E [ ( x ( T ) ) T E ( t ) [ P ( T ) ] ( θ T ) x ( T ) ( x ( 0 ) ) T E ( t ) [ P ( 0 ) ] ( θ 0 ) x ( 0 ) ] (2-30)

将第(2-30)式变形得到:

t = 0 T 1 E [ ( x ( t + 1 ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) x ( t + 1 ) ( x ( t ) ) T E ( t ) [ P ( t ) ] ( θ t ) x ( t ) ] E [ ( x ( T ) ) T E ( t ) [ P ( T ) ] ( θ T ) x ( T ) + ( x ( 0 ) ) T E ( t ) [ P ( 0 ) ] ( θ 0 ) x ( 0 ) ] = 0 (2-31)

将第(2-31)式与第(2-2)式相加,得:

J ( t 0 , x 0 ; u ( t 0 ) , u ( t 0 + 1 ) , , u ( T 1 ) ) = E { t = t 0 T 1 ( x ( t ) u ( t ) ) T ( Q ( t , θ t ) L ( t , θ t ) ( L ( t , θ t ) ) T R ( t , θ t ) ) ( x ( t ) u ( t ) ) + x T ( T ) M x ( T ) } + t = 0 T 1 E [ ( x ( t + 1 ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) x ( t + 1 ) ( x ( t ) ) T E ( t ) [ P ( t ) ] ( θ t ) x ( t ) ] E [ ( x ( T ) ) T E ( t ) [ P ( T ) ] ( θ T ) x ( T ) ( x ( 0 ) ) T P ( 0 , θ 0 ) x ( 0 ) ]

= t = t 0 T 1 E { ( u ( t ) + ( W ( t , θ t ) ) + T ( t , θ t ) x ( t ) ) T W ( t , θ t ) ( u ( t ) + ( W ( t , θ t ) ) + T ( t , θ t ) x ( t ) ) } + t = t 0 T 1 T r [ V ( t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ] + E [ ( x ( T ) ) T ( M P ( T , θ t ) ) x ( T ) ] + ( x ( 0 ) ) T P ( 0 , θ 0 ) x ( 0 ) (2-32)

此时,必须证明 W ( t , θ t ) 0 , t N T 。假设 W ( t , θ t ) 有负的特征值 α ,令 v α α 所对应得唯一的特征向量,并且有:

{ ( v α ) T v α = 1 W ( t , θ t ) v α = α v α

对任意的 δ 0 ,设控制列:

u ˜ ( t ) { ( W ( t , θ t ) ) T ( t , θ t ) x ( t ) , t k δ | α | 1 2 v α ( W ( t , θ t ) ) T ( t , θ t ) x ( t ) , t = k

相应的目标函数为:

J ( t 0 , x 0 ; u ˜ ( t 0 ) , u ˜ ( t 0 + 1 ) , , u ˜ ( T 1 ) ) = t = t 0 T 1 E { ( u ( t ) + ( W ( t , θ t ) ) + T ( t , θ t ) x ( t ) ) T W ( t , θ t ) ( u ( t ) + ( W ( t , θ t ) ) + T ( t , θ t ) x ( t ) ) } + t = t 0 T 1 T r [ V ( t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ] + E [ ( x ( T ) ) T ( M P ( T , θ t ) ) x ( T ) ] + ( x ( 0 ) ) T P ( 0 , θ 0 ) x ( 0 )

= ( δ | α | 1 2 v α ) T W ( t , θ t ) δ | α | 1 2 v α + t = t 0 T 1 T r [ V ( t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ] + E [ ( x ( T ) ) T ( M P ( T , θ t ) ) x ( T ) ] + k = 1 T r [ V ( t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ] + E [ ( x ( T ) ) T ( M P ( T , θ t ) ) x ( T ) ] + ( x ( 0 ) ) T P ( 0 , θ 0 ) x ( 0 ) (2-33)

δ 0 时, J ( t 0 , x 0 ; u ˜ ( t 0 ) , u ˜ ( t 0 + 1 ) , , u ˜ ( T 1 ) ) ,这与线性二次控制问题(2-1)、(2-2)、(2-3)矛盾,故 W ( t , θ t ) 0 是成立的。

根据 P ( T , θ t ) = M + μ N T N 可知:

V ( x 0 ) = J ( t 0 , x 0 ; u ( t 0 ) , u ( t 0 + 1 ) , , u ( T 1 ) ) = t = t 0 T 1 T r [ V ( t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ] μ T r S + ( x ( 0 ) ) T P ( 0 , θ 0 ) x ( 0 ) (2-34)

故定理1证明完毕。

4.2. Riccati方程的相关性质研究

定理2:如果系统(2-1)、(2-2)、(2-3)的最优控制问题是可达的, u ( t ) = K ( t , θ t ) x ( t ) , t N T ,而且正则点 ( u ( t ) , x ( t ) ) 是系统(2-1)、(2-2)、(2-3)的局部最优解,则下列广义Riccati方程(GDRE)有解 ( P ( k ) , μ ) , 0 μ R 1 , t N T

P ( t ) = ( A 0 ( t , θ t ) ) T ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) A 0 ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) A k ( t , θ t ) + Q ( t , θ t ) ( T ( t , θ t ) ) T ( W ( t , θ t ) ) + T ( t , θ t ) (2-35)

T ( t , θ t ) = ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) A 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) A k ( t , θ t ) + ( L ( t , θ t ) ) T (2-36)

W ( t , θ t ) = ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) + R ( t , θ t ) (2-37)

W ( t , θ t ) > 0

μ = T r { P ( T , θ t ) M } T r { N T N }

此外,

u * ( t ) = [ ( W ( t , θ t ) ) + T ( t , θ t ) ] x ( t ) u * ( t ) R m × n , t N T

V ( x 0 , θ t ) = T r { V ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) } + x 0 T P ( 0 , θ t ) x 0 μ T r { M } (2-38)

证明:根据定理 可得,系统(2-1)、(2-2)、(2-3)所对应得广义 方程为式(2-23),其中 W ( t , θ t ) > 0 ,且有:

[ R ( t , θ t ) + ( B 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B 0 ( t , θ t ) + k = 1 r ( B k ( t , θ t ) ) T K ( t , θ t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) ] K ( t , θ t ) + ( L ( t , θ t ) ) T + ( A 0 ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] B 0 ( t , θ t ) + k = 1 r ( A k ( t , θ t ) ) T E ( t ) [ P ( t + 1 ) ] ( θ t ) B k ( t , θ t ) = 0 (2-39)

即: W ( t , θ t ) K ( t , θ t ) + T ( t , θ t ) = 0 ,由于 W ( t , θ t ) > 0 ,故存在 ( W ( t , θ t ) ) 1 ,使得:

K ( t , θ t ) = ( W ( t , θ t ) ) 1 T ( t , θ t ) (2-40)

故:

u * ( t ) = K ( t , θ t ) x ( t ) = ( W ( t , θ t ) ) 1 T ( t , θ t ) x ( t ) (2-41)

根据定理1同样可得:

J ( t 0 , x 0 ; u ˜ ( t 0 ) , u ˜ ( t 0 + 1 ) , , u ˜ ( ) ) = t = t 0 E { ( u ( t ) + ( W ( t , θ t ) ) + T ( t , θ t ) x ( t ) ) T W ( t , θ t ) ( u ( t ) + ( W ( t , θ t ) ) + T ( t , θ t ) x ( t ) ) } + t = t 0 T 1 T r [ V ( t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ] + E [ ( x ( T ) ) T ( M P ( T , θ t ) ) x ( T ) ] + ( x ( 0 ) ) T P ( 0 , θ 0 ) x ( 0 ) (2-42)

将(2-41)代入(2-42)式,且 T 可得:

V ( x 0 , θ t ) = t = t 0 T r [ V ( t ) E ( t ) [ P ( t + 1 ) ] ( θ t ) ] μ T r { M } + ( x ( 0 ) ) T P ( 0 , θ 0 ) x ( 0 ) (2-43)

即定理2证明完毕。

5. 数值举例

考虑系统(2-1)、(2-2)、(2-3)的特殊形式,当 N = 1 , Z = { 1 } , r = 1 时,对应的系数为:

A 0 = ( 1 0 0 2 ) , A 1 = ( 1 0 0 1 ) B 0 = ( 2 1 ) , B 1 = ( 1 1 ) x 0 = ( 1 2 3 4 ) , M = ( 1 0 0 1 )

L ( 0 ) = ( 0 0 ) , R ( 0 ) = 13 V ( 0 ) = ( 2 1 1 1 ) , Q ( 0 ) = ( 3 0 0 0 ) a 11 = a 22 = 0 , a 12 = a 21 = 0

ξ 1 = ξ 2 = 1 Y = ( 0 0 )

根据定理1可直接计算得:

P ( 0 ) = ( 65 64 64 52 ) , P ( 1 ) = ( 2 0 0 2 )

H ( 0 ) = ( 8 8 ) , G ( 0 ) = 1 μ = 1 u = ( 8 8 ) x ( t )

6. 结论

本文将离散时间无限时域时变非齐次Markov跳变系统的线性二次最优控制问题转化为广义Riccati方程的可解性。在后续的研究中具有重大的意义,通过广义Riccati方程的可解性可以更好的研究对应系统的稳定性,更是引导未来用算子理论与随机分析的方法来对离散时间无限时域时变非齐次Markov跳变系统的均方稳定性与可探测性等问题进行研究。

基金项目

重庆理工大学研究生教育高质量发展行动计划资助成果,项目编号:gzlcx20223308,项目类型:校级联合资助项目;重庆理工大学研究生教育高质量发展行动计划资助成果,项目编号:gzlcx20223304,项目类型:校级联合资助项目。

NOTES

*通讯作者。

参考文献

[1] 李传江, 马广富. 最优控制[M]. 北京: 科学出版社, 2011.
[2] Bellman, R., Glicksberg, I. and Gross, O. (1958) Some Aspects of the Mathematical Theory of Control Processes. Rand Corporation, Santa Monica.
[3] Kalman, R.E. (1960) Contributions to the Theory of Optimal Control. Boletín de la Sociedad Geológica Mexicana, 5, 102-119.
[4] Kalman, R.E. (1960) On the General Theory of Control Systems. IFAC Proceedings Volumes, 1, 481-493.
https://doi.org/10.1016/S1474-6670(17)70094-8
[5] Kalman, R.E. (1962) Canonical Structure of Linear Dynam-ical Systems. Proceedings of the National Academy of Sciences of the United States of America, 48, 596-600.
https://doi.org/10.1073/pnas.48.4.596
[6] Kalman, R.E. (1963) Mathematical Descripton of Linear Dynamical Systems. Journal of the Society for Industrial and Applied Mathematics Series A Control, 1, 152-192.
https://doi.org/10.1137/0301010
[7] Wonham, W.M. (1968) On a Matrix Riccati Equation of Stochastic Control. Journal of the Society for Industrial and Applied Mathematics Series A Control, 6, 681.
[8] Jacobson, D.H., Martin, D.H., Pachter, M. and Geveci, T. (1980) Extensions of Linear-Quadratic Control Theory. Springer, Berlin.
https://doi.org/10.1007/BFb0004370
[9] Bensoussan, A. (1992) Stochastic Control of Partially Observed Systems. Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511526503
[10] Davis, M.H.A. (1977) Linear Estimation and Stochastic Control. Chapman and Hall, London.
[11] Yaz, E. (1989) Infinite Horizon Quadratic Optimal Control of a Class of Nonlinear Stochastic Systems. IEEE Transactions on Automatic Control, 34, 1176-1180.
https://doi.org/10.1109/9.40747
[12] Zhang, W.H. and Chen, B.S. (2004) On Stabilizability and Exact Observabil-ityof Stochastic Systems with Their Applications. Automatica, 40, 87-94.
https://doi.org/10.1016/j.automatica.2003.07.002
[13] Zhang, W.H., Zhang, H.S. and Chen, B.S. (2008) General-ized Lyapunov Equation Approach to State-Dependent Stochastic Stabilization/Detectability Criterion. IEEE Transactions on Automatic Control, 53, 1630-1642.
https://doi.org/10.1109/TAC.2008.929368
[14] Hu, Y., Jin, H. and Zhou, X. (2012) Time-Inconsistent Stochastic Linear-Quadratic Control. SIAM Journal on Control and Optimization, 50, 1548-1572.
https://doi.org/10.1137/110853960
[15] Yong, J.M. (2013) Linear-Quadratic Optimal Control Problems for Mean-Field Stochastic Differential Equations. SIAM Journal on Control and Optimization, 51, 2809-2838.
https://doi.org/10.1137/120892477
[16] Qian, Z. and Zhou, X. (2013) Existence of Solutions to a Class of Indefi-nite Stochastic Riccati Equations. SIAM Journal on Control and Optimization, 51, 221-229.
https://doi.org/10.1137/120873777
[17] Meng, Q. (2014) Linear Quadratic Optimal Stochastic Control Problem Driven by a Brownian Motion and a Poisson Random Martingale Measure with Random Cofficients. Stochastic Analysis and Applications, 32, 88-109.
https://doi.org/10.1080/07362994.2013.845106
[18] Burachik, R.S., Kaya, C.Y. and Majeed, S.N. (2014) A Dual-ity Approach for Solving Control-Constrained Linear- Quadratic Optimal Control Problems. SIAM Journal on Control and Optimization, 52, 1423-1456.
https://doi.org/10.1137/130910221
[19] Huang, Y., Zhang, W. and Zhang, H. (2008) Infinite Horizon Linear Quadratic Optimal Control for Discrete-Time Stochastic Systems. Asian Journal of Control, 10, 608-615.
https://doi.org/10.1002/asjc.61
[20] Ku, R.T. and Athans, M. (1977) Further Results on the Uncertainty Threshold Principle. IEEE Transactions on Automatic Control, 22, 866-868.
https://doi.org/10.1109/TAC.1977.1101633