Linear Quadratic Optimal Control - Automatica

Chapter 6 Linear Quadratic Optimal IntroductionIn previous lectures, we discussed the design of state feedback controllers using using eigenvalue(pole) placement algorithms. For single input systems, given a set of desired eigenvalues, thefeedback gain to achieve this is unique (as long as the systemis controllable). For multi-inputsystems, the feedback gain is not unique, so there is additional design freedom. How does oneutilize this freedom? A more fundamental issue is that the choice of eigenvalues is not obvious. Forexample, there are trade offs between robustness, performance, and Control Quadratic (LQ) Optimal Control can be used to resolvesome of these issues, by notspecifying exactly where the closed loop eigenvalues should be directly, but instead by specifyingsome kind of performance objective function to be optimized.

Other Optimal Control objectives,besides the LQ type, can also be used to resolve issues of trade offs and extra design first consider thefinite time horizon casefor general time varying Linear systems, andthen proceed to discuss theinfinite time horizoncase for Linear Time Invariant Finite Time Horizon LQ Problem FormulationConsider them input,n state system withx n,u m: x=A(t)x+B(t)u(t);x(0) =x0.( )Find open loop controlu( ), [t0, tf] such that the following objective function is minimized:J(u, x0, t0, tf) =Ztft0 xT(t)Q(t)x(t) +uT(t)R(t)u(t) dt+x(tf)TSx(tf).( )whereQ(t) andSare symmetric positive semi-definiten nmatrices,R(t) is a symmetric positivedefinitem mmatrix. Notice thatx0,t0, andtfare fixed and given Control goal generally is to keepx(t) close to 0, especially, at the final timetf, using littlecontrol effortu.

To wit, notice in ( ) xT(t)Q(t)x(t) penalizes the transient state deviation, xT(tf)Sx(tf) penalizes the finite state101102 CHAPTER 6. Linear Quadratic Optimal Control uT(t)R(t)u(t) penalizes Control formulation can accommodate regulating an outputy(t) =C(t)x(t) rat near 0. In thiscase, one choice forSandQ(t) areCT(t)W(t)C(t) whereW(t) r ris symmetic positive Solution to Optimal Control problemGeneral finite, fixed horizon Optimal Control problem:For the system with fixed initialcondition, x=f(x, u, t);x(t0) =x0given,and a given time horizon[t0, tf], findu(t),t [t0, tf]such that the following cost function isminimized:J(u( ), x0) = (x(tf)) +Ztft0L(x(t), u(t), t)dtwhere the first term is thefinal costand the second term is therunning : = Hx= L x T f x( ) x=f(x, u, t)( )Hu= L u T f u= 0( ) T(tf) = x(x(tf))( )x(t0) =x0.

( )This is a set of 2ndifferential equations (inxand ) with split boundary conditions att0andtf:x(t0) =x0and T(tf) = x(x(tf)), and an equation that would typically specifyu(t) in terms ofx(t) and/or (t). We shall see the specialization to the LQ case :The solution is obtained by converting the constrained Optimal Control problem into anunconstrained Optimal Control problem using the Lagrange multiplierfunction (t) n: J(u, x0) =J(u( ), x0) +Ztft0 T(t)[f(x, u, t) x] thatddt( T(t) x(t)) = T(t)x(t) + T(t) x. SoZtft0 T x dt= T(tf) x(tf) T(t0) x(t0) Ztft0 Tx us define the so called Hamiltonian functionH(x, u, t) :=L(x, u, t) + T(t)f(x, u, t). Thus, J= (x(tf)) T(tf)x(tf) + T(t0)x(t0) +Ztft0hH(x(t), u(t), t) + (t)x(t)idtThe necessary condition for optimality is that the variation Jof the modified cost with respectto all feasible variations x(t), (t), u(t) and (tf) should FINITE TIME HORIZON LQ REGULATOR103 J= [ x T] x(tf) + T(t0) x(t0) +Ztft0n[Hx+ T] x(t) + [Hu] u(t)odt+Ztft0 T(t)[f(x(t), u(t), t) x]dtSincex(t0) =x0is fixed, x(t0) = 0.

Otherwise, other variations x(t), u(t) or (t) are allfeasible. Setting the terms that multiply these variationsto be zero yield Eqs.( )-( ). Open loop solutionApplying the general Optimal Control solution in section to the LQ problem in Eqs.( )-( ),we have:Theorem Optimal Control is given by:uo(t) = R 1BT(t) (t)( )where (t)andx(t)satisfy the Hamilton-Jacobi equation: x = A(t) B(t)R 1BT(t) Q(T) AT(t) |{z}Hamiltonian Matrix -H(t) x ( )with boundary conditions:x(t0) =x0; (tf) =Sx(tf).( ) Boundary conditions are specified at both initial timet0and final timetf(two point boundaryvalue problem). In general, these are difficult to solve and require iterative methods such asshooting method. Optimal Control in Eq. ( ) isopen loop. It is computed by first computing (t) for allt [t0, tf] and then applyinguo(t) = R 1BT(t) (t).

Open loop Control is not robust to disturbances or Feedback Control solutionConsider thematrixdifferential equation using the Hamiltonian matrixH(t), whereX1(t),X2(t) n n. X1(t) X2(t) = A(t) B(t)R 1BT(t) Q(T) AT(t) |{z}Hamiltonian Matrix -H(t) X1(t)X2(t) ( )with boundary conditionsX1(tf) n nbeing any invertible matrix, andX2(tf) =SX1(tf).X1(t) andX2(t) can be integrated backwards in time fromtf 6. Linear Quadratic Optimal CONTROLLet us assume (and it can be proven) thatX1(t) is invertible. We propose that the solution tothe Hamilton-Jacobi equation ( )-( ) is given by: x(t) (t) = X1(t)X2(t) vfor (t) and (t) as proposed clearly satisfy ( ), and the boundary condition (tf) =Sx(tf). Theinitial conditionx(t0) =x0can be satisfied by choosingv=X 11(t0) we defineP(t) =X2(t)X 11(t), then (t) =P(t)x(t), so that the Optimal Control in Eq.

( )can be implemented as a feedback as given in the following cost function( )is minimized using the Control :u (t) = R(t)TBT(t)P(t)x(t)( )whereP(t) n nis the solution to the following so called continuous time Riccati DifferentialEquation (CTRDE): P(t) =AT(t)P(t) +P(t)A(t) P(t)B(t)R 1(t)BT(t)P(t) +Q(t);P(tf) =S.( )Moreover, the minimum cost achieved using the above controlis:J (x0, t0, tf) :=minu( )J(u, x0) =xT0P(t0)x0 Proof:The feedback form of the Optimal Control Eq.( ) has already been shown. To showthat CTRDE in Eq.( ) is satisfied byP(t), one needs only differentiateP(t) =X 11(t)X2(t),and making use of Eq.( ) and its boundary proof thatP(t) determines the minimal cost will be discussed later using Dynamic Pro-gramming (DP) principle. (t) is solved backwards in time fromtf t0and should be stored in memory before The Optimal Control law is in the form of a time varying Linear state feedbacku(t) = K(t)x(t)with feedback gainK(t) :=R(t)TBT(t)P(t).

The open loop Optimal Control can be obtained,if so desired, by integrating ( ) with the Control ( ).It is, however, much better to utilizefeedback than to use The Riccati differential equation can be derived fromP(t) =X2(t)X 11(t) and ( ).4. By direct substitution, it is easy to see the solution (t) =P(t)x(t) satisfies ( )-( ).Since the solution of CTRDE ( ) does not rely on solving forX1(t) orX2(t) explicitly, theassumption thatX1(t) is invertible is in fact not needed for the proof of this theorem. It canbe thought of as a useful device to guess the The Control formulation works for time varying systems, nonlinear systems linearizedabout a (t) can be shown to be associated with the cost-to-go function (see below). Using thisinterpretation, it can easily be shown thatP(t) must be at least positive FINITE TIME HORIZON LQ Cost-to-go functionThe matrix functionP(t) is associated with the so-called cost-to-go function.

By thisit is meantthat if at any timet1 [t0, tf], and the state isx(t1), then, the Control policy ( ) for theremaining time period [t1, tf] will result in a costJ(u, x(t1), t1, tf) in ( ) witht0substituted byt1andx0substituted byx(t1) such that:Jo(x(t), t, tf) :=minuJ(u, x(t), t, tf) =xT(t)P(t)x(t)Since the Optimal Control ,uo(t) = K(t)x(t) = R 1(t)BT(t)P(t)x(t), the closed loop systemsatisfies, x= [A(t) B(t)K(t)]x(t)so thatx(t) = (t, t0)x0where (t, t0) is the transition matrix forA(t) B(t)K(t). For this reason,the achieved minimal cost function must be of the form (omitting final timetfto avoid clutter):Jo(x0, t0) =J(uo, x0, t0, tf) =xT0 P(t0) some positive semi-definite matrix P(t0). Our task is to show that P(t0) =P(t0). To derivethis result, we need the dynamic Programming (DP) Programming PrincipleConsider the system: x=f(x(t), u(t), t),x(t0) =x0,and the cost index over the interval [t0, tf] is:J(u( ), x0, t0) =Ztft0L(x(t), u(t), t)dt+ (x(tf)).

( )In the theorem below,tfis assumed to be thatuo(t),t [t0, tf]minimizes( )subject toxo(t0) =x0andxo(t)is the associated state trajectory. Let the (minimum) cost achieved usinguo(t)be:Jo(x0, t0) = arg minu( ), [t0,tf]J(u( ), xo, t0, tf)Then, for t1 tf, the restriction of the controluo( )to [t1, tf]minimizesJ(u( ), xo(t1), t1) =Ztft1L(x(t), u(t), t)dt+ (x(tf))subject to the initial conditionx(t1) =xo(t1). ( )is Optimal over the sub-interval[t1, tf].Corollary t1 tf. Consider the Optimal Control problem for the sub-interval[t1, tf].IfJo(x0, t1)is the Optimal cost and the Optimal Control is given byu(t) =uo(x0, t)fort [t1, tf].Then, the Optimal Control for the larger intervalt [t0, tf]with initial conditionx(t0) =x0is givenby:u(t) =(arg minu( )Rt1t0L(x, u, t)dt+Jo(x(t1), t1)t [t0, t1)uo(x(t1), t)t [t1, tf]( )wherex(t1)is the state attained via the controlu(t) 6.]

Linear Quadratic Optimal Control - Automatica

Tags:

Information

Advertisement

Transcription of Linear Quadratic Optimal Control - Automatica

Related search queries

Linear Quadratic Optimal Control - Automatica

Tags:

Information

Advertisement

Related documents

Related search queries