Example: bachelor of science

The Schur Complement and Symmetric Positive Semide …

The Schur Complement and Symmetric PositiveSemidefinite (and Definite) MatricesJean GallierAugust 24, 20191 Schur ComplementsIn this note, we provide some details and proofs of some results from Appendix (especiallySection ) ofConvex Optimizationby Boyd and Vandenberghe [1].LetMbe ann nmatrix written a as 2 2 block matrixM=(A BC D),whereAis ap pmatrix andDis aq qmatrix, withn=p+q(so,Bis ap qmatrixandCis aq pmatrix). We can try to solve the linear system(A BC D)(xy)=(cd),that isAx+By=cCx+Dy=d,by mimicking Gaussian elimination, that is, assuming thatDis invertible, we first solve forygettingy=D 1(d Cx)and after substituting this expression foryin the first equation, we getAx+B(D 1(d Cx)) =c,that is,(A BD 1C)x=c BD the matrixA BD 1 Cis invertible, then we obtain the solution to our systemx= (A BD 1C) 1(c BD 1d)y=D 1(d C(A BD 1C) 1(c BD 1d)).The matrix,A BD 1C, is called theSchur ComplementofDinM. IfAis invertible,then by eliminatingxfirst using the first equation we find that the Schur Complement ofAinMisD CA 1B(this corresponds to the Schur Complement defined in Boyd andVandenberghe [1] whenC=B>).

The Schur Complement and Symmetric Positive Semide nite (and De nite) Matrices Jean Gallier August 24, 2019 1 Schur Complements In this note, we provide some details and proofs of some results from Appendix A.5 (especially Section A.5.5) of Convex Optimization by Boyd and Vandenberghe [1]. Let Mbe an n nmatrix written a as 2 2 block matrix M= A ...

Tags:

  Positive, Complement, Crush, Demise, Symmetric, Schur complement and symmetric positive semide

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of The Schur Complement and Symmetric Positive Semide …

1 The Schur Complement and Symmetric PositiveSemidefinite (and Definite) MatricesJean GallierAugust 24, 20191 Schur ComplementsIn this note, we provide some details and proofs of some results from Appendix (especiallySection ) ofConvex Optimizationby Boyd and Vandenberghe [1].LetMbe ann nmatrix written a as 2 2 block matrixM=(A BC D),whereAis ap pmatrix andDis aq qmatrix, withn=p+q(so,Bis ap qmatrixandCis aq pmatrix). We can try to solve the linear system(A BC D)(xy)=(cd),that isAx+By=cCx+Dy=d,by mimicking Gaussian elimination, that is, assuming thatDis invertible, we first solve forygettingy=D 1(d Cx)and after substituting this expression foryin the first equation, we getAx+B(D 1(d Cx)) =c,that is,(A BD 1C)x=c BD the matrixA BD 1 Cis invertible, then we obtain the solution to our systemx= (A BD 1C) 1(c BD 1d)y=D 1(d C(A BD 1C) 1(c BD 1d)).The matrix,A BD 1C, is called theSchur ComplementofDinM. IfAis invertible,then by eliminatingxfirst using the first equation we find that the Schur Complement ofAinMisD CA 1B(this corresponds to the Schur Complement defined in Boyd andVandenberghe [1] whenC=B>).

2 The above equations written asx= (A BD 1C) 1c (A BD 1C) 1BD 1dy= D 1C(A BD 1C) 1c+ (D 1+D 1C(A BD 1C) 1BD 1)dyield a formula for the inverse ofMin terms of the Schur Complement ofDinM, namely(A BC D) 1=((A BD 1C) 1 (A BD 1C) 1BD 1 D 1C(A BD 1C) 1D 1+D 1C(A BD 1C) 1BD 1).A moment of reflexion reveals that(A BC D) 1=((A BD 1C) 10 D 1C(A BD 1C) 1D 1)(I BD 10I),and then(A BC D) 1=(I0 D 1C I)((A BD 1C) 100D 1)(I BD 10I).It follows immediately that(A BC D)=(I BD 10I)(A BD 1C00D)(I0D 1C I).The above expression can be checked directly and has the advantage of only requiring theinvertibility :IfAis invertible, then we can use the Schur Complement ,D CA 1B, ofAtoobtain the following factorization ofM:(A BC D)=(I0CA 1I)(A00D CA 1B)(I A 1B0I).IfD CA 1 Bis invertible, we can invert all three matrices above and we get another formulafor the inverse ofMin terms of (D CA 1B), namely,(A BC D) 1=(A 1+A 1B(D CA 1B) 1CA 1 A 1B(D CA 1B) 1 (D CA 1B) 1CA 1(D CA 1B) 1).2 IfA,Dand both Schur complementsA BD 1 CandD CA 1 Bare all invertible, bycomparing the two expressions forM 1, we get the (non-obvious) formula(A BD 1C) 1=A 1+A 1B(D CA 1B) 1CA this formula, we obtain another expression for the inverse ofMinvolving the Schurcomplements ofAandD(see Horn and Johnson [5]):(A BC D) 1=((A BD 1C) 1 A 1B(D CA 1B) 1 (D CA 1B) 1CA 1(D CA 1B) 1).

3 If we setD=Iand changeBto Bwe get(A+BC) 1=A 1 A 1B(I CA 1B) 1CA 1,a formula known as thematrix inversion lemma(see Boyd and Vandenberghe [1], , especially ).2 A Characterization of Symmetric Positive DefiniteMatrices Using Schur ComplementsNow, if we assume thatMis Symmetric , so thatA,Dare Symmetric andC=B>, then wesee thatMis expressed asM=(A BB>D)=(I BD 10I)(A BD 1B>00D)(I BD 10I)>,which shows thatMis similar to a block-diagonal matrix (obviously, the Schur Complement ,A BD 1B>, is Symmetric ). As a consequence, we have the following version of Schur strick to check whetherM 0 for a Symmetric matrix,M, where we use the usual notation,M 0 to say thatMis Positive definite and the notationM 0 to say thatMis any Symmetric matrix,M, of the formM=(A BB>C),ifCis invertible then the following properties hold:(1)M 0iffC 0andA BC 1B> 0.(2) IfC 0, thenM 0iffA BC 1B> (1) Observe that(I BC 10I) 1=(I BC 10I)and we know that for any Symmetric matrix,T, and any invertible matrix,N, the matrixTis Positive definite (T 0) iffNTN>(which is obviously Symmetric ) is Positive definite(NTN> 0).

4 But, a block diagonal matrix is Positive definite iff each diagonal block ispositive definite, which concludes the proof.(2) This is because for any Symmetric matrix,T, and any invertible matrix,N, we haveT 0 iffNTN> version of Proposition using the Schur Complement ofAinstead of theSchur Complement ofCalso holds. The proof uses the factorization ofMusing the Schurcomplement ofA(see Section 1).Proposition any Symmetric matrix,M, of the formM=(A BB>C),ifAis invertible then the following properties hold:(1)M 0iffA 0andC B>A 1B 0.(2) IfA 0, thenM 0iffC B>A 1B is an illustration of Proposition (2). Consider the nonlinear quadratic constraint(Ax+b)>(Ax+b) c>x+d,wereA Mn(R),x,b,c Rnandd R. Since obviouslyI=Inis invertible andI 0, wehave(IAx+b(Ax+b)>c>x+d) 0iffc>x+d (Ax+b)>(Ax+b) 0 iff (Ax+b)>(Ax+b) c>x+d, since the matrix (ascalar)c>x+d (Ax+b)>(Ax+b) is the Schur Complement ofIin the above trick of using Schur complements to convert nonlinear inequality constraints intolinear constraints on Symmetric matrices involving the semidefinire ordering is used exten-sively to convert nonlinear problems into semidefinite programs; see Boyd and Vandenberghe[1].

5 WhenCis singular (orAis singular), it is still possible to characterize when a symmetricmatrix,M, as above is Positive semidefinite but this requires using a version of the Schurcomplement involving the pseudo-inverse ofC, namelyA BC B>(or the Schur Complement ,C B>A B, ofA). But first, we need to figure out when a quadratic function of the form412x>Px+x>bhas a minimum and what this optimum value is, wherePis a symmetricmatrix. This corresponds to the (generally nonconvex) quadratic optimization problemminimizef(x) =12x>Px+x>b,which has no solution unlessPandbsatisfy certain Pseudo-InversesWe will need pseudo-inverses so let s review this notion quickly as well as the notion ofSVD which provides a convenient way to compute pseudo-inverses. We only consider thecase of square matrices since this is all we need. For comprehensive treatments of SVD andpseudo-inverses see Gallier [3] (Chapters 12, 13), Strang [7], Demmel [2], Trefethen and Bau[8], Golub and Van Loan [4] and Horn and Johnson [5, 6].

6 Recall that every squaren nmatrix,M, has asingular value decomposition, for short,SVD, namely, we can writeM=U V>,whereUandVare orthogonal matrices and is a diagonal matrix of the form = diag( 1,.., r,0,..,0),where 1 r>0 andris the rank ofM. The i s are called thesingular valuesofMand they are the Positive square roots of the nonzero eigenvalues ofMM>andM>M. Fur-thermore, the columns ofVare eigenvectors ofM>Mand the columns ofUare eigenvectorsofMM>. Observe thatUandVare not V>is some SVD ofM, we define thepseudo-inverse,M , ofMbyM =V U>,where = diag( 11,.., 1r,0,..,0).Clearly, whenMhas rankr=n, that is, whenMis invertible,M =M 1, soM is a generalized inverse ofM. Even though the definition ofM seems to depend onUandV, actually,M is uniquely defined in terms ofM(the sameM is obtained for all possibleSVD decompositions ofM). It is easy to check thatMM M=MM MM =M and bothMM andM Mare Symmetric matrices. In fact,MM =U V>V U>=U U>=U(Ir00 0n r)U>5andM M=V U>U V>=V V>=V(Ir00 0n r)V>.

7 We immediately get(MM )2=MM (M M)2=M M,so bothMM andM Mare orthogonal projections (since they are both Symmetric ). Weclaim thatMM is the orthogonal projection onto the range ofMandM Mis the orthogonalprojection onto Ker(M) , the orthogonal Complement of Ker(M).Obviously, range(MM ) range(M) and for anyy=Mx range(M), asMM M=M,we haveMM y=MM Mx=Mx=y,so the image ofMM is indeed the range ofM. It is also clear that Ker(M) Ker(M M)and sinceMM M=M, we also have Ker(M M) Ker(M) and so,Ker(M M) = Ker(M).SinceM Mis Hermitian, range(M M) = Ker(M M) = Ker(M) , as will also be useful to see that range(M) = range(MM ) consists of all vectory Rnsuch thatU>y=(z0),withz , ify=Mx, thenU>y=U>Mx=U>U V>x= V>x=( r000n r)V>x=(z0),where ris ther rdiagonal matrix diag( 1,.., r). Conversely, ifU>y=(z0), theny=U(z0)andMM y=U(Ir00 0n r)U>y=U(Ir00 0n r)U>U(z0)=U(Ir00 0n r)(z0)=U(z0)=y,6which shows thatybelongs to the range , we claim that range(M M) = Ker(M) consists of all vectory Rnsuch thatV>y=(z0),withz Mu, theny=M Mu=V(Ir00 0n r)V>u=V(z0),for somez Rr.

8 Conversely, ifV>y=(z0), theny=V(z0)and so,M MV(z0)=V(Ir00 0n r)V>V(z0)=V(Ir00 0n r)(z0)=V(z0)=y,which shows thaty range(M M).IfMis a Symmetric matrix, then in general, there is no SVD,U V>, ofMwithU= , ifM 0, then the eigenvalues ofMare nonnegative and so the nonzero eigenvaluesofMare equal to the singular values ofMand SVD s ofMare of the formM=U U>.Analogous results hold for complex matrices but in this case,UandVare unitarymatrices andMM andM Mare Hermitian orthogonal a normal matrix which, means thatMM>=M>M, then there is an intimaterelationship between SVD s ofMand block diagonalizations ofM. As a consequence, thepseudo-inverse of a normal matrix,M, can be obtained directly from a block a (real) normal matrix, then it can be block diagonalized with respect to anorthogonal matrix,U, asM=U U>,where is the (real) block diagonal matrix, = diag(B1,..,Bn),7consisting either of 2 2 blocks of the formBj=( j j j j)with j6= 0, or of one-dimensional blocks,Bk= ( k).

9 Assume thatB1,..,Bpare 2 2 blocks and that 2p+1,.., nare the scalar know that the numbers j i j, and the 2p+kare the eigenvalues ofA. Let 2j 1= 2j= 2j+ 2j= det(Bi) forj= 1,..,p, j=| j|forj= 2p+ 1,..,r. MultiplyingUby a suitable permutation matrix, we may assume that the blocks of are ordered so that 1 2 r>0. Then it is easy to see thatAA>=A>A=U U>U >U>=U >U>,with >= diag( 21,.., 2r,0,..,0),so 1 2 r>0 are the singular values 1 2 r>0 ofA. Define thediagonal matrix = diag( 1,.., r,0,..,0),wherer= rank(A), 1 r>0 and the block diagonal matrix defined such thatthe blockBiin is replaced by the block 1 Biwhere = det(Bi), the nonzero scalar jis replaced j/| j|, and a diagonal zero is replaced by 1. Observe that is an orthogonalmatrix and = .But then we can writeA=U U>=U U>,and we if letV=U , sinceUis orthogonal and is also orthogonal,Vis also orthogonalandA=V U>is an SVD for A. Now we getA+=U +V>=U + >U>.However, since is an orthogonal matrix, >= 1, and a simple calculation shows that + >= + 1= +,which yields the formulaA+=U +U>.

10 Also observe that ris invertible and +=( 1r000).Therefore, the pseudo-inverse of a normal matrix can be computed directly from any blockdiagonalization ofA, as , we will use pseudo-inverses to generalize the result of Section 2 to symmetricmatricesM=(A BB>C)whereC(orA) is A Characterization of Symmetric Positive Semidefi-nite Matrices Using Schur ComplementsWe begin with the following simple fact:Proposition an invertible Symmetric matrix, then the functionf(x) =12x>Px+x>bhas a minimum value iffP 0, in which case this optimal value is obtained for a uniquevalue ofx, namelyx = P 1b, and withf( P 1b) = 12b>P Observe that12(x+P 1b)>P(x+P 1b) =12x>Px+x>b+12b>P ,f(x) =12x>Px+x>b=12(x+P 1b)>P(x+P 1b) 12b>P some negative eigenvalue, say (with >0), if we pick any eigenvector,u, ofPassociated with , then for any Rwith 6= 0, if we letx= u P 1b, then asPu= uwe getf(x) =12(x+P 1b)>P(x+P 1b) 12b>P 1b=12 u>P u 12b>P 1b= 12 2 u 22 12b>P 1b,and as can be made as large as we want and >0, we see thatfhas no , in order forfto have a minimum, we must haveP 0.