Example: biology

Matrix Di erentiation - Department of Atmospheric Sciences

Matrix Differentiation( and some other stuff )Randal J. BarnesDepartment of Civil Engineering, University of MinnesotaMinneapolis, Minnesota, USA1 IntroductionThroughout this presentation I have chosen to use asymbolic Matrix notation. This choicewas not made lightly. I am a strong advocate of index notation, when appropriate. Forexample, index notation greatly simplifies the presentation and manipulation of differentialgeometry. As a rule-of-thumb, if your work is going to primarily involve differentiationwith respect to the spatial coordinates, then index notation is almost surely the the present case, however, I will be manipulating large systems of equations in whichthe Matrix calculus is relatively simply while the Matrix algebra and Matrix arithmetic ismessy and more involved.

5 Matrix Di erentiation In the following discussion I will di erentiate matrix quantities with respect to the elements of the referenced matrices. Although no new concept is required to carry out such operations, the element-by-element calculations involve cumbersome manipulations and, thus, it …

Tags:

  Matrix, Erentiation, Matrix di erentiation

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Matrix Di erentiation - Department of Atmospheric Sciences

1 Matrix Differentiation( and some other stuff )Randal J. BarnesDepartment of Civil Engineering, University of MinnesotaMinneapolis, Minnesota, USA1 IntroductionThroughout this presentation I have chosen to use asymbolic Matrix notation. This choicewas not made lightly. I am a strong advocate of index notation, when appropriate. Forexample, index notation greatly simplifies the presentation and manipulation of differentialgeometry. As a rule-of-thumb, if your work is going to primarily involve differentiationwith respect to the spatial coordinates, then index notation is almost surely the the present case, however, I will be manipulating large systems of equations in whichthe Matrix calculus is relatively simply while the Matrix algebra and Matrix arithmetic ismessy and more involved.

2 Thus, I have chosen to use symbolic Notation and NomenclatureDefinition 1 Letaij R,i=1, 2, .. ,m,j=1, 2, .. ,n. Then the ordered rectangulararrayA= a11a12 a1na21a22 amn (1)is said to be a realmatrixof dimensionm writing a Matrix I will occasionally write down its typical element as well as itsdimension. Thus,A=[aij],i=1, 2, .. ,m;j=1, 2, .. ,n,(2)denotes a Matrix withmrows andncolumns, whose typical element isaij. Note, the firstsubscript locates therowin which the typical element lies while the second subscript locatesthecolumn. For example,ajkdenotes the element lying in thejth row andkth column ofthe 2 Avectoris a Matrix with only one column. Thus, all vectors are inherentlycolumn 1 Multi-column matrices are denoted by boldface uppercase letters: for example,A,B, (single-column matrices) are denoted by boldfaced lowercase letters: for example,a,b,x.

3 I will attempt to use letters from the beginning of the alphabet to designate knownmatrices, and letters from the end of the alphabet for unknown or variable 8361 Spring 2006 Convention 2 When it is useful to explicitly attach the Matrix dimensions to the symbolic notation, I willuse an underscript. For example,Am n, indicates a known, multi-column Matrix superscriptTdenotes the Matrix transpose operation; for example,ATdenotes thetranspose ofA. Similarly, ifAhas an inverse it will be denoted byA 1. The determinantofAwill be denoted by either|A|or det(A). Similarly, the rank of a matrixAis denoted byrank(A). An identity Matrix will be denoted byI, and0will denote a null Matrix MultiplicationDefinition 3 LetAbem n, andBben p, and let the productABbeC=AB(3)thenCis am pmatrix, with element(i,j)given bycij=n k=1aikbkj(4)for alli=1, 2.

4 ,m,j=1, 2, .. , 1 LetAbem n, andxben 1, then the typical element of the productz=Ax(5)is given byzi=n k=1aikxk(6)for alli=1, 2, .. ,m. Similarly, letybem 1, then the typical element of the productzT=yTA(7)is given byzi=n k=1akiyk(8)for alli=1, 2, .. ,n. Finally, the scalar resulting from the product =yTAx(9)is given by =m j=1n k=1ajkyjxk(10)Proof: These are merely direct applications of Definition 8361 Spring 2006 Proposition 2 LetAbem n, andBben p, and let the productABbeC=AB(11)thenCT=BTAT(12)Proof: The typical element ofCis given bycij=n k=1aikbkj(13)By definition, the typical element ofCT, saydij, is given bydij=cji=n k=1ajkbki(14)Hence,CT=BTAT(15) 3 LetAandBben nand invertible matrices. Let the productABbe givenbyC=AB(16)thenC 1=B 1A 1(17)Proof:CB 1A 1=ABB 1A 1=I(18) Partioned MatricesFrequently, I will find it convenient to deal withpartitioned matrices1.

5 Such a representation,and the manipulation of this representation, are two of the relative advantages of the symbolicmatrix 4 LetAbem nand writeA=[B CD E](19)whereBism1 n1,Eism2 n2,Cism1 n2,Dism2 n1,m1+m2=m, andn1+n2= above is said to be apartitionof the of the material in this section is extracted directly from Dhrymes (1978, Section ).3CE 8361 Spring 2006 Proposition 4 LetAbe a square, nonsingular Matrix of orderm. PartitionAasA=[A11A12A21A22](20)so thatA11is a nonsingular Matrix of orderm1,A22is a nonsingular Matrix of orderm2,andm1+m2=m. ThenA 1=[(A11 A12A 122A21) 1 A 111A12(A22 A21A 111A12) 1 A 122A21(A11 A12A 122A21) 1(A22 A21A 111A12) 1](21)Proof: Direct multiplication of the proposedA 1andAyieldsA 1A=I(22) Matrix DifferentiationIn the following discussion I will differentiate Matrix quantities with respect to the elementsof the referenced matrices.

6 Although no new concept is required to carry out such operations,the element-by-element calculations involve cumbersome manipulations and, thus, it is usefulto derive the necessary results and have them readily 3 Lety= (x),(23)whereyis anm-element vector, andxis ann-element vector. The symbol y x= y1 x1 y1 x2 y1 xn y2 x1 y2 x2 y2 ym x1 ym x2 ym xn (24)will denote them nmatrix of first-order partial derivatives of the transformation fromxtoy. Such a Matrix is called the Jacobian Matrix of the transformation ().Notice that ifxis actually a scalar in Convention 3 then the resulting Jacobian matrixis am 1 Matrix ; that is, a single column (a vector). On the other hand, ifyis actually ascalar in Convention 3 then the resulting Jacobian Matrix is a 1 nmatrix; that is, a singlerow (the transpose of a vector).

7 Proposition 5 Lety=Ax(25)2 Much of the material in this section is extracted directly from Dhrymes (1978, Section ). Theinterested reader is directed to this worthy reference to find additional 8361 Spring 2006whereyism 1,xisn 1,Aism n, andAdoes not depend onx, then y x=A(26)Proof: Since theith element ofyis given byyi=n k=1aikxk(27)it follows that yi xj=aij(28)for alli=1, 2, .. ,m,j=1, 2, .. ,n. Hence y x=A(29) 6 Lety=Ax(30)whereyism 1,xisn 1,Aism n, andAdoes not depend onx, as in Proposition thatxis a function of the vectorz, whileAis independent ofz. Then y z=A x z(31)Proof: Since theith element ofyis given byyi=n k=1aikxk(32)for alli=1, 2, .. ,m, it follows that yi zj=n k=1aik xk zj(33)but the right hand side of the above is simply element(i,j)ofA x z.

8 Hence y z= y x x z=A x z(34) 7 Let the scalar be defined by =yTAx(35)whereyism 1,xisn 1,Aism n, andAis independent ofxandy, then x=yTA(36)5CE 8361 Spring 2006and y=xTAT(37)Proof: DefinewT=yTA(38)and note that =wTx(39)Hence, by Proposition 5 we have that x=wT=yTA(40)which is the first result. Since is a scalar, we can write = T=xTATy(41)and applying Proposition 5 as before we obtain y=xTAT(42) 8 For the special case in which the scalar is given by the quadratic form =xTAx(43)wherexisn 1,Aisn n, andAdoes not depend onx, then x=xT(A+AT)(44)Proof: By definition =n j=1n i=1aijxixj(45)Differentiating with respect to thekth element ofxwe have xk=n j=1akjxj+n i=1aikxi(46)for allk=1, 2, .. ,n, and consequently, x=xTAT+xTA=xT(AT+A)(47) 8361 Spring 2006 Proposition 9 For the special case whereAis a symmetric Matrix and =xTAx(48)wherexisn 1,Aisn n, andAdoes not depend onx, then x=2xTA(49)Proof: This is an obvious application of Proposition 10 Let the scalar be defined by =yTx(50)whereyisn 1,xisn 1, and bothyandxare functions of the vectorz.

9 Then z=xT y z+yT x z(51)Proof: We have =n j=1xjyj(52)Differentiating with respect to thekth element ofzwe have zk=n j=1(xj yj zk+yj xj zk)(53)for allk=1, 2, .. ,n, and consequently, z= y y z+ x x z=xT y z+yT x z(54) 11 Let the scalar be defined by =xTx(55)wherexisn 1, andxis a function of the vectorz. Then z=2xT x z(56)Proof: This is an obvious application of Proposition 12 Let the scalar be defined by =yTAx(57)whereyism 1,xisn 1,Aism n, and bothyandxare functions of the vectorz,whileAdoes not depend onz. Then z=xTAT y z+yTA x z(58)7CE 8361 Spring 2006 Proof: DefinewT=yTA(59)and note that =wTx(60)Applying Propositon 10 we have z=xT w z+wT x z(61)Substituting back in forwwe arrive at z= y y z+ x x z=xTAT y z+yTA x z(62) 13 Let the scalar be defined by the quadratic form =xTAx(63)wherexisn 1,Aisn n, andxis a function of the vectorz, whileAdoes not dependonz.

10 Then z=xT(A+AT) x z(64)Proof: This is an obvious application of Proposition 14 For the special case whereAis a symmetric Matrix and =xTAx(65)wherexisn 1,Aisn n, andxis a function of the vectorz, whileAdoes not dependonz. Then z=2xTA x z(66)Proof: This is an obvious application of Proposition 5 LetAbe am nmatrix whose elements are functions of the scalar parameter . Then the derivative of the matrixAwith respect to the scalar parameter is them nmatrix of element-by-element derivatives: A = a11 a12 a1n a21 a22 a2n .. am1 am2 amn (67)Proposition 15 LetAbe a nonsingular,m mmatrix whose elements are functions ofthe scalar parameter . Then A 1 = A 1 A A 1(68)8CE 8361 Spring 2006 Proof: Start with the definition of the inverseA 1A=I(69)and differentiate, yieldingA 1 A + A 1 A=0(70)rearranging the terms yields A 1 = A 1 A A 1(71) References Dhrymes, Phoebus J.


Related search queries