Chapter 13 The Multivariate Gaussian - People

Chapter 13 The Multivariate GaussianIn this Chapter we present some basic facts regarding the Multivariate Gaussian discuss the two major parameterizations of the Multivariate Gaussian themomentparameterizationand thecanonical parameterization, and we show how the basic operationsof marginalization and conditioning are carried out in thesetwo parameterizations. We alsodiscuss maximum likelihood estimation for the Multivariate ParameterizationsThe Multivariate Gaussian distribution is commonly expressed interms of the parameters and , where is ann 1 vector and is ann n, symmetric matrix. (We will assumefor now that is also positive definite, but later on we will haveoccasion to relax thatconstraint). We have the following form for the density function:p(x| , ) =1(2 )n/2| |1/2exp{ 12(x )T 1(x )},( )wherexis a vector in n. The density can be integrated over volumes in nto assignprobability mass to those geometry of the Multivariate Gaussian is essentially that associated with the quadraticformf(x) =12(x )T 1(x ) in the exponent of the density.

Recall our discussion inChapter??, where we showed that a quadratic formf(x) is a paraboloid with level surfaces, , surfaces of the formf(x) =cfor fixedc, being ellipsoids oriented along the eigenvectorsof the matrix . Now note that the exponential, exp( ), is a scalar function that leaves thegeometrical features of the quadratic form intact. That is, for anyxlying on an ellipsoidf(x) =c, we obtain the value exp{ c}. The maximum value of the exponential is 1, ob-tained atx= wheref(x) = 0. The paraboloidf(x) increases to infinity as we move awayfromx= ; thus we obtain a bump in (n+ 1) dimensional space centered atx= . Thelevel surfaces of the Gaussian bump are ellipsoids oriented along the eigenvectors of .12 Chapter 13. THE Multivariate GAUSSIANThe factor in front of the exponential in Eq. is the normalization factor that ensuresthat the density integrates to one.

To show that this factor is correct, we make use of thediagonalization of 1. Diagonalization yields a product ofnunivariate Gaussians whosestandard deviations are the eigenvalues of . When we integrate, each of these univariateGaussians contributes a factor 2 ito the normalization, where iis theith eigenvalue of . Recall that the determinant of a matrix is the product of its eigenvalues to obtain theresult. (We ask the reader to fill in the details of this derivation in Exercise??).As in the univariate case, the parameters and have a probabilistic interpretation asthemomentsof the Gaussian distribution. In particular, we have the important result: =E(x)( ) =E(x )(x )T.( )We will not bother to derive this standard result, but will provide a hint: diagonalize andappeal to the univariate the moment parameterization of the Gaussian will playa principal role in oursubsequent development, there is a second parameterization the canonical parameterization that will also be important.

In particular, expanding the quadratic form in Eq. , anddefiningcanonical parameters: = 1( ) = 1 ,( )we obtain:p(x| , ) = exp{a+ Tx 12xT x},( )wherea= 1/2(nlog(2 ) log| |+ T ) is the normalizing constant in this represen-tation. The canonical parameterization is also sometimes referred to as can also convert from canonical parameters to moment parameters: = 1 ( ) = 1.( )Moment parameters and canonical parameters are useful in different circumstances. Aswe will see, different kinds of transformations are more readily carried out in one represen-tation or the JOINT Joint distributionsSuppose that we partition then 1 vectorxinto ap 1 subvectorx1and aq 1 subvectorx2, wheren=p+q. Form corresponding partitions of the and parameters: =[ 1 2] =[ 11 12 21 22],( )We can write a joint Gaussian distribution forx1andx2using these partitioned parameters:p(x| , ) =1(2 )(p+q)/2| |1/2exp{ 12(x1 1x2 2)T[ 11 12 21 22] 1(x1 1x2 2)}( )This partitioned form of the joint distribution raises a number of questions.

In particular,we can equally well form partitioned versions of and and express the joint distributionin the canonical parameterization; is there any relationship between the partitioned form ofthese two representations? Also, what do the blocks in the partitioned forms have to do withthe marginal and conditional probabilities ofx1andx2?These questions all involve the manipulation of the quadraticforms in the exponentsof the Gaussian densities; indeed, the underlying algebraic problem is that of completingthe square of quadratic forms. In the next section, we discuss an algebra that provides ageneral solution to the problem of completing the square. Partitioned matricesOur first result in this section is to show how to block diagonalizea partitioned matrix. Anumber of useful results flow from this operation, including anexplicit expression for theinverse of a partitioned a general partitioned matrix:M=[E FG H],( )where we assume that bothEandHare invertible.

(Our results can be generalized beyondthis setting). To invert this matrix, we follow a similar procedure to that of particular, we wish toblock diagonalizethe matrix. We wish to put a block of zeros inplace ofGand a block of zeros in place zero out the upper-right-hand corner ofM, note that it suffices topremultiplythesecond block column ofMby a block row vector having elementsIand F H 1. Similarly,to zero out the lower-left-hand corner ofM, if suffices topostmultiplythe second row ofM4 Chapter 13. THE Multivariate GAUSSIANby a block column vector having elementsIand H 1G. The magical fact is that thesetwo operations do not interfere with each other; thus we can block diagonalizeMby doingboth operations. In particular, we have:[I F H 10I] [E FG H] [I0 H 1G I]=[E F H 1G00H].( )The correctness of this decomposition can be verified define theSchur complementof the matrixMwith respect toH, denotedM/H, asthe termE F H 1 Gthat appears in the block diagonal matrix.

It is not difficult to showthatM/His now take the inverse of both sides of Eq. Note that in a matrix expression ofthe formXY Z=Winverting both sides yieldsY 1=ZW 1X; this implies that we don tneed to explicitly invert the block triangular matrices in Eq. (although this is easilydone). Note also that the inverse of a block diagonal matrix is the diagonal matrix of theinverse of its blocks. Thus we have:[E FG H] 1=[I0 H 1G I] [(M/H) 100H 1] [I F H 10I]( )=[(M/H) 1 (M/H) 1F H 1 H 1G(M/H) 1H 1+H 1G(M/H) 1F H 1],( )which expresses the inverse of a partitioned matrix in terms of its can also apply the determinant operator to both sides of Eq. The blocktriangular matrices clearly have a determinant of one; thuswe obtain another importantresult:|M|=|M/H||H|.( )(This result makes the choice of notation for the Schur complement seem quite natural!)We are not yet finished.

Note that we could alternatively have decomposed the matrixMin terms ofEandM/E, yielding the following expression for the inverse:[E FG H] 1=[E 1+E 1F(M/E) 1GE 1 E 1F(M/E) 1 (M/E) 1GE 1(M/E) 1].( )These two expressions for the inverse ofM(Eq. and Eq. ) must be the same,thus we can set the corresponding blocks equal to each other. This yields:(E F H 1G) 1=E 1+E 1F(H GE 1F) 1GE 1( )and(E F H 1G) 1F H 1=E 1F(H GE 1F) 1( ) MARGINALIZATION AND CONDITIONING5 Both Eq. , which is generally referred to as the matrixinversion lemma, and Eq. quite useful in transformations involving Gaussian distributions. They allow expressionsinvolving the inverse ofEto be converted into expressions involving the inverse ofHandvice Marginalization and conditioningWe now make use of our block diagonalization results to developgeneral formulas for thekey operations of marginalization and conditioning in the Multivariate Gaussian setting.

Wepresent results for both the moment parameterization and the canonical goal is to split the joint distribution Eq. into a marginal probability forx2and a conditional probability forx1according to the factorizationp(x1, x2) =p(x1|x2)p(x2).Focusing first on the exponential factor, we make use of Eq. :exp{ 12(x1 1x2 2)T[ 11 12 21 22] 1(x1 1x2 2)}= exp{ 12(x1 1x2 2)T[I0 122 21I] [( / 22) 100 122] [I 12 1220I] (x1 1x2 2)}= exp{ 12(x1 1 12 122(x2 2))T( / 22) 1(x1 1 12 122(x2 2))} exp{ 12(x2 2)T 122(x2 2)}.( )We next exploit Eq. to split the normalization into two factors:1(2 )(p+q)/2| |1/2=1(2 )(p+q)/2(| / 22|| 22|)1/2( )=(1(2 )p/2| / 22|1/2) (1(2 )q/2| 22|1/2)( )To see that we have achieved our goal of factorizing the joint distribution into the product ofa marginal distribution and a conditional distribution, notethat if we group the first factorin Eq.

With the first factor in Eq. we obtain a normalized Gaussian in the variablex1. Integrating with respect tox1, these factors disappear and the remaining factors musttherefore represent the marginal distribution ofx2:p(x2) =1(2 )q/2| / 22|1/2exp{ 12(x2 2)T 122(x2 2)}.( )6 Chapter 13. THE Multivariate GAUSSIANG iven this result, we are now licensed to interpret the factorsthat were integrated over asthe conditional probabilityp(x1|x2):p(x1|x2) =1(2 )p/2| 22|1/2exp{ 12(x1 1 12 122(x2 2))T( / 22) 1(x1 1 12 122(x2 2))}( )To summarize our results, let ( m2, m2) denote the moment parameters of the marginaldistribution ofx2, and let ( c1|2, c1|2) denote the moment parameters of the conditionaldistribution ofx1givenx2. Eq. and Eq. yield the following expressions for theseparameters:Marginalization: m2= 2( ) m2= 22( )Conditioning: c1|2= 1+ 12 122(x2 2)( ) c1|2= 11 12 122 21( )We can also express the marginalization and conditioning operations in the canonicalparameterization.

Chapter 13 The Multivariate Gaussian - People

Tags:

Information

Advertisement

Transcription of Chapter 13 The Multivariate Gaussian - People

Related search queries

Chapter 13 The Multivariate Gaussian - People

Tags:

Information

Advertisement

Documents from same domain

Related documents

Related search queries