Transcription of STAT 730 Chapter 3: Normal Distribution Theory
1 STAT 730 chapter 3 : Normal Distribution TheoryTimothy HansonDepartment of Statistics, University of South CarolinaStat 730: multivariate Analysis1 / 36 Nice properties of multivariate Normal random vectorsMultivariate Normal easily generalizes univariate harder to generalize Poisson, gamma, exponential, completely by first and second moments, meanvector and covariance Np( , ), then ij= 0 impliesxiindependent x N(a ,a a).Central Limit Theorem says sample means are approximatelymultivariate geometry makes properties / 36 Definition via Cram er-Woldxis multivariate Normal a xis Normal for nx Np( , ) a x N(a ,a a) for alla : Ifx Np( , ) then its characteristic function is x(t) = exp(it 12t t).Proof: Lety=t x. Then the ofyis y(s)def=E{eisy}= exp{isE(y) 12s2var(y)}= exp{ist 12s2t t}.Then the ofxis x(t)def=E{eit x}= y(1) = exp(it 12t t).
2 2 Using the we see that if =0thenx= with probabilityone, ( ,0) = .3 / 36 Linear transformations ofxare also normalthm:x Np(x, ),A Rq p, andc Rq Ax+c Nq(A +c,A A ).Proof: Letb Rq; thenb [Ax+c] = [b A]x+b c. Since [b A]xis univariate Normal by def n, [b A]x+b cis also for anyb. Thespecific forms for the mean and covariance are standard results foranyAx+c( Chapter 2).2 Corollary: Any subset ofxis multivariate Normal ; : you will show y(t) =eit 2t2/2fory N( , 2) in / 36 Normality and independenceLetx Np( , ) andx = (x 1,x 2) of dimensionkandp partition = ( 1, 2) and = 11 12 21 22 . C(x1,x2) = 12= 21= : x(t) = x1(t1) x2(t2) = exp(it 1 1+t 2 2 12t 1 11t1 12t 2 22t2) C(x1,x2) = / 36 Some results based on last two slidesCorollary:x Np( , ) y= 1/2(x ) Np(0,In) andU= (x ) 1(x ) =y y :x Np(0,I) a x||a|| N(0,1) fora6= : LetA Rn1 p,B Rn2 p, andx Np( , ).
3 A B = one is immediate from previous two slides by finding thedistribution of AB :x Np( , 2I) andGG =IthenGx Np(G , 2I).AlsoGxindep. of (I G G) / 36 Conditional Distribution ofx2|x1 Letx Np( , ) andx = (x 1,x 2) of dimensionkandp partition = ( 1, 2) and = 11 12 21 22 . 21 111x1. = Ik0 21 111Ip k x Np 1 2 21 111 1 , 1100 22 21 111 12 . Thenx2|x1= + 21 111x1|{z} :x2|x1 Np k( 2+ 21 111(x1 1), 22 21 111 12).Very useful! Mean and variance results hold for / 36 Transformations of Normal data matrixIfx1, .. ,xniid Np( , ), thenX= [x1 xn] is an p normaldata matrix. General transformations are of the formAXB. An importantexample is x = [1n1 n]X[I], the sample mean. One can show :x1, .. ,xniid Np( , ) x Np( ,1n ).8 / 36 General transformation theoremthm: IfX(n p) is data matrix fromNp( , ) andY(m q) =AXBthenYis Normal data matrix (a)A1n= 1mfor R, orB =0, and(b)AA = Ipsome R, orB B= will prove this in class.
4 Some necessary results n: For any matrixX Rn p, letXv= x(1)..x(p) = (x (1), .. ,x (p)) / 36 Kronecker productsdef nLetA Rn mandB Rp q. ThenA B= a11Ba12B a1mBa21Ba22B anmB Rnp , .. ,xniid Np( , ). ThenC(xi,xj) = ij , so Nnp .. , 0 00 0 =Nnp(1n ,In ).10 / 36 Kronecker products, dist n ofXvprop: Letx1, .. ,xniid Np( , ). ThenXv= x(1)x(2)..x(p) Nnp 11n p1n , 11In 12In 1pIn 21In 22In p1In p2In ppIn =Nnp( 1n, In).This is immediate fromC(x(i),x(j)) = ijInandE(x(j)) = j1nand the fact thatXvis a permutation matrix times the vector onthe previous slide (so it s also Normal ).Corollary:X(n p) is fromNp( , ) Xv Nnp( 1n, In).11 / 36 Kronecker products, VIII on p. 460prop: (B A)Xv= (AXB) : First note that(B A)Xv= b11Ab21A bp1Ab12Ab22A bpqA x(1)x(2).
5 X(p) = Ppi=1bi1Ax(i)Ppi=1bi2Ax(i)..Ppi=1biqAx(i ) .Now let s find thejth column ofAm nXn pBp q. ForanyAa bBb cthejth column ofABisAb(j). FirstAXB= [Ax(1) Ax(p)]B. Thus thejth column ofAXBis[Ax(1) Ax(p)]b(j)=Ppi=1bijAx(i).212 / 36 Proof of theorem(B A)Xv Nmq([B A][ 1n]|{z}B A1n,[B A][ In][B A] |{z}B B AA ).This uses [A B][C D] =AC BDand [A B] =A B .Go back to the theorem, this implies particular, ifY=XBthenYis fromNq(B ,B B), asA= / 36 Important later on in this :Xis fromNp( , ),Y=AXB,Z=CXD, thenYindep. ofZ either (a)B D=0or (b)AC = will prove this in your homework, see ( ).Corollary: LetX= [X1X2] of dimensionsn kandn (p k). X1 111 12, fromNk( 1, 11) fromNp k( , ) where 2 21 111 1and 22 21 111 :X1=XBwhereB = [Ik0] = [ 21 111Ip k]. Now use above : : TakingA=1n1 nandC=H=In 1n1n1 nin thetheorem gives / 36 Wishart distributionNote thatS=X [1nH]X.
6 Quadratic functions of the formX CXare an ingredient in many multivariate test n:M(p p) =X XwhereX(m p) is a fromNp(0, )has a Wishart Distribution with scale matrix and :M Wp( ,m).Note that theijth element ofX Xis simplyx (i)x(j)=Pmk= element ofxkx kisxkixkj. ThereforeX X=Pmk=1xkx (M) =E"mXk=1xkx k#|{z}E(xk)=0=Pmk=1 =m .15 / 36 Quadratic form involving Wishartthm: LetB Rp qandM Wp( ,m). ThenB MB Wq(B B,m).Proof: LetY=XB. Result 3 slides back gives usYis fromNq(0,B B). Then def n Wishart tells usY Y=B X XB=B MB Wq(B B,m).216 / 36 Simple results that follow this theoremCorollary: Diagonal submatrices ofM(square matrices that sharepart of a diagonal withM) have a Wishart :mii 2m : 1/2M 1/2 Wp(Ip,m).Corollary:M Wp(Ip,m) andB(p q) B=IqthenB MB Wq(Iq,m).Corollary:M Wp( ,m) a6=0 a Maa a use differentBin the theorem on the previous slide plus / 36 Wisharts add!
7 ThmM1 Wp( ,m1) Wp( ,m2) M1+M2 Wp( ,m1+m2).Proof: LetX= X1X2 . ThenM1+M2=X X. Now use thedef n of are just addingm2more independent rows / 36 Cochran s theoremthm: IfX(n p) fromNp(0, ) andC(n n) is symmetricw/ eigenvalues 1, .. , nthen(a)X CXD=Pni=1 iMiwhereM1, .. ,Mniid Wp( ,1).(b)X CX Wp( ,r) Cidempotent wherer= trC= rankC.(c)nS Wp( ,n 1).ProofThe spectral decomposition ofCisC= [ 1 n] [ 1 n] =Pni=1 i i i. ThenX CX=Pni=1 i[X i][X i] =Pni=1 i[ iX] [ iX]. Generaltransformation theorem (A= i&B=Ip) tells us that fromNp(0, ) so (a) follows from def n Wishart. Part (b):Cidempotent there arer i= 1 andn r i= 0, hencetrC= 1+ n=r. Now use part (a). For part (c) note thatHis idempotent and rankn is a biggie. Lots of stuff that will be used / 36 Drum , .. ,xniid Np( , ) then x Np( ,1n ),nS Wp( ,n 1),and xindep.
8 Is a generalization of the univariatep= 1 case where x N( , 2n) indep. ofns2 2 2n 1. This latter result is used tocook up atn 1distribution: x ps2/n tn 1,by def n. We ll shortly generalize this topdimensions, but firstone last / 36 Generalization of partitioning sums of squaresHere is Craig s fromNp( , ) andC1, .. ,Ckare symmetric, thenX C1X, .. ,X CkXare indep. ifCrCs=0for allr6= : Let s do it for two projection matrices. WriteX C1X=X M1 1 1M 1 XandX C2X=X M2 2 2M 2X. Notethat i i= ias the e-values are either 1 or 0. Theorem (slide14) says 1M 1 Xindep. 2M 2X [ 1M 1][ 2M 2] = 1M 1M2 2=0. But0=C1C2=M1 1M 1M2 2M 2 1M 1M2 2= will come in handy in finding the sampling Distribution ofcommon test statistics / 36 Hotelling sT2 Recall, using obvious notation,N(0,1) 2 / t . Used for one andtwo-samplettests for univariate outcomes.
9 We ll now generalizethis n: Letd Np(0,Ip) Wp(Ip,m). Thenmd M 1d T2(p,m).thm: Letx Np( , ) Wp( ,m). Thenm(x ) M 1(x ) T2(p,m).Proof: Taked = 1/2(x ) andM = 1/2M 1/2anduse def n : xandSare sample mean and covariance fromx1, .. ,xniid Np( , ) (n 1)( x ) S 1( x ) T2(p,n 1).Proof: SubstituteM=nS,m=n 1, andx for n( x )in the theorem / 36 Hotelling sT2is a scaledFthm:T2(p,m) =mpm p+1Fp,m p+ prove this we need some Wp( ,m) and takeM= M11M12M21M22 whereM11 Ra aandM22 Rb banda+b=p. Further, M21M / 36 Proof, Hotelling sT2is a scaledFthm: LetM Wp( ,m) wherem>p. Wb( ,m a).Proof: LetX= [X1X2], soM= M11M12M21M22 =X X= X 1X1X 1X2X 2X1X 2X2 . 2X2 X 2X1(X 1X1) 1X1X2=X 2PX2=X ,whereP=In X1(X 1X1) 1X1is matrix ontoC(X1) |X1=X2 X1 111 12. Theorem on slide 14 tells fromNb(0, ) (not in the book). So Cochran stheorem tells |X1 Wb( ,m a).
10 This doesn tdepend onX1so it s the marginal dist n as / 36 Proof, Hotelling sT2is a scaledFlemma: IfM Wp( ,m),m>pthen1[M 1]pp 1[ 1]pp 2m p : In general, for partitioned matrices,[M11M12M21M22] 1=[(M11 M12M 122M21) 1 M 111M12(M22 M21M 111M12) 1 M 122M21(M11 M12M 122M21) 1(M22 M21M 111M12) 1].Now letM11be upper left (p 1) (p 1) submatrix ofMandm22the lower right 1 1 scalar matrix. Then, where [ 1]pp,1[M 1]pp=11 W1( ,m (p 1)) = 2m p : IfM Wp( ,m),m>pthena 1aa M 1a 2m p+ : LetA= [a(1) a(p 1)a]. ThenN=A 1M(A 1) Wp(A 1 (A 1) ,m). So1[N 1]pp=1[AM 1A ]pp=1a M 1a 1a 1a 2m p+ that theppth element of [A 1 (A 1) ] 1is1a / 36 Proof, Hotelling sT2is a scaledFRecallmd M 1d T2(p,m) whered Np(0,Ip) indep. ofM Wp(Ip,m). Givend, =d dd M 1d 2m p+1(last slide).Since this is independent ofd, this is the marginaldist n as M 1d=md dd d/d M 1d=m 2p 2m p+1=mpm p+1Fp,m p+ : xandSare sample mean and covariance fromNp( , )thenn pp( x ) S 1( x ) Fp,n / 36 Two more distributional resultsCorollary:|M|/|M+dd | B(12(m p+ 1),p2).