Lecture 8: Properties of Maximum Likelihood Estimation (MLE)

ECE 645: Estimation TheorySpring 2015 Instructor: Prof. Stanley H. ChanLecture 8: Properties of Maximum Likelihood Estimation (MLE)(LaTeX prepared by Haiguang Wen)April 27, 2015 This Lecture note is based on ECE 645(Spring 2015) by Prof. StanleyH. Chan in the School of Electricaland Computer Engineering at Purdue Efficiency of MLEM aximum Likelihood Estimation (MLE) is a widely used statistical Estimation method. In this Lecture ,wewill study its Properties : efficiency, consistency and asymptotic is a method for estimating parameters of a statistical model. Given the distribution of a statisticalmodelf(y; ) with unkown deterministic parameter , MLE is to estimate the parameter by maximizingthe probabilityf(y; ) with observationsy. (y) = arg min f(y; )(1)Please see the previous Lecture noteLecture 7for Cram er Rao Lower Bound (CRLB)Cram er Rao Lower Bound(CRLB) is introduced inLecture 7.

Briefly, CRLB describes a lower bound onthe variance of estimators of the deterministic parameter . That isVar( (Y))>( E[ (Y)])2I( ),(2)whereI( ) is theFisher informationthat measures the information carried by the observable random variableYabout the unknown parameter . For unbiased estimator (Y), Equation 2 can be simplified asVar( (Y))>1I( ),(3)which means the variance of any unbiased estimator is as least as theinverse of the Fisher Efficient EstimatorFrom section , we know that the variance of estimator (y) cannot be lower than the CRLB. So anyestimator whose variance is equal to the lower bound is considered as an efficient EstimatorAn estimator (y) is efficient if it achieves equality in : Y={Y1, Y2, , Yn}are Gaussian random variables with distributionN( , 2). Determinethe Maximum Likelihood estimator of . Is the estimator efficient?

Solution:Lety={y1, y2, , yn}be the observation, thenf(y; ) =n k=1f(yk; )=n k=11 2 2exp{ (yk )22 2}=1(2 2)n2exp{ nk=1(yk )22 2}.Take the log of both sides of the above equation, we havelogf(y; ) = n2log(2 2) nk=1(yk )22 logf(y; ) is a quadratic concave function of , we can obtain the MLE by solving the followingequation. logf(y; ) =2 nk=1(yk )2 2= , the MLE is MLE(y) =1n nk= let us check whether the estimator is efficient or not. It is easy to check that the MLE is an unbiasedestimator (E[ MLE(y)] = ). To determine the CRLB, we need to calculate the Fisher information of ( ) = E[ 2 2logf(y; )]=n 2(4)According to Equation 3, we haveVar( MLE(Y))>1I( )= 2n.(5)And the variance of the MLE isVar( MLE(Y))= Var(1nn k=1Yk)= 2n.(6)So CRLB equality is achieved, thus the MLE is Minimum Variance Unbiased Estimator (MVUE)Recall that aMinimum Variance Unbiased Estimator(MVUE) is an unbiased estimator whose variance islower than any other unbiased estimator for all possible values of parameter.

That isVar( MV UE(Y))6 Var( (Y))(7)for any unbiased (Y) of any .Proposition and Efficient EstimatorsIf an estimator (y) is unbiased and efficient, then it must be (y) is efficient, according to CRLB, we haveVar( (Y))6 Var( (Y))(8)for any (Y). Therefore, (Y) must be minimum variance (MV). Since (Y) is also unbiased, it is a MVUE. Remark:The converse of the proposition is not true in general. That is, MVUE does NOT need to beefficient. Here is a counter ExampleSuppose thaty={Y1, Y2, , Yn}are exponential random variables with unknown mean1 . Find theMLE and MVUE of . Are these estimators efficient?Solution:Lety={y1, y2, , yn}be the observation, thenf(y; ) =n k=1f(yk; )=n k=1 exp{ yk}= nexp{ n k=1yk}.(9)Take the log of both sides of the above equation, we havelogf(y; ) =nlog( ) n k= logf(y; ) is a concave function of , we can obtain the MLE by solving the following equation.

Logf(y; ) =n n k=1= 0So the MLE is MLE(y) =n nk=1yk.(10)To calculate the CRLB, we need to calculateE[ MLE(Y)]and Var( MLE(Y)). LetT(y) = nk=1yk, thenby moment generating function, we can show that the distribution ofT(y) is the Erlange distribution:fT(t) = ntn 1(n 1)!e t.(11)So we haveE[ MLE(T(Y))]= 0nt ntn 1(n 1)!e tdt=n n 1 0( t)n 2(n 2)!e td t=nn 1 .(12)3 Therefore the MLE is a biased estimator of .Similarly, we can calculate the variance of MLE as ( MLE(T(Y)))=E[ 2 MLE(T(Y))] E[ MLE(T(Y))]2= 2n2(n 1)2(n 2)The Fisher information isI( ) = 2 2logf(y| ) =n the CRLB isVar( MLE(T(Y)))>( E[ MLE(T(Y))])2I( )=n2(n 1)2/n 2=n(n 1)2 CRLB equality doesNOThold, so MLEis not distribution in Equation 9 belongs to exponential family andT(y) = nk=1ykis a complete sufficientstatistic. So the MLE can be expressed as MLE(T(y)) =nT(y), which is a function ofT(y).

However, theMLE is a biased estimator (Equation 12). But we can construct an unbiased estimator based on the is (T(y)) =n 1n MLE(T(y))=n 1T(y).It is easy to checkE[ (T(Y))]=E[n 1n MLE(T(Y))]=n 1nnn 1 = . Since (T(y)) is an unbiasedestimator and it is a function of complete sufficient statistic, (T(y)) is MVUE. So MV UE(T(y)) =n 1T(y).(13)The variance of MVUE isVar( MV UE(T(Y)))= Var(n 1n MLE(T(Y)))=(n 1)2n2n2 2(n 1)2(n 2)= 2n the CRLB isVar( MV UE(T(Y)))>1I( )= , the MVUE isNOTan efficient Consistency of MLED efinition {Y1, , Yn}be a sequence of observations. Let nbe the estimator using{Y1, , Yn}. We say that nis consistent if np , ,P(| n |> ) 0,asn (14)Remark:A sufficient condition to have Equation 14 is thatE[( n )2] 0,asn .(15) to Chebyshev s inequality, we haveP(| n | ) E[( n )2] 2(16)SinceE[( n )2] 0, the we have0 P(| n | ) E[( n )2] 2 ,P(| n |> ) 0,asn.

Example 3.{Y1, Y2, , Yn}are Gaussian random variables with distributionN( , 2). Is the MLE using{Y1, Y2, , Yn}consistent?Solution:From Example 1., we know that the MLE is n=1nn k= [( n )2]= Var( n)= 2n, (see Equation 6),soE[( n )2] 0. Therefore np , nis fact, the result of the example above it holds for any distribution. The following proposition statesthis result:Proposition 2.(MLE of observation is consistent) Let{Y1, , Yn}be a sequence of observations whereYkiid f (y).Then the MLE of is (This proof is partially correct. See Levy chapter for complete discussion.)The MLE of is n= arg max n k=1f (yk)= arg max log(n k=1f (yk))= arg max 1nn k=1logf (yk)= arg max n( ),where n( ) =1n nk=1logf (yk). Let (yk) = logf (yk)f (yk), then we haveE [ (Yk)]def= logf (yk)f (yk)f (yk)dyk=D(f kf ).According to the weak law of large numbers (WLLN), we have1nn k=1 (yk)p D(f kf ).

(17)Since nis the MLE which maximizes n( ), then0 n( ) n( )=1nn k=1logf (yk) 1nn k=1logf (yk)=1nn k=1logf (yk)f (yk)=1nn k=1 (yk)=1nn k=1 (yk) D(f kf )+D(f kf ).Therefore,D(f kf ) 1nn k=1 (yk) D(f kf ) .By Equation 17, we have0 D(f kf ) 1nn k=1 (yk) D(f kf ) p we must haveD(f kf )p 0, and then p . 63 Asymptotic Normality of MLEThe previous proposition only asserts that MLE of observationsis consistent. However, it provides noinformation about the distribution of the NormalityLet{Y1, , Yn}be a sequence of observations whereYkiid f (y) is the MLE of , then n( )d N(0,1I( )). Lehmann, Elements of Large Sample Theory , Springer, 1999for proof. Example 4.{Y1, Y2, , Yn}are Gaussian random variables with distributionN( , 2). Find the asumptotic dis-tribution of MLSolution:Similar to Example 2, we can calculate the Fisher information of ,I( ) = E[ 2 2logf (Y)]=1 2We know that ML=1n nk=1yk.

So if n=1n nk=1yk, then we have n( )d N(0, 2).7

Lecture 8: Properties of Maximum Likelihood Estimation (MLE)

Tags:

Information

Transcription of Lecture 8: Properties of Maximum Likelihood Estimation (MLE)

Related search queries

Lecture 8: Properties of Maximum Likelihood Estimation (MLE)

Tags:

Information

Documents from same domain

Related documents

Related search queries