Example: bankruptcy

Maximum Likelihood Estimation - University of Washington

Maximum Likelihood EstimationEric ZivotMay 14, 2001 This version: November 15, 20091 Maximum Likelihood The Likelihood FunctionLetX1,..,Xnbe an iid sample with probability density function (pdf)f(xi; ),where is a(k 1)vector of parameters that characterizef(xi; ).For example, ifXi N( , 2)thenf(xi; )=(2 2) 1/2exp( 12 2(xi )2)and =( , 2) densityof the sample is, by independence, equal to the product of the marginaldensitiesf(x1,..,xn; )=f(x1; ) f(xn; )=nYi=1f(xi; ).The joint density is anndimensional function of the datax1.

Maximum Likelihood Estimation Eric Zivot May 14, 2001 This version: November 15, 2009 1 Maximum Likelihood Estimation 1.1 The Likelihood Function Let X1,...,Xn be an iid sample with probability density function (pdf) f(xi;θ), where θis a (k× 1) vector of parameters that characterize f(xi;θ).For example, if Xi˜N(μ,σ2) then f(xi;θ)=(2πσ2)−1/2 exp(−1

Tags:

  University, Washington, Estimation, University of washington, Likelihood, Likelihood estimation

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Maximum Likelihood Estimation - University of Washington

1 Maximum Likelihood EstimationEric ZivotMay 14, 2001 This version: November 15, 20091 Maximum Likelihood The Likelihood FunctionLetX1,..,Xnbe an iid sample with probability density function (pdf)f(xi; ),where is a(k 1)vector of parameters that characterizef(xi; ).For example, ifXi N( , 2)thenf(xi; )=(2 2) 1/2exp( 12 2(xi )2)and =( , 2) densityof the sample is, by independence, equal to the product of the marginaldensitiesf(x1,..,xn; )=f(x1; ) f(xn; )=nYi=1f(xi; ).The joint density is anndimensional function of the datax1.

2 ,xngiven the para-meter vector .The joint density1satisfiesf(x1,..,xn; ) 0Z Zf(x1,..,xn; )dx1 dxn= Likelihood function is defined as the joint density treated as a functions of theparameters :L( |x1,..,xn)=f(x1,..,xn; )=nYi=1f(xi; ).Notice that the Likelihood function is akdimensional function of given the datax1,.., is important to keep in mind that the Likelihood function, being afunction of and not the data, is not a proper pdf. It is always positive butZ ZL( |x1,..,xn)d 1 d k6= ,..,Xnare discrete random variables, thenf(x1.)

3 ,xn; )=Pr(X1=x1,..,Xn=xn)for afixed value of .1To simplify notation, let the vectorx=(x1,..,xn)denote the observed the joint pdf and Likelihood function may be expressed asf(x; )andL( |x).Example 1 Bernoulli SamplingLetXi Bernoulli( ).That is,Xi=1with probability andXi=0with proba-bility1 where0 pdf forXiisf(xi; )= xi(1 )1 xi,xi=0,1 LetX1,..,Xnbe an iid sample withXi Bernoulli( ).The joint density/likelihoodfunction is given byf(x; )=L( |x)=nYi=1 xi(1 )1 xi= Sni=1xi(1 )n Sni=1xiFor a given value of andobservedsamplex,f(x; )gives the probability of observingthe sample.

4 For example, supposen=5andx=(0,..,0).Now some values of are more likely to have generated this sample than others. In particular, it is morelikely that is close to zero than one. To see this, note that the Likelihood functionfor this sample isL( |(0,..,0)) = (1 )5 This function is illustrated infigure xxx. The Likelihood function has a clear maximumat = is, =0is the value of that makes the observed samplex=(0,..,0)most likely (highest probability)Similarly, supposex=(1,..,1).Then the Likelihood function isL( |(1.))

5 ,1)) = 5which is illustrated infigure xxx. Now the Likelihood function has a Maximum at = 2 Normal SamplingLetX1,..,Xnbe an iid sample withXi N( , 2).The pdf forXiisf(xi; )=(2 2) 1/2exp 12 2(xi )2 , < < , 2>0, <x< so that =( , 2) Likelihood function is given byL( |x)=nYi=1(2 2) 1/2exp 12 2(xi )2 =(2 2) n/2exp 12 2nXi=1(xi )2!2 Figure xxx illustrates the normal Likelihood for a representative sample of sizen= that the Likelihood has the same bell-shape of a bivariate normal densitySuppose 2= ( |x)=L( |x)=(2 ) n/2exp 12nXi=1(xi )2!

6 NownXi=1(xi )2=nXi=1(xi x+ x )2=nXi=1 (xi x)2+2(xi x)(x )+( x )2 =nXi=1(xi x)2+n( x )2so thatL( |x)=(2 ) n/2exp 12"nXi=1(xi x)2+n( x )2#!Since both(xi x)2and( x )2are positive it is clear thatL( |x)is maximized at = is illustrated infigure 3 Linear Regression Model with Normal ErrorsConsider the linear regressionyi=x0i(1 k) (k 1)+ i,i=1,..,n i|xi iid N(0, 2)The pdf of i|xiisf( i|xi; 2)=(2 2) 1/2exp 12 2 2i The Jacobian of the transformation for itoyiis one so the pdf ofyi|xiis normalwith meanx0i and variance 2:f(yi|xi; )=(2 2) 1/2exp 12 2(yi x0i )2 where =( 0, 2) an iid sample ofnobservations,yandX,the joint densityof the sample isf(y|X; )=(2 2) n/2exp 12 2nXi=1(yi x0i )2!

7 =(2 2) n/2exp 12 2(y X )0(y X ) 3 The log- Likelihood function is thenlnL( |y,X)= n2ln(2 ) n2ln( 2) 12 2(y X )0(y X )Example 4AR(1) model with Normal ErrorsTo be The Maximum Likelihood EstimatorSupposewehavearandomsamplefromt hepdff(xi; )and we are interested inestimating .The previous example motives an estimator as the value of thatmakes the observed sample most likely. Formally, the Maximum Likelihood estimator,denoted mle,is the value of that maximizesL( |x).That is, mlesolvesmax L( |x)It is often quite difficult to directly maximizeL( |x).

8 It usually much easier tomaximize the log- Likelihood functionlnL( |x).Sinceln( )is a monotonic functionthe value of the that maximizeslnL( |x)will also maximizeL( |x).Therefore, wemay also define mleas the value of that solvesmax lnL( |x)With random sampling, the log- Likelihood has the particularly simple formlnL( |x)=ln nYi=1f(xi; )!=nXi=1lnf(xi; )Since the MLE is defined as a maximization problem, we would like know theconditions under which we may determine the MLE using the techniques of (x; )provides a sufficient set of such conditions.

9 We say thef(x; )is regular if1. The support of the random variablesX,SX={x:f(x; )>0},does notdepend on (x; )is at least three times differentiable with respect to 3. The true value of lies in a compact set 4 Iff(x; )is regular then we mayfind the MLE by differentiatinglnL( |x)andsolving thefirst order conditions lnL( mle|x) =0 Since is(k 1)thefirst order conditions definek, potentially nonlinear, equationsinkunknown values: lnL( mle|x) = lnL( mle|x) lnL( mle|x) k The vector of derivatives of the log- Likelihood function is called thescorevectorand is denotedS( |x)= lnL( |x) By definition, the MLE satisfiesS( mle|x)=0 Under random sampling the score for the sample becomes the sum of the scores foreach observationxi:S( |x)=nXi=1 lnf(xi; ) =nXi=1S( |xi)whereS( |xi)= lnf(xi.)

10 Is the score associated 5 Bernoulli example continuedThe log- Likelihood function islnL( |X)=ln Sni=1xi(1 )n Sni=1xi =nXi=1xiln( )+ n nXi=1xi!ln(1 )The score function for the Bernoulli log- Likelihood isS( |x)= lnL( |x) =1 nXi=1xi 11 n nXi=1xi!The MLE satisfiesS( mle|x)=0,which after a little algebra, produces the MLE mle=1nnXi= ,thesampleaverageistheMLEfor in the Bernoulli 6 Normal example continuedSince the normal pdf is regular, we may determine the MLE for =( , 2)bymaximizing the log-likelihoodlnL( |x)= n2ln(2 ) n2ln( 2) 12 2nXi=1(xi ) (2 1)vector given byS( |x)= lnL( |x) lnL( |x) 2!


Related search queries