1 Fisher Information - Florida State University

Fisher InformationApril 6, 2016 Debdeep Pati1 Fisher InformationAssumeX f(x| ) (pdf or pmf) with R. DefineIX( ) =E [( logf(X| ))2]where( logf(X| ))is the derivative of the log-likelihood function evaluated at thetrue value . Fisher Information is meaningful for families of distribution which are regular:1. Fixed support:{x:f(x| )>0}is the same for all .2. logf(x| ) must exist and be finite for allxand .3. IfE |W(X)|< for all , then( )kE W(X) =( )k W(x)f(x| )dx= W(x)( )kf(x| ) Regular familiesOne parameter exponential families: Cauchy location or scale family:f(x| ) =1 (1 + (x )2)f(x| ) =1 (1 + (x/ )2)and lots more.

Thatf(x| ) =n i=1fi(xi| )2wherefi( | ) is the pdf (pmf) ofXi. Observe that logf(X| ) =n i=1 logfi(Xi| )and the random variables in the sum are independent. ThisVar[ logf(X| )]=n i=1 Var[ logfi(Xi| )]so thatIX( ) = ni=1 IXi( ) by IfX1,X2,..,Xnare andX= (X1,X2,..,Xn), thenIXi( ) =IX1( ) for alliso thatIX( ) =nIX1( ).5. An alternate formula for Fisher Information isIX( ) =E ( 2 2logf(X| )) f(x| )dxas f, etc. Since 1 = f, applying to both sides,0 = f= f = f f= ( logf) again,0 = ( logf)f= [( logf)f]= ( 2 2logf) f+ ( logf) f Noting that f = f f f,=( logf)f,3this becomes0 = ( 2 2logf) f+ ( logf)2 for0 =E( 2 2logf(X| ))+IX( ).

Example: Fisher Information for a Poisson sample. ObserveX = (X1,..,Xn) iidPoisson( ). FindIX ( ). We knowIX ( ) =nIX1( ). We shall calculateIX1( ) in threeways. LetX=X1. Preliminaries:f(x| ) = xe x!logf(x| ) =xlog logx! logf(x| ) =x 1 2 2logf(x| ) =x 2 Method #1: Observe thatIX( ) =E [( logf(X| ))2]=E [(X 1)2]= Var (X )(sinceE(X )=EX = 1)=Var(X) 2= 2= 2=1 Method #2: Observe thatIX( ) = Var ( logf(X| ))= Var(X 1)= Var(X )=1 (as in Method#1).Method #3: Observe thatIX( ) =E ( 2 2logf(X| ))=E (X 2)= 2=1.

4 ThusIX ( ) =nIX1( ) =n .Example: Fisher Information for Cauchy location family. SupposeX1,X2,..,Xniid withpdff(x| ) =1 (1 + (x )2).LetX = (X1,..,Xn),X f(x| ). FindIX ( ).Note thatIX ( ) =nIX1( ) =nIX( ). Now logf(x| ) = f f= 1 (1+(x )2)2 2(x )( 1)1 (1+(x )2)=2(x )(1 + (x )2)NowIX( ) = E[( logf(X| ))2]=E(2(X )1 + (X )2)2= (2(x )1 + (x )2)21 (1 + (x )2)dx=4 (x )2(1 + (x )2) ,du=dx,IX( ) =4 u2(1 +u2)3du=8 0u2(1 +u2) 1/(1 +u2),u= (1/x 1)1/2,du= (1/x 1) 1/2( 1/x2)dx,IX( ) =8 0u2(1 +u2)3du=8 0u2(1 +u2)(11 +u2)2du=8 10(1 x)x2 (1/2)(1/x 1) 1/2(1/x2)dx=4 10x1/2(1 x)1/2dx=4 10x3/2 1(1 x)3/2 1dx(Beta integral)=4 (3/2) (3/2) (3/2 + 3/2)=4 ( )22!

= ( ) = Uses of Fisher Information Asymptotic distribution of MLE s Cram er-Rao Inequality ( Information inequality) Asymptotic distribution of MLE s case:Iff(x| ) is a regular one-parameter family of pdf s (or pmf s) and n= n(Xn) isthe MLE based onXn= (X1,..,Xn) wherenis large andX1,..,Xnare iid fromf(x| ), then approximately, n N( ,1nI( ))whereI( ) IX1( ) and is the true value. Note thatnI( ) =IXn( ). Moreformally, n 1nI( )= nI( )( n )d N(0,1)6asn . More general case:(Assuming various regularity conditions) Iff(x | ) is a one-parameter family of joint pdf s (or joint pmf s) for dataXn= (X1.)

,Xn) wherenis large (think of a large dataset arising from regression or time series model) and n= n(Xn) is the MLE, then n N( ,1 IXn( ))where is the true Estimation of the Fisher InformationIf is unknown, then so isIX( ). Two estimates Iof the Fisher informationIX( ) are I1=IX( ), I2= 2 2logf(X| )| = where is the MLE of based on the dataX. I1is the obvious plug-in estimator. Itcan be difficult to computeIX( ) does not have a known closed form. The estimator I2issuggested by the formulaIX( ) =E( 2 2logf(X| ))It is often easy to compute, and is required in many Newton- Raphson style algorithmsfor finding the MLE (so that it is already available without extra computation).

Thetwo estimates I1and I2are often referred to as the expected and observed Fisherinformation, 1, both estimators are consistent (after normalization) forIXn( ) under variousregularity example: in the iid case: I1/n, I2/n, andIXn( )/nall converge toI( ) IX1( ). Approximate Confidence Intervals for Choose 0< <1 (say, = ). Letz be such thatP( z < Z < z ) = 1 whereZ N(0,1). Whennis large, we have approximately IX( )( ) N(0,1)7so thatP{ z < IX( )( )< z } 1 or equivalently,P{ z 1IX( )< < +z 1IX( )} 1.

This approximation continues to hold whenIX( ) is replaced by an estimate I(either I1or I2):P{ z 1 I< < +z 1 I} 1 .Thus( z 1 I, +z 1 I)is an approximate 1 confidence interval for . (Here is the MLE and Iis an estimateof the Fisher Information .)3 Cramer-Rao InequalityLetX P , (x | )is a regular one-parameter family,E W(X ) = ( )for all , and ( )is differentiable, thenVar (W(X )) { ( )}2IX ( ). Facts:A.[Cov(X,Y)]2 (VarX)(VarY). This is a special case of the Cauchy-Schwarz inequal-ity.

1 Fisher Information - Florida State University

Tags:

Information

Advertisement

Transcription of 1 Fisher Information - Florida State University

Related search queries

1 Fisher Information - Florida State University

Tags:

Information

Advertisement

Documents from same domain

Related documents

Related search queries