18 The Exponential Family and Statistical Applications

18 The Exponential Family and Statistical ApplicationsThe Exponential Family is a practically convenient and widely used uni ed Family of distributionson nite dimensional Euclidean spaces parametrized by a nite dimensional parameter to the case of the real line, the Exponential Family contains as special cases most of thestandard discrete and continuous distributions that we use for practical modelling, such as the nor-mal, Poisson, Binomial, Exponential , Gamma, multivariate normal, etc. The reason for the specialstatus of the Exponential Family is that a number of important and useful calculations in statisticscan be done all at one stroke within the framework of the Exponential Family .

This generalitycontributes to both convenience and larger scale understanding. The Exponential Family is theusual testing ground for the large spectrum of results in parametric Statistical theory that requirenotions ofregularityorCram er-Rao regularity. In addition, the uni ed calculations in the Expo-nential Family have an element of mathematical neatness. Distributions in the Exponential familyhave been used in classical statistics for decades. However, it has recently obtained additional im-portance due to its use and appeal to the machine learning community. A fundamental treatmentof the general Exponential Family is provided in this chapter. Classic expositions are available inBarndor -Nielsen (1978), Brown (1986), and Lehmann and Casella (1998).

An excellent recenttreatment is available in Bickel and Doksum (2006). One Parameter Exponential FamilyExponential families can have any nite number of parameters. For instance, as we will see,a normal distribution with a known mean is in the one parameter Exponential Family , while anormal distribution with both parameters unknown is in the two parameter Exponential bivariate normal distribution with all parameters unknown is in the ve parameter Exponentialfamily. As another example, if we take a normal distribution in which the mean and the varianceare functionally related, , theN( ; 2) distribution, then the distribution will be neither inthe one parameter nor in the two parameter Exponential Family , but in a Family called acurvedExponential Family .

We start with the one parameter regular Exponential De nition and First ExamplesWe start with an illustrative example that brings out some of the most important properties ofdistributions in an Exponential (Normal Distribution with a Known Mean).SupposeX N(0; 2). Thenthe density ofXisf(xj )=1 p2 e x22 2Ix2R:This density is parametrized by a single parameter . Writing ( )= 12 2;T(x)=x2; ( ) = log ; h(x)=1p2 Ix2R;we can represent the density in the formf(xj )=e ( )T(x) ( )h(x);498for any 2R+.Next, suppose that we have an iid sampleX1;X2; ;Xn N(0; 2). Then the joint density ofX1;X2; ;Xnisf(x1;x2; ;xnj )=1 n(2 )n=2e Pni=1x2i2 2Ix1;x2; ;xn2R:Now writing ( )= 12 2;T(x1;x2; ;xn)=nXi=1x2i; ( )=nlog ;andh(x1;x2; ;xn)=1(2 )n=2Ix1;x2; ;xn2R;once again we can represent the joint density in the same general formf(x1;x2; ;xnj )=e ( )T(x1;x2; ;xn) ( )h(x1;x2; ;xn):We notice that in this representation of the joint densityf(x1;x2; ;xnj ), the statisticT(X1;X2; ;Xn)is still a one dimensional statistic, namely,T(X1;X2; ;Xn)=Pni=1X2i.

Using the fact that thesum of squares ofnindependent standard normal variables is a chi square variable withndegreesof freedom, we have that the density ofT(X1;X2; ;Xn)isfT(tj )=e t2 2tn2 1 n2n=2 (n2)It>0:This time, writing ( )= 12 2;S(t)=t; ( )=nlog ; h(t)=12n=2 (n2)It>0;once again we are able to write even the density ofT(X1;X2; ;Xn)=Pni=1X2iin that samegeneral formfT(tj )=e ( )S(t) ( )h(t):Clearly, something very interesting is going on. We started with a basic density in a speci c form,namely,f(xj )=e ( )T(x) ( )h(x), and then we found that the joint density and the densityof the relevant one dimensional statisticPni=1X2iin that joint density, are once again densitiesof exactly that same general form.

It turns out that all of these phenomena are true of theentire Family of densities which can be written in that general form, which is the one parameterExponential Family . Let us formally de ne it and we will then extend the de nition to distributionswith more than one nition (X1; ;Xd)bead-dimensional random vector with a distributionP ; 2 ; ;Xdare jointly continuous. The Family of distributionsfP ; 2 gis said to belongto theone parameter Exponential familyif the density ofX=(X1; ;Xd) may be representedin the formf(xj )=e ( )T(x) ( )h(x);499for some real valued functionsT(x); ( )andh(x) ; ;Xdare jointly discrete, thenfP ; 2 gis said to belong to the one parameter Ex-ponential Family if the joint pmfp(xj )=P (X1=x1; ;Xd=xd) may be written in theformp(xj )=e ( )T(x) ( )h(x);for some real valued functionsT(x); ( )andh(x) that the functions ; Tandhare not unique.

For example, in the product T, we can multiplyTby some constantcand divide by it. Similarly, we can play with constants in the nition (X1; ;Xd) has a distributionP ; 2 , belonging to the oneparameter Exponential Family . Then the statisticT(X) is calledthe natural su cient statisticforthe familyfP notion of a su cient statistic is a fundamental one in Statistical theory and its ciency was introduced into the Statistical literature by Sir Ronald A. Fisher (Fisher (1922)).Su ciency attempts to formalize the notion ofno loss of information. A su cient statistic issupposed to contain by itself all of the information about the unknown parameters of the underlyingdistribution that the entire sample could have provided.

In that sense, there is nothing to loseby restricting attention to just a su cient statistic in one's inference process. However, the formof a su cient statistic is very much dependent on the choice of a particular distributionP formodelling the observableX. Still, reduction to su ciency in widely used models usually makesjust simple common sense. We will come back to the issue of su ciency once again later in will now see examples of a few more common distributions that belong to the one parameterExponential (Binomial Distribution).LetX Bin(n; p);withn 1 considered as known,and 0<p<1 a parameter. We represent the pmf ofXin the one parameter Exponential (xjp)= nx px(1 p)n xIfx2f0;1; ;ngg= nx p1 p x(1 p)nIfx2f0;1; ;ngg= nx exlogp1 p+nlog(1 p)Ifx2f0;1; ;ngg:Writing (p) = logp1 p;T(x)=x; (p)= nlog(1 p), andh(x)= nx Ifx2f0;1; ;ngg,wehaverepresented the pmff(xjp) in the one parameter Exponential Family form, as long asp2(0;1).

Forp= 0 or 1, the distribution becomes a one point distribution. Consequently, the Family ofdistributionsff(xjp);0<p<1gforms a one parameter Exponential Family , but if either of theboundary valuesp=0;1 is included, the Family is not in the Exponential (Normal Distribution with a Known Variance).SupposeX N( ; 2),where is considered known, and 2Ra parameter. Then,f(xj )=1p2 e x22+ x 22Ix2R;500which can be written in the one parameter Exponential Family form by witing ( )= ; T(x)=x; ( )= 22,andh(x)=e x22Ix2R. So, the Family of distributionsff(xj ); 2 Rgforms a oneparameter Exponential (Errors in Variables).SupposeU; V; Ware independent normal variables, withUandVbeingN( ;1) andWbeingN(0;1).

LetX1=U+WandX2=V+W. In other words,a common error of measurementWcontaminates (X1;X2). ThenXhas a bivariate normal distribution with means ; , variances 2;2,and a correlation parameter =12. Thus, the density ofXisf(xj )=12p3 e 23 (x1 )22+(x2 )22 2(x1 )(x2 ) Ix1;x22R=12p3 e 23 (x1+x2) 23 2 e x21+x22 4x1x23Ix1;x22R:This is in the form of a one parameter Exponential Family with the natural su cient statisticT(X)=T(X1;X2)=X1+ (Gamma Distribution).SupposeXhas the Gamma densitye x x 1 ( )Ix> , it has two parameters ; . If we assume that is known, then we may write the density inthe one parameter Exponential Family form:f(xj )=e x log x 1 ( )Ix>0;and recognize it as a density in the Exponential Family with ( )= 1 ;T(x)=x; ( )= log ; h(x)=x 1 ( )Ix> we assume that is known, once again, by writing the density asf(xj )=e logx (log ) log ( )e x Ix>0;we recognize it as a density in the Exponential Family with ( )= ; T(x) = logx; ( )= (log ) + log ( );h(x)=e x Ix> (An Unusual Gamma Distribution).

18 The Exponential Family and Statistical Applications

Tags:

Information

Transcription of 18 The Exponential Family and Statistical Applications

Related search queries

18 The Exponential Family and Statistical Applications

Tags:

Information

Documents from same domain

Related documents

Related search queries