Chapter 8 The exponential family: Basics

Chapter 8 The exponential family: BasicsIn this Chapter we extend the scope of our modeling toolbox to accommodate a variety ofadditional data types, including counts, time intervals and rates. We introduce the expo-nential family of distributions, a family that includes the Gaussian, binomial, multinomial,Poisson, gamma, von Mises and beta distributions, as well as many others. In this chapterwe focus on unconditional models and in the following Chapter we show how these ideas canbe carried over to the setting of conditional first blush this Chapter may appear to involve a large dose of mathematical detail, butappearances shouldn t deceive most of the detail involves working out examples that showhow the exponential family formalism relates to more familiar material.

The real messageof this Chapter is the simplicity and elegance of exponentialfamily. Once the new ideas aremastered, it is often easier to work within the general exponential family framework thanwith specific The exponential familyGiven a measure , we define anexponential familyof probability distributions as thosedistributions whose density (relative to ) have the following general form:p(x| ) =h(x) exp{ TT(x) A( )}( )for a parameter vector , often referred to as thecanonical parameter, and for given functionsTandh. The statisticT(X) is referred to as asufficient statistic; the reasons for thisnomenclature are discussed below.

The functionA( ) is known as thecumulant Eq. (??) with respect to the measure , we have:A( ) = log h(x) exp{ TT(x)} (dx)( )12 Chapter 8. THE exponential FAMILY: Basics where we see that the cumulant function can be viewed as the logarithm of a shows thatA( ) is not a degree of freedom in the specification of an exponentialfamily density; it is determined once ,T(x) andh(x) are set of parameters for which the integral in Eq. (??) is finite is referred to as thenatural parameter space:N={ : h(x) exp{ TT(x)} (dx)< }.( )We will restrict ourselves to exponential families for which the natural parameter space is anonempty open set.

Such families are referred to many cases we are interested in representations that are in a certain sense non-redundant. In particular, an exponential family is referred to asminimalif there are nolinear constraints among the components of the parameter vector nor are there linear con-straints among the components of the sufficient statistic (in the latter case, with probabilityone under the measure ). Non-minimal families can always be reduced to minimal familiesvia a suitable transformation and if we restrict ourselves to minimal representations, however, the same probabilitydistribution can be represented using many different parameterizations, and indeed much ofthe power of the exponential family formalism derives from the insights that are obtainedfrom considering different parameterizations for a given family.

In general, given a set anda mapping : N, we consider densities obtained from Eq. (??) by replacing with ( ):p(x| ) =h(x) exp{ ( )TT(x) A( ( ))}.( )where is a one-to-one mapping whose image is all are also interested in cases in which the image of is a strict subset ofN. Ifthis subset is a linear subset, then it is possible to transform the representation into anexponential family on that subset. When the representation is not reducible in this way, werefer to the exponential family as acurved exponential integral in this equation is a Lebesgue integral, reflecting the fact that in general we wish to dealwith arbitrary.

Actually, let us take the opportunity to be more precise and note that is required tobe a -finite measure. But let us also reassure those readers without a background in measuretheory andLebesgue integration that standard calculus will suffice for an understanding of thischapter. In particular,in all of the examples that we will treat, will either be Lebesgue measure, in which case (dx) reduces to dx and the integral in Eq. (??) can be handled using standard multivariable calculus , or counting measure,in which case the integral reduces to a is also worth noting that andh(x) are not really independent degrees of freedom.

We are alwaysfree to absorbh(x) in the measure . Doing so yields measures that are variations on Lebesgue measure andcounting measure, and thus begins to indicate the elegance of the formulation in terms of general a formal proof of this fact, see Chapter 1 of?. THE exponential ExamplesThe Bernoulli distributionA Bernoulli random variableXassigns probability measure to the pointx= 1 andprobability measure 1 tox= 0. More formally, define to be counting measure on{0,1}, and define the following density function with respect to :p(x| ) = x(1 )1 x( )= exp{log( 1 )x+ log(1 )}.

( )Our trick for revealing the canonical exponential family form, here and throughout thechapter, is to take the exponential of the logarithm of the usual form of the density. Thuswe see that the Bernoulli distribution is an exponential family distribution with: = 1 ( )T(x) =x( )A( ) = log(1 ) = log(1 +e )( )h(x) = 1.( )Note moreover that the relationship between and is invertible. Solving Eq. (??) for ,we have: =11 +e ,( )which is the logistic reader can verify that the natural parameter space is the real line in this Poisson distributionThe probability mass function ( , the density respect to counting measure) of a Poissonrandom variable is given as follows:p(x| ) = xe x!

( )Rewriting this expression we obtain:p(x| ) =1x!exp{xlog }.( )4 Chapter 8. THE exponential FAMILY: BASICSThus the Poisson distribution is an exponential family distribution, with: = log ( )T(x) =x( )A( ) = =e ( )h(x) =1x!.( )Moreover, we can obviously invert the relationship between and : =e .( )The Gaussian distributionThe (univariate) Gaussian density can be written as follows (where the underlying measureis Lebesgue measure):p(x| , 2) =1 2 exp{ 12 2(x )2}( )=1 2 exp{ 2x 12 2x2 12 2 2 log }.( )This is in the exponential family form, with: =[ / 2 1/2 2]( )T(x) =[xx2]( )A( ) = 22 2+ log = 214 2 12log( 2 2)( )h(x) =1 2.

( )Note in particular that the univariate Gaussian distribution isa two-parameter distributionand that its sufficient statistic is a multivariate Gaussian distribution can also be written in the exponential familyform; we leave the details to Exercise??and Chapter von Mises distributionSuppose that we wish to place a distribution on an anglex, wherex (0,2 ). This is readilyaccomplished within the exponential family framework:p(x| , ) =12 I0( )exp{ cos(x )}( ) THE exponential FAMILY5where is a location parameter, is a scale parameter andI0( ) is the modified Besselfunction of order 0.

Chapter 8 The exponential family: Basics

Tags:

Information

Transcription of Chapter 8 The exponential family: Basics

Related search queries

Chapter 8 The exponential family: Basics

Tags:

Information

Documents from same domain

Related documents

Related search queries