Example: quiz answers

Parametric Survival Models - Princeton University

Parametric Survival ModelsGerm an Rodr 2001; revised Spring 2005, Summer 2010We consider briefly the analysis of Survival data when one is willing toassume a Parametric form for the distribution of Survival Survival NotationLetTdenote a continuous non-negative random variable representing sur-vival time, with probability density function (pdf)f(t) and cumulative dis-tribution function (cdf)F(t) = Pr{T t}. We focus on thesurvival func-tionS(t) = Pr{T > t}, the probability of being alive att, and the hazardfunction (t) =f(t)/S(t). Let (t) = t0 (u)dudenote the cumulative (orintegrated) hazard and recall thatS(t) = exp{ (t)}.Any distribution defined fort [0, ) can serve as a Survival can also draft into service distributions defined fory ( , ) byconsideringt= exp{y}, so thaty= logt.]

Let T denote a continuous non-negative random variable representing sur-vival time, with probability density function (pdf) f(t) and cumulative dis-tribution function (cdf) F(t) = PrfT tg. We focus on the survival func-tion S(t) = PrfT>tg, the probability of being alive at t, and the hazard function (t) = f(t)=S(t). Let ( t) = R t

Tags:

  Continuous, Random

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Parametric Survival Models - Princeton University

1 Parametric Survival ModelsGerm an Rodr 2001; revised Spring 2005, Summer 2010We consider briefly the analysis of Survival data when one is willing toassume a Parametric form for the distribution of Survival Survival NotationLetTdenote a continuous non-negative random variable representing sur-vival time, with probability density function (pdf)f(t) and cumulative dis-tribution function (cdf)F(t) = Pr{T t}. We focus on thesurvival func-tionS(t) = Pr{T > t}, the probability of being alive att, and the hazardfunction (t) =f(t)/S(t). Let (t) = t0 (u)dudenote the cumulative (orintegrated) hazard and recall thatS(t) = exp{ (t)}.Any distribution defined fort [0, ) can serve as a Survival can also draft into service distributions defined fory ( , ) byconsideringt= exp{y}, so thaty= logt.]

2 More generally, we can start froma a standard distribution in ( , ) and generate a family ofsurvival distributions by introducing location and scale changes of the formlogT=Y= + now review some of the most important ExponentialThe exponential distribution has constant hazard (t) = . Thus, the sur-vivor function isS(t) = exp{ t}and the density isf(t) = exp{ t}. It1can be shown thatE(T) = 1/ and var(T) = 1/ 2. Thus, the coefficient ofvariation is exponential distribution is related to the extreme-value ,Thas an exponential distribution with parameter , denotedT E( ), iffY= logT= +Wwhere = log andWhas a standard extreme value (min) distribution,with densityfW(w) =ew is a unimodal density withE(W) = , where = is Euler sconstant, and var(W) = 2/6.

3 The skewness is proof follows immediately from a change of WeibullTis Weibull with parameters andp, denotedT W( ,p), ifTp E( ).The cumulative hazard is (t) = ( t)p, the survivor function isS(t) =exp{ ( t)p}, and the hazard is (t) = pptp log of the Weibull hazard is a linear function of log time with constantplog + logpand slopep 1. Thus, the hazard is rising ifp >1, constantifp= 1, and declining ifp < Weibull is also related to the extreme-value distribution:T W( ,p) iffY= logT= + W,whereWhas the extreme value distribution, = log andp= 1/ .The proof follows again from a change of variables; start fromWandchange variables toY= + W, and then change toT= Gompertz-MakehamThe Gompertz distribution is characterized by the fact that the log of thehazard is linear int, so (t) = exp{ + t}and is thus closely related to the Weibull distribution where the log of thehazard is linear in logt.

4 In fact, the Gompertzisa log-Weibull distribution provides a remarkably close fit to adult mortality incontemporary developed GammaThe gamma distribution with parameters andk, denoted ( ,k), hasdensityf(t) = ( t)k 1e t (k),and survivor functionS(t) = 1 Ik( t),whereIk(x) is the incomplete gamma function, defined asIk(x) = x0 k 1e xdx/ (k).There is no closed-form expression for the Survival function, but there areexcellent algorithms for its computation. (R has a function calledpgammathat computes the cdf and survivor function. This function callskthe shapeparameter and 1/ the scale parameter.)There is no explicit formula for the hazard either, but this may be com-puted easily as the ratio of the density to the survivor function, (t) =f(t)/S(t).

5 The gamma hazard increases monotonically ifk >1, from a value of 0 at the origin to amaximum of , is constant ifk= 1 decreases monotonically ifk <1, from at the origin to an asymp-totic value of .Ifk= 1 the gamma reduces to the exponential distribution, which canbe described as the waiting time to one hit in a Poisson process. Ifkis anintegerk >1 then the gamma distribution is called the Erlang distributionand can be characterized as the waiting time tokhits in a Poisson distribution exists for non-integerkas gamma distribution can also be characterized in terms of the dis-tribution of log-time. By a simple change of variables one can show thatT ( ,k) ifflogT=Y= +W,whereWhas ageneralizedextreme-value distribution with densityfw(w) =ekw ew (k),controlled by a parameterk.

6 This density reduces to the ordinary extremevalue distribution whenk= Generalized GammaStacy has proposed a generalized gamma distribution that fits neatly inthe scheme we are developing, as it simply adds a scale parameter in theexpression for logT, so thatY= logT= + W,whereWhas a generalized extreme value distribution with density of the generalized gamma distribution can be written asf(t) = p( t)pk 1e ( t)p (k),wherep= 1/ .The generalized gamma includes the following interesting special cases: gamma, whenp= 1, Weibull, whenk= 1, exponential, whenp= 1 andk= also includes the log-normal as a special limiting case whenk . Log-NormalThas a lognormal distribution iffY= logT= + W,whereWhas a standard normal hazard function of the log-normal distribution increases from 0 toreach a maximum and then decreases monotonically, approaching 0 ast.

7 Ask the generalized extreme value distribution approaches a stan-dard normal, and thus the generalized gamma approaches a Log-LogisticThas a log-logistic distribution iffY= logT= + W,4whereWhas a standard logistic distribution, with pdffW(w) =ew(1 +ew)2,and cdfFW(w) =ew1 + survivor function is the complementSW(w) =11 + variables toTwe find that the log-logistic survivor function isS(t) =11 + ( t)p,where we have written, as usual, = log andp= 1/ . Taking logs weobtain the (negative) integrated hazard, and differentiating findthe hazard function (t) = p( t)p 11 + ( t) that thelogitof the Survival functionS(t) is linear in logt. This factprovides a diagnostic plot: if you have a non- Parametric estimate of thesurvivor function you can plot its logit against log-time; if the graph lookslike a straight line then the survivor function is hazard itself is monotone decreasing from ifp <1, monotone decreasing from ifp= 1, and similar to the log-normal ifp > Generalized FKalbfleisch and Prentice (1980) consider the more general case whereY= logT= + WandWis distributed as the log of an F-variate (which adds two more pa-rameters).

8 The interesting thing about this distribution is that it includesallof theabove distributions as special or limiting cases, and is therefore useful fortesting different Parametric The Coale-McNeil ModelThe Coale-McNeil model of first marriage frequencies among women whowill eventually marry is closely related to the extreme value and model assumes that the density of first marriages at ageaamongwomen who will eventually marry is given byg(a) =g0(a a0k)1k,wherea0andkare location and scale parameters andg0(.) is a standardschedule based on Swedish data. This standard schedule was first derivedempirically, but later Coale and McNeil showed that it could be closelyapproximated by the following analytic expression:g0(z) = (z ) e (z ).

9 It will be convenient to write a somewhat more general model with threeparameters:g(x) = ( / )e (x ) e (x ).This is a form of extreme value distribution. In fact, if = it reducesto the standard extreme value distribution that we discussed before. Thismore general case is known as a (reversed) generalized extreme mean of this distribution is = 1 ( / ),where (x) = (x)/ (x) is thedigammafunction (or derivative of the logof the gamma function).The Swedish standard derived by Coale and McNeil corresponds to thecase = , = ,and = ,which gives a mean of = a simple change of variables, it can be seen that the more generalcase with parametersa0andkcorresponds to = , = ,and =a0+ ,Xhas the (more general) Coale-McNeil distribution with parameters , and iffX= 1 logY,whereYhas a gamma distribution with shape parameterp= /.

10 In other words, age at marriage is distributed as a linear function of thelogarithm of a gamma random particular, the Swedish standard can be obtained asX= ,whereYis gamma withp= / = = case with parametersa0andkcan be obtained asX=a0+ ,whereYis again gamma withp= Coale-McNeil Models holds the ratiop= / fixed at , butalong the way we have generalized the model and could entertain the notionof estimatingprather than holding it main significance of these results is computational: we can calculate marriage schedules as long as we have a function tocompute the incomplete gamma function (or even chi-squared) we can fit nuptiality Models using software for fitting gamma further details see my 1980 paper with Trussell.


Related search queries