Example: biology

Survival Distributions, Hazard Functions, Cumulative Hazards

BIO 244: Unit 1 Survival Distributions, Hazard functions , Definitions:The goals of this unit are to introduce notation, discuss ways of probabilisti-cally describing the distribution of a Survival time random variable, applythese to several common parametric families, and discuss how observationsof Survival times can be a non-negative random variable representing the time until someevent of interest. For example,Tmight denote: the time from diagnosis of a disease until death, the time between administration of a vaccine and development of an in-fection, the time from the start of treatment of a symptomatic disease and thesuppression of shall assume that T is continuous unless we specify otherwise. The prob-ability density function (pdf) and Cumulative distribution function (cdf) aremost commonly used to characterize the distribution of any random variable,and we shall denote these byf( ) andF( ), respectively:pdf:f(t)cdf:F(t) =P(T t)}F(0) =P(T= 0)1 BecauseTis non-negative and usually denotes the elapsed time until anevent, it is commonly characterized in other ways as well:Survivor function:S(t)def= 1 F(t) =P(T > t)fort > survivor function simply indicates the probability that the event of in-terest has not yet occurred by timet; thus, ifTdenotes time until death,S(t) denotes probability of s

The hazard function may assume more a complex form. For example, if T denote the age of death, then the hazard function h(t) is expected to be decreasing at rst and then gradually increasing in the end, re ecting higher hazard of infants and elderly. 1.2 Common Families of Survival Distributions

Tags:

  Survival, Functions, Hazards, Hazard function

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Survival Distributions, Hazard Functions, Cumulative Hazards

1 BIO 244: Unit 1 Survival Distributions, Hazard functions , Definitions:The goals of this unit are to introduce notation, discuss ways of probabilisti-cally describing the distribution of a Survival time random variable, applythese to several common parametric families, and discuss how observationsof Survival times can be a non-negative random variable representing the time until someevent of interest. For example,Tmight denote: the time from diagnosis of a disease until death, the time between administration of a vaccine and development of an in-fection, the time from the start of treatment of a symptomatic disease and thesuppression of shall assume that T is continuous unless we specify otherwise. The prob-ability density function (pdf) and Cumulative distribution function (cdf) aremost commonly used to characterize the distribution of any random variable,and we shall denote these byf( ) andF( ), respectively:pdf:f(t)cdf:F(t) =P(T t)}F(0) =P(T= 0)1 BecauseTis non-negative and usually denotes the elapsed time until anevent, it is commonly characterized in other ways as well:Survivor function:S(t)def= 1 F(t) =P(T > t)fort > survivor function simply indicates the probability that the event of in-terest has not yet occurred by timet; thus, ifTdenotes time until death,S(t) denotes probability of surviving beyond that, for an arbitraryT,F( ) andS( ) as defined above are right con-tinuous int.

2 For continuous Survival timeT, both functions are continuousint. However, even whenF( ) andS( ) are continuous, the nonparametricestimators, say F( ) and S( ), of these that we will consider are discrete distri-butions. For example, F( ) might be the corresponding to the discretedistribution that places massm1,m2, ,mkat certain times 1, 2, , , even thoughF( ) is continuous, its estimator F( ) is (only) right con-tinuous, and thus its value at a certain time point, say 2, will bem1+m2ifwe define the to be right continuous (but equal tom1if we had definedthe to be left continuous). Hazard function:h(t)def=limh 0P[t T < t+h|T t]h=f(t)S(t )withS(t ) = lims tS(s). That is, the Hazard function is a conditional den-sity, given that the event in question has not yet occurred prior to that for continuousT,h(t) = ddtln[1 F(t)] = ddtlnS(t). Cumulative Hazard function:H(t)def= t0h(u)du t >02= ln[1 F(t)] = lnS(t)Note thatS(t) =e H(t)f(t) =h(t)e H(t).

3 Note 1:Note thath(t)dt=f(t)dt/S(t) pr[fail in [t,t+dt)|survive untilt]. Thus, the Hazard function might be of more intrinsic interest than to a patient who had survived a certain time period and wanted toknow something about their 2:There are several reasons why it is useful to introduce the quantitiesh(t) and H(t): Interpretability:SupposeTdenotes time from surgery for breast canceruntil recurrence. Then when a patient who had received surgery visitsher physician, she would be more interested in conditional probabilitiessuch as Given that I haven t had a recurrence yet, what are my chancesof having one in the next year than in unconditional probabilities (asdescribed by the ). Analytic Simplifications:When the data are subject to right censoring, Hazard function representations often lead to easier analyses. For exam-ple, imagine assembling a cohort ofNpatients who just have turned 50years of age and then following them for 1 year.]

4 Then ifdof the mendie during the year of follow-up, the ratiod/Nestimates the (discrete) Hazard function ofT=age at death. We will see thatH( ) has niceanalytical properties. Modeling Simplifications:For many biomedical phenomena,Tis suchthath(t) varies rather slowly ,h( ) is well-suited for 3:It is useful to think about real phenomena and how their hazardfunctions might be shaped. For example, ifTdenotes the age of a car when itfirst has a serious engine problem, then one might expect the corresponding3hazard functionh(t) to be increasing int; that is, the conditional probabil-ity of a serious engine problem in the next month, given no problem so far,will increase with the life of the car. In contrast, if one were studying infantmortality in a region of the world where there was poor nutrition, one mightexpecth(t) to be decreasing during the first year of life.

5 This is known tobe due to selection during the first year of life. Finally, in some applications(such as whenTis the lifetime of a light bulb or the time to which you wona BIG lottery), the Hazard function will be approximately constant that the chances of failure in the next short time interval, given thatfailure hasn t yet occurred, does not change witht; , a 1-month old bulbhas the same probability of burning out in the next week as does a 5-yearold bulb. As we will see below, this lack of aging or memoryless propertyuniquely defines the exponential distribution, which plays a central role insurvival analysis. The Hazard function may assume more a complex example, ifTdenote the age of death, then the Hazard functionh(t) isexpected to be decreasing at first and then gradually increasing in the end,reflecting higher Hazard of infants and Common Families of Survival DistributionsExponential Distribution:denotedT Exp( ).

6 Fort >0,f(t) = e tfor >0 (scale parameter)F(t) = 1 e tS(t) =e th(t) = constant Hazard functionH(t) = tcharacteristic function: (u) =E[eiuT] = iu4 E[Tr] = (r)(u)ir u=0 E(T) =1 V(T) =1 2 Lack of Memory :P[T > t] =P[T > t+t0|T > t0]for anyt0>0(probability of surviving anotherttime units does notdepend on how long you ve lived so far) Also, the exponential family is closed to scale changes; that is:T Exp( ), c >0 c T Exp( /c)2-Parameter Gamma Distribution:The 2-parameter gamma distribution, which is denotedG( , ), can be viewedas a generalization of the exponential distribution. It arises naturally (thatis, there are real-life phenomena for which an associated Survival distributionis approximately Gamma) as well as analytically (that is, simple functions ofrandom variables have a gamma distribution).f(t) = t 1e t ( )fort >0 Parameters >0 and >0 ( ) = gamma func.

7 = 0t 1e tdt characteristic function: (u) =( iu) 5 E(T) = V(T) = 2 G(1, ) =Exp( ) T1 G( 1, ), T2 G( 2, ), T1 T2= T1+T2 G( 1+ 2, ) if =k2(k= integer),then 2 T 2kThe following plot shows the shape of the Gamma Hazard function for dif-ferent values of the shape parameter . The case =1 corresponds to theexponential distribution (constant Hazard function). When is greater than1, the Hazard function is concave and increasing. When it is less than one,the Hazard function is convex and (t)Gamma >1 = 1 <1 Weibull Distribution:The Weibull distribution can also be viewed as a generalization of the expo-nential distribution, and is denotedW(p, ). It is defined as follows:6F(t) = 1 e ( t)pf(t) =p ptp 1e ( t)ph(t) =p ptp 1(power oft)H(t) = ( t)pt >0 >0 (scale)p >0 (shape)As shown in the following plot of its Hazard function, the Weibull distributionreduces to the exponential distribution when the shape parameter p equals >1, the Hazard function is increasing; whenp <1 it is following properties of the Weibull distribution are easily verified.

8 T W(p, ), c >0 = cT W(p, c) T W(p, ) = Tp Exp( p) W(1, ) =Exp( )7 Note:The Weibull distribution is sometimes parameterized asH(t) = tpinstead ofH(t) = ( t)p, in which case the expressions and properties abovetake on a somewhat different Distribution:The log-normal distribution is denotedLN( , 2) exp{N( , 2)}.It is de-fined as follows:F(t) = (log(t) )f(t) = (log(t) )t h(t) =f(t)/F(t)where ( ) and ( ) are the pdf and CDF of standard following properties of the generalized gamma distribution are easily ver-ified. Fork= 1,2, E(Tk) =ek +k2 22 Generalized Gamma Distribution:The generalized gamma distribution can also be viewed as a generaliza-tion of the exponential, weibull and gamma distributions, and is denoted8GG( ,p, ). It is defined as follows:F(t) = { /p,( t)p}/ ( /p)f(t) =p ( t) 1e ( t)p/ ( /p)h(t) =p ( t) 1e ( t)p { /p,( t)p}t >0 >0p >0 >0where (s,x) = x0ts 1e tdtis the incomplete gamma following properties of the generalized gamma distribution are easily ver-ified.

9 Fork= 1,2, E(Tk) = (( +k)/p) k ( /p) Ifp= 1,GG( ,1, ) G( , ) if =p,GG(p,p, ) W(p, ) if =p= 1,GG(1,1, ) EXP( )Note:The generalized gamma distribution can be used to test the adequacyof commonly used Gamma, Weibull and Exponential distributions, since theyare all nested within the generalized gamma distribution Some Properties of Survival Time Random Variables9 T1, T2,.., Exp( ) = T1+T2+ +Tn G(n, )and 2 (T1+ +Tn) 22n SupposeT1,T2, ,Tnare ( ), and letT(1), T(2),.., T(n)de-note the corresponding order 1,2, ,n,defineZi= (n i+ 1)[T(i) T(i 1)]whereT(0)= 0. That is, ,Z1=nT(1)Z2= (n 1)[T(2) T(1)]..Zn=T(n) T(n 1)Z1,Z2, ,Znare sometimes called Normalized Spacings . Imagine the window in time extending fromt= 0 untilt=T(1). The total amountof lifetime observed during this window isnT(1), since all n subjects arealive througout this time period. This is justZ1.

10 Next consider the window extending fromt=T(1)toT(2). The total observed time inthis window is (n 1)[T(2) T(1)], sincen 1 subjects survive throughthis window. This is justZ2. Finally, the total observed time in thewindow extending fromt=T(n 1)toT(n)is just [T(n) T(n 1)] =Zn,since only 1 subject passes through this window. The normalized spac-ings have an interpretation in terms of accumulated lifetime observed inspecific cross-sectional views of the data. When the originalTiare ( ) random variables, it can be shown thatZ1,Z2,.., Exp( ) (Exercise 4). ThatZ1, which isntimes the gap until the first failure, andZn, which is the gap between the next-to-lastand last failure, have the same distribution speaks to the right tail ofthe exponential Poisson Process with parameter N(t) = # events occuring in (0,t) Pois( t)DefineTi= time between (i 1)standithevents,i= 1,2.


Related search queries