Example: stock market

Tutorial Paper Survival Analysis Part I: Basic concepts ...

TutorialPaperSurvival Analysis part I: Basic concepts and first analysesTG Clark*,1, MJ Bradburn1, SB Love1and DG Altman11 Cancer Research UK/NHS Centre for Statistics in Medicine, Institute of Health Sciences, University of Oxford, Old Road, Oxford OX3 7LF, UKBritish Journal of Cancer(2003)89,232 238. Cancer Research UKKeywords: Survival Analysis ; statistical methods; Kaplan-Meier INTRODUCTIONIn many cancer studies, the main outcome under assessment is thetime to an event of interest. The generic name for the time issurvival time, although it may be applied to the time survived from complete remission to relapse or progression as equally as tothe time from diagnosis to death. If the event occurred in allindividuals, many methods of Analysis would be applicable. However,it is usual that at the end of follow-up some of the individuals havenot had the event of interest, and thus their true time to event isunknown.

(death) data. Figure 1 (left) shows that four patients had a nonfatal relapse, one was lost to follow-up, and seven patients died (five from ovarian cancer).

Tags:

  Analysis, Basics, Paper, Survival, Part, Tutorials, Tutorial paper survival analysis part

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Tutorial Paper Survival Analysis Part I: Basic concepts ...

1 TutorialPaperSurvival Analysis part I: Basic concepts and first analysesTG Clark*,1, MJ Bradburn1, SB Love1and DG Altman11 Cancer Research UK/NHS Centre for Statistics in Medicine, Institute of Health Sciences, University of Oxford, Old Road, Oxford OX3 7LF, UKBritish Journal of Cancer(2003)89,232 238. Cancer Research UKKeywords: Survival Analysis ; statistical methods; Kaplan-Meier INTRODUCTIONIn many cancer studies, the main outcome under assessment is thetime to an event of interest. The generic name for the time issurvival time, although it may be applied to the time survived from complete remission to relapse or progression as equally as tothe time from diagnosis to death. If the event occurred in allindividuals, many methods of Analysis would be applicable. However,it is usual that at the end of follow-up some of the individuals havenot had the event of interest, and thus their true time to event isunknown.

2 Further, Survival data are rarely Normally distributed,but are skewed and comprise typically of many early events andrelatively few late ones. It is these features of the data that make thespecial methods calledsurvival Paper is the first of a series of four articles that aim tointroduce and explain the Basic concepts of Survival Analysis . Mostsurvival analyses in cancer journals use some or all of Kaplan Meier (KM) plots, logrank tests, and Cox (proportional hazards)regression. We will discuss the background to, and interpretationof, each of these methods but also other approaches to analysisthat deserve to be used more often. In this first article, we willpresent the Basic concepts of Survival Analysis , including how toproduce and interpret Survival curves, and how to quantify andtest Survival differences between two or more groups of papers in the series cover multivariate Analysis and the lastpaper introduces some more advanced concepts in a brief questionand answer format.

3 More detailed accounts of these methods canbe found in books written specifically about Survival Analysis , forexample, Collett (1994), Parmar and Machin (1995) and Kleinbaum(1996). In addition, individual references for the methods arepresented throughout the series. Several introductory texts alsodescribe the basis of Survival Analysis , for example, Altman (2003)and Piantadosi (1997).TYPES OF EVENT IN CANCER STUDIESIn many medical studies, time to death is the event of , in cancer, another important measure is the timebetween response to treatment and recurrence or relapse-freesurvival time (also called disease-free Survival time). It isimportant to state what the event is and when the period ofobservation starts and finishes. For example, we may be interestedin relapse in the time period between a confirmed response and thefirst relapse of MAKES Survival ANALYSISDIFFERENTThe specific difficulties relating to Survival Analysis arise largelyfrom the fact that only some individuals have experienced theevent and, subsequently, Survival times will be unknown for asubset of the study group.

4 This phenomenon is called censoringand it may arise in the following ways: (a) a patient has not (yet)experienced the relevant outcome, such as relapse or death, by thetime of the close of the study; (b) a patient is lost to follow-upduring the study period; (c) a patient experiences a different eventthat makes further follow-up impossible. Such censored survivaltimes underestimate the true (but unknown) time to the Survival process of an individual as a time-line,their event (assuming it were to occur) is beyond the end of thefollow-up period. This situation is often calledright can also occur if we observe the presence of a state orcondition but do not know where it began. For example, consider astudy investigating the time to recurrence of a cancer followingsurgical removal of the primary tumour. If the patients wereexamined 3 months after surgery to determine recurrence, thenthose who had a recurrence would have a Survival time that wasleft censoredbecause the actual time of recurrence occurred lessthan 3 months after surgery.

5 Event time data may also beintervalcensored,meaning that individuals come in and out of we consider the previous example and patients are alsoexamined at 6 months, then those who are disease free at 3months and lost to follow-up between 3 and 6 months areconsidered interval censored. Most Survival data include rightcensored observations, but methods for interval and left censoreddata are available (Hosmer and Lemeshow, 1999). In the remainderof this Paper , we will consider right censored data general, the feature of censoring means that special methods ofanalysis are needed, and standard graphical methods of data explo-ration and presentation, notably scatter diagrams, cannot be STUDIESO varian cancer dataThis data set relates to 825 patients diagnosed with primaryepithelial ovarian carcinoma between January 1990 and December1999 at the Western General Hospital in Edinburgh.

6 Follow-updata were available up until the end of December 2000, by whichtime 550 ( ) had died (Clarket al, 2001). Figure 1 shows datafrom 10 patients diagnosed in the early 1990s and illustrates howpatient profiles in calendar time are converted to time to eventReceived 6 December 2002; accepted 30 April 2003*Correspondence: Mr TG Clark; E-mail: Journal of Cancer (2003) 89,232 238&2003 Cancer Research UK All rights reserved 0007 0920/03$ (death) data. Figure 1 (left) shows that four patients had a nonfatalrelapse, one was lost to follow-up, and seven patients died (fivefrom ovarian cancer). In the other plot, the data are presented inthe format for a Survival Analysis where all-cause mortality is theevent of interest. Each patient s Survival time has been plotted asthe time from diagnosis. It is important to note that becauseoverall mortality is the event of interest, nonfatal relapses areignored, and those who have not died are considered (right)censored.

7 Figure 1 (right) is specific to the outcome or event ofinterest. Here, death from any cause, often called overall Survival ,was the outcome of interest. If we were interested solely in ovariancancer deaths, then patients 5 and 6 those who died fromnonovarian causes would be censored. In general, it is goodpractice to choose an end-point that cannot be misclassified. All-cause mortality is a more robust end-point than a specific cause ofdeath. If we were interested in time to relapse, those who did nothave a relapse (fatal or nonfatal) would be censored at either thedate of death or the date of last cancer clinical trial dataThese data originate from a phase III clinical trial of 164 patientswith surgically resected (non-small cell) lung cancer, randomisedbetween 1979 and 1985 to receive radiotherapy either with or with-out adjuvant combination platinum-based chemotherapy (LungCancer Study Group, 1988; Piantadosi, 1997).

8 For the purposes ofthis series, we will focus on the time to first relapse (includingdeath from lung cancer). Table 1 gives the time of the earliest 15and latest five relapses for each treatment group, where it can beseen that some patients were alive and relapse-free at the end of thestudy. The relapse proportions in the radiotherapy and combina-tion arms were (70 out of 86) and (54 out of 78), res-pectively. However, these figures are potentially misleading as theyignore the duration spent in remission before these events AND HAZARDS urvival data are generally described and modelled in terms of tworelated probabilities, namelysurvivalandhazard. The survivalprobability (which is also called the survivor function)S(t) is theprobability that an individual survives from the time origin ( of cancer) to a specified future timet. It is fundamentalto a Survival Analysis because Survival probabilities for differentvalues oftprovide crucial summary information from time toevent data.

9 These values describe directly the Survival experienceof a study hazard is usually denoted byh(t)orl(t) and is theprobability that an individual who is under observation at a timethas an event at that time. Put another way, it represents theinstantaneous event rate for an individual who has alreadysurvived to that, in contrast to the survivor function,which focuses on not having an event, the hazard function focuseson the event occurring. It is of interest because it provides insightinto the conditional failure rates and provides a vehicle forspecifying a Survival model. In summary, the hazard relates to theincident (current) event rate, while Survival reflects the MEIER Survival ESTIMATEThe Survival probability can be estimated nonparametrically fromobserved Survival times, both censored and uncensored, using theKM (or product-limit) method (Kaplan and Meier, 1958).

10 Supposethatkpatients have events in the period of follow-up at distincttimest1ot2ot3ot4ot5o?otk. As events are assumed to occurindependently of one another, the probabilities of surviving fromone interval to the next may be multiplied together to give thecumulative Survival probability. More formally, the probability ofbeing alive at timetj,S(tj), is calculated fromS(tj 1) the probabilityof being alive attj 1,njthe number of patients alive just beforetj,anddjthe number of events attj,byS tj S tj 1 1 djnj wheret0 0 andS(0) 1. The value ofS(t) is constant betweentimes of events, and therefore the estimated probability is a stepfunction that changes value only at the time of each event. Thisestimator allows each patient to contribute information to thecalculations for as long as they are known to be event-free. Wereevery individual to experience the event ( no censoring), thisestimator would simply reduce to the ratio of the number ofindividuals events free at timetdivided by the number of peoplewho entered the intervals for the Survival probability can also becalculated.


Related search queries