Example: air traffic controller

236-2009: Fitting Cox Model Using PROC PHREG and ... - SAS

1 Paper 236-2009 Fitting Cox Model Using PROC PHREG and Beyond in SAS Lea Liu, Sandy Forman, Bruce Barton Maryland Medical Research Institute, Baltimore, Maryland, USA Abstract Cox proportional hazard Model has been widely used for survival analysis in many areas in investigating time-to-event data. PROC PHREG in SAS has been a powerful tool used for construction of a Cox Model . In addition to the STATEMENTs and OPTIONs within PHREG that have already provided the most demanded output, a little more effort on data manipulation would accomplish the calculations and generate output for some new statistical approaches.

ROC curve. SAS has options for generating classification table and ROC curve in PROC LOGISTIC. However measurement of predictive accuracy can be more complex for survival analysis in the presence of censoring. C-index introduced by Harrell (Ref. 1, 1996) as a natural extension of the ROC curve is an

Tags:

  Analysis, Survival, Survival analysis

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of 236-2009: Fitting Cox Model Using PROC PHREG and ... - SAS

1 1 Paper 236-2009 Fitting Cox Model Using PROC PHREG and Beyond in SAS Lea Liu, Sandy Forman, Bruce Barton Maryland Medical Research Institute, Baltimore, Maryland, USA Abstract Cox proportional hazard Model has been widely used for survival analysis in many areas in investigating time-to-event data. PROC PHREG in SAS has been a powerful tool used for construction of a Cox Model . In addition to the STATEMENTs and OPTIONs within PHREG that have already provided the most demanded output, a little more effort on data manipulation would accomplish the calculations and generate output for some new statistical approaches.

2 This paper presents: 1) Using overall C-index as a measure of discrimination for Model validation; 2) Calculating adjusted survival rate Using corrected group prognosis method; 3) Presenting the effect of a continuous covariate on estimated survival . The core SAS program code and MACRO used for implementation are included. Introduction Cox proportional hazard Model has been widely used for survival analysis in many discipline areas in investigating time-to-event data. New statistical methods and approaches on this subject have been updated consistently.

3 PROC PHREG in SAS has been a powerful tool used for construction of a Cox Model . The STATEMENTs and OPTIONs within PROC PHREG have provided the most demanded output. However, there is a lag time for SAS to update the code to respond to the new methods. To meet the needs in daily work, we have created some SAS programs and a MACRO for the following three new statistical approaches: 1) Using overall C-index as a measure of discrimination for Model validation; 2) Calculating adjusted survival rate Using corrected group prognosis method; 3) Presenting the effect of a continuous covariate on estimated survival .

4 These methods have been used in many medical research articles in recent years. A Model construction usually starts with Fitting a full Model including all candidate predictors. Then a preferred approach for selecting variables will be used to obtain a final Model which is more stable and retains the predictive power with reduced number of variables that make significant contribution to the outcome. The three new statistical approaches we introduced in this paper all apply on the final Model . The examples in this paper use the same data set from a clinical study on heart diseases.

5 Model validation Using overall C as a measure of discrimination Discrimination is one of Model validation process that tests the ability of a predictive Model to separate those who develop event from those who do not. One of the most popular measures of discrimination is ROC curve. SAS has options for generating classification table and ROC curve in PROC LOGISTIC. However measurement of predictive accuracy can be more complex for survival analysis in the presence of censoring. C-index introduced by Harrell (Ref.)

6 1, 1996) as a natural extension of the ROC curve is an easily interpretable measure of predictive discrimination. Later, Pencina and D Agostino (Ref. 2, 2004) developed the overall C index as a parameter describing the performance of a given Model applied to the population under consideration and discuss the statistic used as its sample estimate. C-index for the survival analysis Model is defined as the probability of concordance given that the pairs considered are usable in which at least one had an event.

7 It can be interpreted as the probability that a subject from the event group has a higher predicted probability of having an event than a subject from the non-event group. Statistics and Data AnalysisSASG lobalForum2009 2 In constructing C-index, we can use only usable pairs. This results in either event vs. event or event vs. non-event comparison. Example 1: we use the factors (MI history (mihx), Diabetes (diabhx), Low Ejection Fraction (lowef)) to explain the composite end point of death, re-infarction, or class IV heart failure (combfv).

8 First, re-run the final Model Using PROC PHREG with OUTPUT statement to create dataset that contains subject-id, observed survival time and survival function estimate for each individual. Then create a dataset evtset including only the subject who had event. proc PHREG data=sample; id idn; Model combdays*combfv(0)=mihx diabhx lowef; output out=obs survival =surv; run; data evtset; set obs; if combfv=1; rename idn=idn_j surv=y_j combdays=x_j; keep idn surv combdays; run; Secondly, construct all usable pairs and create variable for concordance.

9 PROC SQL creates dataset that including all usable pairs by a Cartesian join. Dataset concord includes a new variable concord that identifies if the pair is concordant or not. proc sql; create table allset as select idn_j, y_j, x_j, idn as idn_i, surv as y_i, combdays as x_i from evtset, obs where idn_j<>idn; quit; data concord; set allset; if (x_i<x_j and y_i>y_j) or (x_i>x_j and y_i<y_j) then concord=1; else concord=0; run; Denote the actual survival times of subjects by X1, X2.

10 , the predicted probabilities of survival by Y1, Y2, .., a concordant pair is when Xi<Xj and Yi<Yj or Xi>Xj and Yi>Yj. If the inequalities go in the opposite direction, : Xi<Xj and Yi>Yj or Xi>Xj and Yi<Yj, then a pair is said to be discordant. The following code performs the calculation of the C-index and 95%CI Using the estimated probabilities of concordance and discordance proposed by Pencina and D Agostino. data _null_; set concord end=eof; retain nch ndh; if _N_=1 then do; nch=0; ndh=0; end; if concord=1 then nch+1; if concord=0 then ndh+1; if eof=1 then do; call symput('ch',trim(left(nch))); call symput('dh',trim(left(ndh))); call symput('uspairs',trim(left(_n_))); end; Statistics and Data AnalysisSASG lobalForum2009 3run; data _null_; set sample end=eof; if eof=1 then call symput('totobs',trim(left(_n_))); run; %put &ch &dh &uspairs data calculat; ch=input("&ch", ).


Related search queries