Example: air traffic controller

Repeated Measures Analysis with Discrete Data …

Repeated Measures Analysis with Discrete data Using theSAS SystemGordon JohnstonMaura StokesSAS Institute Inc., Cary, NCAbstractThe Analysis of correlated data arising from repeatedmeasurements when the measurements are assumedto be multivariate normal has been studied exten-sively. In many practical problems, however, thenormality assumption is not reasonable. When theresponses are Discrete and correlated, for example,different methodology must be used in the Analysis ofthe data .

Repeated Measures Analysis with Discrete Data Using the SAS System Gordon Johnston Maura Stokes SAS Institute Inc., Cary, NC Abstract The analysis of correlated data arising from repeated

Tags:

  Analysis, With, Data, Measure, Discrete, Repeated measures analysis with discrete data, Repeated

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Repeated Measures Analysis with Discrete Data …

1 Repeated Measures Analysis with Discrete data Using theSAS SystemGordon JohnstonMaura StokesSAS Institute Inc., Cary, NCAbstractThe Analysis of correlated data arising from repeatedmeasurements when the measurements are assumedto be multivariate normal has been studied exten-sively. In many practical problems, however, thenormality assumption is not reasonable. When theresponses are Discrete and correlated, for example,different methodology must be used in the Analysis ofthe data .

2 Generalized Estimating Equations (GEEs)provide a practical method with reasonable statisticalefficiency to analyze such data . This paper providesan overview of the use of GEEs in the Analysis ofcorrelated data using the SAS System. Emphasis isplaced on Discrete correlated data , since this is anarea of great practical were introduced by Liang and Zeger (1986) asa method of dealing with correlated data when, exceptfor the correlation among responses, the data can bemodeled as a generalized linear model.

3 For example,correlated binary and count data often can be mod-eled in this way. with Release of SAS/STAT software, the GENMOD procedure includes the ca-pability to perform GEE model fitting. In addition, theAlternating Logistic Regression algorithm for fitting logodds ratios with binary data will be implemented ina future release. This paper provides an overviewof the GEE methodology that is implemented in theGENMOD procedure. Refer to Diggle, Liang, andZeger (1994) and the other references at the end ofthis paper for more details on this data can arise from situations such as longitudinal studies, in which multiple measure -ments are taken on the same subject at differentpoints in time clustering, where measurements are taken onsubjects that share a common category or char-acteristic that leads to correlation.

4 For example,incidence of pulmonary disease among familymembers may be correlated because of hered-itary correlation must be accounted for by analysismethods appropriate to the data . Possible conse-quences of analyzing correlated data as if they wereindependent are incorrect inferences concerning regression pa-rameters due to underestimated standard errors inefficient estimators, that is, more mean squareerror in regression parameter estimators thannecessaryExample of Longitudinal DataThe following data , from Thall and Vail (1990), areconcerned with the treatment of epileptic seizureepisodes.

5 These data were also analyzed in Dig-gle, Liang, and Zeger (1994). The data consists of thenumber of epileptic seizures in an eight-week baselineperiod, before any treatment, and in each of four two-week treatment periods, in which patients receivedeither a placebo or the drug Progabide in additionto other therapy. A portion of the data is shown inTable Seizure DataPatient measurements are likely to be corre-lated, whereas between-subject measurements arelikely to be independent.

6 The raw correlations amongthe counts between visits are shown in Table 2. Theyindicate strong correlation in the number of seizures1between the visits. Accounting for this correlation isan important aspect of the Analysis strategy. Theseizures data will be analyzed in later sections ascount data with a specified correlation CorrelationsVisit 1 Visit 2 Visit 3 Visit 4 Visit Linear Models for Indepen-dent DataLetYi;i=1;:::;nbe independent linear models for independent data arecharacterized by a systematic componentg(E(Yi))=g( i)=xi0 where i=E(Yi),gis a link function that re-lates the means of the responses to the linearpredictorxi0 ,xiis a vector of independentvariables for theith observation, and is a vec-tor of regression parameters to be estimated.

7 A random component:Yi;i=1;:::;nareindependent and have a probability distributionfrom an exponential family:Yi exponential family:binomial, Poisson,normal, gamma,inverse gaussianThe exponential family assumption implies that thevariance ofYiis given byVi= v( i), wherevisa variance function that is determined by the specificprobability distribution and is a dispersion parameterthat may be known or may be estimated from thedata, depending on the specific model. The variancefunctions for the binomial and Poisson distributionsare given by binomial:v( )= (1 ) Poisson:v( )= The maximum likelihood estimator of thep 1 pa-rameter vector is obtained by solving the estimatingequationsmXi=1@ 0i@ v 1i(yi i( ))=0for.

8 This is a nonlinear system of equations for ,and it can be solved iteratively by the Fisher scoringor Newton-Raphson CorrelationGeneralized Estimating EquationsLetYij;j=1;:::;ni;i=1;:::;Krepr esent thejthmeasurement on theith subject. There arenimeasur-ments on subjectiandPKi=1nitotal data are modeled using the same link func-tion and linear predictor setup (systematic component)as the independence case. The random componentis described by the same variance functions as in theindependence case, but the covariance structure ofthe correlated measurements must also be the vector of measurements on theith subjectbeYi=[Yi1;:::;Yini]0with corresponding vector ofmeans i=[ i1;:::; ini]0and letVibe an estimateof the covariance matrix ofYi.

9 The Generalized Es-timating Equation for estimating is an extension ofthe independence estimating equation to correlateddata and is given byKXi=1@ 0i@ V 1i(Yi i( ))=0 Working CorrelationsLetRi( )be anni ni"working" correlation matrixthat is fully specified by the vector of parameters .The covariance matrix ofYiis modeled asVi= A12iR( )A12iwhereAis anni nidiagonal matrix withv( ij)as thejth diagonal element. IfRi( )is the true correlationmatrix ofYi, thenViis the true covariance matrix working correlation matrix is not usually knownand must be estimated.

10 It is estimated in the iterativefitting process using the current value of the param-eter vector to compute appropriate functions of thePearson residualeij=yij ijpv( ij)There are several specific choices of the form ofworking correlation matrixRi( )commonly used tomodel the correlation matrix are shown below. Refer to Liang and Zeger(1986) for additional choices. The dimension of the2vector , which is treated as a nuisance parameter,and the form of the estimator of are different foreach choice.


Related search queries