Example: confidence

Generalized Linear Mixed Models - Fall 2012

Generalized Linear Mixed Multilevel Models ), in which the level-1 observa- tions (subjects or repeated observations) are nested Models within the higher level-2 observations (clusters or subjects). Higher levels are also possible, for exam- ple, a three-level design could have repeated obser- Introduction vations (level-1) nested within subjects (level-2) who are nested within clusters (level-3). Generalized Linear Models (GLMs) represent a class For analysis of such multilevel data, random of fixed effects regression Models for several types of cluster and/or subject effects can be added into the dependent variables ( , continuous, dichotomous, regression model to account for the correlation of counts). McCullagh and Nelder [32] describe these in the data. The resulting model is a Mixed model great detail and indicate that the term Generalized lin- including the usual fixed effects for the regressors ear model' is due to Nelder and Wedderburn [35] who plus the random effects.

For these, it is the probabil-ity of smoking abstinence, rather than smoking, that. Generalized Linear Mixed Models 3 Table 1 Smoking cessationstudy:smokingstatus (0 = smoking, 1 = not smoking)acrosstime(N = 489),GLMM logistic parameter estimates (Est.), …

Tags:

  Probabil, Prob ability

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Generalized Linear Mixed Models - Fall 2012

1 Generalized Linear Mixed Multilevel Models ), in which the level-1 observa- tions (subjects or repeated observations) are nested Models within the higher level-2 observations (clusters or subjects). Higher levels are also possible, for exam- ple, a three-level design could have repeated obser- Introduction vations (level-1) nested within subjects (level-2) who are nested within clusters (level-3). Generalized Linear Models (GLMs) represent a class For analysis of such multilevel data, random of fixed effects regression Models for several types of cluster and/or subject effects can be added into the dependent variables ( , continuous, dichotomous, regression model to account for the correlation of counts). McCullagh and Nelder [32] describe these in the data. The resulting model is a Mixed model great detail and indicate that the term Generalized lin- including the usual fixed effects for the regressors ear model' is due to Nelder and Wedderburn [35] who plus the random effects.

2 Mixed Models for continuous described how a collection of seemingly disparate normal outcomes have been extensively developed statistical techniques could be unified. Common Gen- since the seminal paper by Laird and Ware [28]. eralized Linear Models (GLMs) include Linear regres- For nonnormal data, there have also been many sion, logistic regression, and Poisson regression. developments, some of which are described below. There are three specifications in a GLM. First, Many of these developments fall under the rubric of the Linear predictor, denoted as i , of a GLM is of Generalized Linear Mixed Models (GLMMs), which the form extend GLMs by the inclusion of random effects i = xi , (1) in the predictor. Agresti et al. [1] describe a variety of social science applications of GLMMs; [12], [33], where xi is the vector of regressors for unit i with and [11] are recent texts with a wealth of statistical fixed effects.

3 Then, a link function g( ) is specified material on GLMMs. which converts the expected value i of the outcome Let i denote the level-2 units ( , subjects) and variable Yi ( , i = E[Yi ]) to the Linear predictor i let j denote the level-1 units ( , nested obser- vations). The focus will be on longitudinal designs g( i ) = i . (2). here, but the methods apply to clustered designs as well. Assume there are i = 1, .. , N subjects Finally, a specification for the form of the variance (level-2 units) and j = 1, .. , ni repeated observa- in terms of the mean i is made. The latter two tions (level-1 units) nested within each subject. A. specifications usually depend on the distribution of random-intercept model, which is the simplest Mixed the outcome Yi , which is assumed to fall within the model, augments the Linear predictor with a single exponential family of distributions. random effect for subject i, Fixed effects Models , which assume that all obser- vations are independent of each other, are not appro- priate for analysis of several types of correlated data ij = xij + i , (3).

4 Structures, in particular, for clustered and/or longitu- dinal data (see Clustered Data). In clustered designs, where i is the random effect (one for each subject). subjects are observed nested within larger units, for These random effects represent the influence of example, schools, hospitals, neighborhoods, work- subject i on his/her repeated observations that is not places, and so on. In longitudinal designs, repeated captured by the observed covariates. These are treated observations are nested within subjects (see Lon- as random effects because the sampled subjects are gitudinal Data Analysis and Repeated Measures thought to represent a population of subjects, and they Analysis of Variance). These are often referred to as are usually assumed to be distributed as N(0, 2 ). multilevel [16] or hierarchical [41] data (see Linear The parameter 2 indicates the variance in the population distribution, and therefore the degree of Reproduced from the Encyclopedia of Statistics in heterogeneity of subjects.)

5 Behavioral Science. John Wiley & Sons, Ltd. Including the random effects, the expected value ISBN: 0-470-86080-4. of the outcome variable, which is related to the Linear 2 Generalized Linear Mixed Models predictor via the link function, is given as probability of a response given the random effects (and covariate values). ij = E[Yij | i , xij ]. (4) This model can also be written as This is the expectation of the conditional distribu- P (Yij = 1|vi , xij , zij ) = g 1 ( ij ) = ( ij ), (7). tion of the outcome given the random effects. As a result, GLMMs are often referred to as conditional where the inverse link function ( ij ) is the logis- Models in contrast to the marginal Generalized esti- tic cumulative distribution function (cdf), namely mating equations (GEE) Models (see Generalized ( ij ) = [1 + exp( ij )] 1 . A nicety of the logis- Estimating Equations (GEE)) [29], which represent tic distribution, that simplifies parameter estimation, an alternative generalization of GLMs for correlated is that the probability density function (pdf) is related data (see Marginal Models for Clustered Data).

6 To the cdf in a simple way, as ( ij ) = ( ij )[1 . The model can be easily extended to include mul- ( ij )]. tiple random effects. For example, in longitudinal The probit model, which is based on the standard problems, it is common to have a random subject normal distribution, is often proposed as an alterna- intercept and a random Linear time-trend. For this, tive to the logistic model [13]. For the probit model, denote zij as the r 1 vector of variables having ran- the normal cdf and pdf replace their logistic counter- dom effects (a column of ones is usually included for parts. A useful feature of the probit model is that it the random intercept). The vector of random effects can be used to yield tetrachoric correlations for the vi is assumed to follow a multivariate normal distri- clustered binary responses, and polychoric correla- bution with mean vector 0 and variance covariance tions for ordinal outcomes (discussed below).

7 For this matrix v (see Catalogue of Probability Density reason, in some areas, for example familial studies, Functions). The model is now written as the probit formulation is often preferred to its logistic counterpart. ij = xij + zij vi . (5). Note that the conditional mean ij is now specified Example as E[Yij |vi , xij ], namely, in terms of the vector of Gruder et al. [20] describe a smoking-cessation study random effects. in which 489 subjects were randomized to either a control, discussion, or social support conditions. Con- trol subjects received a self-help manual and were Dichotomous Outcomes encouraged to watch twenty segments of a daily TV. program on smoking cessation, while subjects in the Development of GLMMs for dichotomous data has two experimental conditions additionally participated been an active area of statistical research. Several in group meetings and received training in support approaches, usually adopting a logistic or probit and relapse prevention.

8 Here, for simplicity, these regression model (see Probits) and various methods two experimental conditions will be combined. Data for incorporating and estimating the influence of the were collected at four telephone interviews: postin- random effects, have been developed. A review arti- tervention, and 6, 12, and 24 months later. Smoking cle by Pendergast et al. [37] discusses and compares abstinence rates (and sample sizes) at these four time- many of these developments. points were (109), (97), (92), and The Mixed -effects logistic regression model is a (77) for the placebo condition. Similarly, for common choice for analysis of multilevel dichoto- the combined experimental condition it was mous data and is arguably the most popular GLMM. (380), (357), (337), and (295). In the GLMM context, this model utilizes the logit for these timepoints. link, namely Two logistic GLMM were fit to these data: a ran.

9 Ij dom intercept and a random intercept and Linear trend g( ij ) = logit( ij ) = log = ij . (6) of time model (see Growth Curve Modeling). These 1 ij Models were estimated using SAS PROC NLMIXED. Here, the conditional expectation ij = E(Yij |vi , xij ) with adaptive quadrature. For these, it is the probabil - equals P (Yij = 1|vi , xij ), namely, the conditional ity of smoking abstinence, rather than smoking, that Generalized Linear Mixed Models 3. Table 1 Smoking cessation study: smoking status (0 = smoking, 1 = not smoking) across time (N = 489), GLMM logistic parameter estimates (Est.), standard errors (SE), and P values Random intercept model Random int and trend model Parameter Est. SE P value Est. SE P value Intercept .362 .001 .432 .001. Time .113 .122 .36 .502 .274 .07. Condition (0 = control; 1 = experimental) .379 .001 .415 .001. Condition by Time .322 .136 .02 .331.

10 249 .184. Intercept variance .600 Intercept Time covariance .048 .371. Time variance .468. 2 log likelihood Note: P values not given for variance and covariance parameters (see [41]). is being modeled. Fixed effects included a condition This example shows that the significance of model term (0 = control, 1 = experimental), time (coded 0, terms can depend on the structure of the random 1, 2, and 4 for the four timepoints), and the con- effects. Thus, one must decide upon a reasonable dition by time interaction. Results for both Models model for the random effects as well as for the are presented in Table 1. Based on a likelihood-ratio fixed effects. A commonly recommended approach test, the model with random intercept and Linear time for this is to perform a sequential procedure for model trend is preferred over the simpler random intercept selection. First, one includes all possible covariates model ( 22 = ).


Related search queries