Transcription of Factor Analysis Example - Harvard University
1 Factor Analysis Example Qian-Li Xue Biostatistics Program Harvard Catalyst | The Harvard Clinical & Translational Science Center Short course, October 28, 2016 1 Example : Frailty Frailty is a biologic syndrome of decreased reserve and resistance to stressors, resulting from cumulative declines across multiple physiologic systems, and causing vulnerability to adverse outcomes (Fried et al. 2001) Common phenotypes of frailty in geriatrics include weakness, fatigue, weight loss, decreased balance, low levels of physical activity, slowed motor processing and performance, social withdrawal, mild cognitive changes, and increased vulnerability to stressors (Walston et al.)
2 2006) 2 Example : Frailty Manifest Variables of Frailty: Body composition: Arm circumference Body mass index Tricep skinfold thickness Slowed motor processing and performance: Speed of fast walk Speed of Pegboard test Speed of usual walk Time to do chair stands Muscle Strength: Grip strength Knee extension Hip extension 3 Recap of Basic Characteristics of Exploratory Factor Analysis (EFA) Most EFA extract orthogonal factors, which may not be a reasonable assumption Distinction between common and unique variances EFA is underidentified ( no unique solution) Remember rotation?
3 Equally good fit with different rotations! All measures are related to each Factor 4 Major steps in EFA 1. Data collection and preparation 2. Choose number of factors to extract 3. Extracting initial factors 4. Rotation to a final solution 5. Model diagnosis/refinement 6. Derivation of Factor scales to be used in further Analysis 5 Step 1. Data collection and preparation v Factor Analysis is totally dependent on correlations between variables. v Factor Analysis summarizes correlation structure O1 .. On Data Matrix v1 .. vk v1 .. vk Correlation Matrix Factor pattern Matrix 6 Example .
4 Frailty (N=547) bmi arm skin grip knee hip uslwalk fastwk chrstand peg ---------------------------------------- ------------------------------------ bmi arm skin grip knee hip uslwalk fastwk chrstand peg ---------------------------------------- ---------------------------------------- ---------------------------------------- Observed Data
5 Correlation Matrix 7 Step 2. Choose number of factors v Intuitively: The number of uncorrelated constructs that are jointly measured by the Y s. v Only useful if number of factors is less than number of Y s (recall data reduction ). v Estimability: Is there enough information in the data to estimate all of the parameters in the Factor Analysis ? May be constrained to a certain number of factors. 8 Step 2. Choosing number of factors Use Principal Components Analysis (PCA) to help decide Similar to Factor Analysis , but conceptually quite different! number of factors is equivalent to number of variables each Factor or principal component is a weighted combination of the input variables Y1.
6 Yn: P1 = a11Y1 + a12Y2 + .. a1nYn Principal components ARE NOT latent variable Does not differentiate between common and unique variances 9 Choosing Number of Factors 10 /* Principal Components Analysis */ Proc Factor data=frailty METHOD=PRIN outstat= plots=(scree); var bmi arm skin grip knee hip uslwalk fastwk chrstand peg; %parallel(data=frailty, niter=1000, statistic=Median); run; SAS PCA Output Eigenvalues of the Correlation Matrix: Total = 10 Average = 1 Eigenvalue Difference Proportion Cumulative 1 2 3 4 5 6 7 8 9 10 11 Step 2.
7 Choosing number of factors To select how many factors to use, evaluate eigenvalues from PCA Two interpretations: eigenvalue equivalent number of variables which the Factor represents eigenvalue amount of variance in the data described by the Factor . Criteria to go by: number of eigenvalues > 1 (Kaiser-Guttman Criterion) scree plot parallel Analysis % variance explained comprehensibility 12 Choosing Number of Factors 13 Parallel Analysis (Hayton, Allen, & Scarpello (2004) Eigenvalues (EV) that would be expected from random data are compared to those produced by the data If EV(random data) > EV(real data), the derived factors are mostly random noise How to do this in SAS How to do this in STATA Type findit fapara in STATA to locate the program for free download Reference.)
8 14 Choosing Number of Factors 15 Accuracy of Retention Criteria EV > 1 Tends to always over estimate number of Factor Accuracy increase with small number variables & communalities are high Scree Test More accurate than EV>1 Subjective and sometimes ambiguous Parallel Test Most accurate Becoming the standard 16 Step 3. Extracting initial factors Using MLE Proc Factor data=frailty METHOD=ML priors=smc msa residual rotate=varimax reorder outstat= plots=(scree initloadings loadings); var bmi arm skin grip knee hip uslwalk fastwk chrstand peg; run; 17 Step 3.
9 Extracting initial factors Using MLE Factor Pattern (unrotated) Factor1 Factor2 Factor3 arm bmi skin grip fastwk uslwalk peg chrstand knee hip Final Communality Estimates and Variable Weights Total Communality: Weighted = Unweighted = Variable Communality Weight bmi arm skin grip knee hip uslwalk fastwk chrstand peg 18 Step 4. Factor Rotation Steps 2 and 3 determines the minimum number of factors needed to account for observed correlations After obtaining initial orthogonal factors, we want to find more easily interpretable factors via rotations While keeping the number of factors and communalities of Ys fixed!
10 !! Rotation does NOT improve fit! 19 Step 4. Factor Rotation All solutions are relatively the same Goal is simple structure Most construct validation assumes simple (typically rotated) structure. Rotation does NOT improve fit! 20 Step 4. Factor Rotation (Varimax) Rotated Factor Pattern (Varimax) Factor1 Factor2 Factor3 arm bmi skin grip fastwk uslwalk peg chrstand knee hip 21 Factor Pattern (unrotated) Factor1 Factor2 Factor3 arm bmi skin grip fastwk uslwalk peg chrstand knee hip Step 4.