Data, Covariance, and Correlation Matrix

Data, Covariance, and Correlation MatrixNathaniel E. HelwigAssistant Professor of Psychology and StatisticsUniversity of Minnesota (Twin Cities)Updated 16-Jan-2017 Nathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 1 CopyrightCopyrightc 2017 by Nathaniel E. HelwigNathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 2 Outline of Notes1) The Data MatrixDefinitionPropertiesR code2) The Covariance MatrixDefinitionPropertiesR code3) The Correlation MatrixDefinitionPropertiesR code4) Miscellaneous TopicsCrossproduct calculationsVec and KroneckerVisualizing dataNathaniel E.

Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 3 The Data MatrixThe Data MatrixNathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 4 The Data MatrixDefinitionThe Organization of DataThe data Matrix refers to the array of numbersX= x11x12 x1px21x22 x2px31x32 xnp wherexijis thej-th variable collected from thei-th item ( , subject).items/subjects are rowsvariables are columnsXis a data Matrix of ordern p(# items by # variables).Nathaniel E.

Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 5 The Data MatrixDefinitionCollection of Column VectorsWe can view a data Matrix as a collection of column vectors:X= x1x2 xp wherexjis thej-th column ofXforj {1,..,p}.Then 1 vectorxjgives thej-th variable s scores for E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 6 The Data MatrixDefinitionCollection of Row VectorsWe can view a data Matrix as a collection of row vectors:X= x 1x n wherex iis thei-th row ofXfori {1.}

,n}.The 1 pvectorx igives thei-th item s scores for E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 7 The Data MatrixPropertiesCalculating Variable (Column) MeansThe sample mean of thej-th variable is given by xj=1nn i=1xij=n 11 nxjwhere1ndenotes ann 1 vector of onesxjdenotes thej-th column ofXNathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 8 The Data MatrixPropertiesCalculating Item (Row) MeansThe sample mean of thei-th item is given by xi=1pp j=1xij=p 1x i1pwhere1pdenotes anp 1 vector of onesx idenotes thei-th row ofXNathaniel E.

Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 9 The Data MatrixR CodeData Frame and Matrix Classes in R> data(mtcars)> class(mtcars)[1] " "> dim(mtcars)[1] 32 11> head(mtcars)mpg cyl disp hp drat wt qsec vs am gear carbMazda RX4 6 160 110 0 1 4 4 Mazda RX4 Wag 6 160 110 0 1 4 4 Datsun 710 4 108 93 1 1 4 1 Hornet 4 Drive 6 258 110 1 0 3 1 Hornet Sportabout 8 360 175 0 0 3 2 Valiant 6 225 105 1 0 3 1> X <- (mtcars)> class(X)

[1] " Matrix "Nathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 10 The Data MatrixR CodeRow and Column Means> # get row means (3 ways)> rowMeans(X)[1:3]Mazda RX4 Mazda RX4 Wag Datsun > c(mean(X[1,]), mean(X[2,]), mean(X[3,]))[1] > apply(X,1,mean)[1:3]Mazda RX4 Mazda RX4 Wag Datsun > # get column means (3 ways)> colMeans(X)[1:3]mpg cyl > c(mean(X[,1]), mean(X[,2]), mean(X[,3]))[1] > apply(X,2,mean)[1:3]mpg cyl E.

Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 11 The Data MatrixR CodeOther Row and Column Functions> # get column medians> apply(X,2,median)[1:3]mpg cyl > c(median(X[,1]), median(X[,2]), median(X[,3]))[1] > # get column ranges> apply(X,2,range)[,1:3]mpg cyl disp[1,] 4 [2,] 8 > cbind(range(X[,1]), range(X[,2]), range(X[,3]))[,1] [,2] [,3][1,] 4 [2,] 8 E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 12 The Covariance MatrixThe Covariance MatrixNathaniel E.

Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 13 The Covariance MatrixDefinitionThe Covariation of DataThe covariance Matrix refers to the symmetric array of numbersS= s21s12s13 s1ps21s22s23 s2ps31s32s23 s2p wheres2j= (1/n) ni=1(xij xj)2is the variance of thej-th variablesjk= (1/n) ni=1(xij xj)(xik xk)is the covariance between thej-th andk-th variables xj= (1/n) ni=1xijis the mean of thej-th variableNathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 14 The Covariance MatrixDefinitionCovariance Matrix from Data MatrixWe can calculate the covariance Matrix such asS=1nX cXcwhereXc=X 1n x =CXwith x = ( x1.)

, xp)denoting the vector of variable meansC=In n 11n1 ndenoting a centering matrixNote that the centered matrixXchas the formXc= x11 x1x12 x2 x1p xpx21 x1x22 x2 x2p xpx31 x1x32 x2 x3p x1xn2 x2 xnp xp Nathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 15 The Covariance MatrixPropertiesVariances are NonnegativeVariances are sums-of-squares, which implies thats2j 0 >0 as long as there does not exist an such thatxj= 1nThis implies that..tr(S) 0 where tr( )denotes the Matrix trace function pj=1 j 0 where( 1.

, p)are the eigenvalues ofSIfn<p, then j=0 for at least onej {1,..,p}. Ifn pand thepcolumns ofXare linearly independent, then j>0 for allj {1,..,p}.Nathaniel E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 16 The Covariance MatrixPropertiesThe Cauchy-Schwarz InequalityFrom the Cauchy-Schwarz inequality we have thats2jk s2js2kwith the equality holding if and only ifxjandxkare linearly could also write the Cauchy-Schwarz inequality as|sjk| sjskwheresjandskdenote the standard deviations of the E. Helwig (U of Minnesota)Data, Covariance, and Correlation MatrixUpdated 16-Jan-2017 : Slide 17 The Covariance MatrixR CodeCovariance Matrix by Hand (hard way)> n <- nrow(X)> C <- diag(n) - Matrix (1/n, n, n)> Xc <- C %*% X> S <- t(Xc) %*% Xc / (n-1)> S[1:3,1:6]mpg cyl disp hp drat wtmpg # or #> Xc <- scale(X, center=TRUE, scale=FALSE)> S <- t(Xc) %*% Xc / (n-1)> S[1:3,1.]

Data, Covariance, and Correlation Matrix

Information

Transcription of Data, Covariance, and Correlation Matrix

Related search queries

Data, Covariance, and Correlation Matrix

Information

Documents from same domain

Related documents

Related search queries