### Transcription of Dimensionality Reduction - Stanford University

1 Chapter 11 **Dimensionality** ReductionThere are many sources of data that can be viewed as a large matrix. Wesaw in Chapter 5 how the Web can be represented as a transition matrix. InChapter 9, the utility matrix was a point of focus. And in Chapter 10 weexamined matrices that represent social networks. In many of these matrixapplications, the matrix can be summarized by finding narrower matricesthat in some sense are close to the original. These narrow matrices have only asmall number of rows or a small number of columns, and therefore can be usedmuch more efficiently than can the original large matrix. The processof findingthese narrow matrices is calleddimensionality saw a preliminary example of **Dimensionality** **Reduction** in Section , we discussed UV- **decomposition** of a matrix and gave a simple algorithmfor finding this **decomposition** .

2 Recall that a large matrixMwas decomposedinto two matricesUandVwhose productU Vwas approximatelyM. ThematrixUhad a small number of columns whereasVhad a small number of rows,so each was significantly smaller thanM, and yet together they representedmost of the information inMthat was useful in predicting ratings of items this chapter we shall explore the idea of **Dimensionality** **Reduction** inmore detail. We begin with a discussion of eigenvalues and their use in prin-cipal component analysis (PCA). We cover **singular** - **value** **decomposition** , amore powerful version of UV- **decomposition** . Finally, because we are alwaysinterested in the largest data sizes we can handle, we look at another formof **decomposition** , called CUR- **decomposition** , which is a variant of **singular** - **value** **decomposition** that keeps the matrices of the **decomposition** sparse if theoriginal matrix is 11.

3 **Dimensionality** Eigenvalues and Eigenvectors of Symmet-ric MatricesWe shall assume that you are familiar with the basics of matrix algebra: multi-plication, transpose, determinants, and solving linear equations for example. Inthis section, we shall define eigenvalues and eigenvectors of a symmetric matrixand show how to find them. Recall a matrix is symmetric if the element inrowiand columnjequals the element in rowjand DefinitionsLetMbe a square matrix. Let be a constant andea nonzero column vectorwith the same number of rows asM. Then is aneigenvalueofMandeisthe correspondingeigenvectorofMifMe= an eigenvector ofMandcis any constant, then it is also true thatceis an eigenvector ofMwith the same eigenvalue. Multiplying a vector by aconstant changes the length of a vector, but not its direction.

4 Thus, to avoidambiguity regarding the length, we shall require that every eigenvector be aunit vector, meaning that the sum of the squares of the components of thevector is 1. Even that is not quite enough to make the eigenvector unique,since we may still multiply by 1 without changing the sum of squares of thecomponents. Thus, we shall normally require that the first nonzero componentof an eigenvector be :LetMbe the matrix 3 22 6 One of the eigenvectors ofMis 1/ 52/ 5 and its corresponding eigenvalue is 7. The equation 3 22 6 1/ 52/ 5 = 7 1/ 52/ 5 demonstrates the truth of this claim. Note that both sides are equal to 7/ 514/ 5 Also observe that the eigenvector is a unit vector, because (1/ 5)2+(2/ 5)2=1/5 + 4/5 = 1. EIGENVALUES AND EIGENVECTORS OF SYMMETRIC Computing Eigenvalues and EigenvectorsWe have already seen one approach to finding aneigenpair(an eigenvalue andits corresponding eigenvector) for a suitable matrixMin Section.

5 Start withany unit vectorvof the appropriate length and computeMiviteratively until a stochastic matrix, the limiting vector is theprincipaleigenvector (the eigenvector with the largest eigenvalue), and itscorrespondingeigenvalue is method for finding the principal eigenvector, calledpoweriteration, works quite generally, although if the principal eigenvalue (eigenvalueassociated with the principal eigenvector) is not 1, then asigrows, the ratioofMi+1vtoMivapproaches the principal eigenvalue whileMivapproachesa vector (probably not a unit vector) with the same direction as shall take up the generalization of the power-iteration method to find alleigenpairs in Section However, there is anO(n3)-running-time methodfor computing all the eigenpairs of a symmetricn nmatrix exactly, and thismethod will be presented first.

6 There will always beneigenpairs, although insome cases, some of the eigenvalues will be identical. The method starts byrestating the equation that defines eigenpairs,Me= eas (M I)e=0, then nidentity matrixwith 1 s along the main diagonal and 0 a vector of all 0 fact of linear algebra is that in order for (M I)e=0to hold for avectore6=0, the determinant ofM Imust be 0. Notice that (M I)looks almost like the matrixM, but ifMhascin one of its diagonal elements,then (M I) hasc there. While the determinant of ann nmatrix hasn! terms, it can be computed in various ways inO(n3) time; an example is themethod of pivotal condensation. The determinant of (M I) is annth-degree polynomial in , from whichwe can get thenvalues of that are the eigenvalues ofM. For any such **value** ,sayc, we can then solve the equationMe=ce.

7 There arenequations innunknowns (thencomponents ofe), but since there is no constant term in anyequation, we can only solve foreto within a constant factor. However, usingany solution, we can normalize it so the sum of the squares of the componentsis 1, thus obtaining the eigenvector that corresponds to :Let us find the eigenpairs for the 2 2 matrixMfrom Ex-ample RecallM= 3 22 6 1 RecallMidenotes multiplying by the matrixM itimes, as discussed in Section that a stochastic matrix is not generally symmetric. Symmetric matrices andstochastic matrices are two classes of matrices for which eigenpairs exist and can be this chapter, we focus on techniques for symmetric 11. **Dimensionality** REDUCTIONThenM Iis 3 22 6 The determinant of this matrix is (3 )(6 ) 4, which we must set to equation in to solve is thus 2 9 + 14 = 0.

8 The roots of this equationare = 7 and = 2; the first is the principal eigenvalue, since it is the the vector of unknowns xy We must solve 3 22 6 xy = 7 xy When we multiply the matrix and vector we get two equations3x+2y = 7x2x+6y = 7yNotice that both of these equations really say the same thing:y= 2x. Thus, apossible eigenvector is 12 But that vector is not a unit vector, since the sum of the squares of its compo-nents is 5, not 1. Thus to get the unit vector in the same direction, we divideeach component by 5. That is, the principal eigenvector is 1/ 52/ 5 and its eigenvalue is 7. Note that this was the eigenpair we explored in Exam-ple the second eigenpair, we repeat the above with eigenvalue 2 in place of7. The equation involving the components ofeisx= 2y, and the secondeigenvector is 2/ 5 1/ 5 Its corresponding eigenvalue is 2, of course.

9 Finding Eigenpairs by Power IterationWe now examine the generalization of the process we used in Section tofind the principal eigenvector, which in that section was the PageRank vector all we needed from among the various eigenvectors of the stochastic matrix ofthe Web. We start by computing the principal eigenvector by a slightgeneral-ization of the approach used in Section We then modify the matrixto, EIGENVALUES AND EIGENVECTORS OF SYMMETRIC MATRICES409effect, remove the principal eigenvector. The result is a new matrixwhose prin-cipal eigenvector is the second eigenvector (eigenvector with thesecond-largesteigenvalue) of the original matrix. The process proceeds in that manner, re-moving each eigenvector as we find it, and then using power iterationto findthe principal eigenvector of the matrix that the matrix whose eigenpairs we would like to find.

10 Start with anynonzero vectorx0and then iterate:xk+1:=MxkkMxkkwherekNkfor a matrix or vectorNdenotes theFrobenius norm; that is, thesquare root of the sum of the squares of the elements ofN. We multiply thecurrent vectorxkby the matrixMuntil convergence ( ,kxk xk+1kis lessthan some small, chosen constant). Letxbexkfor that **value** ofkat whichconvergence is obtained. Thenxis (approximately) the principal eigenvectorofM. To obtain the corresponding eigenvalue we simply compute 1=xTMx,which is the equationMx= xsolved for , sincexis a unit :Take the matrix from Example :M= 3 22 6 and let us start withx0a vector with 1 for both components. To computex1,we multiplyMx0to get 3 22 6 11 = 58 The Frobenius norm of the result is 52+ 82= 89 = We obtainx1bydividing 5 and 8 by ; that is:x1= For the next iteration, we compute 3 22 6 = The Frobenius norm of the result is , so we divide to obtainx2= We are converging toward a normal vector whose second component is twice thefirst.