Linear Algebra Review and Reference - CS229: Machine …

Linear Algebra Review and ReferenceZico Kolter (updated by Chuong Do and Tengyu Ma)June 20, 2020 Contents1 Basic Concepts and Basic Notation ..22 Matrix Vector-Vector Products .. Matrix-Vector Products .. Matrix-Matrix Products ..53 Operations and The Identity Matrix and Diagonal Matrices .. The Transpose .. Symmetric Matrices .. The Trace .. Norms .. Linear Independence and Rank .. The Inverse of a Square Matrix .. Orthogonal Matrices .. Range and Nullspace of a Matrix .. The Determinant .. Quadratic Forms and Positive Semidefinite Matrices .. Eigenvalues and Eigenvectors .. Eigenvalues and Eigenvectors of Symmetric Matrices ..194 Matrix The Gradient .. The Hessian .. Gradients and Hessians of Quadratic and Linear Functions.

Least Squares .. Gradients of the Determinant .. Eigenvalues as Optimization ..2811 Basic Concepts and NotationLinear Algebra provides a way of compactly representing and operating on sets of linearequations. For example, consider the following system of equations:4x1 5x2= 13 2x1+ 3x2= is two equations and two variables, so as you know from high school Algebra , youcan find a unique solution forx1andx2(unless the equations are somehow degenerate, forexample if the second equation is simply a multiple of the first, but in the case above thereis in fact a unique solution). In matrix notation, we can write the system more compactlyasAx=bwithA=[4 5 23], b=[ 139].As we will see shortly, there are many advantages (including the obvious space savings)to analyzing Linear equations in this Basic NotationWe use the following notation: ByA Rm nwe denote a matrix withmrows andncolumns, where the entries ofAare real numbers.

Byx Rn, we denote a vector withnentries. By convention, ann-dimensional vectoris often thought of as a matrix withnrows and 1 column, known as acolumn we want to explicitly represent arow vector a matrix with 1 row andncolumns we typically writexT(herexTdenotes the transpose ofx, which we will defineshortly). Theith element of a vectorxis denotedxi:x= . We use the notationaij(orAij,Ai,j, etc) to denote the entry ofAin theith row andjth column:A= a11a12 a1na21a22 amn .2 We denote thejth column ofAbyajorA:,j:A= | | |a1a2 an| | | . We denote theith row ofAbyaTorAi,::A= aT1 aT2 .. aTm . Viewing a matrix as a collection of column or row vectors is very important andconvenient in many cases. In general, it would be mathematically (and conceptually)cleaner to operate on the level of vectors instead of scalars.

There is no universalconvention for denoting the columns or rows of a matrix, and thus you can feel free tochange the notations as long as it s explicitly Matrix MultiplicationThe product of two matricesA Rm nandB Rn pis the matrixC=AB Rm p,whereCij=n k= that in order for the matrix product to exist, the number of columns inAmust equalthe number of rows inB. There are many other ways of looking at matrix multiplicationthat may be more convenient and insightful than the standard definition above, and we llstart by examining a few special Vector-Vector ProductsGiven two vectorsx,y Rn, the quantityxTy, sometimes called theinner productordotproductof the vectors, is a real number given byxTy R=[x1x2 xn] =n i= that inner products are really just special case of matrix multiplication.

Note thatit is always the case thatxTy= vectorsx Rm,y Rn(not necessarily of the same size),xyT Rm nis calledtheouter productof the vectors. It is a matrix whose entries are given by (xyT)ij=xiyj, ,xyT Rm n= [y1y2 yn]= x1y1x1y2 x1ynx2y1x2y2 xmyn .As an example of how the outer product can be useful, let1 Rndenote ann-dimensionalvector whose entries are all equal to 1. Furthermore, consider the matrixA Rm nwhosecolumns are all equal to some vectorx Rm. Using outer products, we can representAcompactly as,A= | | |x x x| | | = x1x1 x1x2x2 xm = [1 1 1]= Matrix-Vector ProductsGiven a matrixA Rm nand a vectorx Rn, their product is a vectory=Ax are a couple ways of looking at matrix-vector multiplication, and we will look at eachof them in we writeAby rows, then we can expressAxas,y=Ax= aT1 aT2.

ATm x= .In other words, theith entry ofyis equal to the inner product of theithrowofAandx,yi= , let s writeAin column form. In this case we see that,y=Ax= | | |a1a2 an| | | = a1 x1+ a2 x2+..+ an xn.(1)In other words, y is alinear combinationof thecolumnsofA, where the coefficientsof the Linear combination are given by the entries far we have been multiplying on the right by a column vector, but it is also possibleto multiply on the left by a row vector. This is written,yT=xTAforA Rm n,x Rm,andy Rn. As before, we can expressyTin two obvious ways, depending on whether weexpressAin terms on its rows or columns. In the first case we expressAin terms of itscolumns, which givesyT=xTA=xT | | |a1a2 an| | | =[xTa1xTa2 xTan]which demonstrates that theith entry ofyTis equal to the inner product ofxand , expressingAin terms of rows we get the final representation of the vector-matrixproduct,yT=xTA=[x1x2 xn] aT1 aT2.

ATm =x1[ aT1 ]+x2[ aT2 ]+..+xn[ aTn ]so we see thatyTis a Linear combination of therowsofA, where the coefficients for thelinear combination are given by the entries Matrix-Matrix ProductsArmed with this knowledge, we can now look at four different (but, of course, equivalent)ways of viewing the matrix-matrix multiplicationC=ABas defined at the beginning ofthis , we can view matrix-matrix multiplication as a set of vector-vector products. Themost obvious viewpoint, which follows immediately from the definition, is that the (i,j)thentry ofCis equal to the inner product of theith row ofAand thejth column , this looks like the following,C=AB= aT1 aT2 .. aTm | | |b1b2 bp| | | = aT1b1aT1b2 aT1bpaT2b1aT2b2 aTmbp .Remember that sinceA Rm nandB Rn p,ai Rnandbj Rn, so these innerproducts all make sense.

This is the most natural representation when we representA5by rows andBby columns. Alternatively, we can representAby columns, andBby representation leads to a much trickier interpretation ofABas a sum of outer ,C=AB= | | |a1a2 an| | | bT1 bT2 .. bTn =n i= another way,ABis equal to the sum, over alli, of the outer product of theith columnofAand theith row ofB. Since, in this case,ai Rmandbi Rp, the dimension of theouter productaibTiism p, which coincides with the dimension ofC. Chances are, the lastequality above may appear confusing to you. If so, take the time to check it for yourself!Second, we can also view matrix-matrix multiplication as a set of matrix-vector , if we representBby columns, we can view the columns ofCas matrix-vectorproducts betweenAand the columns ofB.

Symbolically,C=AB=A | | |b1b2 bp| | | = | ||Ab1Ab2 Abp| || .(2)Here theith column ofCis given by the matrix-vector product with the vector on the right,ci=Abi. These matrix-vector products can in turn be interpreted using both viewpointsgiven in the previous subsection. Finally, we have the analogous viewpoint, where we repre-sentAby rows, and view the rows ofCas the matrix-vector product between the rows ofAandC. Symbolically,C=AB= aT1 aT2 .. aTm B= aT1B aT2B .. aTmB .Here theith row ofCis given by the matrix-vector product with the vector on the left,cTi= may seem like overkill to dissect matrix multiplication to such a large degree, especiallywhen all these viewpoints follow immediately from the initial definition we gave (in about aline of math) at the beginning of this direct advantage of these variousviewpoints is that they allow you to operate on the level/unit of vectors insteadof fully understand Linear Algebra without getting lost in the complicatedmanipulation of indices, the key is to operate with as large concepts as , if you could write all your math derivations with matrices or vectors, it would be better than doingthem with scalar all of Linear Algebra deals with matrix multiplications of some kind, and it isworthwhile to spend some time trying to develop an intuitive understanding of the viewpointspresented addition to this, it is useful to know a few basic properties of matrix multiplication ata higher level: Matrix multiplication is associative: (AB)C=A(BC).

Matrix multiplication is distributive:A(B+C) =AB+AC. Matrix multiplication is, in general,notcommutative; that is, it can be the case thatAB6=BA. (For example, ifA Rm nandB Rn q, the matrix productBAdoesnot even exist ifmandqare not equal!)If you are not familiar with these properties, take the time to verify them for example, to check the associativity of matrix multiplication, suppose thatA Rm n,B Rn p, andC Rp q. Note thatAB Rm p, so (AB)C Rm q. Similarly,BC Rn q,soA(BC) Rm q. Thus, the dimensions of the resulting matrices agree. To show thatmatrix multiplication is associative, it suffices to check that the (i,j)th entry of (AB)Cisequal to the (i,j)th entry ofA(BC). We can verify this directly using the definition ofmatrix multiplication:((AB)C)ij=p k=1(AB)ikCkj=p k=1(n l=1 AilBlk)Ckj=p k=1(n l=1 AilBlkCkj)=n l=1(p k=1 AilBlkCkj)=n l=1 Ail(p k=1 BlkCkj)=n l=1 Ail(BC)lj= (A(BC)) , the first and last two equalities simply use the definition of matrix multiplication, thethird and fifth equalities use the distributive property forscalar multiplication over addition,and the fourth equality uses thecommutative and associativity of scalar addition.

Linear Algebra Review and Reference - CS229: Machine …

Tags:

Information

Transcription of Linear Algebra Review and Reference - CS229: Machine …

Related search queries

Linear Algebra Review and Reference - CS229: Machine …

Tags:

Information

Documents from same domain

Related documents

Related search queries