4.3 Least Squares Approximations

218 Chapter 4. Least Squares ApproximationsIt often happens thatAxDbhas no solution. The usual reason is:too many matrix has more rows than columns. There are more equations than unknowns(mis greater thann). Thencolumns span a small part ofm-dimensional space. Unless allmeasurements are perfect,bis outside that column space. Elimination reaches animpossible equation and stops. But we can t stop just because measurements include repeat: We cannot always get the erroreDb Axdown to zero. Wheneis zero,xis an exact solution the length ofeis as small as possible,bxis aleast Squares goal in this section is to computebxand use it. These are realproblems and they need an previous section emphasizedp(the projection). This section emphasizesbx(theleast Squares solution). They are connected bypDAbx. The fundamental equation is stillATAbxDATb.

Here is a short unofficial way to reach this equation:WhenAxDbhas no solution, multiply byATand solveATAbxDATb:Example 1A crucial application of Least Squares is fitting a straight line with three points:Find the closest line to the ; 6/; .1; 0/, ; 0/.No straight linebDCCD tgoes through those three points. We are asking for twonumbersCandDthat satisfy three equations. Here are the equations attD0; 1; 2tomatch the given valuesbD6; 0; 0:tD0 The first point is on the linebDCCDtifCCD 0D6tD1 The second point is on the linebDCCDtifCCD 1D0tD2 The third point is on the linebDCCDtifCCD 2D0:This3by2system hasno ;0;0/is not a combination of the ;1;1 ;1;2/. Read offA;x;andbfrom those equations:AD2410111235xD CD same numbers were in Example3in the last section. We ; 3/.Those numbers are the bestCandD,so5 3twill be the best line for must connect projections to Least Squares , by explaining practical problems, there could easily bemD100points instead ofmD3.

Theydon t exactly match any straight lineCCDt. Our numbers6; 0; 0exaggerate the error soyou can seee1;e2, ande3in Figure the ErrorHow do we make the erroreDb Axas small as possible? This is an important questionwith a beautiful answer. The bestx(calledbx/can be found by geometry or algebra orcalculus:90 angle or project usingPor set the derivative of the error to Least Squares Approximations219By geometryEveryAxlies in the plane of the ;1;1 ;1;2/. In thatplane, we look for the point closest nearest point is the best choice forAbxisp. The smallest possible error iseDb p. The three points ;p2;p3/do lie on a line, becausepis in the column space. In fitting a straightline,bxgives the best choice ; D/.By algebraEvery vectorbsplits into two parts. The part in the column space perpendicular part in the nullspace ofATise.)

There is an equation we cannot There is an equationAbxDpwe do solve (by removinge/:AxDbDpCeis impossible;AbxDpis solvable.(1)The solution toAbxDpleaves the Least possible error (which ise):Squared length for anyxkAx bk2 DkAx pk2 Ckek2:(2)This is the lawc2Da2Cb2for a right triangle. The vectorAx pin the column space isperpendicular toein the left nullspace. We reduceAx pto zero by choosingxto leaves the smallest possible ;e2;e3/.Notice what smallest means. Thesquared lengthofAx bis minimized:The Least Squares solutionbxmakesEDkAx bk2as small as :Best line and projection: Two pictures , same line has ; 2; 1/with ; 2; 1/. The ; 3/.The best line isbD5 3tand the projection ispD5a1 shows the closest line. It misses by distancese1;e2;e3D1; 2; are vertical distances. The Least Squares line 4.)

OrthogonalityFigure shows the same problem in3-dimensional space (bpespace). The vectorbis not in the column space ofA. That is why we could not solveAxDb. No line goesthrough the three points. The smallest possible error is the perpendicular vectore. This iseDb Abx, the vector of ; 2; 1/in the three equations. Those are the distancesfrom the best line. Behind both figures is the fundamental that the errors1; 2; 1add to zero. The ;e2;e3/is perpendicularto the first ;1;1/inA. The dot product calculusMost functions are minimized by calculus! The graph bottoms out and thederivative in every direction is zero. Here the error functionEto be minimized is asum ofsquarese21Ce22Ce23(the square of the error in each equation):EDkAx 0 6 1 2/2:(3)The unknowns areCandD. With two unknowns there aretwo derivatives both zeroat the minimum.

They are partial derivatives because@E =@CtreatsDas constant and@E =@DtreatsCas constant:@E 0 6 1 2/D0@E 0 6/.0 1/.1 2/.2/D0:@E =@Dcontains the extra factors0;1;2from the chain rule. (The last derivative 2 timesCC2D times that extra 2.) In theCderivative the correspondingfactors are1; 1; 1, becauseCis always multiplied by 1. It is no accident that1,1,1and0,1,2are the columns cancel 2 from every term and collect allC s and allD s:TheCderivative is zero:3CC3DD6 TheDderivative is zero:3CC5DD0 This matrix 3335 isATA(4)These equations are identical withATAbxDATb. The bestCandDare the componentsofbx. The equations from calculus are the same as the normal equations from linearalgebra. These are the key equations of Least Squares :The partial derivatives ofkAx bk2are zero whenATAbxDATb:The solution isCD5andDD 3.

ThereforebD5 3tis the best line it comesclosest to the three points. AttD0,1,2this line goes throughpD5,2, could not go throughbD6,0,0. The errors are1, 2,1. This is the vectore!The Big PictureThe key figure of this book shows the four subspaces and the true action of a matrix. Thevectorxon the left side of Figure went tobDAxon the right side. In that figurexwas split intoxrCxn. There were many solutions Least Squares Approximations221 Figure : The projectionpDAbxis closest tob,sobxminimizesEDkb this section the situation is just the opposite. There arenosolutions of splitting upxwe are splitting upb. Figure shows the big picture for leastsquares. Instead ofAxDbwe solveAbxDp. The erroreDb pis how the very small just one point. With independentcolumns, the only solution toAxD0isxD0. ThenATAis invertible.

The equationATAbxDATbfully determines the best vectorbx. The error 7 will have the complete picture all four subspaces included. EveryxsplitsintoxrCxn, and everybsplits intopCe. The best solution isbxrin the row space. Wecan t helpeand we don t wantxn this a Straight LineFitting a line is the clearest application of Least Squares . It starts withm>2points,hopefully near a straight line. At timest1;:::;tmthosempoints are at heightsb1;:::;bm. The best lineCCDtmisses the points by vertical distancese1;:::; line is perfect, and the Least Squares line minimizesEDe21C first example in this section had three points in Figure Now we allowmpoints(andmcan be large). The two components ofbxare line goes through thempoints when we exactly solveAxDb. Generally we can tdo it. Two unknownsCandDdetermine a line, soAhas onlynD2columns.

To fit thempoints, we are trying to solvemequations (and we only want two!):AxDbisCCDt1Db1 CCDt2Db2:::CCDtmDbmwithAD266641t11t2:::: ::1tm37775:(5)222 Chapter 4. OrthogonalityThe column space is so thin that almost certainlybis outside of it. Whenbhappens to liein the column space, the points happen to lie on a line. In that casebDp. ThenAxDbis solvable and the errors ;:::;0/.The closest lineCCDthas heightsp1;:::;pmwith errorse1;:::; ; D/. The errors areeiDbi C points by a straight line is so important that we give the two equationsATAbxDATb, once and for all. The two columns ofAare independent (unless all timestiare thesame). So we turn to Least Squares and matrixATAD 1 1t1 tm 2641t1::::::1tm375D"mPtiPtiPt2i#:(6)On the right side of the normal equation is the 2 by 1 vectorATb:ATbD 1 1t1 tm 264b1:::bm375D"PbiPtibi#:(7)In a specific problem, these numbers are given.

The ; D/is in equation (9).The lineCCDtminimizese21C Ce2mDkAx bk2whenATAbxDATb:"mPtiPtiPt2i# CD D"PbiPtibi#:(8)The vertical errors at thempoints on the line are the components ofeDb p. Thiserror vector (theresidual)b Abxis perpendicular to the columns ofA(geometry). Theerror is in the nullspace ofAT(linear algebra). The ; D/minimizes the totalerrorE, the sum of b1/2C bm/2:When calculus sets the derivatives@E =@Cand@E =@Dto zero, it Least Squares problems have more than two unknowns. Fitting by the best parabolahasnD3coefficientsC; D; E(see below). In general we are fittingmdata pointsbynparametersx1;:::;xn. The matrixAhasncolumns andn<m. The derivativesofkAx bk2give derivative of a square is is why the method of Least Squares is so 2 Ahasorthogonal columnswhen the measurement timestiadd to Least Squares Approximations223 SupposebD1; 2; 4at timestD 2; 0; 2.

4.3 Least Squares Approximations

Tags:

Information

Transcription of 4.3 Least Squares Approximations

Related search queries

4.3 Least Squares Approximations

Tags:

Information

Documents from same domain

Related documents

Related search queries