Example: stock market

Multiple Linear Regression Analysis: A Matrix Approach ...

Alabama Journal of Mathematics Spring/Fall 2009. Multiple Linear Regression Analysis: A Matrix Approach with MATLAB. Scott H. Brown Auburn University Montgomery Linear Regression is one of the fundamental models in statistics used to determine the rela- tionship between dependent and independent variables. An extension of this model, namely Multiple Linear Regression , is used to represent the relationship between a dependent variable and several independent variables. This article focuses on expressing the Multiple Linear re- gression model using Matrix notation and analyzing the model using a script Approach with MATLAB. This Approach is designed to enable high school or university students to better understand Matrix operations and the algorithm used to analyze Multiple Linear Regression .

MULTIPLE LINEAR REGRESSION ANALYSIS: A MATRIX APPROACH WITH MATLAB 3 Conclusion In this paper we introduced an alternative approach of combining MATLAB script and matrix algebra to …

Tags:

  Linear, Regression, Linear regression

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Multiple Linear Regression Analysis: A Matrix Approach ...

1 Alabama Journal of Mathematics Spring/Fall 2009. Multiple Linear Regression Analysis: A Matrix Approach with MATLAB. Scott H. Brown Auburn University Montgomery Linear Regression is one of the fundamental models in statistics used to determine the rela- tionship between dependent and independent variables. An extension of this model, namely Multiple Linear Regression , is used to represent the relationship between a dependent variable and several independent variables. This article focuses on expressing the Multiple Linear re- gression model using Matrix notation and analyzing the model using a script Approach with MATLAB. This Approach is designed to enable high school or university students to better understand Matrix operations and the algorithm used to analyze Multiple Linear Regression .

2 Multiple Linear Regression A Matrix Approach to Multiple Model Linear Regression Analysis Using matrices allows for a more compact framework in A simple Linear Regression illustrates the relation between terms of vectors representing the observations, levels of re- the dependent variable y and the independent variable x gressor variables, Regression coefficients, and random errors. based on the Regression equation The model is in the form yi = 0 + 1 xi + ei , i = 1, 2, 3, .. , n (1) Y = X + (3). Using the least squares method, the best fitting line can be and when written in Matrix notation we have found by minimizing the sum of the squares of the vertical y1 1 x11 x1k 1 1.. distance from each data point on the line. For further interest- y2 1 x21 x2k 2.

3 2 . ing discussion on this subject see Gordon and Gordon (2004).. = .. + .. (4).. and Scarianio and Calzada (2004).. According to the Multiple Linear Regression model the de- yn 1 xn1 xnk n n pendent variable is related to two or more independent vari- ables. The general model for k variables is of the form Note that Y is an n 1 dimensional random vector consisting of the observations, X is an n (k 1) Matrix determined by yi = 0 + 1 xi1 + 2 xi2 + + k xik + ei , i = 1, 2, .. , n. (2) the predictors, is a (k+1) 1 vector of unknown parameters, and is an n 1 vector of random errors. The first step in Multiple Linear Regression analysis is to The simple Linear Regression model is used to find the straight determine the vector of least squares estimators, , which line that best fits the data.

4 On the other hand, the Multiple lin- gives the Linear combination y that minimizes the length of ear Regression model, for example with two independent vari- the error vector. Basically the estimator provides the least ables, is used to find the plane that best fits the data. Models possible value to sum of the squares difference between y . that involve more than two independent variables are more and y. Algebraically can be expressed by using Matrix no- complex in structure but can still be analyzed using Multiple tation. An important stipulation in Multiple Regression anal- Linear Regression techniques. ysis is that the variables x1 , x2 , .. , xn be linearly indepen- In Multiple Linear Regression analysis, the method of least dent.

5 This implies that the correlation between each xi is squares is used to estimate the Regression coefficients in 2. small. Now, since the objective of Multiple Regression is to The Regression coefficients illustrate the unrelated contribu- minimize the sum of the squared errors, the Regression co- tions of each independent variable towards predicting the de- efficients that meet this condition are determined by solving pendent variable. Unlike the simple Linear Regression , there the least squares normal equation must be inferences made about the degree of interaction or correlation between each of the independent variables. X T X = X T Y. (5). The computations used in finding the Regression coefficients ( i , i = 1, .. , k), residual sum of squares (SSE), Regression Now if the variables x1 , x2.

6 , xn are linearly independent, sum of squares (SSR), etc. are rather complex. To simplify 1. then the inverse of X T X, namely X T X will exist. Multi- the computation, the Multiple Regression model in terms of 1. the observations can be written using Matrix notation. plying both sides of the normal equation 5 by X T X , we 1. 2 SCOTT H. BROWN. obtain 1 B. = X T X X T Y. (6) M=X*B;. M. Several mathematical software packages such as Mathe- E=Y-M;. matica, Stata, and MATLAB provide Matrix commands to E. determine the solution to the normal equation as shown in MaxErr=max(abs(Y-M)). MathWorks (2006), Kohler and Kreuter (2005), and Re- search (2006). The reader will also find the more advanced The importance of these steps in the program is to illus- Texas Instrument (TI) graphing calculator that will allow a trate the use of Matrix algebra to find the least square estima- student to perform Multiple Linear Regression analysis by us- tors.

7 Recall the least squares estimators = (X T X) 1 X T Y. ing the Matrix Approach . An application of the graphing cal- The first step in the program computes the product of X T and culator Approach can be found in Wilson et al. (2004). We X as follows: will focus on using MATLAB and the option to write a pro- gram with Matrix commands. The purpose of creating a pro- A = XT X. gram in this manner fosters a good understanding of Matrix 1 .5 .4 . algebra and Multiple Linear Regression analysis.. 1 .8 .6 . 1 1 1 1 1 1 ..9 .7..5 1. A MATLAB Approach = .8 .9 .. 1 ..4 .6 .7.. There are several options in MATLAB to perform mul- 1 . tiple Linear Regression analysis. One option is Generalized 1 Linear Models in MATLAB (glmlab) which is available in.

8 6 6 6.. either Windows, Macintosh, or Unix. Variables and data can = 6.. (8). be loaded through the main glmlab window screen. For fur- 6 ther details see Dunn (2000) about the capabilities of glmlab. Another option is the Statistical Toolbox, which allows the In this next step, the instructor can reinforce the concept user to program with functions. MATLAB programs can also of the inverse existing only if the columns of X are linearly be written with m-files. These files are text files created with independent. In our case the inverse does exist as, either functions or script. A function requires an input or out- put argument. While the function method simplifies writing .. a program, using script better illustrates the process of ob.

9 K = (X T X) 1 = (9). taining the least squares estimator using Matrix commands.. In our example we will use script to write our program. In the following example we are measuring the quantity y We can now find the least squares estimators, (dependent variable) for several values of x1 and x2 (indepen- dent variables). We will use the following tables of values: .. y x1 x2 B = = KX T Y = . (10)..19 .5 .4 .28 .8 .6..30 .9 .7 (7) According to these values the corresponding fitted Regression .25 model is: .29 .28 y = + ( )x1 + ( )x2 (11). The least squares estimators of are found by writing the One additional step is to validate the Regression model for the following MATLAB program in script form using Matrix no- data by computing the maximum error e.

10 In our example we tation: note the error Matrix is as follows: . X=[1 .5 .4;1 .8 .6;1 .9 .7;1 ; . 1 ;1 ]; .. X . E = = (12).. Y=[.19;.28;.30;.25;.29;.28];. Y .. A=XT*X; A. K=(XT*X) -1; Based on these values one will find the maximum error to K be , which indicates the model accurately follows the B=K*XT*Y; data. Multiple Linear Regression ANALYSIS: A Matrix Approach WITH MATLAB 3. Conclusion Gordon, S. and Gordon, F. (2004). Deriving the Regression equations without calculus. Mathematics and Computer In this paper we introduced an alternative Approach of Education, 38(1):64 68. combining MATLAB script and Matrix algebra to analyze Multiple Linear Regression . This Approach is relatively simple Kohler, U. and Kreuter, F. (2005).


Related search queries