Example: marketing

Nathaniel E. Helwig - Statistics

Multivariate Linear RegressionNathaniel E. HelwigAssistant Professor of Psychology and StatisticsUniversity of Minnesota (Twin Cities)Updated 16-Jan-2017 Nathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 1 CopyrightCopyrightc 2017 by Nathaniel E. HelwigNathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 2 Outline of Notes1) Multiple Linear RegressionModel form and assumptionsParameter estimationInference and prediction2) Multivariate Linear RegressionModel form and assumptionsParameter estimationInference and predictionNathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 3 Multiple Linear RegressionMultiple Linear RegressionNathaniel E.

Multivariate Linear Regression Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Jan-2017

Tags:

  Nathaniel e, Nathaniel, Helwig

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Nathaniel E. Helwig - Statistics

1 Multivariate Linear RegressionNathaniel E. HelwigAssistant Professor of Psychology and StatisticsUniversity of Minnesota (Twin Cities)Updated 16-Jan-2017 Nathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 1 CopyrightCopyrightc 2017 by Nathaniel E. HelwigNathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 2 Outline of Notes1) Multiple Linear RegressionModel form and assumptionsParameter estimationInference and prediction2) Multivariate Linear RegressionModel form and assumptionsParameter estimationInference and predictionNathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 3 Multiple Linear RegressionMultiple Linear RegressionNathaniel E.

2 Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 4 Multiple Linear RegressionModel Form and AssumptionsMLR Model: Scalar FormThe multiple linear regression model has the formyi=b0+p j=1bjxij+eifori {1,..,n}whereyi Ris the real-valued response for thei-th observationb0 Ris the regression interceptbj Ris thej-th predictor s regression slopexij Ris thej-th predictor for thei-th observationeiiid N(0, 2)is a Gaussian error termNathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 5 Multiple Linear RegressionModel Form and AssumptionsMLR Model: NomenclatureThe model is multiple because we havep>1 , we have a simple linear regression modelThe model is linear becauseyiis a linear function of the parameters(b0,b1.)

3 ,bpare the parameters).The model is a regression model because we are modeling a responsevariable (Y) as a function of predictor variables (X1,..,Xp). Nathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 6 Multiple Linear RegressionModel Form and AssumptionsMLR Model: AssumptionsThe fundamental assumptions of the MLR model are:1 Relationship betweenXjandYis linear (given other predictors)2xijandyiare observed random variables (known constants)3eiiid N(0, 2)is an unobserved random variable4b0,b1,..,bpare unknown constants5(yi|xi1,..,xip)ind N(b0+ pj=1bjxij, 2)note: homogeneity of varianceNote:bjis expected increase inYfor 1-unit increase inXjwith allother predictor variables held constantNathaniel E.

4 Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 7 Multiple Linear RegressionModel Form and AssumptionsMLR Model: Matrix FormThe multiple linear regression model has the formy=Xb+ewherey= (y1,..,yn) Rnis then 1 response vectorX= [1n,x1,..,xp] Rn (p+1)is then (p+1)design matrix 1nis ann 1 vector of ones xj= (x1j,..,xnj) Rnisj-th predictor vector (n 1)b= (b0,b1,..,bp) Rp+1is(p+1) 1 vector of coefficientse= (e1,..,en) Rnis then 1 error vectorNathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 8 Multiple Linear RegressionModel Form and AssumptionsMLR Model: Matrix Form (another look)Matrix form writes MLR model for allnpoints simultaneouslyy=Xb+e = 1x11x12 x1p1x21x22 x2p1x31x32 xnp + Nathaniel E.

5 Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 9 Multiple Linear RegressionModel Form and AssumptionsMLR Model: Assumptions (revisited)In matrix terms, the error vector is multivariate normal:e N(0n, 2In)In matrix terms, the response vector is multivariate normal givenX:(y|X) N(Xb, 2In) Nathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 10 Multiple Linear RegressionParameter EstimationOrdinary Least SquaresThe ordinary least squares (OLS) problem isminb Rp+1 y Xb 2=minb Rp+1n i=1(yi b0 pj=1bjxij)2where denotes the Frobenius OLS solution has the form b= (X X) 1X yNathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 11 Multiple Linear RegressionParameter EstimationFitted Values and ResidualsSCALAR FORM:Fitted values are given by yi= b0+ pj=1 bjxijand residuals are given by ei=yi yiMATRIX FORM:Fitted values are given by y=X band residuals are given by e=y yNathaniel E.

6 Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 12 Multiple Linear RegressionParameter EstimationHat MatrixNote that we can write the fitted values as y=X b=X(X X) 1X y=HywhereH=X(X X) 1X is the hat a symmetric and idempotent matrix:HH=HHprojectsyonto the column space E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 13 Multiple Linear RegressionParameter EstimationMultiple Regression Example in R> data(mtcars)> head(mtcars)mpg cyl disp hp drat wt qsec vs am gear carbMazda RX4 6 160 110 0 1 4 4 Mazda RX4 Wag 6 160 110 0 1 4 4 Datsun 710 4 108 93 1 1 4 1 Hornet 4 Drive 6 258 110 1 0 3 1 Hornet Sportabout 8 360 175 0 0 3 2 Valiant 6 225 105 1 0 3 1> mtcars$cyl <- factor(mtcars$cyl)> mod <- lm(mpg ~ cyl + am + carb, data=mtcars)> coef(mod)(Intercept) cyl6 cyl8 am E.

7 Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 14 Multiple Linear RegressionParameter EstimationRegression Sums-of-Squares: Scalar FormIn MLR models, the relevant sums-of-squares areSum-of-Squares Total:SST= ni=1(yi y)2 Sum-of-Squares Regression:SSR= ni=1( yi y)2 Sum-of-Squares Error:SSE= ni=1(yi yi)2 The corresponding degrees of freedom areSST:dfT=n 1 SSR:dfR=pSSE:dfE=n p 1 Nathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 15 Multiple Linear RegressionParameter EstimationRegression Sums-of-Squares: Matrix FormIn MLR models, the relevant sums-of-squares areSST=n i=1(yi y)2=y [In (1/n)J]ySSR=n i=1( yi y)2=y [H (1/n)J]ySSE=n i=1(yi yi)2=y [In H]yNote:Jis ann nmatrix of onesNathaniel E.

8 Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 16 Multiple Linear RegressionParameter EstimationPartitioning the VarianceWe can partition the total variation inyiasSST=n i=1(yi y)2=n i=1(yi yi+ yi y)2=n i=1( yi y)2+n i=1(yi yi)2+2n i=1( yi y)(yi yi)=SSR+SSE+2n i=1( yi y) ei=SSR+SSEN athaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 17 Multiple Linear RegressionParameter EstimationRegression Sums-of-Squares in R> anova(mod)Analysis of Variance TableResponse: mpgDf Sum Sq Mean Sq F value Pr(>F)cyl 2 **am 1 *carb 1 *Residuals 27 codes: 0 ** ** *.

9 1> Anova(mod, type=3)Anova Table (Type III tests)Response: mpgSum Sq Df F value Pr(>F)(Intercept) 1 < **cyl 2 **am 1 **carb 1 *Residuals 27---Signif. codes: 0 ** ** * . 1 Nathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 18 Multiple Linear RegressionParameter EstimationCoefficient of Multiple DeterminationThe coefficient of multiple determination is defined asR2=SSRSST=1 SSESSTand gives the amount of variation inyithat is explained by the linearrelationships withxi1,.., interpretingR2values, note that..0 R2 1 LargeR2values do not necessarily imply a good modelNathaniel E.

10 Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 19 Multiple Linear RegressionParameter EstimationAdjusted Coefficient of Multiple Determination (R2a)Including more predictors in a MLR model can artificially inflateR2:Capitalizing on spurious effects present in noisy dataPhenomenon of over-fitting the dataThe adjustedR2is a relative measure of fit:R2a=1 SSE/dfESST/dfT=1 2s2 Ywheres2Y= ni=1(yi y)2n 1is the sample estimate of the variance :R2andR2ahave different interpretations! Nathaniel E. Helwig (U of Minnesota)Multivariate Linear RegressionUpdated 16-Jan-2017 : Slide 20 Multiple Linear RegressionParameter EstimationRegression Sums-of-Squares in R> smod <- summary(mod)> names(smod)[1] "call" "terms" "residuals" "coefficients"[5] "aliased" "sigma" "df" " "[9] " " "fstatistic" " "> summary(mod)$ [1] > summary(mod)$ [1] E.


Related search queries