Transcription of Chapter 1 Simple Linear Regression (Part 2)
1 Chapter 1 Simple Linear Regression (Part 2)1 Software R and Regression analysisDownloadable from ; some useful commands setwd( ) .. to change the directory for data loading and saving .. for reading/loading data data$variable .. variable in the data plot(X, Y) .. plotting Y against X (starting a new plot); lines(X, Y).. to add lines on an existing plot. object = lm(y x).. to call lm to estimate a model and stored the calculationresults in object Exporting the plotted figure (save as .pfd, .ps or other files)Example we have 10 observations for (X, Y): ( , ), ( , ), ( , ), ( , ), ( , ), ( , ), ( , ), ( , ), ( , ), ( , ).Theyarestoredinfile( ). We hope to fit a Linear Regression modelYi= 0+ 1Xi+ i,i=1, .., ncode of R (the words after # are comments only)mydata = ( ) # read the data from the file1X = mydata$V1 # select XY = mydata$V2 # select Yplot(X, Y) # plot the observations (data)myreg = lm(Y X) # do the Linear regressionsummary(myreg) # output the estimationCoefficients:Estimate Std.
2 Error t valuePr(>|t|)(Intercept) ** Sign. codes: 0 ** ** * . 1 Residual standard error: on 8 degrees of freedomMultiple R-squared: , Adjusted R-squared: : on 1 and 8 DF, p-value: (X, myreg$fitted) # plot the fittedtitle("Scatter of (X,Y) and fitted Linear Regression model ") # add title# Please get to know how to make a figure file for latter use246810246810 XYScatter of (X,Y) and fitted Linear Regression modelFigure 1:(R code)2 The fitted Regression line/ model is Y= + any new subject/individual withX , its prediction ofE(Y)is Y=b0+b1X .For the above data, IfX = 3, then we predict Y= IfX = 3, then we predict Y= IfX = , then we predict Y= Properties of Least squares estimatorsStatistical properties in theory LSE is unbiased:E{b1}= 1,E{b0}= : By the model , we have Y= 0+ 1 X+ andb1= ni=1(Xi X)(Yi Y) ni=1(Xi X)2= ni=1(Xi X)( 0+ 1Xi+ i 0 1 X ) ni=1(Xi X)2= 1+ ni=1(Xi X)( i ) ni=1(Xi X)2= 1+ ni=1(Xi X) i ni=1(Xi X)2recall thatE i= 0.
3 It follows thatEb1= ,E(b0)=E( Y b1 X)= 0+ 1 X E(b1) X= 0+ 1 X 1 X= 0 Variance of the estimatorsVar(b1)= 2 ni=1(Xi X)2,Var(b0)=1n 2+ X2 ni=1(Xi X)2 2[Proof:Var(b1)=Var( ni=1(Xi X) i ni=1(Xi X)2)={n i=1(Xi X)2} 2 Var{n i=1(Xi X) i}={n i=1(Xi X)2} 2n i=1(Xi X)2 2= 2 ni=1(Xi X) shall prove the second equation later.] Estimated (fitted) Regression function Yi=b0+ Yi=b0+b1 Xithefitted { Yi}=EYi[Proof:E( Yi)=E(b0+b1Xi)=E(b0)+E(b1)Xi= 0+ 1Xi=EYi]Numerical properties of fitted Regression lineRecall the normal equations 2n i=1(Yi b0 b1Xi)=0 2n i=1Xi(Yi b0 b1Xi)=04andei=Yi Yi=Yi b0 b1Xi. It followsn i=1ei=0n i=1 Xiei=0 The following properties follows n i=1ei=0 n i=1Yi=n i=1 Yi n i=1e2i=minb0,b1{Q} n i=1 Xiei=0 n i=1 Yiei=0 Regression line always goes to ( X, Y) Yi Y= 1(Xi X)+ i,where i= i . The coefficient and the correlation coefficientb1=rX,YsYsXwheresX= n i=1(Xi X)2n 1,sY= n i=1(Yi Y)2n 1rX,Y= ni=1(Xi X)(Yi Y) ni=1(Xi X)2 ni=1(Yi Y) Estimation of Error Terms Variance 2 Sum of squares of residualsorerror sum of squares(SSE)SSE=n i=1(Yi Yi)2=n i=1e2i5 Estimate 2bys2=n i=1(Yi Yi)2n 2=n i=1e2in 2called mean squared error (MSE), i=1e2in 2or denoted by is it divided byn 2?
4 Because there are TWO constraints onei,i=1, .., n, normal equations. s2is unbiased estimator of 2, (s2)= 2[Proof: For any 1, .., nIID with mean and variance 2,wehaveEn i=1( i )2=En i=1[( i ) ( )]2=E{n i=1( i )2 n( )2}=n i=1 Var( ) nV ar( )=n 2 2=(n 1) 2 This is why we estimate 2by 2= ni=1( i )2n ( 1 )=Var{(1 1n) 1 n-1 terms 1n 2 .. 1n n}=(1 1n)2 2+1n2 2+..+1n2 2=(1 2n+1n2) 2+n 1n2 2=(1 1n) , for anyi,Var( i )=(1 1n) turn to the {n i=1(Yi Yi)2}=n i=1E(Yi Yi)2=n i=1 Var(Yi Yi)+{E(Yi Yi)}2=n i=1 Var{(Yi Y b1(Xi X))2}=n i=1{Var(Yi Y) 2 Cov(Yi Y,b1(Xi X)) +Var(b1)(Xi X)2}=n i=1{Var(Yi Y) 2 Cov((Yi Y)(Xi X),b1)+Var(b1)(Xi X)2}=n i=1{Var( i ) 2 Cov((Yi Y)(Xi X),b1)+Var(b1)(Xi X)2}=(n 1) 2 2 Cov(n i=1(Yi Y)(Xi X),b1)+Var(b1)n i=1(Xi X)2=(n 1) 2 2 Cov(b1n i=1(Xi X)2,b1)+Var(b1)n i=1(Xi X)2=(n 1) 2 Var(b1)n i=1(Xi X)2=(n 2) (s2)= 2 ExampleFor the above example, the MSE (estimator of 2=Var( i)) isMSE=n i=1e2i/(n 2) = = MSE= is also calledResidual standard to find the value in the output of R?]
5 73 Regression Without PredictorsAt first glance, it doesn t seem that studying Regression without predictors would be veryuseful. Certainly, we are not suggesting that using Regression without predictors is a majordata analysis tool. We do think that it is worthwhile to look at Regression models withoutpredictors to see what they can tell us about the nature of the constant. Understanding theregression constant in these simpler models will help us to understand both the constantand the other Regression coefficients in later more complex 0+ i,i=1,2, .., as before, we assume i,i=1,2, .., nare IID withE( i)=0andVar( i)= 2(We shall call this model Regression Without Predictors)The least square estimatorb0is to minimizer ofQ=n i=1{Yi b0}2 Note thatdQdb0= 2n i=1{Yi b0}Letting it equal 0, we have thenormal equationn i=1{Yi b0}=0which leads to the (ordinary) least square estimatorb0= fitted model is Yi= fitted residuals areei=Yi Yi=Yi Yi8 Can you prove the estimator is unbiased, 0?
6 How to estimate 2? 2=1n 1n i=1e2iWhy it is divided byn 1?4 Inference in regressionNext, we consider the Simple Linear Regression modelY1= 0+ 1X1+ 1Y2= 0+ 1X2+ (1)Yn= 0+ 1Xn+ nunder assumptions of normal random errors. Xiis a known, observed, and nonrandom 1, .., nare independentN(0, 2), ThusYiis random 0, 1and 2are the assumption, we haveE(Yi)= 0+ 1 XiandVar(Yi)= Inference of 1We need to check whether 1= 0 (or any other specified value, say ), why To check whetherXandYhas Linear relationship9 To see whether the model can be simplified (if 1= 0, the model becomesYi= 0+ i,a Regression model without predictors.) For example,HypothesesH0: 1= : 1 =0 Sample distribution ofb1recallb1= ni=1(Xi X)(Yi Y) ni=1(Xi X)2 Theorem model (1) with normal assumption of ithenb1 N( 1, 2 ni=1(Xi X)2)ProofRecall the fact thatany Linear combination of independent normal distributed randomvariablesisstillnormal.
7 Tofinditsdistribution, we only need to find its mean and normal and independent, thusb1isnormal,andEb1= 1and (we have proved that)Var(b1)= 2 ni=1(Xi X)2 The theorem : what is the distribution ofb1/ Var(b1) underH0? Can we use this Theoremto test the hypothesisH0?whyEstimated Varianceofb1. (Estimating 2byMSE)s2(b1)=MSE ni=1(Xi X)2= ni=1e2i/(n 2) ni=1(Xi X)2s(b1)istheStandard Error (or )ofb1, (or called Standard deviation)sample distribution of(b1 1)/s(b1)b1 1s(b1)followst(n 2) for model (1)Confidence interval for /2(n 2) ort(1 /2,n 2) the 1 /2 quantileoft(n 2).P(t( /2,n 2) (b1 1)/s(b1) t(1 /2,n 2)) = 1 10By symmetry of the distribution, we havet(1 /2,n 2) = t( /2,n 2)Thus, with confidence 1 , we have the confidence interval for 1is t(1 /2,n 2) (b1 1)/s(b1) t(1 /2,n 2) t(1 /2,n 2) s(b1) 1 b1+t(1 /2,n 2) s(b1)Example the example above, find the 95% confidence interval for 1?
8 Solution: sincen= 10, we havet(1 ,8) = ; the SE forb1iss(b1)= the confidence interval isb1 t(1 ,8) s(b1)= = [ , ]Test of 1 Two-sided Test: to check whether 1is 0H0: 0=0,Ha: 1 =0 UnderH0, we have random variablet=b1s(b1) t(n 2)Suppose thesignificance levelis (usually, , ). Calculatet,sayt If|t | t(1 /2;n 2), then acceptH0. If|t |>t(1 /2;n 2), then test can also be done based on thep-value, defined asp=P(|t|>|t |). It iseasy to see thatp-value< |t |>t(1 /2;n 2)Thus11 If p-value , then acceptH0. If p-value< , then rejectH0. One-sided test: for example to check whether 1is positive (or negative)H0: 1 0,Ha: 1<0 UnderH0,wehavet=b1s(b1)=b1 1s(b1)+ 1s(b1) t(n 2) + a positive termSuppose thesignificance levelis (usually, , ). Calculatet,sayt Ift t( ;n 2), then acceptH0. Ift <t( ;n 2), then Inference about 0 Sample distribution ofb0b0= Y b1 XTheorem model (1) with normal assumption of ithenb0 N( 0, 2[1n+ X2 ni=1(Xi X)2])[ProofThe expectation isEb0=E{ Y} E(b1) X=( 0+ 1 X) 1 X= 0 Letki=(Xi X) ni=1(Xi X)2, then (see the proof at the beginning of this part)b1= 1+n i=1ki 0+1nn i=1 i n i=1ki i= 0+n i=1[1n ki X] iThevarianceisVar(b0)=n i=1[1n ki X]2 2=[1n+ X2 ni=1(Xi X)2] 212 Therefore the Theorem follows.]
9 ]Estimated Varianceofb0(by replacing 2with MSE).s2(b0)=MSE[1n+ X2 ni=1(Xi X)2]s(b0)istheStandard Error (or )ofb0, (or called Standard deviation)Sample distribution of(b0 0)/s(b0)b0 0s(b0)followst(n 2) for model (1)Confidence interval for 0: with confidence 1 , we have the confidence intervalb0 t(1 /2,n 2) s(b0) 1 b0+t(1 /2,n 2) s(b0)Test of 0 Two-sided Test: to check whether 1is 0H0: 0=0,Ha: 0 =0 UnderH0,wehavet=b0s(b0) t(n 2)Suppose thesignificance levelis (usually, , ). If the calculatedt,sayt If|t | t(1 /2;n 2), then acceptH0. If|t |>t(1 /2;n 2), then , the test can also be done based on thep-value, defined asp=P(|t|>|t |).It is easy to see thatp-value< |t |>t(1 /2;n 2)Thus If p-value , then If p-value< , then rejectH0. One-sided test:to check whether 1is positive (or negative)H0: 0 0,Ha: 0>0 Example the example above, with significance level ,1.
10 TestH0: 0=0versusH1: 0 =02. TestH 0: 1=0versusH 1: 1 =03. TestH 0: 0 0versusH 1: 0<0 Answer:1. sincen=10,t( ,8) = |t |= < Thus, we acceptH0(another approach: p-value = > , we acceptH0)2. The t-value is|t |= > , thus we rejectH 0, significantly differentfrom 0.(another approach: p-value = < , we rejectH 0) ( ,8) = , sincet = > we acceptH 0 How to find these test from the output of the R code?14