Transcription of Introduction to Econometrics with R
1 Introduction to Econometrics with RChristoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer2022-04-032 ContentsPreface71 Colophon .. A Very Short Introduction toRandRStudio.. 152 Probability Random Variables and Probability Distributions .. Random Sampling and the Distribution of Sample Averages .. Exercises .. 563 A Review of Statistics using Estimation of the Population Mean .. Properties of the Sample Mean .. Hypothesis Tests Concerning the Population Mean .. Confidence Intervals for the Population Mean .. Comparing Means from Different Populations .. An Application to the Gender Gap of Earnings .. Scatterplots, Sample Covariance and Sample Correlation .. Exercises .. 904 Linear Regression with One Simple Linear Regression .. Estimating the Coefficients of the Linear Regression Model .. Measures of Fit .. The Least Squares Assumptions .. The Sampling Distribution of the OLS Estimator.
2 Exercises .. 1175 Hypothesis Tests and Confidence Intervals in the Simple LinearRegression Testing Two-Sided Hypotheses Concerning the Slope Coefficient . Confidence Intervals for Regression Coefficients .. Regression when X is a Binary Variable .. Heteroskedasticity and Homoskedasticity .. The Gauss-Markov Theorem .. Using the t-Statistic in Regression When the Sample Size Is Exercises .. 1486 Regression Models with Multiple Omitted Variable Bias .. The Multiple Regression Model .. Measures of Fit in Multiple Regression .. OLS Assumptions in Multiple Regression .. The Distribution of the OLS Estimators in Multiple Regression . Exercises .. 1707 Hypothesis Tests and Confidence Intervals in Multiple Hypothesis Tests and Confidence Intervals for a Single Coefficient An Application to Test Scores and the Student-Teacher Ratio .. Joint Hypothesis Testing Using the F-Statistic .. Confidence Sets for Multiple Coefficients.
3 Model Specification for Multiple Regression .. Analysis of the Test Score Data Set .. Exercises .. 1888 Nonlinear Regression A General Strategy for Modelling Nonlinear Regression Functions Nonlinear Functions of a Single Independent Variable .. Interactions Between Independent Variables .. Nonlinear Effects on Test Scores of the Student-Teacher Ratio .. Exercises .. 2329 Assessing Studies Based on Multiple Internal and External Validity .. Threats to Internal Validity of Multiple Regression Analysis .. and External Validity when the Regression is Used forForecasting .. Example: Test Scores and Class Size .. Exercises .. 26210 Regression with Panel Panel Data .. Data with Two Time Periods: Before and After Fixed Effects Regression .. Regression with Time Fixed Effects .. Fixed Effects Regression Assumptions and Standard Errorsfor Fixed Effects Regression .. Drunk Driving Laws and Traffic Deaths .. Exercises.
4 28511 Regression with a Binary Dependent Binary Dependent Variables and the Linear Probability Model . Probit and Logit Regression .. Estimation and Inference in the Logit and Probit Models .. Application to the Boston HMDA Data .. Exercises .. 31212 Instrumental Variables IV Estimator with a Single Regressor and a Single The General IV Regression Model .. Checking Instrument Validity .. Application to the Demand for Cigarettes .. Where Do Valid Instruments Come From? .. Exercises .. 33413 Experiments and Potential Outcomes, Causal Effects and Idealized Experiments . Threats to Validity of Experiments .. Experimental Estimates of the Effect of Class Size Reductions .. Quasi Experiments .. Exercises .. 36314 Introduction to Time Series Regression and Using Regression Models for Forecasting .. Time Series Data and Serial Correlation .. Autoregressions .. Can You Beat the Market? (Part I) .. Additional Predictors and The ADL Model.
5 Lag Length Selection Using Information Criteria .. Nonstationarity I: Trends .. Nonstationarity II: Breaks .. Can You Beat the Market? (Part II) .. 41515 Estimation of Dynamic Causal The Orange Juice Data .. Dynamic Causal Effects .. Dynamic Multipliers and Cumulative Dynamic Multipliers .. HAC Standard Errors .. of Dynamic Causal Effects with Strictly ExogeneousRegressors .. Orange Juice Prices and Cold Weather .. 44116 Additional Topics in Time Series Vector Autoregressions .. Orders of Integration and the DF-GLS Unit Root Test .. Cointegration .. Clustering and Autoregressive Conditional Heteroskedas-ticity .. 473 PrefaceChair of Econometrics Department of Business Administration and EconomicsUniversity of Duisburg-Essen Essen, Germany Lastupdated on Sunday, April 03, 2022 Over the recent years, the statistical programming language R has become anintegral part of the curricula of Econometrics classes we teach at the University ofDuisburg-Essen.
6 We regularly found that a large share of the students, especiallyin our introductory undergraduate Econometrics courses, have not been exposedto any programming language before and thus have difficulties to engage withlearning R on their own. With little background in statistics and Econometrics , itis natural for beginners to have a hard time understanding the benefits of havingR skills for learning and applying Econometrics . These particularly include theability to conduct, document and communicate empirical studies and having thefacilities to program simulation studies which is helpful for, , comprehendingand validating theorems which usually are not easily grasped by mere broodingover formulas. Being applied economists and econometricians, all of the latterare capabilities we value and wish to share with our of confronting students with pure coding exercises and complementaryclassic literature like the book by Venables and Smith (2010), we figured it wouldbe better to provide interactive learning material that blends R code with thecontents of the well-received textbookIntroduction to Econometricsby Stockand Watson (2015) which serves as a basis for the lecture.
7 This material isgathered in the present bookIntroduction to Econometrics with R, an empiricalcompanion to Stock and Watson (2015). It is an interactive script in the styleof a reproducible research report and enables students not only to learn howresults of case studies can be replicated with R but also strengthens their abilityin using the newly acquired skills in other empirical Used in this Book Italictext indicates new terms, names, buttons and alike. Constant width textis generally used in paragraphs to refer includes commands, variables, functions, data types, databases and78 CONTENTS file names. Constant width text on gray background indicatesRcode that can be typedliterally by you. It may appear in paragraphs for better distinguishabilityamong executable and non-executable code statements but it will mostlybe encountered in shape of large blocks ofRcode. These blocks are referredto as code thank theStifterverband f r die Deutsche Wissenschaft theMinistryof Culture and Science of North Rhine-Westphaliafor their financial , we are grateful to Alexander Blasberg for proofreading and his effort inhelping with programming the exercises.
8 A special thanks goes to Achim Zeileis(University of Innsbruck) and Christian Kleiber (University of Basel) for theiradvice and constructive criticism. Another thanks goes to Rebecca Arnold fromthe M nster University of Applied Sciences for several suggestions regarding thewebsite design and for providing us with her nice designs for the book cover,logos and icons. We are also indebted to all past students of our introductoryeconometrics courses at the University of Duisburg-Essen for their 1 IntroductionThe interest in the freely available statistical programming language and softwareenvironmentR(R Core Team, 2021) is soaring. By the time we wrote first draftsfor this project, more than 11000 add-ons (many of them providing cutting-edge methods) were made available on the ComprehensiveRArchive Network(CRAN), an extensive network of FTP servers around the world that storeidentical and up-to-date versions ofRcode and its (commercial) software for statistical computing in most fields of research inapplied statistics.
9 The benefits of it being freely available, open source and havinga large and constantly growing community of users that contribute to CRAN renderRmore and more appealing for empirical economists and striking advantage of usingRin Econometrics is that it enables students toexplicitly document their analysis step-by-step such that it is easy to update andto expand. This allows to re-use code for similar applications with different ,Rprograms are fully reproducible, which makes it straightforwardfor others to comprehend and validate the recent years,Rhas thus become an integral part of the curricula ofeconometrics classes we teach at the University of Duisburg-Essen. In some sense,learning to code is comparable to learning a foreign language and continuouspractice is essential for the learning success. Needless to say, presenting bareRcode on slides does not encourage the students to engage with hands-on experi-ence on their own. This is whyRis crucial. As for accompanying literature, thereare some excellent books that deal withRand its applications to Econometrics , , Kleiber and Zeileis (2008).
10 However, such sources may be somewhat beyondthe scope of undergraduate students in economics having little understanding ofeconometric methods and barely any experience in programming at all. Conse-quently, we started to compile a collection of reproducible reports for use in reports provide guidance on how to implement selected applications from910 CHAPTER 1. INTRODUCTIONthe textbookIntroduction to Econometrics (Stock and Watson, 2015) whichserves as a basis for the lecture and the accompanying tutorials. This processwas facilitated considerably byknitr(Xie, 2022b) andR markdown(Allaireet al., 2022). In conjunction, bothRpackages provide powerful functionalitiesfor dynamic report generation which allow to seamlessly combine pure text,LaTeX,Rcode and its output in a variety of formats, including PDF and , writing and distributing reproducible reports for use in academia hasbeen enriched tremendously by thebookdownpackage (Xie, 2022a) which hasbecome our main tool for this on top ofR markdownand allows to create appealing HTML pages like this one, among other inspired byUsing R for Introductory Econometrics (Heiss, 2016)1and withthis powerful toolkit at hand we wrote up our own empirical companion to Stockand Watson (2015).