Example: air traffic controller

Getting Started in Fixed/Random Effects Models using R

Getting Started in Fixed/Random Effects Models using R/RStudio(v. )Oscar 2010 ~otorres/IntroPanel data (also known as longitudinal or cross-sectional time-series data) is a dataset in which the behavior of entities are observed across entities could be states, companies, individuals, countries, data looks like thiscountryyearYX1X2 a brief introduction onthe theory behind panel data analysis please see the following document: contents of this document rely heavily on the document: Panel Data Econometricsin R: theplmpackage notes from the ICPSR s Summer Program in Quantitative Methods of Social Research(summer 2010)Exploring panel data3library(foreign)Panel <- (" ")coplot(y ~ year|country, type="l", data=Panel) # Linescoplot(y ~ year|country, type="b", data=Panel) # Points and lines# Bars at top indicates corresponding graph ( countries)from left to right starting on the bottom row (Muenchen/Hilbe:355)Exploring panel data4library(foreign)Panel <- (" ")library(car)scatterplot(y~year|country , boxplots=FALSE, smooth=TRUE, , data=Panel) fixed - Effects MODEL(Covariance Model, Within Estimator, Individual Dummy Variable Model, Least Squares Dummy Variable Model) fixed Effects : Heterogeneity a

Run a fixed effects model and save the estimates, then run a random model and save the estimates, then perform the test. If the p-value is significant (for example <0.05) then use fixed effects, if not use random effects.

Tags:

  Fixed

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Getting Started in Fixed/Random Effects Models using R

1 Getting Started in Fixed/Random Effects Models using R/RStudio(v. )Oscar 2010 ~otorres/IntroPanel data (also known as longitudinal or cross-sectional time-series data) is a dataset in which the behavior of entities are observed across entities could be states, companies, individuals, countries, data looks like thiscountryyearYX1X2 a brief introduction onthe theory behind panel data analysis please see the following document: contents of this document rely heavily on the document: Panel Data Econometricsin R: theplmpackage notes from the ICPSR s Summer Program in Quantitative Methods of Social Research(summer 2010)Exploring panel data3library(foreign)Panel <- (" ")coplot(y ~ year|country, type="l", data=Panel) # Linescoplot(y ~ year|country, type="b", data=Panel) # Points and lines# Bars at top indicates corresponding graph ( countries)from left to right starting on the bottom row (Muenchen/Hilbe:355)Exploring panel data4library(foreign)Panel <- (" ")library(car)scatterplot(y~year|country , boxplots=FALSE, smooth=TRUE, , data=Panel) fixed - Effects MODEL(Covariance Model, Within Estimator, Individual Dummy Variable Model, Least Squares Dummy Variable Model) fixed Effects .

2 Heterogeneity across countries (or entities)library(foreign)Panel <- (" ")library(gplots)plotmeans(y ~ country, main="Heterogeineityacross countries", data=Panel)# plotmeansdraw a 95% confidence intervalaround the meansdetach("package:gplots")# Remove package gplots from the workspace6 Heterogeneity: unobserved variables that do not change over time fixed Effects : Heterogeneity across years library(foreign)Panel <- (" ")library(gplots)plotmeans(y ~ year, main="Heterogeineityacross years", data=Panel)# plotmeansdraw a 95% confidence intervalaround the meansdetach("package:gplots")# Remove package gplots from the workspace7 Heterogeneity: unobserved variables that do not change over time OLS regression8> library(foreign)> Panel < (" ")> ols<- lm(y ~ x1, data=Panel)> summary(ols)Call:lm(formula = y ~ x1, data = Panel)Residuals:Min 1Q Median 3Q Max +09 +09 +08 +09 +09 Coefficients:Estimate Std.

3 Error t value Pr(>|t|) (Intercept) +09 +08 *x1 +08 +08 ---Signif. codes: 0 ** ** * . 1 Residual standard error: +09 on 68 degrees of freedomMultiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 68 DF, p-value: > yhat <-ols$fitted> plot(Panel$x1, Panel$y, pch=19, xlab="x1", ylab="y")> abline(lm(Panel$y~Panel$x1),lwd=3, col="red")Regular OLS regression does not consider heterogeneity across groups or timeFixed Effects using Least squares dummy variable model9> library(foreign)>Panel < (" ") > <-lm(y ~ x1 + factor(country) -1, data=Panel)> summary( )Call:lm(formula = y ~ x1 + factor(country) -1, data = Panel)Residuals:Min 1Q Median 3Q Max +09 +08 +08 +09 +09 Coefficients:Estimate Std. Error t value Pr(>|t|) x1 +09 +09 * factor(country)A +08 +08 factor(country)B +09 +09 factor(country)C +09 +09 factor(country)D +09 +08 **factor(country)E +08 +09 factor(country)F +09 +09.

4 Factor(country)G +08 +09 ---Signif. codes: 0 ** ** * . 1 Residual standard error: +09 on 62 degrees of freedomMultiple R-squared: , Adjusted R-squared: F- statistic: on 8 and 62 DF, p-value: For the theory behind fixed Effects , please see squares dummy variable model10> yhat<- $fitted> library(car)> scatterplot(yhat~Panel$x1|Panel$country, boxplots=FALSE, xlab="x1", ylab="yhat",smooth=FALSE)> abline(lm(Panel$y~Panel$x1),lwd=3, col="red")OLS regressionComparing OLS vsLSDV modelEach component of the factor variable (country) is absorbing the Effects particular to each country. Predictor x1 was not significant in the OLS model, once controlling for differences across countries, x1became significant in the OLS_DUM ( LSDV model).11> library(apsrtable)> apsrtable(ols, , c("OLS", "OLS_DUM")) # Displays a table in Latex form> cat(apsrtable(ols, , c("OLS", "OLS_DUM"), Sweave=F), file=" ")# Exports the table to a text file (in Latex code).

5 \begin{table}[!ht]\caption{}\label{} \begin{tabular}{ l D{.}{.}{2}D{.}{.}{2} } \hline& \multicolumn{ 1 }{ c }{ OLS } & \multicolumn{ 1 }{ c }{ OLS_DUM } \\\hline% & OLS & OLS_DUM \\(Intercept) & ^* & \\& ( ) & \\x1 & & ^*\\& ( ) & ( ) \\factor(country)A & & \\& & ( ) \\factor(country)B & & \\& & ( ) \\factor(country)C & & \\& & ( ) \\factor(country)D & & ^*\\& & ( ) \\factor(country)E & & \\& & ( ) \\factor(country)F & & \\& & ( ) \\factor(country)G & & \\& & ( )

6 \\$N$ & 70 & 70 \\$R^2$ & & \\adj. $R^2$ & & \\Resid. sd& & \\ \ hline\multicolumn{3}{l}{\footnotesize{St andard errors in parentheses}}\\\multicolumn{3}{l}{\footn otesize{$^*$ indicates significance at $p< $}} \end{tabular} \end{table}The coefficient of x1indicates how much Ychanges overtime, controlling by differences in countries, when Xincreases by one unit. Notice x1is significant in the LSDV modelThe coefficient of x1indicates how much Ychanges when Xincreases by one unit. Notice x1is not significant in the OLS modelFixed Effects : nentity-specific intercepts ( using plm)> library(plm)> fixed <-plm(y ~ x1, data=Panel, index=c("country", "year"), model="within")> summary( fixed )Oneway(individual) effect Within ModelCall:plm(formula = y ~ x1, data = Panel, model = "within", index = c("country", "year"))Balanced Panel: n=7, T=10, N=70 Residuals :Min.

7 1st Qu. Median Mean 3rd Qu. Max. +09 +08 +08 +09 +09 Coefficients :Estimate Std. Error t-value Pr(>|t|) x1 2475617827 1106675594 *---Signif. codes: 0 ** ** * . 1 Total Sum of Squares: +20 Residual Sum of Squares: +20R- Squared : Adj. R-Squared : F- statistic: on 1 and 62 DF, p-value: > fixef( fixed ) # Display the fixed Effects (constants for each country)A B C D E F 880542404 -1057858363 -1722810755 3162826897 -602622000 2010731793 G -984717493> pFtest( fixed , ols) # Testing for fixed Effects , null: OLS better than fixedF test for individual effectsdata: y ~ x1 F = , df1 = 6, df2 = 62, p-value = hypothesis: significant effectsFixed Effects optionOutcome variablePredictor variable(s)Panel settingn = # of groups/panels, T = # years, N = total # of observationsPr(>|t|)= Two-tail p- values test the hypothesis that each coefficient is different from 0.

8 To reject this, the p-value has to be lower than (95%, you could choose also an alpha of ), if this is the case then you can say that the variable has a significant influence on your dependent variable (y)If this number is < then your model is ok. This is a test (F) to see whether all the coefficients in the model are different than the p-value is < then the fixed Effects model is a better choiceThe coeffof x1 indicates how much Ychanges overtime, on average per country, when Xincreases by one MODEL(Random Intercept, Partial Pooling Model)Random Effects ( using plm)> random <-plm(y ~ x1, data=Panel, index=c("country", "year"), model="random")> summary(random)Oneway(individual) effect Random Effect Model (Swamy-Arora's transformation)Call:plm(formula = y ~ x1, data = Panel, model = "random", index = c("country", "year"))Balanced Panel: n=7, T=10, N=70 shareidiosyncratic +18 +09 +18 +09 : Residuals :Min.

9 1st Qu. Median Mean 3rd Qu. Max. +09 +09 +08 +09 +09 Coefficients :Estimate Std. Error t-value Pr(>|t|)(Intercept) 1037014284 790626206 1247001782 902145601 Sum of Squares: +20 Residual Sum of Squares: +20R- Squared : Adj. R-Squared : F- statistic: on 1 and 68 DF, p-value: # Setting as panel data (an alternative way to run the above <- (Panel, index = c("country", "year"))# Random Effects using panel setting (same output as above) <- plm(y ~ x1, data = , model="random")summary( )Random Effects optionOutcome variablePredictor variable(s)Panel settingn = # of groups/panels, T = # years, N = total # of observationsPr(>|t|)= Two-tail p- values test the hypothesis that each coefficient is different from 0. To reject this, the p-value has to be lower than (95%, you could choose also an alpha of ), if this is the case then you can say that the variable has a significant influence on your dependent variable (y)If this number is < then your model is ok.)

10 This is a test (F) to see whether all the coefficients in the model are different than of the coefficients is tricky since they include both the within-entity and between-entity Effects . In the case of TSCS data represents the average effect of Xover Ywhen Xchanges across time and between countries by one the theory behind random Effects please see: OR RANDOM? fixed or Random: Hausman testTo decide between fixed or random Effects you can run a Hausmantest where the null hypothesis is that the preferred model is random Effects vs. the alternative the fixed Effects (see Green, 2008, chapter 9). It basically tests whether the unique errors (ui) are correlated with the regressors, the null hypothesis is they are a fixed Effects model and save the estimates, then run a random model and save the estimates, then perform the test.


Related search queries