Transcription of Data Analysis Declare data with Stata Cheat Sheet TIME ...
1 data AnalysisCheat Sheetwith StataFor more info, see Stata s reference manual ( )Tim Essam Laura Hughes us @StataRGIS and @flaneuseksinspired by RStudio s awesome Cheat Sheets ( )updated May 2021CC BY : we are not affiliated with Stata . But we like rep78 variable to be an indicator price the third category of rep78 to be the base categoryregress price ib(3).rep78specify base indicatorfvsetcommand to change basefvset base frequent rep78set the base to most frequently occurring category for mpg as a continuous variable and specify an interaction between foreign and mpgregress price # variable as continuous#create a squared mpg term to be used in regressionregress price mpg # rep78 as an indicator; omit observations with rep78 == 2regress price io(2).
2 Rep78omit a variable or indicator##regress price ## all possible interactions with mpg (mpg and mpg2)specify factorial interactionsDESCRIPTIONCATEGORICAL VARIABLES identify a group to which an observation belongsINDICATOR VARIABLES denote whether something is true or falseTFCONTINUOUS VARIABLES measure somethingDeclare datatsline spotplot time series of sunspotsxtset id yeardeclare national longitudinal data to be a panelgenerate lag_spot = a new variable of annual lags of sunspotstsreport report time-series aspects of a datasetxtdescribereport panel aspects of a datasetxtsum hourssummarize hours worked.
3 Decomposingstandard deviation into between andwithin componentsarima spot, ar(1/2) fit an autoregressive model with 2 lags xtreg ln_w ## ttl_exp, fe vce(robust)fit a fixed-effects model with robust standard errorsxtline ln_wage if id <= 22, tlabel(#3)plot panel data as a line plotsvydescribereport survey - data detailssvy: mean age, over(sex)estimate a population mean for each subpopulationsvy: tabulate sex heartatkreport two-way table with tests of independencesvy, subpop(rural): mean ageestimate a population mean for rural areastsset time, yearlydeclare sunspot data to be yearly time seriesTIME SERIES webuse sunspot, clearPANEL / LONGITUDINAL webuse nlswork, clearSURVEY data webuse nhanes2b, clearsvyset psuid [pweight = finalwgt], strata(stratid) Declare survey design for a datasetsvy.
4 Reg zinc ## female weight ruralestimate a regression using survey weightsstset studytime, failure(died) Declare survey design for a datasetSURVIVAL Analysis webuse drugtr, clearstsumsummarize survival-time datastcox drug agefit a Cox proportional hazards modeltscollap carryforwardtsspellcompact time series into means, sums, and end-of-period valuescarry nonmissing values forward from one obs. to the nextidentify spells or runs in time seriesUSEFUL ADD-INSpwmean mpg, over(rep78) pveffects mcompare(tukey)estimate pairwise comparisons of means with equal variances include multiple comparison adjustmentwebuse systolic, clearanova systolic druganalysis of variance and covariancettest mpg, by(foreign)
5 Estimate t test on equality of means for mpg by foreigntabulate foreign rep78, chi2 exact expectedtabulate foreign and repair record and return chi2 and Fisher s exact statistic alongside the expected valuesprtest foreign == test of proportions ksmirnov mpg, by(foreign) exact Kolmogorov Smirnov equality-of-distributions testranksum mpg, by(foreign)equality tests on unmatched data (independent samples)By declaring data type, you enable Stata to apply data munging and Analysis functions specific to certain data typesTIME-SERIES x t- 1L2.
6 2-period lag x t- x t+1F2. 2-period lead x t+ x t-x t- 1D2. difference of difference xt-xt 1-(xt 1-xt 2) difference x t-xt- 1S2. lag-2 (seasonal difference) xt xt 2logit foreign headroom mpg, orestimate logistic regression and report odds ratiosregress price mpg weight, vce(robust)fit ordinary least-squares (OLS) model on mpg, weight, and foreign, apply robust standard errorsprobit foreign turn price, vce(robust)estimate probit regression with robust standard errorsrreg price mpg weight, genwt(reg_wt)estimate robust regression to eliminate outliersregress price mpg weight if foreign == 0, vce(cluster rep78)
7 Regress price only on domestic cars, cluster standard errorsbootstrap, reps(100): regress mpg /* */ weight gear foreignestimate regression with bootstrappingjackknife r(mean): sum mpg jackknife standard error of sample meanExamples use (sysuse auto, clear) unless otherwise notedSummarize dataStatistical testsEstimation with categorical & factor variablesdisplay _b[length] display _se[length]return coefficient estimate or standard error for lengthfrom most recent regression modelmargins, dydx(length)return the estimated marginal effect for lengthmargins, eyex(length)return the estimated elasticity for lengthpredict yhat if e(sample)
8 Create predictions for sample on which model was fitpredict double resid, residualscalculate residuals based on last fitted modeltest headroom = 0test linear hypotheses that headroom estimate equals zerolincom headroom - lengthestimate linear combination (headroom - length)regress price headroom lengthUsed in all postestimation examplesmore details at price mpg weight, star( )return all pairwise correlation coefficients with sig. levelscorrelate mpg pricereturn correlation or covariance matrixmean price mpgestimates of means, including standard errorsproportion rep78 foreignestimates of proportions, including standard errors for categories identified in varlistratio price/mpgestimates of ratio, including standard errors total priceestimates of totals, including standard errorsci mean mpg price, level(99)
9 Compute standard errors and confidence intervalsstem mpgreturn stem-and-leaf display of mpgsummarize price mpg, detailcalculate a variety of univariate summary statisticsfrequently used commands are highlighted in yellowunivar price mpg, boxplotcalculate univariate summary with box-and-whiskers plotssc install univarreturns e-class information when post option is usedType help regress postestimation plotsfor additional diagnostic plotshettesttest for heteroskedasticityestat vifreport variance inflation factorovtesttest for omitted-variable biasdfbeta(length)calculate measure of influencervfplot, yline(0)
10 Plot residuals against fitted valuesplot all partial-regression leverageplots in one graphavplotsResidualsFitted valuespricempgpricerep78priceheadroompri ceweightsome are inappropriate with robust SEsDiagnostics2 Postestimation3 Fit models1commands that use a fitted modelstores results as -classrerereResults are stored as either -class or-class. See Programming Cheat Sheetrerrrrrreeee0100200 Number of sunspots195018501900420420197019801990id 1id 2id 3id 4420wage relative to inflationBlinder Oaxaca decompositionADDITIONAL MODELS xtline plottsline plotinstrumental variablesivregressivreg2principal components analysispcafactor analysisfactorcount outcomespoisson nbregcensored datatobitbuilt-in Stata commandregression discontinuityrddynamic panel estimatorxtabond xtdpdsys propensity score matchingteffects psmatch synthetic control