Transcription of stepwise — Stepwise estimation - Stata
1 Stepwise estimationSyntaxMenuDescriptionOptionsRe marks and examplesStored resultsMethods and formulasReferencesAlso seeSyntaxstepwise[,options]:commandoptio nsDescriptionModel pr(#)significance level for removal from the model pe(#)significance level for addition to the modelModel2forwardperform forward- Stepwise selectionhierarchicalperform hierarchical selectionlockterm1keep the first termlrperform likelihood-ratio test instead of Wald testReportingdisplayoptionscontrol column formats and line width At least one ofpr(#)orpe(#)must be allowed; see[U] Prefix are allowed ifcommandallows them; see[U] postestimation commands behave as they would aftercommandwithout thestepwiseprefix; see the postestimationmanual entry [U] 20 estimation and postestimation commandsfor more capabilities of estimation >Other> Stepwise estimationDescriptionstepwiseperforms Stepwise estimation .
2 Typing. Stepwise , pr(#):commandperforms backward-selection estimation forcommand. The Stepwise selection method is determinedby the following option combinations:optionsDescriptionpr(#)back ward selectionpr(#) hierarchicalbackward hierarchical selectionpr(#) pe(#)backward stepwisepe(#)forward selectionpe(#) hierarchicalforward hierarchical selectionpr(#) pe(#) forwardforward stepwise12 Stepwise Stepwise estimationcommanddefines the estimation command to be executed. The following Stata commands aresupported bystepwise:clogit nbreg regresscloglog ologit scobitglm oprobit stcoxintreg poisson stcrreglogistic probit streglogit qreg tobitstepwiseexpectscommandto have the following form:commandname[depvar]term[term..][if] [in][weight][,commandoptions]wheretermis eithervarnameor(varlist)(avarlistin parentheses indicates that this group ofvariables is to be included or excluded together).
3 Depvaris not present whencommandnameisstcox,stcrreg, orstreg; otherwise,depvaris assumed to be present. Forintreg,depvarisactually two dependent variable names (depvar1anddepvar2).swis a synonym Model pr(#)specifies the significance level for removal from the model; terms withp pr()are eligiblefor (#)specifies the significance level for addition to the model; terms withp <pe()are eligiblefor addition. Model 2 forwardspecifies the forward- Stepwise method and may be specified only when bothpr()andpe()are also specified. Specifying bothpr()andpe()withoutforwardresults in backward-stepwiseselection. Specifying onlypr()results in backward selection, and specifying onlype()resultsin forward hierarchical that the first term be included in the model and not be subjected to the that the test of term significance be the likelihood-ratio test.
4 The default is the lesscomputationally expensive Wald test; that is, the test is based on the estimated variance covariancematrix of the estimators. Reporting displayoptions:cformat(%fmt),pformat(%fm t),sformat(%fmt), andnolstretch; see [R]es-timation Stepwise estimation 3 Remarks and are presented under the following headings:IntroductionSearch logic for a stepFull search logicExamplesEstimation sample considerationsMessagesProgramming for stepwiseIntroductionTyping. Stepwise , pr(.10): regress y1 x1 x2 d1 d2 d3 x4 x5performs a backward-selection search for the regression modely1onx1,x2,d1,d2,d3,x4, andx5. In this search, each explanatory variable is said to be a term. Typing. Stepwise , pr(.10): regress y1 x1 x2 (d1 d2 d3) (x4 x5)performs a similar backward-selection search, but the variablesd1,d2, andd3are treated as oneterm, as arex4andx5.
5 That is,d1,d2, andd3may or may not appear in the final model, but theyappear or do not appear 1 Using the automobile dataset, we fit a backward-selection model ofmpg:. use generate weight2 = weight*weight. Stepwise , pr(.2): regress mpg weight weight2 displ gear turn headroom foreign> pricebegin with full modelp = >= removing headroomp = >= removing displacementp = >= removing priceSourceSS df MS Number of obs = 74F( 5, 68) = 5 Prob > F = 68 R-squared = R-squared = 73 Root MSE = Std. Err. t P>|t| [95% Conf. Interval] .0039169 .1763099 . estimation treated each variable as its own term and thus considered each one separately.
6 Theengine displacement and gear ratio should really be considered together:4 Stepwise Stepwise estimation . Stepwise , pr(.2): regress mpg weight weight2 (displ gear) turn headroom> foreign pricebegin with full modelp = >= removing headroomp = >= removing displacement gear_ratiop = >= removing priceSourceSS df MS Number of obs = 74F( 4, 69) = 4 Prob > F = 69 R-squared = R-squared = 73 Root MSE = Std. Err. t P>|t| [95% Conf. Interval] .0039379 .176658 . logic for a stepBefore discussing the complete search logic, consider the logic for a step the first step indetail. The other steps follow the same logic.
7 If you type. Stepwise , pr(.20): regress y1 x1 x2 (d1 d2 d3) (x4 x5)the logic is1. Fit the modelyonx1 x2 d1 d2 d3 x4 Consider Consider Consider droppingd1 d2 Consider droppingx4 Find the term above that is least significant. If its significancelevel is , remove that you type. Stepwise , pr(.20) hierarchical: regress y1 x1 x2 (d1 d2 d3) (x4 x5)the logic would be different because thehierarchicaloption states that the terms are ordered. Theinitial logic would become1. Fit the modelyonx1 x2 d1 d2 d3 x4 Consider droppingx4 x5 the last If the significance of this last term is , remove the process would then stop or continue. It would stop ifx4 x5were not dropped, and otherwise,stepwisewould continue to consider the significance of the next-to-last term,d1 d2 ()rather thanpr()switches to forward estimation .
8 If you type. Stepwise , pe(.20): regress y1 x1 x2 (d1 d2 d3) (x4 x5) Stepwise Stepwise estimation 5stepwiseperforms forward-selection search. The logic for the first step is1. Fit a model ofyon nothing (meaning a constant).2. Consider Consider Consider addingd1 d2 Consider addingx4 Find the term above that is most significant. If its significancelevel is< , add that with backward estimation , if you specifyhierarchical,. Stepwise , pe(.20) hierarchical: regress y1 x1 x2 (d1 d2 d3) (x4 x5)the search for the most significant term is restricted to the next term:1. Fit a model ofyon nothing (meaning a constant).2. Consider addingx1 the first If the significance is< , add the added,stepwisewould next considerx2; otherwise, the search process would also use a Stepwise selection logic that alternates between adding and removingterms.
9 The full logic for all the possibilities is given Stepwise Stepwise estimationFull search logicOptionLogicpr()Fit the full model on all explanatory variables.(backward selection)While the least-significant term is insignificant , remove itand () hierarchicalFit full model on all explanatory variables.(backward hierarchical selection)While the last term is insignificant , remove itand () pe()Fit full model on all explanatory variables.(backward Stepwise )If the least-significant term is insignificant , remove it andreestimate; otherwise, that again: if the least-significant term is insignificant ,remove it and reestimate; otherwise, ,if the most-significant excluded term is significant , addit and reestimate;if the least-significant included term is insignificant ,remove it and reestimate;until neither is ()Fit empty model.(forward selection)While the most-significant excluded term is significant ,add it and () hierarchicalFit empty model.
10 (forward hierarchical selection)While the next term is significant , add itand () pe() forwardFit empty model.(forward Stepwise )If the most-significant excluded term is significant ,add it and reestimate; otherwise, that again: if the most-significant excluded term is significant , add it and reestimate; otherwise, ,if the least-significant included term is insignificant ,remove it and reestimate;if the most-significant excluded term is significant ,add it and reestimate;until neither is Stepwise estimation 7 ExamplesThe following two statements are equivalent; both include solely single-variable terms:. Stepwise , pr(.2): regress price mpg weight displ. Stepwise , pr(.2): regress price (mpg) (weight) (displ)The following two statements are equivalent; the last term in each isr1,..,r4:. Stepwise , pr(.2) hierarchical: regress price mpg weight displ (r1-r4).
