Transcription of Title stata.com bootstrap — Bootstrap sampling and estimation
1 Bootstrap sampling and estimationSyntaxMenuDescriptionOptionsRe marks and examplesStored resultsMethods and formulasReferencesAlso seeSyntaxbootstrapexplist[,options eformoption]:commandoptionsDescriptionMa inreps(#)perform# Bootstrap replications; default isreps(50)Optionsstrata(varlist) variable s identifying stratasize(#)draw samples of size#; default isNcluster(varlist)variables identifying resampling clustersidcluster(newvar)create new clusterIDvariablesaving(filename,..)save results tofilename; save statistics in double precision;save results tofilenameevery#replicationsbcacompute acceleration forBCaconfidence intervalstiesadjustBC/BCa confidence intervals for tiesmseuseMSEformula for variance estimationReportinglevel(#)set confidence level; default islevel(95)notablesuppress table of resultsnoheadersuppress table headernolegendsuppress table legendverbosedisplay the full table legendnodotssuppress replication dotsnoisilydisplay any output fromcommandtracetracecommandtitle(text)u setextas Title for Bootstrap resultsdisplayoptionscontrol column formats, row spacing, line width, display of omittedvariables and base and empty cells, and factor- variable labelingeformoptiondisplay coefficient table in exponentiated formAdvancednodropdo not drop observationsnowarndo not warn whene(sample)is not setforcedo not check forweightsorsvycommands; seldom used12 Bootstrap Bootstrap sampling and estimationreject(exp)identify invalid resultsseed(#)set random-number seed to#group(varname)IDvariable for groups withincluster()jackknifeopts(jkopts)opti ons forjackknife.
2 See [R]jackknifecoeflegenddisplay legend instead of statisticsweightsare not allowed (),jackknifeopts(), andcoeflegenddo not appear in the dialog [U] 20 estimation and postestimation commandsfor more capabilities of estimation (name:elist)elisteexpelistcontainsnewvar = (exp)(exp)eexpisspecname[eqno]specnamesp ecnameisbb[]sese[]eqnois##nameexpis a standard Stata expression; see[U] 13 Functions and between[ ], which are to be typed, and[], which indicate optional >Resampling> Bootstrap estimationDescriptionbootstrapperforms Bootstrap estimation . Typing. bootstrapexplist, reps(#):commandexecutescommandmultiple times, bootstrapping the statistics inexplistby resampling observations(with replacement) from the data in memory#times. This method is commonly referred to as thenonparametric the statistical command to be executed.
3 Most Stata commands and user-writtenprograms can be used withbootstrap, as long as they follow standard Stata syntax; see[U] 11 Lan-guage syntax. If thebcaoption is supplied,commandmust also work withjackknife; see[R]jackknife. Thebyprefix may not be part the statistics to be collected from the execution ofcommand. Ifcommandchangesthe contents ine(b),explistis optional and defaults Bootstrap sampling and estimation 3 Because bootstrapping is a random process, if you want to be able to reproduce results, set therandom-number seed by specifying theseed(#)option or by typing. set seed#where#is a seed of your choosing, before runningbootstrap; see [R]set estimation commands allow thevce( Bootstrap )option. For those commands, we rec-ommend usingvce( Bootstrap )overbootstrapbecause the estimation command already handlesclustering and other model-specific details for you.
4 Thebootstrapprefix command is intendedfor use with nonestimation commands, such assummarize, user-written commands, or functions synonyms Main reps(#)specifies the number of Bootstrap replications to be performed. The default is 50. A total of50 200 replications are generally adequate for estimates of standard error and thus are adequatefor normal-approximation confidence intervals; see Mooney and Duval (1993, 11). Estimates ofconfidence intervals using the percentile or bias-corrected methods typically require 1,000 or morereplications. Options strata(varlist)specifies the variables that identify strata. If this option is specified, Bootstrap samplesare taken independently within each (#)specifies the size of the samples to be drawn. The default isN, meaning to draw samples ofthe same size as the data. If specified,#must be less than or equal to the number of observationswithinstrata().
5 Ifcluster()is specified, the default size is the number of clusters in the original dataset. Forunbalanced clusters, resulting sample sizes will differ from replication to replication. For clustersampling,#must be less than or equal to the number of clusters withinstrata().cluster(varlist)specifies the variables that identify resampling clusters. If this option is specified,the sample drawn during each replication is a Bootstrap sample of (newvar)creates a new variable containing a unique identifier for each resampled option requires thatcluster()also be (filename[,suboptions])creates a Stata data file (.dtafile) consisting of (for each statisticinexplist) a variable containing the that the results for each replication be saved asdoubles, meaning 8-byte default, they are saved asfloats, meaning 4-byte reals.
6 This option may be used withoutthesaving()option to compute the variance estimates by using double (#)specifies that results be written to disk every#th ()should be specifiedonly in conjunction withsaving()whencommandtakes a long time for each replication. Thisoption will allow recovery of partial results should some other software crash your [P] thatfilenamebe overwritten if it exists. This option does not appear in thedialog Bootstrap Bootstrap sampling and estimationbcaspecifies thatbootstrapestimate the acceleration of each statistic inexplist. This estimateis used to constructBCaconfidence intervals. Typeestat Bootstrap , bcato display theBCaconfidence interval generated by thatbootstrapadjust for ties in the replicate values when computing the medianbias used to constructBCandBCa confidence thatbootstrapcompute the variance by using deviations of the replicates from theobserved value of the statistics based on the entire dataset.
7 By default,bootstrapcomputes thevariance by using deviations from the average of the replicates. Reporting level(#); see [R] estimation the display of the table of the display of the table header. This option impliesnolegend. This optionmay also be specified when replaying estimation the display of the table legend. This option may also be specified when replayingestimation that the full table legend be displayed. By default, coefficients and standard errorsare not displayed. This option may also be specified when replaying estimation display of the replication dots. By default, one dot character is displayed for eachsuccessful replication. A red x is displayed ifcommandreturns an error or if one of the valuesinexplistis that any output fromcommandbe displayed. This option implies a trace of the execution ofcommandto be displayed.
8 This option implies (text)specifies a Title to be displayed above the table of Bootstrap results. The default Title is thetitle stored ine( Title )by an estimation command, or ife( Title )is not filled in,Bootstrapresultsis ()may also be specified when replaying estimation :noomitted,vsquish,noemptycells,baseleve ls,allbaselevels,nofvla-bel,fvwrap(#),fv wrapon(style),cformat(%fmt),pformat(%fmt ),sformat(%fmt), andnolstretch; see [R] estimation the coefficient table to be displayed in exponentiated form; see [R] which of the following are allowed (eform(string)andeformare alwaysallowed):eformoptionDescriptionefo rm(string)usestringfor the column titleeformexponentiated coefficient,stringisexp(b)hrhazard ratio,stringisHaz. Ratioshrsubhazard ratio,stringisSHRirrincidence-rate ratio,stringisIRRorodds ratio,stringisOdds Ratiorrrrelative-risk ratio,stringisRRRbootstrap Bootstrap sampling and estimation 5 Advanced nodropprevents observations outsidee(sample)and theifandinqualifiers from being droppedbefore the data are the display of a warning message whencommanddoes not sete(sample).
9 Forcesuppresses the restriction thatcommandnot specify weights or be asvycommand. This is ararely used option. Use it only if you know what you are (exp)identifies an expression that indicates when results should be rejected. Whenexpistrue, the resulting values are reset to missing (#)sets the random-number seed. Specifying this option is equivalent to typing the followingcommand prior to callingbootstrap:. set seed#The following options are available withbootstrapbut are not shown in the dialog box:group(varname)re-createsvarnameconta ining a unique identifier for each group across the resampledclusters. This option requires thatidcluster()also be option is useful for maintaining unique group identifiers when sampling clusters with replace-ment. Suppose that cluster 1 contains 3 groups. If theidcluster(newclid)option is specifiedand cluster 1 is sampled multiple times,newcliduniquely identifies each copy of cluster 1.
10 Ifgroup(newgroupid)is also specified,newgroupiduniquely identifies each copy of each (jkopts)identifies options that are to be passed tojackknifewhen it computes theacceleration values for theBCaconfidence intervals; see [R]jackknife. This option requires thebcaoption and is mostly used for passing theeclass,rclass, orn(#)option ; see [R] estimation and are presented under the following headings:IntroductionRegression coefficientsExpressionsCombining Bootstrap datasetsA note about macrosAchieved significance levelBootstrapping a ratioWarning messages and e(sample)Bootstrapping statistics from data with a complex structureIntroductionWith few assumptions, bootstrapping provides a way of estimating standard errors and other measuresof statistical precision (Efron 1979; Efron and Stein 1981; Efron 1982; Efron and Tibshirani 1986;Efron and Tibshirani 1993; also see Davison and Hinkley [1997]; Guan [2003]; Mooney and Duval[1993]; Poi [2004]; and Stine [1990]).