Example: bachelor of science

Syntax - Stata

Describe data contentsSyntaxMenuDescriptionOptionsRema rks and examplesStored resultsReferencesAlso seeSyntaxcodebook[varlist][if][in][,opti ons]optionsDescriptionOptionsallprint complete report without missing valuesheaderprint dataset name and last saved datenotesprint any notes attached to variablesmvreport pattern of missing valuestabulate(#)set tables/summary statistics threshold; default istabulate(9)problemsreport potential problems in datasetdetaildisplay detailed report on the variables; only withproblemscompactdisplay compact report on the variablesdotsdisplay a dot for each variable processed; only withcompactLanguageslanguages[(namelist) ]use with multilingual datasets; see [D]label languagefor detailsMenuData>Describe data>Describe data contents (codebook)Descriptioncodebookexamines the variable names, labels, and data to produce a codebook describing Options allis equivalent to specifying theheaderandnotesoptions.

202 3 E.N.C. 78 4 W.N.C. 115 5 S. Atl. 46 6 E.S.C. 89 7 W.S.C. 59 8 Mountain 195 9 Pacific 4 . 2 .a Because division has nine unique nonmissing values, codebook reported a tabulation. If divi-sion had contained one more unique nonmissing value, codebook would have switched to reporting summary statistics, unless we had included the tabulate ...

Tags:

  Syntax

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of Syntax - Stata

1 Describe data contentsSyntaxMenuDescriptionOptionsRema rks and examplesStored resultsReferencesAlso seeSyntaxcodebook[varlist][if][in][,opti ons]optionsDescriptionOptionsallprint complete report without missing valuesheaderprint dataset name and last saved datenotesprint any notes attached to variablesmvreport pattern of missing valuestabulate(#)set tables/summary statistics threshold; default istabulate(9)problemsreport potential problems in datasetdetaildisplay detailed report on the variables; only withproblemscompactdisplay compact report on the variablesdotsdisplay a dot for each variable processed; only withcompactLanguageslanguages[(namelist) ]use with multilingual datasets; see [D]label languagefor detailsMenuData>Describe data>Describe data contents (codebook)Descriptioncodebookexamines the variable names, labels, and data to produce a codebook describing Options allis equivalent to specifying theheaderandnotesoptions.

2 It provides a complete report, whichexcludes only to the top of the output a header that lists the dataset name, the date that the datasetwas last saved, any notes attached to the variables; see [D] thatcodebooksearch the data to determine the pattern of missing values. This is aCPU-intensive codebook Describe data contentstabulate(#)specifies the number of unique values of the variables to use to determine whether avariable is categorical or continuous. Missing values are not included in this count. The default is9; when there are more than nine unique values, the variable is classified as continuous. Extendedmissing values will be included in the that a summary report is produced describing potential problems that have beendiagnosed: Variables that are labeled with an undefined value label Incompletely value-labeled variables Variables that are constant, including always missing Leading, trailing, and embedded spaces in string variables Embedded binary 0 (\0) in string variables Noninteger-valued date variablesSee the discussion of these problems and advice on overcoming them following example be specified only with theproblemsoption.

3 It specifies that the detailed report on thevariables not be that a compact report on the variables be not be specifiedwith any options other that a dot be displayed for every variable be specified only withcompact. Languages languages[(namelist)]is for use with multilingual datasets; see [D]label language. It indicatesthat the codebook pertains to the languages innamelistor to all defined languages if no suchlist is specified as an argument tolanguages(). The output ofcodebooklists the data labeland variable labels in these languages and which value labels are attached to variables in are diagnosed in all these languages, as well. The problem report does not provide detailsin which language problems occur. We advise you to reruncodebookfor problematic variables;specifydetailto produce the problem report you have a multilingual dataset but do not specifylanguages(), all output, including theproblem report, is shown in the active and , without arguments, is most usefully combined withlogto produce a printed listingfor enclosure in a notebook documenting the data; see[U] 15 Saving and printing output log , however, also useful interactively, because you can specify one or a few 1codebookexamines the data in producing its results.

4 For variables thatcodebookthinks arecontinuous, it presents the mean; the standard deviation; and the 10th, 25th, 50th, 75th, and 90thpercentiles. For variables that it thinks are categorical, it presents a tabulation. In part,codebookmakes this determination by counting the number of unique values of the variable. If the number isnine or fewer,codebookreports a tabulation; otherwise, it reports summary Describe data contents 3codebookdistinguishes the standard missing values (.) and the extended missing values (. , denoted by.*). If extended missing values are found,codebookreports the numberof distinct missing value codes that occurred in that variable. Missing values are ignored with thetabulateoption when determining whether a variable is treated as continuous or use (ccdb46, 52-54).

5 Codebook fips division, allDataset: saved: 6 Mar 2013 22:20 Label: ccdb46, 52-54 Number of variables: 42 Number of observations: 956 Size: 145,312 bytes ignoring labels, :1. confirmed data with steve on 7/22fips state/place codetype: numeric (long)range: [10060,560050] units: 1unique values: 956 missing .: 0/956mean: 256495std. dev: 156998percentiles: 10% 25% 50% 75% 90%61462 120426 252848 391360 482530division Census Divisiontype: numeric (int)label: divisionrange: [1,9] units: 1unique values: 9 missing .: 4/956unique mv codes: 2 missing.

6 *: 2/956tabulation: Freq. Numeric Label69 1 N. 2 Mid Atl202 3 4 5 S. 6 7 8 Mountain195 9 Pacific4 .2 .aBecausedivisionhas nine unique nonmissing values,codebookreported a tabulation. Ifdivi-sionhad contained one more unique nonmissing value,codebookwould have switched to reportingsummary statistics, unless we had included thetabulate(#) codebook Describe data contentsExample 2 Themvoption is useful. It instructscodebookto search the data to determine patterns of missingvalues. Different kinds of missing values are not distinguished in the use (City Temperature Data). codebook cooldd heatdd tempjan tempjuly, mvcooldd Cooling degree daystype: numeric (int)range: [0,4389] units: 1unique values: 438 missing.

7 : 3/956mean: dev: : 10% 25% 50% 75% 90%411 615 940 1566 2761missing values: heatdd==mv <-> cooldd==mvtempjan==mv --> cooldd==mvtempjuly==mv --> cooldd==mvheatdd Heating degree daystype: numeric (int)range: [0,10816] units: 1unique values: 471 missing .: 3/956mean: dev: : 10% 25% 50% 75% 90%1510 2460 4950 6232 6919missing values: cooldd==mv <-> heatdd==mvtempjan==mv --> heatdd==mvtempjuly==mv --> heatdd==mvtempjan Average January temperaturetype: numeric (float)range: [ , ] units: .1unique values: 310 missing.

8 : 2/956mean: dev: : 10% 25% 50% 75% 90% values: tempjuly==mv <-> tempjan==mvcodebook Describe data contents 5tempjuly Average July temperaturetype: numeric (float)range: [ , ] units: .1unique values: 196 missing .: 2/956mean: dev: : 10% 25% 50% 75% 90% values: tempjan==mv <-> tempjuly==mvcodebookreports that iftempjanis missing,tempjulyis also missing, and vice versa. In the outputfor thecoolddvariable,codebookalso reports that the pattern of missing values is the same forcoolddandheatdd. In both cases, the correspondence is indicated with <-> .Forcooldd,codebookalso states that tempjan==mv --> cooldd==mv.

9 The one-way arrowmeans that a missingtempjanvalue implies a missingcoolddvalue but that a missingcoolddvalue does not necessarily imply a feature ofcodebook this one for numeric variables is that it can determine the unitsof the variable. For instance, in the example above,tempjanandtempjulyboth have units of ,meaning that temperature is recorded to tenths of a precision considerationsin making this determination (tempjanandtempjulyarefloats; see[U] Precision andproblems therein). If we had a variable in our dataset recorded in 100s (for example, 21,500 or36,800),codebookwould have reported the units as 100. If we had a variable that took on onlyvalues divisible by 5 (5, 10, 15, etc.),codebookwould have reported the units as 3We can use thelabel languagecommand (see [D]label language) and thelabelcommand (see[D]label) to create German value labels for our auto dataset.

10 These labels are reported bycodebook:. use (1978 Automobile Data). label language en, rename(language default renamed en). label language de, new(language de now current language). label data "1978 Automobile Daten". label variable foreign "Art Auto". label values foreign origin_de. label define origin_de 0 "Innen" 1 "Ausl andish"6 codebook Describe data contents. codebook foreignforeignArt Autotype: numeric (byte)label: origin_derange: [0,1] units: 1unique values: 2 missing .: 0/74tabulation: Freq. Numeric Label52 0 Innen22 1 Ausl andish. codebook foreign, languages(en de)foreign in en: Car typein de: Art Autotype: numeric (byte)label in en: originlabel in de: origin_derange: [0,1] units: 1unique values: 2 missing.


Related search queries