Syntax - Stata

Equality tests on unmatched dataSyntaxMenuDescriptionOptions for ranksumOptions for medianRemarks and examplesStored resultsMethods and formulasReferencesAlso seeSyntaxWilcoxon rank -sum testranksumvarname[if] [in], by(groupvar)[porder]Nonparametric equality-of-medians testmedianvarname[if] [in] [weight], by(groupvar)[medianoptions]ranksumoption sDescriptionMain by(groupvar)grouping variableporderprobability that variable for first group is larger than variable forsecond groupmedianoptionsDescriptionMain by(groupvar)grouping variableexactperform Fisher s exact testmedianties(below)assign values equal to the median to below groupmedianties(above)assign values equal to the median to above groupmedianties(drop)drop values equal to the median from the analysismedianties(split)split values equal to the median equally between the two groups by(groupvar)is allowed withranksumandmedian; see [D] are allowed withmedian; see[U] >Nonparametric analysis> tests of hypotheses> wilcoxon rank -sum testmedianStatistics>Nonparametric analysis> tests of hypotheses>K-sample equality-of-medians test12 ranksum Equality tests on unmatched dataDescriptionranksumtests the hypothesis that two independent samples (that is,unmatcheddata) are frompopulations with the same distribution by using the wilcoxon rank -sum test, which is also known asthe Mann Whitney two-sample statistic ( wilcoxon 1945; Mann and Whitney 1947).

Medianperforms a nonparametrick-sample test on the equality of medians. It tests the nullhypothesis that theksamples were drawn from populations with the same median. For two samples,the chi-squared test statistic is computed both with and without a continuity for use withunmatcheddata. For equality tests on matched data, see[R] for ranksum Main by(groupvar)is required. It specifies the name of the grouping an estimate of the probability that a random draw from the first population is largerthan a random draw from the second for median Main by(groupvar)is required. It specifies the name of the grouping the significance calculated by Fisher s exact test. For two samples, both one- andtwo-sided probabilities are (below|above|drop|split)specifies how values equal to the overall median are tobe handled. The median test computes the median forvarnameby using all observations and thendivides the observations into those falling above the median and those falling below the values for an observation are equal to the sample median, they can be dropped from theanalysis by specifyingmedianties(drop); added to the group above or below the median byspecifyingmedianties(above)ormediantie s(below), respectively; or if there is more than1 observation with values equal to the median, they can be equally divided into the two groups byspecifyingmedianties(split).

If this option is not specified,medianties(below)is and 1We are testing the effectiveness of a new fuel additive. We run an experiment with 24 cars: 12cars with the fuel treatment and 12 cars without. We input these data by creating a dataset with the mileage rating, andtreatrecords 0 if the mileage corresponds tountreated fuel and 1 if it corresponds to treated Equality tests on unmatched data 3. use ranksum mpg, by(treat)Two-sample wilcoxon rank -sum (Mann-Whitney) testtreatobs rank sum expecteduntreated12 128 150treated12 172 150combined24 300 300unadjusted variance for ties variance : mpg(treat==untreated) = mpg(treat==treated)z = > |z| = results indicate that the medians are not statistically different at any level smaller than , the median test.

Median mpg, by(treat) exactMedian testGreaterwhether car receivedthan thefuel additivemedianuntreated treatedTotalno7 512yes5 712 Total12 1224 Pearson chi2(1) = Pr = s exact = Fisher s exact = corrected:Pearson chi2(1) = Pr = to reject the null hypothesis that there is no difference between the fuel with the additive andthe fuel without the these results from these two tests with those obtained from thesignrankandsigntestwhere we found significant differences; see [R]signrank. An experiment run on 24 different cars isnot as powerful as a before-and-after comparison using the same 12 resultsranksumstores the following inr():Scalarsr(N1)sample sizen1r(N2)sample sizen2r(z)zstatisticr(Vara)adjusted variancer(group1)value of variable for first groupr(sumobs)actual sum of ranks for first groupr(sumexp)expected sum of ranks for first groupr(porder)probability that draw from first population is larger than draw from second population4 ranksum Equality tests on unmatched datamedianstores the following inr().

Scalarsr(N)sample sizer(chi2)Pearson s 2r(p)significance of Pearson s 2r(pexact)Fisher s exactpr(groups)number of groups comparedr(chi2cc)continuity-corrected Pearson s 2r(pcc)continuity-corrected significancer(p1exact)one-sided Fisher s exactpMethods and formulasFor a practical introduction to these techniques with an emphasis on examples rather than theory,see Acock (2014), Bland (2000), or Sprent and Smeeton (2007). For a summary of these tests , seeSnedecor and Cochran (1989).Methods and formulas are presented under the following headings:ranksummedianranksumFor the wilcoxon rank -sum test, there are two independent random variables,X1andX2, and wetest the null hypothesis thatX1 X2. We have a sample of sizen1fromX1and another of data are then ranked without regard to the sample to which they belong. If the data are tied,averaged ranks are used.

wilcoxon s test statistic (1945) is the sum of the ranks for the observationsin the first sample:T=n1 i=1R1iMann and Whitney sUstatistic (1947) is the number of pairs(X1i, X2j)such thatX1i> statistics differ only by a constant:U=T n1(n1+ 1)2 Again Fisher s principle of randomization provides a method for calculating the distribution ofthe test statistic, ties or not. The randomization distribution consists of the(nn1)ways to choosen1ranks from the set of alln=n1+n2ranks and assign them to the first is a straightforward exercise to verify thatE(T) =n1(n+ 1)2andVar(T) =n1n2s2nwheresis the standard deviation of the combined ranks,ri, for both groups:s2=1n 1n i=1(ri r)2ranksum Equality tests on unmatched data 5 This formula for the variance is exact and holds both when there are no ties and when there areties and we use averaged ranks. (Indeed, the variance formula holds for the randomization distributionof choosingn1numbers from any set ofnnumbers.)

Using a normal approximation, we calculatez=T E(T) Var(T)When theporderoption is specified, the probabilityp=Un1n2is noteWe follow the great majority of the literature in naming these tests for wilcoxon , Mann, andWhitney. However, they were independently developed by several other researchers in the late 1940sand early 1950s. In addition to wilcoxon , Mann, and Whitney, credit is due to Festinger (1946),Whitfield (1947), Haldane and Smith (1947), and Van der Reyden (1952). Leon Festinger (1919 1989),John Burdon Sanderson Haldane (1892 1964), and Cedric Austen Bardell Smith (1917 2002) arewell known for other work, but little seems to be known about Whitfield or van der Reyden. For adetailed study, including information on these researchers, see Berry, Mielke, and Johnston (2012).medianThe median test examines whether it is likely that two or more samples came from populationswith the same median.

The null hypothesis is that the samples were drawn from populations withthe same median. The alternative hypothesis is that at least one sample was drawn from a populationwith a different median. The test should be used only with ordinal or interval that there are score values forkindependent samples to be compared. The median testis performed by first computing the median score for all observations combined, regardless of thesample group. Each score is compared with this computed grand median and is classified as beingabove the grand median, below the grand median, or equal to the grand median. Observations withscores equal to the grand median can be dropped, added to the above group, added to the below group, or split between the two all observations are classified, the data are cast into a 2 kcontingency table, and a Pearson schi-squared test or Fisher s exact test is ranksum Equality tests on unmatched data Henry Berthold Mann (1905 2000) was born in Vienna, Austria, where he completed a doctoratein algebraic number theory.

He moved to the United States in 1938 and for several years madehis livelihood by tutoring in New York. During this time, he proved a celebrated conjecture innumber theory and studied statistics at Columbia with Abraham Wald, with whom he wrote threepapers. After the war, he taught at Ohio State and the Universities of Wisconsin and addition to his work in number theory and statistics, he made major contributions to algebraand Ransom Whitney (1915 2007) studied at Oberlin, Princeton, and Ohio State Universitiesand worked at the latter throughout his career. His PhD thesis under Henry Mann was onnonparametric statistics. It was this work that produced the test that bears their names. ReferencesAcock, A. C. Gentle Introduction to Stata . 4th ed. College Station, TX: Stata , K. J., P. W. Mielke, Jr., and J. E. Johnston. 2012.

The two-sample rank -sum test: Early for History of Probability and Statistics8: 1 , M. Introduction to Medical Statistics. 3rd ed. Oxford: Oxford University , R. M. 2012. What hypotheses do nonparametric two-group tests actually test? Stata Journal12: 182 , A. H. 2002. Power by Journal2: 107 , L. 1946. The significance of difference between means without reference to the frequency : 97 , R. A. Design of Experiments. Edinburgh: Oliver & , R. 1997. sg69: Immediate Mann Whitney and binomial effect-size Technical Bulletin36:29 31. Reprinted inStata Technical Bulletin Reprints, vol. 6, pp. 187 189. College Station, TX: Stata , J. B. S., and C. A. B. Smith. 1947. A simple exact test for birth-order of Human Genetics14: 117 , T., and J. W. Hardin. 2013. Exact wilcoxon signed- rank and wilcoxon Mann Whitney ranksum : 337 , W.

Syntax - Stata

Tags:

Information

Transcription of Syntax - Stata

Related search queries

Syntax - Stata

Tags:

Information

Documents from same domain

Related documents

Related search queries