Example: bankruptcy

Non-parametric tests - Main < Vanderbilt …

Non-parametric testsNon- parametric Robbins Scholars SeriesJune 24, 20101 / 30 Non-parametric testsOutlineOne Sample Test: Wilcoxon Signed-RankTwo Sample Test: Wilcoxon Mann WhitneyConfidence IntervalsSummary2 / 30 Non-parametric testsIntroductionIT- tests : tests for the means of continuous dataIOne sampleH0: = 0versusHA: 6= 0 ITwo sampleH0: 1 2= 0 versusHA: 1 26= 0 IUnderlying these tests is the assumption that the data arisefrom a normal distributionIT- tests do not actually require normally distributed data toperform reasonably well in most circumstancesIParametric methods: assume the data arise from adistribution described by a few parameters (Normaldistribution with mean and variance 2).

Non-parametric tests When to use non-parametric methods I With correct assumptions (e.g., normal distribution), parametric methods will be more e cient / powerful than non-parametric methods but often not as much as you might

Tags:

  Tests, Parametric, Non parametric tests

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Non-parametric tests - Main < Vanderbilt …

1 Non-parametric testsNon- parametric Robbins Scholars SeriesJune 24, 20101 / 30 Non-parametric testsOutlineOne Sample Test: Wilcoxon Signed-RankTwo Sample Test: Wilcoxon Mann WhitneyConfidence IntervalsSummary2 / 30 Non-parametric testsIntroductionIT- tests : tests for the means of continuous dataIOne sampleH0: = 0versusHA: 6= 0 ITwo sampleH0: 1 2= 0 versusHA: 1 26= 0 IUnderlying these tests is the assumption that the data arisefrom a normal distributionIT- tests do not actually require normally distributed data toperform reasonably well in most circumstancesIParametric methods: assume the data arise from adistribution described by a few parameters (Normaldistribution with mean and variance 2).

2 INonparametric methods: do not make parametric assumptions(most often based on ranks as opposed to raw values)IWe discuss Non-parametric alternatives to the one and twosample / 30 Non-parametric testsExamples of when the parametric t-test goes wrongIExtreme outliersIExample:t-test comparing two sets of measurementsISample 1: 1 2 3 4 5 6 7 8 9 10 ISample 2: 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ISample averages: and , T-test p-valuep= :t-test comparing two sets of measurementsISample 1: 1 2 3 4 5 6 7 8 9 10 ISample 2: 7 8 9 10 11 12 13 14 15 16 17 18 19 20200 ISample averages: and , T-test p-valuep= / 30 Non-parametric testsExamples of when the parametric t-test goes wrongIT-statistict=x1 x2s 1n1+1n2 IFor two sample testss2=(n1 1)s21+ (n2 1)s22n1+n2 2 IIn the first datasetIs21= ,s22= the second datasetIs21= ,s22= 23355 / 30 Non-parametric testsExamples of when the parametric t-test goes wrongIUpper detection limitsIExample.

3 Fecal calprotectin was being evaluated as a possiblebiomarker of Crohn s disease severityIMedian can be calculated (mean cannot)6 / 30 Non-parametric testsFecal CalprotectinNo or Mild ActivityModerate or Severe Activity05001000150020002500n = 8n = 18 Above Detection Limit7 / 30 Non-parametric testsWhen to use Non-parametric methodsIWith correct assumptions ( , normal distribution), parametric methods will be more efficient / powerful thannon- parametric methods but often not as much as you mightthink1 IIf the normality assumption grossly violated.

4 Nonparametrictests can be much more efficient and powerful than thecorresponding parametric testINon- parametric methods provide a well-foundationed way todeal with circumstance in which parametric methods large-sample efficiency of the Wilcoxon test compared to thettest is3 = / 30 Non-parametric testsNon- parametric methodsIMany Non-parametric methods convert raw values to ranksand then analyze ranksIIn case of ties, midranks are used, , if the raw data were105 120 120 121 the ranks would be 1 4 parametric TestNonparametric Counterpart1-sampletWilcoxon signed-rank2-sampletWilcoxon 2-sample rank-sumk-sample ANOVAK ruskal-WallisPearsonrSpearman 9 / 30 Non-parametric testsOne Sample Test: Wilcoxon Signed-RankOne sample testsINon- parametric analogue to the one sample always used on paired data where the column ofvalues represents differences ( ,D=Ypost Ypre).

5 ISign test:the simplest test for the median difference beingzero in the populationIExamine all values ofDafter discarding those in which D=0 ICount the number of positive DsITestsH0:Prob[D>0] =12versusHA:Prob[D>0]6=12 IUnderH0it is equally likely in the population to have a valuebelow zero as it is to have a value above zeroINote that it ignores magnitudes completely it is inefficient(low power)10 / 30 Non-parametric testsOne Sample Test: Wilcoxon Signed-RankOne sample tests : Wilcoxon signed rankIIn the pre-post analysisID = pre - postIRetain the sign of D ( +/-)IRank = rank of|D|(absolute value of D)ISigned rank, SR = Sign * RankIBase analyses on SRIO bservations with zero differences are ignoredIExample: A pre-post studyPostPreDSignRank of|D|Signed + + + / 30 Non-parametric testsOne Sample Test.

6 Wilcoxon Signed-RankOne sample testsIA good approximation to an exactP-value (not discussed)may be obtained by computingz= SRi SR2i,where the signed rank for can then compare|z|to the normal ,z=7 and bysurfstatthe 2-tailedP-valueis all differences are positive or all are negative, the exact2-tailedP-value is12n 1 IThis implies thatnmust exceed 5 for any possibility ofsignificance at the = level for a 2-tailed test12 / 30 Non-parametric testsOne Sample Test: Wilcoxon Signed-RankOne sample testsISleep DatasetICompare the effects of two soporific subject receives Drug 1 and Drug 2 IStudy question: Is Drug 1 or Drug 2 more effective atincreasing sleep?

7 IDependent variable: Difference in hours of sleep comparingDrug 2 to Drug 1IH0: For any given subject, the difference in hours of sleep isequally likely to be positive or negative13 / 30 Non-parametric testsOne Sample Test: Wilcoxon Signed-RankSubjectDrug 1 Drug 2 Diff (2-1) +83 + + + + + + +6 Table: Hours of extra sleep on drugs 1 and 2, differences, signs and ranksof sleep study data14 / 30 Non-parametric testsOne Sample Test: Wilcoxon Signed-RankOne sample / paired test exampleIApproximate p-value calculation9 i=1 SRi= 39, 9 i=1SR2i= = , and the two sided test yields a p-value equal to2*( )= signed rank test statistical program outputWilcoxon signed rank testdata: = 42, p-value = hypothesis: true location is not equal to 0 IThus, we rejectH0and conclude Drug 2 increases sleep bymore hours than Drug 1 (p= )15 / 30 Non-parametric testsOne Sample Test.

8 Wilcoxon Signed-RankOne sample / paired test exampleIWe could also perform sign test on sleep dataIIf drugs are equally effective, we should have same number ofpositives and negatives ( , Prob(D>0)=.5).IAnalogous to coin flip example from last the observed data: 1 negative and 8 positives (we throw out1 no change )IOne sided p-value: probability of observing 0 or 1 negativesITwo sided p-value: probability of observing 0, 1, 8, or 9negativesIp= , rejectH0at = / 30 Non-parametric testsOne Sample Test: Wilcoxon Signed-RankWilcoxon signed rank testIAssumes the distribution of differences is symmetricIWhen the distribution is symmetric, the signed rank test testswhether the median difference is zeroIIn general it tests that, for two randomly chosen observationsiandjwith values (differences)xiandxj, that the probabilitythatxi+xj>0 is12 IThe estimator that corresponds exactly to the test in allsituations is the pseudomedian, the median of all possiblepairwise averages ofxiandxj, so one could say that thesigned rank test testsH0.

9 Pseudomedian=017 / 30 Non-parametric testsOne Sample Test: Wilcoxon Signed-RankITo testH0: = 0, where is the population median (not adifference) and 0is some constant, we create thenvaluesxi 0and feed those to the signed rank test, assuming thedistribution is symmetricIWhen all nonzero values are of the same sign, the test reducesto thesign testand the 2-tailedP-value is (12)n 1wherenisthe number of nonzero values18 / 30 Non-parametric testsTwo Sample Test: Wilcoxon Mann WhitneyTwo sample WMW testIThe Wilcoxon Mann Whitney (WMW) 2-sample rank sumtest is for testing for equality of central tendency of twodistributions (for unpaired data)IRanking is done by combining the two samples and ignoringwhich sample each observation came fromIExample:Females120118121119 Males124120133 Ranks for for / 30 Non-parametric testsTwo Sample Test.

10 Wilcoxon Mann WhitneyTwo sample WMW testIDoing a 2-samplet-test using these ranks as if they were rawdata and computing theP-value against 4+3-2=5 willwork quite wellILoosely speaking the WMW test tests whether the populationmedians of the two groups are the sameIMore accurately and more generally, it tests whetherobservations in one population tend to be larger thanobservations in the otherILettingx1andx2respectively be randomly chosenobservations from populations one and two, WMW testsH0:C=12, whereC=Prob[x1>x2]20 / 30 Non-parametric testsTwo Sample Test: Wilcoxon Mann WhitneyTwo sample WMW testIWilcoxon rank sum test statisticW=R n1(n1+ 1)2where R is the sum of the ranks in group 1 IUnderH0, w=n1n22and w= n1n2(n1+n2+1)12, andz=W w wfollow a N(0,1) / 30 Non-parametric testsTwo Sample Test: Wilcoxon Mann WhitneyTwo sample WMW testITheCindex (concordance)


Related search queries