Example: confidence

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: HypothesisTesting and ANOVAG oals Introduction to ANOVA Review of common one and two sample tests Overview of key elements of Hypothesis testingHypothesis Testing The intent of Hypothesis Testing is formally examine twoopposing conjectures (hypotheses), H0 and HA These two hypotheses are mutually exclusive andexhaustive so that one is true to the exclusion of theother We accumulate evidence - collect and analyze sampleinformation - for the purpose of determining which ofthe two hypotheses is true and which of the twohypotheses is falseThe Null and Alternative Hypothesis States the assumption (numerical) to be tested Begin with the assumption that the null Hypothesis is TRUE Always contains the = signThe null Hypothesis , H0:The alternative Hypothesis , Ha: I

•Calculate a test statistic in the sample data that is ... candidate gene. We then divide these N individuals into ... Sum of MS F Squares Source of df Variation! SST G k"1! SST E N"k! SST G k"1 SST E N"k. Non-Parametric Alternative • Kruskal-Wallis …

Tags:

  Samples, Candidate

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Lecture 7: Hypothesis Testing and ANOVA

1 Lecture 7: HypothesisTesting and ANOVAG oals Introduction to ANOVA Review of common one and two sample tests Overview of key elements of Hypothesis testingHypothesis Testing The intent of Hypothesis Testing is formally examine twoopposing conjectures (hypotheses), H0 and HA These two hypotheses are mutually exclusive andexhaustive so that one is true to the exclusion of theother We accumulate evidence - collect and analyze sampleinformation - for the purpose of determining which ofthe two hypotheses is true and which of the twohypotheses is falseThe Null and Alternative Hypothesis States the assumption (numerical) to be tested Begin with the assumption that the null Hypothesis is TRUE Always contains the = signThe null Hypothesis , H0:The alternative Hypothesis , Ha.

2 Is the opposite of the null Hypothesis Challenges the status quo Never contains just the = sign Is generally the Hypothesis that is believed to be true bythe researcherOne and Two Sided Tests Hypothesis tests can be one or two sided (tailed) One tailed tests are directional:H0: 1 - 2 0HA: 1 - 2 > 0 Two tailed tests are not directional:H0: 1 - 2 = 0HA: 1 - 2 0P-values After calculating a test statistic we convert this to a P-value by comparing its value to distribution of teststatistic s under the null Hypothesis Measure of how likely the test statistic value is underthe null hypothesisP-value Reject H0 at level P-value > Do not reject H0 at level Calculate a test statistic in the sample data that isrelevant to the Hypothesis being testedWhen To Reject H0 One Sided = region.

3 Set of all test statistic values for which H0 will berejectedCritical Value = Values = and + of significance, : Specified before an experiment to definerejection regionTwo Sided = Notation In general, critical values for an level test denoted as:! One sided test: X"Two sided test: X"/2where X depends on the distribution of the test statistic For example, if X ~ N(0,1):! One sided test: z" ( , )Two sided test: z"/2 ( , = )Errors in Hypothesis TestingH0 TrueH0 FalseDo NotReject H0 Rejct H0 Actual Situation Truth DecisionErrors in Hypothesis TestingCorrect Decision1 - Correct Decision1 - Incorrect Decision Incorrect Decision H0 TrueH0 FalseDo NotReject H0 Rejct H0 Actual Situation Truth DecisionType I and II ErrorsCorrect Decision1 - Correct Decision1 - Incorrect Decision Incorrect Decision H0 TrueH0 FalseDo NotReject H0 Rejct H0 Actual Situation Truth DecisionType II ErrorType I Error)()

4 (ErrorIITypePErrorITypeP==!"! Power = 1 - "Parametric and Non-Parametric Tests Parametric Tests: Relies on theoretical distributions ofthe test statistic under the null Hypothesis and assumptionsabout the distribution of the sample data ( , normality) Non-Parametric Tests: Referred to as DistributionFree as they do not assume that data are drawn from anyparticular distributionWhirlwind Tour of One and Two Sample TestsType of DataBinomialNon-GaussianGaussianGoalChi- Square orFisher s Exact TestWilcoxon-Mann-Whitney TestTwo samplet-testCompare twounpairedgroupsMcNemar s TestWilcoxon TestPaired t-testCompare twopaired groupsBinomial TestWilcoxon TestOne samplet-testCompare onegroup to ahypotheticalvalueGeneral Form of a t-test!)

5 T=x " sn! t",n#1! T=x "y "( 1" 2)sp1m+1n! t",m+n#2 One SampleTwo SampleStatisticdfNon-Parametric Alternatives Wilcoxon Test: non-parametric analog of one sample t-test Wilcoxon-Mann-Whitney test: non-parametric analogof two sample t-testHypothesis Tests of a Proportion Large sample test ( )! z= p "p0p0(1"p0)/n Small sample test ( )- Calculated directly from binomial distributionConfidence Intervals Confidence interval: an interval of plausible values forthe parameter being estimated, where degree of plausibilityspecifided by a confidence level General form:!

6 X critical value" se! x -y t",m+n#2 sp1m+1nInterpreting a 95% CI We calculate a 95% CI for a hypothetical sample mean to bebetween and Does this mean there is a 95%probability the true population mean is between and NO! Correct interpretation relies on the long-rang frequencyinterpretation of probability Why is this so? Hypothesis Tests of 3 or More Means Suppose we measure a quantitative trait in a group of Nindividuals and also genotype a SNP in our favoritecandidate gene. We then divide these N individuals intothe three genotype categories to test whether theaverage trait value differs among genotypes.

7 What statistical framework is appropriate here? Why not perform all pair-wise t-tests?Basic Framework of ANOVA Want to study the effect of one or morequalitative variables on a quantitativeoutcome variable Qualitative variables are referred to as factors( , SNP) Characteristics that differentiates factors arereferred to as levels ( , three genotypes of aSNPOne-Way ANOVA Simplest case is for One-Way (Single Factor) ANOVA The outcome variable is the variable you re comparing The factor variable is the categorical variable being used todefine the groups-We will assume k samples (groups))

8 The one-way is because each value is classified in exactly oneway ANOVA easily generalizes to more factorsAssumptions of ANOVA Independence Normality Homogeneity of variances (aka,Homoscedasticity) The null Hypothesis is that the means are all equal The alternative Hypothesis is that at least one ofthe means is different Think about the Sesame Street game where three ofthese things are kind of the same, but one of thesethings is not like the other. They don t all have to bedifferent, just one of ANOVA : Null Hypothesis0123:kH ====L!

9 H0 : 1= 2=..= kMotivating ANOVA A random sample of some quantitative traitwas measured in individuals randomly sampledfrom population Genotyping of a single SNP AA:82, 83, 97 AG:83, 78, 68 GG:38, 59, 55 Rational of ANOVA Basic idea is to partition total variation of thedata into two sources1. Variation within levels (groups)2. Variation between levels (groups) If H0 is true the standardized variances are equalto one anotherThe Details Let Xij denote the data from the ith level and jth observationAA:82, 83, 97 AG:83, 78, 68GG:38, 59, 55 Our Data: Overall, or grand mean, is:!

10 X ..=xijNj=1J"i=1K"! x ..=82+83+97+83+78+68+38+59+559= ! x (82+83+97)/3= ! x (83+78+68)/3= ! x (38+59+55)/3= Total Variation Recall, variation is simply average squared deviations from the mean SSTSSTGSSTE=+! (xij"x ..j=1J#i=1K#)2! ni (x i."x ..)2i=1K#! (xij"x #i=1K#)2 Sum of squareddeviations about thegrand mean across allN observationsSum of squareddeviations for eachgroup mean aboutthe grand meanSum of squareddeviations for allobservations withineach group from thatgroup mean, summedacross all groupsIn Our ExampleSSTSSTGSSTE=+!


Related search queries