Example: tourism industry

Sample Size - vanbelle.org

2 SampleSizeThe rstquestionfacedbya statisticalconsultant,andfrequentlythela st,is, Howmany subjects(animals,units)doI need? , TypeII error, andone sidedversustwo questionistosettlethetypeofvariable(endp oint)theconsulteehasinmind:Isit continuous,discrete,orsomethingelse?Forc ontinuousmeasurementsthenormaldistributi onisthedefaultmodel,fordistributionswith binaryoutcomes, samplesizecalculation,foroneortwo groups,are:TypeI Error( )Probabilityofrejectingthenullhypothesis whenit is trueTypeII Error( )Probabilityofnotrejectingthenullhypothe siswhenit is falsePower=1 Probabilityofrejectingthenullhypothesisw henit is false 20and 21 Variancesunderthenullandalternative hypothe ses(maybethesame) 0and 1 Meansunderthenullandalternative hypothesesn0andn1 Samplesizesintwo groups(maybethesame)Thechoiceofthealtern ative hypothesisis they knew thevalueofthealternative hypothesis,they wouldnotneedtodothestudy.

BEGIN WITH A BASIC FORMULA FOR SAMPLE SIZEŒLEHR’S EQUATION 29 right (equivalent to increasing the distance between null and alternative hypotheses)

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Sample Size - vanbelle.org

1 2 SampleSizeThe rstquestionfacedbya statisticalconsultant,andfrequentlythela st,is, Howmany subjects(animals,units)doI need? , TypeII error, andone sidedversustwo questionistosettlethetypeofvariable(endp oint)theconsulteehasinmind:Isit continuous,discrete,orsomethingelse?Forc ontinuousmeasurementsthenormaldistributi onisthedefaultmodel,fordistributionswith binaryoutcomes, samplesizecalculation,foroneortwo groups,are:TypeI Error( )Probabilityofrejectingthenullhypothesis whenit is trueTypeII Error( )Probabilityofnotrejectingthenullhypothe siswhenit is falsePower=1 Probabilityofrejectingthenullhypothesisw henit is false 20and 21 Variancesunderthenullandalternative hypothe ses(maybethesame) 0and 1 Meansunderthenullandalternative hypothesesn0andn1 Samplesizesintwo groups(maybethesame)Thechoiceofthealtern ative hypothesisis they knew thevalueofthealternative hypothesis,they wouldnotneedtodothestudy.

2 Thereis alsodebateaboutwhichis thenullhypothesisandwhichis the2728 SAMPLESIZEPS fragreplacementsH0: 0 1= 0H1: 0 1= 0 =2 =2S:E:= q2nS:E:= q2nR CriticalValuePower= 1 y0 y1|{z} |{z}0+z1 =2 q2n z1 sidedalternative,equalvariancesundernull andalternative , siteis safeorhazardousasthenullhypothesis?Milla rd(1987a) a dif , inmostresearchsettingsthenullhypothesisi s reasonablyassumedto a needtobecomefamiliarwiththeresearchareai nordertobeofmorethanmarginalusetotheinve stigator. Intermsofthealternative hypothesis,it is salutaryto readthecommentsofWright(1999)in a completelydifferentcontext,butveryapplic abletotheresearcher: analternative senseofthedata,dosowithanessentialsimpli city, andshedlightonotherareas.

3 Thisprovidessomechallengingguidancetothe selectionofanalternative , TypeI error, is canrefertotheerrorassuch,ortheprobabilit yofmakinga TypeI error. It willusuallybeclearfromthecontextwhichis ,whetherthetestis one sidedortwo sided,andtheprobabilityofa TypeI error( ) hypothesisthende nesthepowerandtheTypeII error( ).Noticethatmovingthecurve associatedwiththealternative hypothesistotheBEGINWITHA BASICFORMULAFORSAMPLESIZE LEHR'SEQUATION29right(equivalenttoincrea singthedistancebetweennullandalternative hypotheses)increasestheareaofthecurve overtherejectionregionandthusincreasesth epower. Thecriticalvaluede nestheboundarybetweentherejectionand samplesituation:0 +z1 =2 r2n= z1 r2n:( )If thevariances,andsamplesizes,arenotequal, thenthestandarddeviationsinequation( )arereplacedbythevaluesassociatedwiththe nullandalternativehypotheses,andindividu alsamplesizesareinsertedasfollows,0 +z1 =2 0r1n0+1n1= z1 s 20n0+ 21n1:( )Thisformulationisthemostgeneralandisthe basisforvirtuallyalltwo samplesituationsbyassumingthatoneofthesa mpleshasanin BASICFORMULAFORSAMPLESIZE LEHR'SEQUATIONI ntroductionStartwiththebasicsamplesizefo rmulafortwo groups,witha two sidedalternative,normaldistributionwithh omogeneousvariances( 20= 21= 2) andequalsamplesizes(n0=n1=n).

4 RuleofThumbThebasicformulaisn=16 2;( )where = 0 1 = ( )isthetreatmentdifferencetobedetectedinu nitsofthestandarddeviation samplecasethenumeratoris8 singlesampleis comparedwitha thestandardizeddifference, , is ,then16=0:52= thestudyrequiresonlyonegroup,thena totalof30 SAMPLESIZET able ,Equation( );Two SidedAlternative Hypothesis,TypeI Error, = 0:05 TypeIIPowerNumeratorforError1 SampleSizeEquation( ) PowerOneSampleTwo samplescenariowillrequire128subjects,the one samplescenarioone fourthofthatnumber. Thisillustratestherulethatthetwo samplescenariorequiresfourtimesasmany observationsastheone samplesituationtwo meanshave tobeestimated,doublingthevariance,and,ad ditionally, requirestwo populationmeans, 0and 1, withcommonvariance, 2, isn=2 z1 =2+z1 2 0 1 2:( )Thisequationis derivedfromequation( ).

5 For = 0:05and = 0:20thevaluesofz1 =2andz1 ,respectively;and2(z1 =2+z1 )2= 15:68,whichcanberoundedupto16, appearsinSnedecorandCochran(1980),theequ ationwassuggestedbyLehr(1992).Thetwo key ingredientsarethedifferencetobedetected, = 0 1, andtheinherentvariabilityoftheobservatio nsindicatedby 2. Thenumeratorcanbecalculatedforothervalue sofTypeI andTypeII error. frequentlyusedto evaluatenew drugsinPhaseIIIclinicaltrials(usuallydou bleblindcomparisonsofnew drugwithplaceboorstandard);seeCALCULATIN GSAMPLESIZEUSINGTHECOEFFICIENTOFVARIATIO N31 Lakatos(1998).Oneadvantageofa thatit basestheinferencesoncon mostcommonsamplesizesituationsinvolve oneortwo 8 fortheone samplecase,thisillustratesthatthetwo samplesituationrequiresfourtimesasmany observationsastheone con theresearcherdoesnotknowthevariabilityan dcannotbeledtoanestimate,thediscussionof samplesizewillhave to beaddressedin lackofknowledgeaboutvariabilityofthemeas urementsindicatesthatsubstantialeducatio nis ( )canbeusedtocalculatedetectabledifferenc efora givensamplesize,n.

6 Invertingthisequationgives =4pn;( )or 0 1=4 pn:( )Inwords,thedetectablestandardizeddiffer enceinthetwo samplecaseis (non standardized)differenceis samplecasethenumerator4is replacedby2, andtheequationis interpretedasthedetectabledeviationfroms omeparametervalue . poweranddetectabledifferencesforthecaseo fTypeI gurealsocanbeusedforestimatingsamplesize sinconnectionwithcorrelation, (71).Thisruleofthumb,representedbyequati on( ),is consultingsession: Whatkindoftreatmenteffectareyouanticipat ing? Oh,I'mlookingfora 20%changeinthemean. Mmm,andhow muchvariabilityis thereinyourobservations? About30% Thedialogueindicateshow researchersfrequentlythinkaboutrelative treatmenteffectsandvariability.

7 How toaddressthisquestion?It turnsout,fortuitously, = number per group (two samples)0501001500255075n = number in group (one Sample ) : PSfragreplacementsj jj handsideforone linkedspeci , 0 1 0= 1 1 0:( ). Thequestionis :n=16(CV)2(ln 0 ln 1)2;( )whereCVis thecoef cientofvariation(CV= 0= 0= 1= 1).CALCULATINGSAMPLESIZEUSINGTHECOEFFICI ENTOFVARIATION33 Table RangeofCoef cientsofVariationandPercentageChangeinMe ans:Two SampleTests,Two sidedAlternativeswithTypeI ( ) cient1513733148321of20244582514632 Variation3054813055291263in4097423197522 1106 Percent50>1000361152813216975>1000811341 1817135`9100>1000>10006063221266234 IllustrationForthesituationdescribedinth econsultingsessionratioofthemeansis calculatedtobe1 0:20 = 0:80andthesamplesizebecomesn=16(0:30)2(l n0:80)2= 28.

8 9 29 thetreatmentis tobecomparedwitha standard,thatis,onlyonegroupis needed, cationofa coef cientofvariationimpliesthatthestandardde viationis pro stabilizethevariancea it is shownthatthevarianceinthelogscaleis approximatelyequaltothecoef ,toa rstorderapproximation, ( )forvaluesofCVrangingfrom5%to100%andvalu esofPCrangingfrom5%to50%. waytoestimatingsamplesizesforthesituatio nwherethespeci cationsaremadeintermsofpercentagechangea ndcoef :50:450:40:350:30:250:2501001502000n=num berpergroup(two samples)Percentchange, meanscanbede nedin two waysusingeither 0or 1inthedenominator. Supposewede ne = ( 0+ 1)=2, thatis,justthearithmeticaverageofthetwo meansandde nethequantity,PC= 0 1 :( )Thenthesamplesizeis estimatedremarkablyaccuratelybyn=16 CV2P C2:( ).

9 Sometimestheresearcherwillnothave any ideaaboutthevariability. In biologicalsystemsa coef cientofvariationof35%is handyruleforsamplesizeis then,n=2P C2:( )NOFINITEPOPULATIONCORRECTIONFORSURVEYSA MPLESIZE35 Forexample,underthisscenarioa 20% good (aftersomealgebra)producesn= 49 ForadditionaldiscussionseevanBelleandMar tin(1993). SURVEYI ntroductionSurvey samplingquestionsarefrequentlyaddressedi ntermsofwantingtoknowapopulationproporti onwitha speci (whichmakesz1 = 0).Thedenominatorforthetwo samplesituationthenbecomes8,and4 fortheonesamplesituation( ).Survey samplingtypicallydealswitha nitepopulationofsizeNwitha cor cally, a sampleofsizenis takenwithoutreplacementfroma populationofsizeN, , xisSE( x) =rN nnN :( )RuleofThumbThe nitepopulationcorrectioncanbeignoredin initialdiscussionsofsurvey takenwithoutreplacementfroma nitepopulationcorrectionleadstoa nitepopulationcorrectioncanbewrittenasSE ( x) =1pnr1 nN :( )36 SAMPLESIZES othe nitepopulationcorrection,r1 nN, isa numberlessthanone,andthesquarerootoperat ionpullsit thesampleis 10%ofthepopulation,the nitepopulationcorrectionis0.

10 95orthereisa 5% thepopulationis verylarge, as is frequentlythecase,the nitepopu lationcorrection,andtheprecisionoftheest imateofthepopulationmeanis propor tionaltopn(for xedstandarddeviation, ). is frequentlyhelpfultobeabletoestimatequick lythevariabilityina (n 1) s nn 1 Range2:( )IllustrationConsiderthefollowingsampleo feightobservations:44,48,52,60,61,63,66, 44= 25. Onthebasisofthisvaluethestandarddeviatio nisbracketedby2:6 s 14:3:( )Theactualvalueforthestandarddeviationis FORMULATEA STUDY SOLELY INTERMSOFEFFECTSIZE37 BasisoftheRuleThisruleisbasedonthefollow ingtwo ,considera sampleofobservationswithrangeRange=xmaxi mum xminimum. Thelargestpossiblevaluesforthestandardde viationis whenhalftheobservationsareplacedat xminimumandtheotherhalfat xmaximum.


Related search queries