Transcription of THE ELEMENTS 0F QUANTITATIVE ANALYSIS KYLE …
1 CHAPTER11 KYLEGORMANANDDANIELEZRAIOHNSONA sociolinguistwhohasgatheredsomuchdatatha tithasbecomedifficulttomakesenseoftheraw observationsmayturntographicalpresentati on,andtodescriptivestatistics,techniques fordistillingacollectionofdataintoafewke ynumericalvalues,allowingtheresearcherto focusonspecific,meaningfulpropertiesofth edataset(seelohnsoninpress).However,asoc iolinguistisrarelysatisfiedwithameresnap shotoflinguisticbehavior,anddesiresnotju sttodescribe,butalsotoevaluatehypotheses abouttheconnectionsbetweenlinguisticbeha vror,speakers, ( ,Lucas,Bayley,8rValli2001:43).Asocioling uistwhosuspectsthatwomenandmen111acertai nspeechcommunitydifferintherateatwhichth eyrealizethefinalconsonantofawordendingi n<ing>withcoronal[n]ratherthanvelar[1]]
2 Wouldcollecttokensofthesewordsinthespeec hofwomenandmen, ,intheformofadescrip tivestatisticoranappropriategraph,coulds uggestthatwomendifferfrommenintherateatw hichtheyusethesecompetingvariants,theset ech, , ,however, ,asingleinterviewmakesuponlyatinyfractio nofanyspeaker slifetimeoflanguage, ,wheretherearealwaysmorepossiblesubjects torunorstimulitopresent, ,itisalwayspossiblethatthesamplediffersq uantitativelyfromthepopuelation, ,butthewomeninasample,forinstance,maynot berepresenta ,usuallyanobserveddifference,inthesample doesextendtothepopulationiscalledthealte rnativehypothesis,whereastheopposingview thatthereisnorealdiffer ,ifasociolinguistisinterestedintheassoci ationbetweengenderandspeechrate,thenthen ullhypothesisisthatspeechrateisconstanta crossgenders, ( ,aZ-score,t statistic,F statistic,orchi-squarestatistic),thencom putetheprobability,henceforththep-value, thatateststatisticaslargeorlargerwouldha veoccurredunderthenullhypothesis( ,nodiffer enceinthepopulation).
3 Althoughthisthresholdisarbitrary,aresult wherep< ]Sciences, ,p< lation. intheforegoingexample,thealternativehypo thesisonlyrequiresthattherebesomediffere ncebetweengroups, , ,asthelabel significant , ,generallywithhelpfromacomputer,tocalcul ateateststatisticandp-valuefromasetofdat a; ,thecontentsofthesampleareshapedbyconven iencefactors,suchasspeakers forinstance,aresearcherinterestedinstigm atizedspeechmayunfor-tunatelydiscovertha tlow-prestigespeakersaretheleastlikelyto agreetoaninterviewwithastranger ,theresearchermaydeployproportionalstrat ifiedsampling( ,Cedergren1973);ifthepopulationconsistso fmiddleaclaSsspeak-ers,whoaccountfor25pe rcentofthepopulation,andworkingclassspea kers,accountingfortheremaining75percent, theresearcherensuresthatthis1:3ratioofmi ddle toworking classspeakers{andtokens} (Bayley2002:118).
4 Whileitisinsomesenseimpossibletoincludee verypredictorthatmightberelevanttotheout comesofinterest,astatisticalmodelisoflit tleuseforinferringacausalconnectionbetwe enpredictorsandoutcomesifoneormoreimport antpredictorshavebeenomitted,Forinstance , ,andfindsthatbotharesignificant, , ,butwhentheyareQUANTITATIVEANALYSIS217co mbinedinthesameregressionmodel,onlyoneof thetwotag,phonologicalcontext)issignific anttheotherpredictor(cg,grammaticalcateg ory)issaidtohavebeensuppressedleg,Taglia monte&Templezoos).Suchasituationcouldari seifthetwopredictorsarecorrelated,forexa mple,ifcertaingrammati-calcategoriestend toco-occurwithcertainphonologicalcontext s( , ),but.
5 Dictorsstandinacausalrelationshipwiththe outcome( ,bothphonologicalcontextandgrammaticalca tegoryincreaserateofdeletion), , orthogonal, thatis, linear( ,stronglynonorthogonal) (2010)givesanexampleOfaspurioussocioling uisticfindingduetomulticollinearitybetwe enmeasuresofsocioeconomicstatus,anddemon stratesthemethodofresidualiaation, , ,bothinthefieldandthelaboratory,togather manydatapointsfromeachspeakerorsubject, ,itisnecessarytodistinguishbetweenagende reffectinthepopulationandthepresenceinth esampleofafewspeakerswhojusthappentobema leandfurthermoreare outliers fromtherestofthesample; ,evenaftergender,age,andsocialstatusaret akenintoaccount(Guy1980,1991:5),speakeri dentityisastrongprefdictoroflinguisticbe havior, ,etc;everytokenfrom CelesteS.
6 Alsohasthesamevalueforthegenderpredictor ( female"),age(45),etc, , whetherpredictorsoroutcomesionacontinuou sorintegerscale,butconvertsthesevaluesto afew valued(oftenbinary) {( 2 totreatdatathatarenaturallymanpvaluedasa fewvvalueciscale}itusuallyincreasesthech anceofTypellerror,theerroroffailingtorej ectthenullhypothesisinthecasewhenthisnul ihypothesisisinfactfalse(Cohen1983).Ifar esearcherpositsasoundchangeinprogressina speechcommunity,thena78 yeareoldspeakershouldbelessadvancedwithr especttothischangethana60-year oldspeaker,butifthesetwospeakersareplace dtogetherintothe 60yearsofageandolder bin, :binningusuallyrequirestheresearchertoar bitrarilychoosethenumberandlocationofthe cutpointts)betweenbins, foundereffect ofVARBRUL anditsdescendants, ,itisincorrecttoassumethatVARBRUL Sfeaturesetdelimitsthesetofpossiblesocio linguisticanalyses,andtheuseofcontinuous predictorsand/oroutcomesinsociolinguisti csdatesbackatleastasfarasLennig s(1978') ,andmorespecifically, ,whichanumberofstudieshavefoundtobecurvi lin ear,withinteriorsocialclassesusingthehig hestratesofanonstandardvariantofastablel inguisticvariable(Labov2001:3if.)
7 Insuchcases,theappropriateresponsetothis problem,though,isnotadhocdichotomization ,butratherfortheresearchertoexplorethere lationshipsobservedinthedata( ,byplottingthepredictorandoutcome),andch oosingappropriate transformations , ,theexemplartheoryoflenition( ,Bybee2002)predictsarelationshipbetweent helogarithmofwordfrequencyandtherateofle nition, (2001:16 26) , (categoricalQUANTITATIVEANALYSIS219la, ).Ihefollowmgsectionconsidersmethodsorco ntinuousoutcomes,Withafocusonacousticmea surementsofvowelsTheconcludingsectiondis cussessomerecenttrendsinthefieldofstatis ticsofrel-evancetosociolinguists. METHODSFORBINARYVARIABLESI nterpretingCross-TabulationsManyquantita tivesociolinguisticstudiescomparetwodist inctdiscretesen,tlcallyequivalentvariant sincomplementarydistribution.
8 :3mmThe clii ,WilliamLabovelicitedtokensofthephrasefo urthfloor"fromemployeesinthreeManhattand epartmentstoresforthepurposeofstudyingth esocialstratificationofpost (Labov2006:chapter4)firstoch lishedin1966,doesnotincludeanyinferentia lstatistics,thecross:tabul:i)tio-oftheda ta( , )lendsitselftoaSimlestat' , spronouncepost-vocalicrin125tokens,anddo notin211tokens;rispresent aerctofthetime(:125/336). ,thedepartmentstorerepresentinothe;upper class,hasa48percentrateof1 'effectisduetochance,thesecountsareusedt ocomputeateststatisticcall1:Pearson schi- :abilityofateststatisticofthissizeorlarg erbeingobtainedforasamle112::Sizesimplyb ychanceusingthetwo-tailedchi squaredistributionTffe13115representingt hispossibilityisp:LIE-16, 'isuetoI CJEC ithenullhypothesisthattherearenodifferen cesintheErealizdt'on:Iarlipnlgtl egdifferentdepartmentstores,andtheaverag eratesof1 presenceiii:u(.
9 Highe::eUCi:ydCil<:a:t:e:-hatpost-vocali crisrealizedmoreoftenbyspeakersfromFishe r sexacttest,Thechi squaretestisnotveryappropriateforsmall:E iounts{ofd a tassinceitisbasedonanapproximationthatis trueundertheViouslyfalseassumptionofanmf initesample; ,Wefavorsrelatedtech:quenownas["lShCIsex acttest,whichcomputesthe exact"( ) ,theFisherpevalue220 METHODOLOGIESANDAPPROACHI SissomewhatsmallerthanthePearsonchi squarep value{ ), valueisoftendifficulttocomputebyhand,but sinceitcanbecomputedforhugedatasetsbyamo derncomputerintheblinkofaneye,itshouldal waysbeusedratherthanthechi ,Laborfeignedmisunderstandingafterthefir st fourthfloor, usuallycausingthespeakertorepeathim orherself, ,Labovrecordedwhethereachtokencomesfrom fourth or floor.]}}
10 " ;wordanddepartmentstorearesignificantpre -dictors, ,itisprefer ,thep ,whichpredictsbinaryoutcomeusingoneormor eindependentpredictorts),andwhichwillbef amiliartomanyreadersasthemodelunderlying VARBRUL, ,theoutcomeiseitherrorzero;thepredictors ,allcategorical,areword( fourth vs. floor ),repetition( ),andstore( s}.Modernregressionsoftwarealsoallowsthe usertoincludewhataregenerallycalledinter actioneffects, (r)cross-tabulation,chi square,andFisherexacttestxi1 azeroat]rpivaluepevaluc(chi-square)(Fish erexact) fourth o7"loor" , 'I'IVEANALYSESIn.)[Jt_1inthiscase,aninte ractionbetweenwordanddepartmentstoreallo wstheresearchertoprobewhether,inaddition toanydifferencesbetween fourth"and floor andthedifferentdepartmentstores,thereisa nydifferenceinthedifferencebetween fourth and floor" fourth versus floor atSaksdifferentfrom fourth"versus floor s?