Transcription of THE ELEMENTS 0F QUANTITATIVE ANALYSIS KYLE GORMAN …
1 CHAPTER11 KYLEGORMANANDDANIELEZRAIOHNSONA sociolinguistwhohasgatheredsomuchdatatha tithasbecomedifficulttomakesenseoftheraw observationsmayturntographicalpresentati on,andtodescriptivestatistics,techniques fordistillingacollectionofdataintoafewke ynumericalvalues,allowingtheresearcherto focusonspecific,meaningfulpropertiesofth edataset(seelohnsoninpress).However,asoc iolinguistisrarelysatisfiedwithameresnap shotoflinguisticbehavior,anddesiresnotju sttodescribe,butalsotoevaluatehypotheses abouttheconnectionsbetweenlinguisticbeha vror,speakers, ( ,Lucas,Bayley,8rValli2001:43).Asocioling uistwhosuspectsthatwomenandmen111acertai nspeechcommunitydifferintherateatwhichth eyrealizethefinalconsonantofawordendingi n<ing>withcoronal[n]ratherthanvelar[1]]
2 Wouldcollecttokensofthesewordsinthespeec hofwomenandmen, ,intheformofadescrip tivestatisticoranappropriategraph,coulds uggestthatwomendifferfrommenintherateatw hichtheyusethesecompetingvariants,theset ech, , ,however, ,asingleinterviewmakesuponlyatinyfractio nofanyspeaker slifetimeoflanguage, ,wheretherearealwaysmorepossiblesubjects torunorstimulitopresent, ,itisalwayspossiblethatthesamplediffersq uantitativelyfromthepopuelation, ,butthewomeninasample,forinstance,maynot berepresenta ,usuallyanobserveddifference,inthesample doesextendtothepopulationiscalledthealte rnativehypothesis,whereastheopposingview thatthereisnorealdiffer ,ifasociolinguistisinterestedintheassoci ationbetweengenderandspeechrate,thenthen ullhypothesisisthatspeechrateisconstanta crossgenders, ( ,aZ-score,t statistic,F statistic,orchi-squarestatistic),thencom putetheprobability,henceforththep-value, thatateststatisticaslargeorlargerwouldha veoccurredunderthenullhypothesis( ,nodiffer enceinthepopulation).
3 Althoughthisthresholdisarbitrary,aresult wherep< ]Sciences, ,p< lation. intheforegoingexample,thealternativehypo thesisonlyrequiresthattherebesomediffere ncebetweengroups, , ,asthelabel significant , ,generallywithhelpfromacomputer,tocalcul ateateststatisticandp-valuefromasetofdat a; ,thecontentsofthesampleareshapedbyconven iencefactors,suchasspeakers forinstance,aresearcherinterestedinstigm atizedspeechmayunfor-tunatelydiscovertha tlow-prestigespeakersaretheleastlikelyto agreetoaninterviewwithastranger ,theresearchermaydeployproportionalstrat ifiedsampling( ,Cedergren1973);ifthepopulationconsistso fmiddleaclaSsspeak-ers,whoaccountfor25pe rcentofthepopulation,andworkingclassspea kers,accountingfortheremaining75percent, theresearcherensuresthatthis1:3ratioofmi ddle toworking classspeakers{andtokens} (Bayley2002:118).
4 Whileitisinsomesenseimpossibletoincludee verypredictorthatmightberelevanttotheout comesofinterest,astatisticalmodelisoflit tleuseforinferringacausalconnectionbetwe enpredictorsandoutcomesifoneormoreimport antpredictorshavebeenomitted,Forinstance , ,andfindsthatbotharesignificant, , ,butwhentheyareQUANTITATIVEANALYSIS217co mbinedinthesameregressionmodel,onlyoneof thetwotag,phonologicalcontext)issignific anttheotherpredictor(cg,grammaticalcateg ory)issaidtohavebeensuppressedleg,Taglia monte&Templezoos).Suchasituationcouldari seifthetwopredictorsarecorrelated,forexa mple,ifcertaingrammati-calcategoriestend toco-occurwithcertainphonologicalcontext s( , ),but.
5 Dictorsstandinacausalrelationshipwiththe outcome( ,bothphonologicalcontextandgrammaticalca tegoryincreaserateofdeletion), , orthogonal, thatis, linear( ,stronglynonorthogonal) (2010)givesanexampleOfaspurioussocioling uisticfindingduetomulticollinearitybetwe enmeasuresofsocioeconomicstatus,anddemon stratesthemethodofresidualiaation, , ,bothinthefieldandthelaboratory,togather manydatapointsfromeachspeakerorsubject, ,itisnecessarytodistinguishbetweenagende reffectinthepopulationandthepresenceinth esampleofafewspeakerswhojusthappentobema leandfurthermoreare outliers fromtherestofthesample; ,evenaftergender,age,andsocialstatusaret akenintoaccount(Guy1980,1991:5),speakeri dentityisastrongprefdictoroflinguisticbe havior, ,etc;everytokenfrom CelesteS.
6 Alsohasthesamevalueforthegenderpredictor ( female"),age(45),etc, , whetherpredictorsoroutcomesionacontinuou sorintegerscale,butconvertsthesevaluesto afew valued(oftenbinary) {( 2 totreatdatathatarenaturallymanpvaluedasa fewvvalueciscale}itusuallyincreasesthech anceofTypellerror,theerroroffailingtorej ectthenullhypothesisinthecasewhenthisnul ihypothesisisinfactfalse(Cohen1983).Ifar esearcherpositsasoundchangeinprogressina speechcommunity,thena78 yeareoldspeakershouldbelessadvancedwithr especttothischangethana60-year oldspeaker,butifthesetwospeakersareplace dtogetherintothe 60yearsofageandolder bin.)
7 Binningusuallyrequirestheresearchertoarb itrarilychoosethenumberandlocationofthec utpointts)betweenbins, foundereffect ofVARBRUL anditsdescendants, ,itisincorrecttoassumethatVARBRUL Sfeaturesetdelimitsthesetofpossiblesocio linguisticanalyses,andtheuseofcontinuous predictorsand/oroutcomesinsociolinguisti csdatesbackatleastasfarasLennig s(1978') ,andmorespecifically, ,whichanumberofstudieshavefoundtobecurvi lin ear,withinteriorsocialclassesusingthehig hestratesofanonstandardvariantofastablel inguisticvariable(Labov2001:3if.).Insuch cases,theappropriateresponsetothisproble m,though,isnotadhocdichotomization,butra therfortheresearchertoexploretherelation shipsobservedinthedata( ,byplottingthepredictorandoutcome),andch oosingappropriate transformations , ,theexemplartheoryoflenition( ,Bybee2002)predictsarelationshipbetweent helogarithmofwordfrequencyandtherateofle nition, (2001:16 26) , (categoricalQUANTITATIVEANALYSIS219la, ).
8 Ihefollowmgsectionconsidersmethodsorcont inuousoutcomes,Withafocusonacousticmeasu rementsofvowelsTheconcludingsectiondiscu ssessomerecenttrendsinthefieldofstatisti csofrel-evancetosociolinguists. METHODSFORBINARYVARIABLESI nterpretingCross-TabulationsManyquantita tivesociolinguisticstudiescomparetwodist inctdiscretesen,tlcallyequivalentvariant sincomplementarydistribution.:3mmThe clii ,WilliamLabovelicitedtokensofthephrasefo urthfloor"fromemployeesinthreeManhattand epartmentstoresforthepurposeofstudyingth esocialstratificationofpost (Labov2006:chapter4)firstoch lishedin1966,doesnotincludeanyinferentia lstatistics,thecross:tabul:i)tio-oftheda ta( , )lendsitselftoaSimlestat' , spronouncepost-vocalicrin125tokens,anddo notin211tokens;rispresent aerctofthetime(:125/336).
9 ,thedepartmentstorerepresentinothe;upper class,hasa48percentrateof1 'effectisduetochance,thesecountsareusedt ocomputeateststatisticcall1:Pearson schi- :abilityofateststatisticofthissizeorlarg erbeingobtainedforasamle112::Sizesimplyb ychanceusingthetwo-tailedchi squaredistributionTffe13115representingt hispossibilityisp:LIE-16, 'isuetoI CJEC ithenullhypothesisthattherearenodifferen cesintheErealizdt'on:Iarlipnlgtl egdifferentdepartmentstores,andtheaverag eratesof1 presenceiii:u(.. highe::eUCi:ydCil<:a:t:e:-hatpost-vocali crisrealizedmoreoftenbyspeakersfromFishe r sexacttest,Thechi squaretestisnotveryappropriateforsmall:E iounts{ofd a tassinceitisbasedonanapproximationthatis trueundertheViouslyfalseassumptionofanmf initesample.)}
10 ,Wefavorsrelatedtech:quenownas["lShCIsex acttest,whichcomputesthe exact"( ) ,theFisherpevalue220 METHODOLOGIESANDAPPROACHI SissomewhatsmallerthanthePearsonchi squarep value{ ), valueisoftendifficulttocomputebyhand,but sinceitcanbecomputedforhugedatasetsbyamo derncomputerintheblinkofaneye,itshouldal waysbeusedratherthanthechi ,Laborfeignedmisunderstandingafterthefir st fourthfloor, usuallycausingthespeakertorepeathim orherself, ,Labovrecordedwhethereachtokencomesfrom fourth or floor." ;wordanddepartmentstorearesignificantpre -dictors, ,itisprefer ,thep ,whichpredictsbinaryoutcomeusingoneormor eindependentpredictorts),andwhichwillbef amiliartomanyreadersasthemodelunderlying VARBRUL, ,theoutcomeiseitherrorzero;thepredictors ,allcategorical,areword( fourth vs.]}