Example: bachelor of science

The Relationship Between Precision-Recall and ROC Curves

TheRelationshipBetween Precision-Recalland ROC of ComputerSciencesandDepartment of BiostatisticsandMedicalInformatics,Unive rsity ofWisconsin-Madison,1210 WestDaytonStreet,Madison,WI,53706 USAA bstractReceiverOperatorCharacteristic(RO C)curvesarecommonlyusedtopresent ,whendealingwithhighlyskewed datasets, Precision-Recall (PR) Curves give a moreinformative pictureof an algorithm' show thata deepconnectionexistsbetweenROCspaceandPR space,such thata curve dominatesinROCspaceif andonlyif it corollaryis thenotionofanachievablePRcurve, which hasproper-tiesmuch like theconvex hullin ROCspace;we show ane cient , we alsonotedi erencesin thetwo types of Curves aresigni cant example,in PRspaceit is incorrectto ,algorithmsthatopti-mizetheareaundertheR OCcurve IntroductionIn machinelearning,current research hasshiftedawayfromsimplypresentingaccura cyresultswhenperform-inganempiricalvalid ationof al.

as functions that act on the underlying confusion ma-trix which de nes a point in either ROC space or PR space. Thus, givenaconfusionmatrixA, RECALL(A) returns the Recall associated with A. 3. Relationship between ROC Space and PR Space ROC and PR curves are typically generated to evalu-ate the performance of a machine learning algorithm on ...

Tags:

  Trix, Ma trix

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of The Relationship Between Precision-Recall and ROC Curves

1 TheRelationshipBetween Precision-Recalland ROC of ComputerSciencesandDepartment of BiostatisticsandMedicalInformatics,Unive rsity ofWisconsin-Madison,1210 WestDaytonStreet,Madison,WI,53706 USAA bstractReceiverOperatorCharacteristic(RO C)curvesarecommonlyusedtopresent ,whendealingwithhighlyskewed datasets, Precision-Recall (PR) Curves give a moreinformative pictureof an algorithm' show thata deepconnectionexistsbetweenROCspaceandPR space,such thata curve dominatesinROCspaceif andonlyif it corollaryis thenotionofanachievablePRcurve, which hasproper-tiesmuch like theconvex hullin ROCspace;we show ane cient , we alsonotedi erencesin thetwo types of Curves aresigni cant example,in PRspaceit is incorrectto ,algorithmsthatopti-mizetheareaundertheR OCcurve IntroductionIn machinelearning,current research hasshiftedawayfromsimplypresentingaccura cyresultswhenperform-inganempiricalvalid ationof al.

2 (1998)havearguedthatsimplyusingaccuracyr esultscanbe useReceiver OperatorCharacter-istic(ROC) Curves ,which show how thenumber of cor-rectlyclassi edpositive examplesvarieswiththenum-ber of incorrectlyclassi ednegative ,ROCcurves canpresent an overlyoptimisticviewof analgorithm'sperformanceif thereis a largeskewAppearinginProceedingsof the23rdInternationalCon-ference on MachineLearning, Pittsburgh,PA, 2006by theauthor(s)/owner(s).in (2000;2004)have recommendedusingcostcurves to areanexcellent alternative toROCcurves,butdiscussingthemis beyondthescopeof (PR) Curves ,oftenusedin Informa-tionRetrieval (Manning& Schutze,1999;Raghavanet al.,1989),have beencitedas analternative to ROCcurves fortaskswitha largeskewin theclassdis-tribution(Bockhorst& Craven,2005;Bunescuet al.)

3 ,2004;Daviset al.,2005;Goadrich et al.,2004;Kok&Domingos,2005;Singla& Domingos,2005).Anim-portant di erencebetweenROCspaceandPRspaceis thevisualrepresentationof erencesbetweenalgo-rithmsthatarenotappar ent in andPRcurves areshownin Figures1(a)and1(b)respectively. Thesecurves,takenfromthesamelearnedmodel sona highly-skewed cancerdetec-tiondataset,highlight thevisualdi erencebetweenthesespaces(Daviset al.,2005).Thegoalin ROCspaceis to be in theupper-left-handcorner,andwhenonelooks at theROCcurves in Figure1(a)theyap-peartobe to be in theupper-right-handcorner,andthePRcurves in Figure1(b)show thatthereis thealgorithmsappearto be com-parablein ROCspace,however,in PRspacewe canseethatAlgorithm2 hasa clearadvantageover Algo-rithm1.

4 Thisdi erenceexistsbecausein thisdomainthenumber of negative examplesgreatlyexceedsthenumber of positives , a largechangein thenumber of falsepositives canleadto asmallchangein thefalsepositive rateusedin ,ontheotherhand,by comparingfalsepositives to truepositives ratherthantrueneg-atives,capturesthee ectof thelargenumber of neg-ative examplesonthealgorithm' de believe it is important to studytheconnectionbe-TheRelationshipBetw eenPrecision-RecallandROCC urves 0 1 0 1 True Positive RateFalse Positive RateAlgorithm 1 Algorithm 2(a)Comparisonin ROCspace 0 1 0 1 PrecisionRecallAlgorithm 1 Algorithm 2(b)Comparisonin erencebetweencomparingalgorithmsin ROCvs PRspacetweenthesetwo spaces,andwhethersomeof thein-terestingpropertiesof show thatforany dataset,andhencea xednumber of positive andnegative examples,theROCcurve andPRcurve fora givenalgorithmcon-tainthe\samepoints.

5 "ThereforethePRcurves forAlgorithmI andAlgorithmII in Figure1(b)are,in asensethatwe formallyde ne,equivalent to theROCcurves forAlgorithmI andAlgorithmII, respectivelyin Figure1(a).Basedon thisequivalenceforROCandPRcurves,we show thata curve dominatesin ROCspaceif andonlyif it dominatesin ,we introducethePRspaceanalogto theconvexhullin ROCspace,which we calltheachievablePRcurve. We show thatdueto theequivalenceof thesetwo spaceswe cane demonstratethatin PRspaceit is insu cient to , we show thatanalgorithmthatoptimizestheareaunder theROCcurve is notguaranteedto Reviewof ROC and Precision-RecallIna binarydecisionproblem,a classi erlabelsex-amplesaseitherpositive theclassi ercanbe representedin a struc-tureknownas a confusionmatrixor :Truepositives (TP)areexamplescorrectlylabeledas (FP)referto negative (TN)correspondto negatives correctlylabeledas , falsenegatives (FN)referto positive examplesincorrectlylabeledas confusionmatrixis shownin Figure2(a).

6 Thecon-fusionmatrixcanbe usedto constructa point in eitherROCspaceor theconfusionmatrix,we areableto de nethemetricsusedin each spaceas in Figure2(b).In ROCspace,oneplotstheFalsePositive Rate(FPR)onthex-axisandtheTruePos-itive Rate(TPR) negative examplesthataremisclassi- , thesameas TPR,whereasPre-cisionmeasuresthatfractio nof examplesclassi edaspositive thataretrulypositive. Figure2(b)gives thede nitionsforeach willtreatthemetricsas functionsthatactontheunderlyingconfusion ma-trixwhich de nesa point in eitherROCspaceor ,given a confusionmatrixA, RECALL(A) Relationshipbetween ROC Spaceand PR SpaceROCandPRcurves aretypicallygeneratedto evalu-atetheperformanceof a machinelearningalgorithmon a given datasetcontainsa xednum-ber of positive andnegative show herethatthereexistsa givendatasetof positiveandnegativeexamples,there existsa one-to-onecorrespon-dence Between a curvein ROCspace anda curvein PRspace, suchthatthecurvescontainexactlythesameco nfusionmatrices,if Recall6= PF PpredictednegativeF NT N(a)

7 ConfusionMatrixRecall=T PT P+F NPrecision=T PT P+F PTruePositive Rate=T PT P+F NFalsePositive Rate=F PF P+T N(b)De nitionsof point inROCspacede nesauniqueconfusionmatrixwhenthedataseti s PRspacewe ignoreT N, onemight worrythateach point may correspondto ,witha xednumber of positiveandnegative examples,giventheotherthreeentriesin a matrix,T Nis Recall=0, we areunableto recoverF P, andthus cannot nda , we have a one-to-onemappingbetweenconfusionmatrice sandpoints in alsohave a one-to-onemappingbetweenpoints (each de nedby a confusionmatrix)in ROCspaceandPRspace;hence,we cantranslatea curvein ROCspaceto de nitionwe needforournexttheoremis thenotionthatonecurve dominatesanothercurve,\meaningthatall arebeneathit or equalto it (Provostet al.)

8 ,1998)." a xed number of positiveandneg-ativeexamples,onecurvedom inatesa secondcurveinROCspace if andonlyif the rstdominatesthesecondin Precision-Recall ()): If a curve dominatesin ROCspacethenit dominatesin PR space. Proof have curve I andcurve II(asshownin Figure3)such thatcurve I dominatesin ROCspace,yet,oncewe translatethesecurves inPRspace,curve I does notdominatein PRspace,thereexistssomepointAoncurve II such thatthepointBoncurveI withidenticalRecallhaslower ,P RECISION(A)>P RECISION(B)yetRECALL(A)=RECALL(B).SinceR ECALL(A) =RECALL(B) andRecallis iden-ticaltoT P R, we have thatT P R(A) =T P R(B).Sincecurve I dominatescurve II inROCspaceF P R(A) F P R(B).Rememberthattotalpos-itivesandtotal negativesare xedandsinceT P R(A) =T P R(B):T P R(A) =T PATotalPositivesT P R(B) =T PBTotalPositiveswe now haveT PA=T PBandthusdenotebothasT P.

9 Remember thatF P R(A) F P R(B) andF P R(A) =F PATotalNegativesF P R(B) =F PBTotalNegativesThisimpliesthatF PA F PBbecauseP RECISION(A) =T PF PA+T PP RECISION(B) =T PF PB+T PwenowhavethatP RECISION(A) P RECISION(B).Butthiscontradictsourorigina lassumptionthatP RECISION(A)>P RECISION(B).Claim2 ((): If a curve dominatesin PR spacethenit dominatesin ROC by have curve I andcurve II (asshownin Figure4) such thatcurve I dominatescurve IIin PRspace,butoncetranslatedin ROCspacecurve I does notdominatein ROCspace,thereexistssomepointAoncurve IIsuch thatpointBoncurve I withidenticalT P RyetF P R(A)< T P R(B). SinceRECALLandT P Rarethesame,we getthatRECALL(A) =RECALL(B).Becausecurve I dominatesin PRspacewe know thatP RECISION(A) P RECISION(B).)

10 RememberTheRelationshipBetweenPrecision- RecallandROCC urves 0 1 0 1 True Positive RateFalse Positive Rate A BCurve ICurve II(a)Case1:F P R(A)> F P R(B) 0 1 0 1 True Positive RateFalse Positive Rate B = ACurve ICurve II(b)Case2:F P R(A) =F P R(B) casesforClaim1 of 0 1 0 1 PrecisionRecall B ACurve ICurve II(a)Case1:P RECISION(A)<P RECISION(B) 0 1 0 1 PrecisionRecallB = ACurve ICurve II(b)Case2:P RECISION(A) =P RECISION(B) casesof Claim2 of (A) =RECALL(B) andRECALL(A) =T PATotalPositivesRECALL(B) =T PBTotalPositivesWe know thatT PA=T PB, sowe willnow denotethemsimplyasT P. BecauseP RECISION(A) P RECISION(B) andP RECISION(A) =T PT P+F PAP RECISION(B) =T PT P+F PBwe ndthatF PA F PB.


Related search queries