Transcription of Predicting Good Probabilities With Supervised Learning
{{id}} {{{paragraph}}}
IthacaNY14853 AbstractWe showthatmaxi-mummarginmethodssuchasboost edtreesandboostedstumpspushprobabilityma ssawayfrom0 and1 yieldinga Bayes,whichmake unrealis-ticindependenceassumptions,push probabilitiestoward0 ex-perimentwithtwo waysofcorrectingthebiasedprobabilitiespr edictedbysomelearningmeth- muchdatathey ,randomforests, IntroductionInmany applicationsit isimportanttopredictwellcali-bratedproba bilities;goodaccuracy orareaundertheROCcurve :SVMs,neuralnets,decisiontrees,memory-ba sedlearn-ing,baggedtrees,randomforests,b oostedtrees,boostedstumps,naive show howmaximummarginmethodssuchasSVMs,booste dtrees,andboostedstumpstendtopushpredict edprobabilitiesawayfrom0 predictandyieldsa bayeshave theoppositebiasandtendtopushpredictionsc loserto0 , Bonn,Germany, (s)/owner(s).suchasbaggedtreesandneuraln etshave (orlackof)characteristictoeachlearningme thod,weexperimentwithtwo :a methodfortransformingSVMoutputsfrom[ 1;+1]toposteriorprobabilities(Platt,1999 )IsotonicRegression:themethodusedbyZadro zny andElkan(2002;2001)tocalibrateprediction sfromboostednaive bayes,SVM,anddecisiontreemodelsPlattScal ingismosteffective whenthedistortioninthepredictedprobabili tiesis a morepowerfulcalibrationmethodthatcancorr ectany , thisextrapowercomesat a learningcurve anal
ing, bagged trees, random forests, boosted trees, boosted stumps, naive bayes and logistic regression. We show how maximum margin methods such as SVMs, boosted trees, and boosted stumps tend to push predicted probabilities away from 0 and 1. This hurts the quality of the probabili-ties they predict and yields a characteristic sigmoid-shaped
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}