Transcription of Using Random Forest to Learn Imbalanced Data
{{id}} {{{paragraph}}}
Using Random Forest to Learn Imbalanced DataChao of Statistics,UC BerkeleyAndy Research,Merck Research LabsLeo of Statistics,UC BerkeleyAbstractIn this paper we propose two ways to deal with the Imbalanced data classification problem usingrandom Forest . One is based on cost sensitive learning, and the other is based on a sampling metrics such as precision and recall, false positive rate and false negative rate,F-measureand weighted accuracy are computed. Both methods are shown to improve the prediction accuracy ofthe minority class, and have favorable performance compared to the existing IntroductionMany practical classification problems areimbalanced; , at least one of the classes constitutes only avery small minority of the data. For such problems, the interest usually leans towards correct classificationof the rare class (which we will refer to as the positive class). Examples of such problems include frauddetection, network intrusion, rare disease diagnosing, etc.
accuracy. We use metrics such as true negative rate, true positive rate, weighted accuracy, G-mean, precision, recall, and F-measure to evaluate the performance of learning algorithms on imbalanced data.
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}
INFORMES, Access Strategies in the European Research, Mobile Applications for the Health Sector, DYNAMICS AND EPIDEMIOLOGY OF, DYNAMICS AND EPIDEMIOLOGY OF ANTIMICROBIAL RESISTANCE IN ANIMAL PRODUCTION, ThE ExtERnAL AssuRAnCE oF sustAinAbiLity REpoRtinG, Credit Reporting Agencies, Report, New York DMV