Transcription of Correlation-based Feature Selection for Machine Learning
{{id}} {{{paragraph}}}
Department of Computer ScienceHamilton, NewZealandCorrelation- based Feature Selection forMachine LearningMark A. HallThis thesis is submitted in partial fulfilment of the requirementsfor the degree of Doctor of Philosophy at The University of 1999c 1999 Mark A. HalliiAbstractA central problem in Machine Learning is identifying a representative set of features fromwhich to construct a classification model for a particular task. This thesis addresses theproblem of Feature Selection for Machine Learning through acorrelation based central hypothesis is that good Feature sets contain features that are highly correlatedwith the class, yet uncorrelated with each other. A Feature evaluation formula, basedon ideas from test theory, provides an operational definition of this hypothesis. CFS(Correlation based Feature Selection ) is an algorithm thatcouples this evaluation formulawith an appropriate correlation measure and a heuristic search was evaluated by experiments on artificial and natural datasets.
Feature selection degraded machine learning performance in cases where some features were eliminated which were highly predictive of very small areas of the instance space. Further experiments compared CFS with a wrapper—a well know n approach to feature selection that employs the target learning algorithmto evaluate feature sets. In many cases
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}