Example: dental hygienist

Foundations of Machine Learning

Foundations of Machine LearningAdaptive Computation and Machine LearningThomas Dietterich, EditorChristopher Bishop, David Heckerman, Michael Jordan, and Michael Kearns,Associate EditorsA complete list of books published in The Adaptive Computations and MachineLearning series appears at the back of this of Machine LearningMehryar Mohri, Afshin Rostamizadeh, and Ameet TalwalkarThe MIT PressCambridge, MassachusettsLondon, Englandc 2012 Massachusetts Institute of TechnologyAll rights reserved. No part of this book may be reproduced in any form by anyelectronic or mechanical means (including photocopying, recording, or informationstorage and retrieval) without permission in writing from the Press books may be purchased at special quantity discounts for business orsales promotional use. For information, please email write to Special Sales Department, The MIT Press, 55 Hayward Street, Cam-bridge, MA book was set in LATEX by the authors.

Foundations of machine learning / Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. p. cm. - (Adaptive computation and machine learning series) ... Machine learning consists of designing efficient and accurate prediction algo-rithms. As in other areas of computer science, some critical measures of the quality

Tags:

  Foundations, Designing

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Foundations of Machine Learning

1 Foundations of Machine LearningAdaptive Computation and Machine LearningThomas Dietterich, EditorChristopher Bishop, David Heckerman, Michael Jordan, and Michael Kearns,Associate EditorsA complete list of books published in The Adaptive Computations and MachineLearning series appears at the back of this of Machine LearningMehryar Mohri, Afshin Rostamizadeh, and Ameet TalwalkarThe MIT PressCambridge, MassachusettsLondon, Englandc 2012 Massachusetts Institute of TechnologyAll rights reserved. No part of this book may be reproduced in any form by anyelectronic or mechanical means (including photocopying, recording, or informationstorage and retrieval) without permission in writing from the Press books may be purchased at special quantity discounts for business orsales promotional use. For information, please email write to Special Sales Department, The MIT Press, 55 Hayward Street, Cam-bridge, MA book was set in LATEX by the authors.

2 Printed and bound in the United Statesof of Congress Cataloging-in-Publication DataMohri, of Machine Learning / Mehryar Mohri, Afshin Rostamizadeh, andAmeet cm. - (Adaptive computation and Machine Learning series)Includes bibliographical references and 978-0-262-01825-8 (hardcover : alk. paper) 1. Machine Learning . 2. Computeralgorithms. I. Rostamizadeh, Afshin. II. Talwalkar, Ameet. III. 1-dc23201200724910987654321 ContentsPrefacexi1 Learningscenarios .. Outline .. 82 The PAC Learning Guaranteesforfinitehypothesissets Guaranteesforfinitehypothesissets Generalities .. Bayeserrorandnoise .. Exercises .. 293 Rademacher Complexity and Rademachercomplexity .. Growthfunction .. Exercises .. 554 Support Vector SVMs separablecase .. Primaloptimizationproblem .. Dualoptimizationproblem.

3 SVMs Primaloptimizationproblem .. Dualoptimizationproblem .. Exercises .. 845 Kernel Positivedefinitesymmetrickernels .. Definitions .. Kernel-basedalgorithms .. Learningguarantees .. Sequencekernels .. Weightedtransducers .. Rationalkernels .. Exercises .. 1166 AdaBoost .. Boundontheempiricalerror .. Relationshipwithlogisticregression .. Theoreticalresults .. VC-dimension-basedanalysis .. Margin-basedanalysis .. Marginmaximization .. Exercises .. 1427 On-Line MistakeboundsandHalvingalgorithm .. Weightedmajorityalgorithm .. Randomizedweightedmajorityalgorithm .. Winnowalgorithm .. On-linetobatchconversion .. Exercises .. 1768 Multi-Class Multi-classclassificationproblem .. Generalizationbounds .. Aggregated multi-class One-versus-one.

4 Error-correctioncodes .. Structuredpredictionalgorithms .. Exercises .. 2079 Theproblemofranking .. Generalizationbound .. RankingwithSVMs .. RankBoost .. Boundontheempiricalerror .. Marginboundforensemblemethodsinranking .. Boostinginbipartiteranking .. AreaundertheROCcurve .. Preference-basedsetting .. Deterministicalgorithm .. Extensiontootherlossfunctions .. Exercises .. 23410 .. Finitehypothesissets .. Linearregression .. Supportvectorregression .. Lasso .. Groupnormregressionalgorithms .. On-lineregressionalgorithms .. 26311 Algorithmic .. Application to regression algorithms: SVR and KRR .. Applicationtoclassificationalgorithms:SV Ms .. 27712 Dimensionality .. (KPCA) .. Isomap .. Locallylinearembedding(LLE) .. 29013 Learning Automata and .. Passivelearning .. Learningwithqueries.

5 Learningautomatawithqueries .. Learningreversibleautomata .. 31014 Reinforcement .. Optimalpolicy .. Valueiteration .. Linearprogramming .. Stochasticapproximation .. TD(0)algorithm .. SARSA .. TD( )algorithm .. 337xConclusion339A Linear Algebra Vectorsandnorms .. Singularvaluedecomposition .. Symmetricpositivesemidefinite(SPSD) 346B Convex Differentiationandunconstrainedoptimizat ion .. Convexity .. Constrainedoptimization .. 357C Probability Conditionalprobabilityandindependence .. Expectation, Markov s inequality, and moment-generating VarianceandChebyshev sinequality .. 365D Concentration Hoeffding sinequality .. McDiarmid sinequality .. Binomialdistribution:Slud sinequality .. Exercises .. 377E book is a general introduction to Machine Learning that can serve as a textbookfor students and researchers in the field.

6 It covers fundamental modern topics inmachine Learning while providing the theoretical basis and conceptual tools neededfor the discussion and justification of algorithms. It also describes several key aspectsof the application of these have aimed to present the most novel theoretical tools and concepts whilegiving concise proofs, even for relatively advanced results. In general, wheneverpossible, we have chosen to favor succinctness. Nevertheless, we discuss some crucialcomplex topics arising in Machine Learning and highlight several open researchquestions. Certain topics often merged with others or treated with insufficientattention are discussed separately here and with more emphasis: for example, adifferent chapter is reserved for multi-class classification, ranking, and we cover a very wide variety of important topics in Machine Learning , wehave chosen to omit a few important ones, including graphical models and neuralnetworks, both for the sake of brevity and because of the current lack of solidtheoretical guarantees for some book is intended for students and researchers in Machine Learning , statisticsand other related areas.

7 It can be used as a textbook for both graduate and advancedundergraduate classes in Machine Learning or as a reference text for a researchseminar. The first three chapters of the book lay the theoretical foundation for thesubsequent material. Other chapters are mostly self-contained, with the exceptionof chapter 5 which introduces some concepts that are extensively used in laterones. Each chapter concludes with a series of exercises, with full solutions reader is assumed to be familiar with basic concepts in linear algebra,probability, and analysis of algorithms. However, to further help him, we presentin the appendix a concise linear algebra and a probability review, and a shortintroduction to convex optimization. We have also collected in the appendix anumber of useful tools for concentration bounds used in this our knowledge, there is no single textbook covering all of the materialpresented here.

8 The need for a unified presentation has been pointed out to usxiiPrefaceevery year by our Machine Learning students. There are several good books forvarious specialized areas, but these books do not include a discussion of otherfundamental topics in a general manner. For example, books about kernel methodsdo not include a discussion of other fundamental topics such as boosting, ranking,reinforcement Learning , Learning automata or online Learning . There also exist moregeneral Machine Learning books, but the theoretical foundation of our book and ouremphasis on proofs make our presentation quite course ( Foundations of Machine Learning ) taught by the first authorat the Courant Institute of Mathematical Sciences in New York University overthe last seven years. This book has considerably benefited from the commentsand suggestions from students in these classes, along with those of many friends,colleagues and researchers to whom we are deeply are particularly grateful to Corinna Cortes and Yishay Mansour who haveboth made a number of key suggestions for the design and organization of thematerialpresentedwithdetailedcomments thatwehavefullytakenintoaccountand that have greatly improved the presentation.

9 We are also grateful to YishayMansour for using a preliminary version of the book for teaching and for reportinghis feedback to also thank for discussions, suggested improvement, and contributions of manykinds the following colleagues and friends from academic and corporate research lab-oratories: Cyril Allauzen, Stephen Boyd, Spencer Greenberg, Lisa Hellerstein, SanjivKumar, Ryan McDonald, Andres Mu noz Medina, Tyler Neylon, Peter Norvig, Fer-nando Pereira, Maria Pershina, Ashish Rastogi, Michael Riley, Umar Syed, CsabaSzepesv ari, Eugene Weinstein, and Jason , we thank the MIT Press publication team for their help and support inthe development of this IntroductionMachine Learning can be broadly defined as computational methods using experienceto improve performance or to make accurate predictions. Here,experiencerefers tothe past information available to the learner, which typically takes the form ofelectronic data collected and made available for analysis.

10 This data could be in theform of digitized human-labeled training sets, or other types of information obtainedvia interaction with the environment. In all cases, its quality and size are crucial tothe success of the predictions made by the Learning consists of designing efficient and accurate predictionalgo-rithms. As in other areas of computer science, some critical measures of the qualityof these algorithms are their time and space complexity. But, in Machine Learning ,we will need additionally a notion ofsample complexityto evaluate the sample sizerequired for the algorithm to learn a family of concepts. More generally, theoreti-cal Learning guarantees for an algorithm depend on the complexity of the conceptclasses considered and the size of the training the success of a Learning algorithm depends on the data used, machinelearning is inherently related to data analysis and statistics. More generally, learningtechniques are data-driven methods combining fundamental concepts in computerscience with ideas from statistics, probability and Applications and problemsLearning algorithms have been successfully deployed in a variety of applications,includingText or document classification, , spam detection;Natural language processing, , morphological analysis, part-of-speech tagging,statistical parsing, named-entity recognition;Speech recognition, speech synthesis, speaker verification;Optical character recognition (OCR);Computational biology applications, , protein function or structured predic-2 Introductiontion;Computer vision tasks, , image recognition, face detection;Fraud detection (credit card, telephone) and network intrusion;Games, , chess, backgammon;Unassisted vehicle control (robots, navigation);Medical diagnosis.


Related search queries