Example: stock market

Machine Learning and Data Mining Lecture Notes

Machine Learning and Data MiningLecture NotesCSC 411/D11 Computer Science DepartmentUniversity of TorontoVersion: February 6, 2012 Copyrightc 2010 Aaron Hertzmann and David FleetCSC 411 / CSC D11 CONTENTSC ontentsConventions and Notationiv1 Introduction to Machine of Machine Learning .. simple problem ..22 Linear 1D case .. inputs .. outputs ..83 Nonlinear function regression .. and Regularization .. Neural Networks .. Neighbors .. a quadratic .. 185 Basic Probability logic .. definitions and rules .. random variables .. and Multinomial distributions .. expectation .. 266 Probability Density Functions (PDFs) expectation, mean, and variance .. distributions .. distributions .. Gaussian distribution .. a binomial distribution .. Rule.

2. The Software Engineering View. Machine learning allows us to program computers by example, which can be easier than writing code the traditional way. 3. The Stats View. Machine learning is the marriage of computer science and statistics: com-putational techniques are applied to statistical problems. Machine learning has been applied

Tags:

  Lecture, Notes, Machine, Statistical, Learning, Lecture notes, Machine learning

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Machine Learning and Data Mining Lecture Notes

1 Machine Learning and Data MiningLecture NotesCSC 411/D11 Computer Science DepartmentUniversity of TorontoVersion: February 6, 2012 Copyrightc 2010 Aaron Hertzmann and David FleetCSC 411 / CSC D11 CONTENTSC ontentsConventions and Notationiv1 Introduction to Machine of Machine Learning .. simple problem ..22 Linear 1D case .. inputs .. outputs ..83 Nonlinear function regression .. and Regularization .. Neural Networks .. Neighbors .. a quadratic .. 185 Basic Probability logic .. definitions and rules .. random variables .. and Multinomial distributions .. expectation .. 266 Probability Density Functions (PDFs) expectation, mean, and variance .. distributions .. distributions .. Gaussian distribution .. a binomial distribution .. Rule.

2 Estimation .. , ML, and Bayes Estimates .. Gaussians .. 39 Copyrightc 2011 Aaron Hertzmann and David FleetiCSC 411 / CSC nonlinear regression .. Conditionals .. Regression .. Neural Networks .. Neighbors Classification .. vs. Discriminative models .. by LS Regression .. ve Bayes .. Input Features .. 519 Gradient differences .. 5510 Cross Cross-Validation .. 5611 Bayesian Bayesian Regression .. Hyperparameters .. Bayesian Model Selection .. 6312 Monte Carlo Sampling Gaussians .. Importance Sampling .. Markov Chain Monte Carlo (MCMC) ..7313 Principal Components The model and Learning .. Reconstruction .. Properties of PCA .. Whitening .. Modeling .. Probabilistic PCA.

3 7914 Lagrange Examples .. Least-Squares PCA in one-dimension .. Multiple constraints .. Inequality constraints .. 90 Copyrightc 2011 Aaron Hertzmann and David FleetiiCSC 411 / CSC D11 CONTENTS15 Clustering .. Clustering .. Mixtures of Gaussians .. Learning .. Numerical issues .. The Free Energy .. Proofs .. Relation toK-means .. Degeneracy .. Determining the number of clusters .. 10116 Hidden Markov Markov Models .. Hidden Markov Models .. Viterbi Algorithm .. The Forward-Backward Algorithm .. EM: The Baum-Welch Algorithm .. Numerical issues: renormalization .. Free Energy .. Most likely state sequences .. 11417 Support Vector Maximizing the margin .. Slack Variables for Non-Separable Datasets.

4 Loss Functions .. The Lagrangian and the Kernel Trick .. Choosing parameters .. Software .. 12218 Decision stumps .. Why does it work? .. Early stopping .. 128 Copyrightc 2011 Aaron Hertzmann and David FleetiiiCSC 411 / CSC D11 AcknowledgementsConventions and NotationScalars are written with lower-case italics, ,x. Column-vectors are written in bold, lower-case:x, and matrices are written in bold set of real numbers is represented byR;N-dimensional Euclidean space is :Text in aside boxes provide extra background or information that you are not re-quired to know for this Taylor and James Martens assisted with preparation of these 2011 Aaron Hertzmann and David FleetivCSC 411 / CSC D11 Introduction to Machine Learning1 Introduction to Machine LearningMachine Learning is a set of tools that, broadly speaking, allow us to teach computers how toperform tasks by providing examples of how they should be done.

5 For example, suppose we wishto write a program to distinguish between valid email messages and unwanted spam. We could tryto write a set of simple rules, for example, flagging messagesthat contain certain features (suchas the word viagra or obviously-fake headers). However, writing rules to accurately distinguishwhich text is valid can actually be quite difficult to do well,resulting either in many missed spammessages, or, worse, many lost emails. Worse, the spammers will actively adjust the way theysend spam in order to trick these strategies ( , writing vi@gr@ ). Writing effective rules and keeping them up-to-date quickly becomes an insurmountable task. Fortunately, machinelearning has provided a solution. Modern spam filters are learned from examples: we provide thelearning algorithm with example emails which we have manually labeled as ham (valid email)or spam (unwanted email), and the algorithms learn to distinguish between them Learning is a diverse and exciting field, and there are multiple ways of defining Artifical Intelligence is central to human knowledge and intelligence,and, likewise, it is also essential for building intelligent machines.

6 Years of effort in AIhas shown that trying to build intelligent computers by programming all the rules cannot bedone; automatic Learning is crucial. For example, we humansare not born with the abilityto understand language we learn it and it makes sense to tryto have computers learnlanguage instead of trying to program it all Software Engineering Learning allows us to program computers byexample, which can be easier than writing code the traditional Stats Learning is the marriage of computer science and statistics: com-putational techniques are applied to statistical problems. Machine Learning has been appliedto a vast number of problems in many contexts, beyond the typical statistics problems. Ma-chine Learning is often designed with different considerations than statistics ( , speed isoften more important than accuracy).

7 Often, Machine Learning methods are broken into two :A model is learned from a collection oftraining :The model is used to make decisions about some newtest example, in the spam filtering case, the training data constitutes email messages labeled as hamor spam, and each new email message that we receive (and whichto classify) is test data. However,there are other ways in which Machine Learning is used as 2011 Aaron Hertzmann and David Fleet1 CSC 411 / CSC D11 Introduction to Machine Types of Machine LearningSome of the main types of Machine Learning Learning ,in which the training data is labeled with the correct answers, , spam or ham. The two most common types of supervised Learning areclassification(where the outputs are discrete labels, as in spam filtering)andregression(where the outputsare real-valued). Learning ,in which we are given a collection of unlabeled data, which wewishto analyze and discover patterns within.

8 The two most important examples Learning ,in which an agent ( , a robot or controller) seeks to learntheoptimal actions to take based the outcomes of past are many other types of Machine Learning as well, for Learning , in which only a subset of the training data is forecasting, such as in financial detectionsuch as used for fault-detection in factories and in Learning , in which obtaining data is expensive, and so an algorithm must determinewhich training data to acquireand many A simple problemFigure 1 shows a 1D regression problem. The goal is to fit a 1D curve to a few points. Which curveis best to fit these points? There are infinitely many curves that fit the data, and, because the datamight be noisy, we might not even want to fit the data precisely. Hence, Machine Learning requiresthat we make certain choices:1.

9 How do we parameterize the model we fit? For the example in Figure 1, how do we param-eterize the curve; should we try to explain the data with a linear function, a quadratic, or asinusoidal curve?2. What criteria ( , objective function) do we use to judgethe quality of the fit? For example,when fitting a curve to noisy data, it is common to measure the quality of the fit in terms ofthe squared error between the data we are given and the fitted curve. When minimizing thesquared error, the resulting fit is usually called a least-squares 2011 Aaron Hertzmann and David Fleet2 CSC 411 / CSC D11 Introduction to Machine Learning3. Some types of models and some model parameters can be very expensive to optimize long are we willing to wait for a solution, or can we use approximations (or hand-tuning) instead?4. Ideally we want to find a model that will provide useful predictions in future situations.

10 Thatis, although we might learn a model fromtraining data, we ultimately care about how wellit works on futuretest data. When a model fits training data well, but performs poorly ontest data, we say that the model hasoverfitthe training data; , the model has fit propertiesof the input that are not particularly relevant to the task athand ( , Figures 1 (top row andbottom left)). Such properties are refered to asnoise. When this happens we say that themodel does notgeneralizewell to the test data. Rather it produces predictions on the testdata that are much less accurate than you might have hoped forgiven the fit to the Learning provides a wide selection of options by which to answer these questions,along with the vast experience of the community as to which methods tend to be successful ona particular class of data-set. Some more advanced methods provide ways of automating someof these choices, such as automatically selecting between alternative models, and there is somebeautiful theory that assists in gaining a deeper understanding of Learning .


Related search queries