Example: stock market

Machine Learning - Introduction to Machine Learning

Machine LearningIntroduction to Machine LearningMarek PetrikJanuary 26, 2017 Some of the figures in this presentation are taken from An Introduction to Statistical Learning , with applications in R (Springer, 2013) with permission from the authors: G. James, D. Wi en, T. Hastie and R. TibshiraniWhat is Machine Learning ?Arthur Samuel (1959, IBM):Field of study that gives computers the ability to learnwithout being explicitly programmedThe rise of Machine learningICML: International Conference on Machine Learning2009500 a endees20151600 a endees20163300 a endeesData is everywhere!

Machine Learning Introduction to Machine Learning Marek Petrik January 26, 2017 Some of the figures in this presentation are taken from ”An Introduction to Statistical Learning, with …

Tags:

  Introduction, Machine, Learning, Machine learning introduction to machine learning

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of Machine Learning - Introduction to Machine Learning

1 Machine LearningIntroduction to Machine LearningMarek PetrikJanuary 26, 2017 Some of the figures in this presentation are taken from An Introduction to Statistical Learning , with applications in R (Springer, 2013) with permission from the authors: G. James, D. Wi en, T. Hastie and R. TibshiraniWhat is Machine Learning ?Arthur Samuel (1959, IBM):Field of study that gives computers the ability to learnwithout being explicitly programmedThe rise of Machine learningICML: International Conference on Machine Learning2009500 a endees20151600 a endees20163300 a endeesData is everywhere!

2 IBM Watson: Computers Beat Humans in JeopardyFair use,h : Computers Beat Humans in GoPhotograph by Saran Poroong Ge y Images/iStockphotoPersonalized Product RecommendationsOnline retailers mine purchase history to make recommendationsIdentifying Crops from SpaceIdentify the type of crop / vegetation / urban space from satelliteobservationsOther Applications1. Health-care: Identify risks of ge ing a disease2. Health-care: Predict e ectiveness of a treatment3. Detect spam in emails4. Recognize hand-wri en text5. Speech recognition (speech to text)6.

3 Machine translation7. Predict probability of an employee leavingActivity: any other applications?This Course: Introductionto Machine LearningIBuild a foundation for practice and research in MLIB asic Machine Learning concepts: max likelihood, crossvalidationIFundamental Machine Learning techniques: regression,model-selection, deep learningIEducational goals:1. How to apply basic methods2. Reveal what happens inside3. What are the pitfalls4. Expand understanding of linear algebra, statistics, andoptimizationCourse OverviewIGrading:50%6 Assignments15%Midterm exam30%Final exam15%Class projectIAssignments: posted on myCourses and the websiteIDiscuss questions: language: R (or Python, but discouraged)Course TextsITextbooks:ISL James, G.

4 , Wi en, D., Hastie, T., & Tibshirani, R. (2013). AnIntroduction to Statistical LearningELS Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements ofStatistical Learning . Springer Series in Statistics (2nd ed.)IOther sources:DL Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning .[online pdf on github]LA Strang, G. Introduction to Linear Algebra. (2016)CO Boyd, S., & Vandenberghe, L. (2004). Convex Optimization.[online pdf]RL Su on, R. S., & Barto, A. (2012). Reinforcement Learning . 2ndedition [online pdf dra ]Class Plan IDateDayTopicJan 24 TueStatistical Learning and R languageJan 26 ThuLinear regression with one variableJan 31 TueLinear regression with multiple variablesFeb 02 ThuNo classFeb 07 TueClassification and logistic regression 1 Feb 09 ThuClassification, naive Bayes and LDAFeb 14 TueLinear algebra for Machine Learning .

5 ReviewMar 16 ThuLinear algebra and optimizationFeb 21 TueOverfi ing and resampling methodsFeb 23 ThuCross-validation and bootstrappingFeb 28 TueLinear model selection, priorsMar 02 ThuLinear model selection and regularizationClass Plan IIMar 07 TueMidterm reviewMar 09 ThuMidterm exam; material until 2/23 Mar 14 TueSpring break, no classMar 16 ThuSpring break, no classMar 21 TueBuilding nonlinear featuresMar 23 ThuNearest neighbor methods and GAMsMar 28 TueTree-based methods and boostingMar 30 ThuSupport vector machines and other techniquesApr 04 TueUnsupervised Learning , PCAApr 06 ThuUnsupervised Learning .

6 K-meansApr 11 TueReinforcement learningApr 13 ThuNeural networks and deep learningApr 18 TueNeural networks and deep learningApr 20 ThuBig data and Machine learningClass Plan IIIApr 25 TueMachine Learning in practiceApr 27 ThuProject presentationsMay 02 TueGuest speakerMay 04 ThuFinal exam reviewMay 11-17?Final examWhat is Machine LearningIDiscover unknown functionf:Y=f(X)IX= set of features, or inputsIY= target, or response0 50 100200300510152025 TVSales0 10 20 30 40 50510152025 RadioSales0 20 40 60 80 100510152025 NewspaperSalesSales=f(TV,Radio,Newspaper )NotationY=f(X) =f(X1,X2,X3)Sales=f(TV,Radio,Newspaper)I Y=SalesIX1=TVIX2=RadioIX3=NewspaperVecto r.

7 X=(X1,X2,X3)DatasetSales=f(TV,Radio,News paper)0 50 100200300510152025 TVSales0 10 20 30 40 50510152025 RadioSales0 20 40 60 80 100510152025 NewspaperSalesDataset:YX1X2X310101203520 6641851110143782525106153105111rows are samplesErrors in Machine Learning : World is NoisyIWorld is too complex to model preciselyIMany features are not captured in data setsINeed to allow for errors inf:Y=f(X) + Machine Learning AlgorithmIInput:Training data-set with features and targetsIOutput:Prediction functionfParametric Prediction MethodsYears of EducationSeniorityIncomeLinear models (linear regression)income=f(education,seniority) = 0+ 1 education+ 2 seniorityWhy Estimatef?

8 0 50 100200300510152025 TVSales0 10 20 30 40 50510152025 RadioSales0 20 40 60 80 : Make predictions about future: Best medium mixto spend ad money? : Understand the relationship: What kind of adswork? Why?Prediction or Inference?ApplicationPredictionInference Identify risk of ge ing a diseasePredict e ectiveness of a treatmentRecognize hand-wri en textSpeech recognitionPredict probability of an employee leavingPrediction or Inference?ApplicationPredictionInference Identify risk of ge ing a diseasePredict e ectiveness of a treatmentRecognize hand-wri en textSpeech recognitionPredict probability of an employee leavingStatistical View of Machine LearningIProbability space : Set of all adultsIRandom variable:X( ) =R: Years of educationIRandom variable:Y( ) =R: Salary1012141618202220304050607080 Years of EducationIncome1012141618202220304050607 080 Years of EducationIncomeHow Good are Predictions?

9 ILearned function fITest data:(x1,y1),(x2,y2),..IMean Squared Error (MSE):MSE=1nn i=1(yi f(xi))2 IThis is the estimate of:MSE=E[(Y f(X))2] =1| | (Y( ) f(X( )))2 IImportant: Samplesxiare We Need Test Data?IWhy not just test on the training data? Squared ErrorIFlexibility is the degree of polynomial being fitIGray line: training error, red line: testing errorBias-Variance DecompositionY=f(X) + Mean Squared Error can be decomposed as:MSE=E(Y f(X))2= Var( f(X)) Variance+ (E( f(X)))2 Bias+ Var( )IBias: How well would method work with infinite dataIVariance: How much does output change with di erent datasetsBias-Variance Trade-o of FunctionfRegression: continuous targetf:X RYears of EducationSeniorityIncomeClassification: discrete targetf:X {1,2,3.}

10 ,k}ooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooo oooX1X2 Regression or Classification?ApplicationRegressionClas sificationIdentify risk of ge ing a diseasePredict e ectiveness of a treatmentRecognize hand-wri en textSpeech recognitionPredict probability of an employee leavingRegression or Classification?ApplicationRegressionClas sificationIdentify risk of ge ing a diseasePredict e ectiveness of a treatmentRecognize hand-wri en textSpeech recognitionPredict probability of an employee leavingError Rate In ClassificationILearned function fITest data:(x1,y1),(x2,y2).


Related search queries