Transcription of Artificial Intelligence and Machine Learning
1 1 Artificial Intelligence and Machine LearningVijay GadepallyJeremy Kepner, Lauren Milechin, Siddharth SamsiDISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited. This material is based upon work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering. 2020 Massachusetts Institute of Technology. Delivered to the Government with Unlimited Rights, as defined in DFARS Part or 7014 (Feb 2014). Notwithstanding any copyright notice, Government rights in this work are defined by DFARS or DFARS as detailed above. Use of this work other than as specifically authorized by the Government may violate any copyrights that exist in this contributions from: Siddharth Samsi, Albert Reuther, Jeremy Kepner, David Martinez, Lauren MilechinAI and ML - 2 VNG 010720 Outline Artificial Intelligence Overview Machine Learning Deep Dives Supervised Learning Unsupervised Learning Reinforcement Learning Conclusions/Summary2AI and ML - 3 VNG 010720 What is Artificial Intelligence ?
2 Narrow AI:The theory and development of computer systems that perform tasks that augment for human Intelligence such as perceiving, classifying, Learning , abstracting, reasoning, and/or actingGeneral AI: Full autonomyDefinition adapted from Oxford dictionary and inputs from Prof. Patrick Winston (MIT)AI and ML - 4 VNG 010720AI. Why Now?Convergence of High Performance Computing, Big Data and Algorithms that enable widespread AI developmentBig DataMachine Learning AlgorithmsCompute PowerSource: DARPA/ Public domainSource: DARPA/ Public domain3AI and ML -5 VNG 010720AI Canonical ArchitectureData ConditioningStructuredDataUnstructuredDa taSensorsSourcesInformationKnowledgeInsi ghtUsers (Missions)Robust AIExplainable AIMetrics and Bias AssessmentVerification & ValidationSecurity( , counter AI)Policy, Ethics, Safety and :Human- Machine Teaming (CoA)Spectrum Knowledge-Based Unsupervised and Supervised Learning Transfer Learning Reinforcement Learning etcHumanHuman- Machine ComplementMachineModern ComputingCPUsGPUsQuantumCustom.
3 TPUN euromorphicGPU = Graphics Processing Unit TPU = Tensor Processing UnitCoA = Courses of Action IEEE. Figure 1 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-9, doi: All rights reserved. This content is excluded from our Creative Commons license. For more information, see and ML -8 VNG 010720 Select History of Artificial Intelligence2016 -DeepMind AlphaGo defeats top human Go player (Lee Sedol)2015 -DeepMind achieves human expert level of play on Atari games (using only raw pixels and scores)Adapted from: The Quest for Artificial Intelligence , Nils J. Nilsson, 2010 and MIT Lincoln Laboratory Library and Archives1997 -IBM Deep Blue defeats reigning chess champion (Garry Kasparov)2007 -DARPA Grand Challenge ( Urban Challenge )2011 -IBM Watson defeats former Jeopardy!
4 Champions (Brad Rutter and Ken Jennings)1955 -Western Joint Computer Conference Session on Learning Machines1956 -Dartmouth Summer Research Project on AI1958 -National Physical Laboratory in the UKSymposium on the Mechani-zation of Thought Processes1986 presentThe return of neural networks 1950 -Computing Machinery and Intelligence Turing Test published by MIND vol. LIX2012 -Team from U. of Toronto (Geoff Hinton's lab) wins the ImageNet Large Scale Visual Recognition Challenge with deep- Learning softwareG. ClarkO. Selfridge1979 -An Assessment of AI from a Lincoln Laboratory Perspective Internal MIT LL publicationJ. ForgieandJ. Allen1960 -Recognizing hand-written characters, Robert Larson of SRI AI Center1961 -James Slagle, Solving Freshman Calculus (Minsky Student) MIT2014 -Google s GoogleNet Object classification at near human performance2005 -Google s Arabic and Chinese to English translation1994 -Human-level spontaneous speech recognition1989 -Convolutional Neural Networks1984 -Hidden Markov models2016 -DARPA Cyber Grand ChallengeJ.
5 McCarthy, M. Minsky,N. Rochester, O. Selfridge,C. Shannon, others1957 -Memory Test Computer, first computer to simulate the operation of neural networks 195020002010201719601970198019901959 -Arthur Samuel Some studies in Machine Learning using the Game of checkers IBM Journal of R&DProgramming Pattern Recognition(MIT LL Staff)Pattern Recognition and Modern Computers(MIT LL Staff)Generalization of Pattern Recognition in a Self-Organizing System(MIT LL Staff)1957 -Frank Rosenblatt Neural Networks Perceiving and Recognizing Automation1988 -Statistical Machine TranslationAI Winters 1974 1980 and 1987 19932001 presentThe availability of very large data sets 1982 -Expert Systems Pioneer DENDRAL project at Stanford4AI and ML - 9 VNG 010720 Artificial Intelligence Evolution* Waves adapted from JohnLaunchbury, Director I2O,DARPAREASONING*LEARNINGCONTEXTABSTRA CTIONH andcraftedKnowledgeContextualAdaptationS ystem EvolutionFour Waves of AIPerceivingLearningAbstractingReasoning PerceivingLearningAbstractingReasoningPe rceivingLearningAbstractingReasoningPerc eivingLearningAbstractingReasoningStatis ticalLearningLots of data enablednon- expert systemsAdding context to AI systemsAbility of system to abstractVNG 010720 Spectrum of Commercial Organizations in the Machine Intelligence FieldAI and ML - 10 Source.
6 Shivon Zilis, 2016, Shivon Zilis and James Cham, designed by Heidi Skinner. All rights reserved. This content is excluded from our Creative Commons license. For more information, see -use/5AI and ML - 11 VNG 010720 Data is Critical To Breakthroughs in AIYearBreakthroughs in AIDatasets (First Available)Algorithms (First Proposed)1994 Human-level read-speech recognitionSpoken Wall Street Journal articles and other texts (1991)Hidden Markov Model (1984)1997 IBM Deep Blue defeated Garry Kasparov700,000 Grandmaster chess games, aka The Extended Book (1991)Negascout planning algorithm (1983)2005 Google s Arabic- and trillion tokens from Google Web and News pages (collected in 2005)Statistical Machine translation algorithm (1988)2011 IBM Watson became the world Jeopardy! million documents from Wikipedia, Wiktionary, Wikiquote, and Project Gutenberg(updated in 2010)Mixture-of-Experts algorithm (1991)2014 Google s GoogleNetobject classification at near-human performanceImageNet corpus of million labeled images and 1,000 object categories (2010)Convolutional neural network algorithm (1989)2015 Google s Deepmind achieved human parity in playing 29 Atari games by Learning general control from videoArcade Learning Environmentdataset of over 50 Atari games (2013)Q- Learning algorithm (1992)Average No.
7 Of Years to Breakthrough:3 years18 yearsSource: Train AI 2017, and ML - 12 VNG 010720AI Canonical ArchitectureData ConditioningStructuredDataUnstructuredDa taSensorsSourcesInformationKnowledgeInsi ghtUsers (Missions)Robust AIExplainable AIMetrics and Bias AssessmentVerification & ValidationSecurity( , counter AI)Policy, Ethics, Safety and :Human- Machine Teaming (CoA)Spectrum Knowledge-Based Unsupervised and SupervisedLearning Transfer Learning ReinforcementLearning etcHumanHuman- Machine ComplementMachineModern ComputingCPUsGPUsQuantumCustom.. TPUN euromorphicGPU = Graphics Processing Unit TPU = Tensor Processing UnitCoA = Courses of Action 6AI and ML -13 VNG 010720 Unstructured and Structured DataData Conditioning/Storage Technologies-Data to Information -Infrastructure/Databases Indexing/Organization/Structure Domain Specific Languages High Performance Data Access Declarative InterfacesData Curation Unsupervised Machine Learning Dimensionality Reduction Clustering/Pattern Recognition Outlier DetectionData Labeling Initial data exploration Highlight missing or incomplete data Reorient sensors/recapture data Look for errors/biases in collectionOften takes up 80+% of overall AI/ML development workRobust AIData ConditioningAlgorithms Modern ComputingHuman- Machine TeamingData ConditioningTechnologiesCapabilities ProvidedSpeechNetwork LogsMetadataSocialMediaHuman BehaviorSide
8 ChannelStructured Data TypesUnstructured Data TypesSensorsReportsAI and ML -14 VNG 010720 Machine Learning Algorithms TaxonomyRobust AIData ConditioningModern ComputingHuman- Machine TeamingData ConditioningAlgorithms Algorithms*Analogizers( , SVM)Connectionists( , DNN)Bayesians( , naive Bayes)* The Five Tribes of Machine Learning , Pedro DomingosSymbolists( , exp. sys.)DNN = Deep Neural NetworksSVM = Support Vector MachinesExp. Sys. = Expert SystemsEvolutionaries( , genetic programming)Image Adapted From Deep Learning by Ian Goodfellow, YoshuaBengioand Aaron CourvilleArtificial IntelligenceMachine LearningNeural NetsDeep Neural Nets7AI and ML -15 VNG 010720 Computing ClassModern AI Computing EnginesRobust AIData ConditioningAlgorithms Human- Machine TeamingData ConditioningModern ComputingGPUCPUTPU Most popular computing platform General purpose compute Used by most for training algorithms (good for NN backpropagation) Speeds up inference time(domain specific architecture)Selected ResultsSpGEMM Performance using Graph Processor (G102)Traversed Edges / SecondQuantum Benefits unproven until now Recent results on HHL (linear system of equations)NeuromorphicCustom Active research area Ability to speed up specific computations of interest ( graphs)
9 What It Provides to AI1E+71E+81E+91E+101E+111E+121E+131E+141 E+11E+21E+31E+41E+51E+61E+71E+81E+9 Traversed Edges Per SecondWattsASIC Graph Processor (Projected)FPGA Graph Processor (Measured)Cray XK7 Titan (Measured)Cray XT4 Franklin (Measured)Data CenterApplicationsEmbedded Applications8 NodesMini-Chassis64 NodesChassis256 Nodes Rack1024 Nodes 4 Racks4k Nodes 16 Racks16k Nodes, 64 Racks10141013101210111010109108107101102 103104105106107108109 Alexnet comparison: Forward-Backward PassGPU = Graphics Processing Unit TPU = Tensor Processing Unit HHL = Harrow-Hassidim-Lloyd quantum algorithmAI and ML -16 VNG 010720 DGX-1 Neural Network Processing PerformanceMIT EyerissMovidiusXJetsonTX1 JetsonTX2 XavierDGX-StationDGX-2 WaveSystemWaveDPUTrueNorthSysGraphCoreNo deGraphCoreC2K80P100V1002xSkyLakeSPPhi72 10 FPhi7290 FArriaNervanaPeak Power (W)Peak GOps/Second1 TeraOps/W10 TeraOps/W100 GigaOps/WInt8 Int16 Float16 Float16 -> Float32 Float32 Float64 Computation PrecisionChipCardSystemForm FactorInferenceTrainingComputation TypeGoyaTPU3 TPU1 TPU2 LegendTuringTPUEdgeTrueNorthReuther, Albert, et al.
10 "Survey and Benchmarking of Machine Learning Accelerators." arXivpreprint (2019). IEEE. Figure 2 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-9, doi: All rights reserved. This content is excluded from our Creative Commons license. For more information, see and ML -17 VNG 010720 Robust AI: Preserving TrustData ConditioningAlgorithms Modern ComputingHuman- Machine TeamingData ConditioningRobust AIConfidence Level vs. Consequence of ActionsConsequence of ActionsConfidence Level in the Machine Making the DecisionLowHighLowHighMachines Augmenting HumansBest Matched toMachinesBest Matched to HumansAI and ML - 18 VNG 010720 Importance of Robust AI System vulnerable to adversarial action (both cyber and physical)Unknown relationship between arbitrary input and Machine outputIssueRobust AI FeatureSolutionsExampleModel failure detection, red teamingExplainability, dimensionality reduction, feature importance inferenceUnwanted actions when controlling heavy or dangerous machineryRisk sensitivity, robust inference, high decision thresholds User unfamiliarity or mistrust leads to lack of adoptionSeamless integration, model expansion, transparent uncertaintyAlgorithms need to meet mission specifications Robust training, portfolio methods, regularizationSecurityMetricsPolicy, Ethics, Safety, and TrainingExplainable AIValidation & Verification9AI and ML -19 VNG 010720 Robust AIData ConditioningAlgorithms Data ConditioningModern ComputingHuman- Machine TeamingKnowledgeInsightHuman- Machine Team ing (C oA )