Example: quiz answers

Foundations of Data Science - Cornell University

Foundations of data Science Avrim Blum, John Hopcroft, and Ravindran KannanThursday 4thJanuary, 2018 Copyright 2015. All rights reserved1 Contents1 Introduction92 High-Dimensional Introduction .. The Law of Large Numbers .. The Geometry of High Dimensions .. Properties of the Unit Ball .. of the Unit Ball .. Near the Equator .. Generating Points Uniformly at Random from a Ball .. Gaussians in High Dimension .. Random Projection and Johnson-Lindenstrauss Lemma .. Separating Gaussians .. Fitting a Spherical Gaussian to data .. Bibliographic Notes .. Exercises .. 323 Best-Fit Subspaces and Singular Value Decomposition (SVD) Introduction .. Preliminaries.

1 Introduction 9

Tags:

  Data

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Foundations of Data Science - Cornell University

1 Foundations of data Science Avrim Blum, John Hopcroft, and Ravindran KannanThursday 4thJanuary, 2018 Copyright 2015. All rights reserved1 Contents1 Introduction92 High-Dimensional Introduction .. The Law of Large Numbers .. The Geometry of High Dimensions .. Properties of the Unit Ball .. of the Unit Ball .. Near the Equator .. Generating Points Uniformly at Random from a Ball .. Gaussians in High Dimension .. Random Projection and Johnson-Lindenstrauss Lemma .. Separating Gaussians .. Fitting a Spherical Gaussian to data .. Bibliographic Notes .. Exercises .. 323 Best-Fit Subspaces and Singular Value Decomposition (SVD) Introduction .. Preliminaries.

2 Singular Vectors .. Singular Value Decomposition (SVD) .. Best Rank-kApproximations .. Left Singular Vectors .. Power Method for Singular Value Decomposition .. Faster Method .. Singular Vectors and Eigenvectors .. Applications of Singular Value Decomposition .. data .. Component Analysis .. a Mixture of Spherical Gaussians .. Documents and Web Pages .. Application of SVD to a Discrete Optimization Problem .. Bibliographic Notes .. Exercises .. 674 Random Walks and Markov Stationary Distribution .. Markov Chain Monte Carlo .. Algorithm .. Sampling .. Areas and Volumes .. Convergence of Random Walks on Undirected Graphs.

3 Normalized Conductance to Prove Convergence .. Electrical Networks and Random Walks .. Random Walks on Undirected Graphs with Unit Edge Weights .. Random Walks in Euclidean Space .. The Web as a Markov Chain .. Bibliographic Notes .. Exercises .. 1185 Machine Introduction .. The Perceptron algorithm .. Kernel Functions .. Generalizing to New data .. Overfitting and Uniform Convergence .. Illustrative Examples and Occam s Razor .. Disjunctions .. s Razor .. : Learning Decision Trees .. Regularization: Penalizing Complexity .. Online Learning .. Example: Learning Disjunctions .. Halving Algorithm .. Perceptron Algorithm .. : Inseparable data and Hinge Loss.

4 Online to Batch Conversion .. Support-Vector Machines .. VC-Dimension .. Definitions and Key Theorems .. Examples: VC-Dimension and Growth Function .. Proof of Main Theorems .. VC-Dimension of Combinations of Concepts .. Other Measures of Complexity .. Strong and Weak Learning - Boosting .. Stochastic Gradient Descent .. Combining (Sleeping) Expert Advice .. Deep Learning .. Generative Adversarial Networks (GANs) .. Further Current Directions .. Semi-Supervised Learning .. Active Learning .. Multi-Task Learning .. Bibliographic Notes .. Exercises .. 1766 Algorithms for Massive data Problems: Streaming, Sketching, Introduction .. Frequency Moments of data Streams.

5 Of Distinct Elements in a data Stream .. of Occurrences of a Given Element.. Elements .. Second Moment .. Matrix Algorithms using Sampling .. Multiplication using Sampling .. Length Squared Sampling in Two Passes .. of a Large Matrix .. Sketches of Documents .. Bibliographic Notes .. Exercises .. 2047 Introduction .. General Assumptions on the Form of Clusters .. Clustering .. Clustering .. Maximum-Likelihood Motivation .. Properties of thek-Means Objective .. s Algorithm .. s Algorithm .. Clustering on the Line .. Clustering .. Finding Low-Error Clusterings .. Spectral Clustering .. Project? .. Algorithm .. Separated by (1) Standard Deviations.

6 Spectral clustering .. Approximation Stability .. Conceptual Idea .. this Formal .. and Analysis .. High-Density Clusters .. Linkage .. Linkage .. Kernel Methods .. Recursive Clustering based on Sparse Cuts .. Dense Submatrices and Communities .. Community Finding and Graph Partitioning .. Spectral clustering applied to social networks .. Bibliographic Notes .. Exercises .. 2408 Random TheG(n,p) Model .. Distribution .. of Triangles inG(n,d/n) .. Phase Transitions .. Giant Component .. of a giant component .. other large components .. case ofp <1/n.. Cycles and Full Connectivity .. of Cycles .. Connectivity .. forO(lnn) Diameter.

7 Phase Transitions for Increasing Properties .. Branching Processes .. CNF-SAT .. in practice .. Transitions for CNF-SAT .. Nonuniform Models of Random Graphs .. Component in Graphs with Given Degree Distribution .. Growth Models .. Model Without Preferential Attachment .. Model With Preferential Attachment .. Small World Graphs .. Bibliographic Notes .. Exercises .. 3019 Topic Models, Nonnegative Matrix Factorization, Hidden Markov Mod-els, and Graphical Topic Models .. An Idealized Model .. Nonnegative Matrix Factorization - NMF .. NMF with Anchor Terms .. Hard and Soft Clustering .. The Latent Dirichlet Allocation Model for Topic Modeling.

8 The Dominant Admixture Model .. Formal Assumptions .. Finding the Term-Topic Matrix .. Hidden Markov Models .. Graphical Models and Belief Propagation .. Bayesian or Belief Networks .. Markov Random Fields .. Factor Graphs .. Tree Algorithms .. Message Passing in General Graphs .. Graphs with a Single Cycle .. Belief Update in Networks with a Single Loop .. Maximum Weight Matching .. Warning Propagation .. Correlation Between Variables .. Bibliographic Notes .. Exercises .. 35710 Other Ranking and Social Choice .. Randomization .. Examples .. Compressed Sensing and Sparse Vectors .. Unique Reconstruction of a Sparse Vector .. Efficiently Finding the Unique Sparse Solution.

9 Applications .. Biological .. Low Rank Matrices .. An Uncertainty Principle .. Sparse Vector in Some Coordinate Basis .. A Representation Cannot be Sparse in Both Time and FrequencyDomains .. Gradient .. Linear Programming .. The Ellipsoid Algorithm .. Integer Optimization .. Semi-Definite Programming .. Bibliographic Notes .. 381611 Dilation .. The Haar Wavelet .. Wavelet Systems .. Solving the Dilation Equation .. Conditions on the Dilation Equation .. Derivation of the Wavelets from the Scaling Function .. Sufficient Conditions for the Wavelets to be Orthogonal .. Expressing a Function in Terms of Wavelets .. Designing a Wavelet System .. Bibliographic Notes.

10 Exercises .. 40312 Definitions and Notation .. Asymptotic Notation .. Useful Relations .. Useful Inequalities .. Probability .. Sample Space, Events, and Independence .. Linearity of Expectation .. Union Bound .. Indicator Variables .. Variance .. Variance of the Sum of Independent Random Variables .. Median .. The Central Limit Theorem .. Probability Distributions .. Bayes Rule and Estimators .. Bounds on Tail Probability .. Chernoff Bounds .. More General Tail Bounds .. Applications of the Tail Bound .. Eigenvalues and Eigenvectors .. Symmetric Matrices .. Relationship between SVD and Eigen Decomposition .. Extremal Properties of Eigenvalues.


Related search queries