Example: stock market

High-Dimensional Probability - University of California ...

High-Dimensional ProbabilityAn Introduction with Applications in Data ScienceRoman VershyninUniversity of California , IrvineJune 9, 2020 ~rvershyn/ContentsPrefaceviAppetizer: using Probability to cover a geometric set11 Preliminaries on random quantities associated with random classical of sums of independent random concentration inequalities? s s : degrees of random Hoeffding s and Khintchine s s vectors in high of the matrices and principal component of High-Dimensional distributions in higher : Grothendieck s inequality and semidefinite : Maximum cut for trick, and tightening of Grothendieck s on , covering numbers and packing : error correcting bounds on random sub-gaussian : community detection in bounds on sub-gaussian : covariance estimation a

Galyna Livshyts, Jelani Nelson, Ekkehard Schnoor, Martin Spindler, Dominik St oger, Tim Sullivan, Terence Tao, Joel Tropp, Katarzyna Wyczesany, Yifei Shen and Haoshu Xu for many valuable suggestions and corrections, especially to Sjo-erd Dirksen, Larry Goldstein, Wu Han, Han Wu, and Mahdi Soltanolkotabi for

Tags:

  High, Dimensional, Probability, Nelson, High dimensional probability

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of High-Dimensional Probability - University of California ...

1 High-Dimensional ProbabilityAn Introduction with Applications in Data ScienceRoman VershyninUniversity of California , IrvineJune 9, 2020 ~rvershyn/ContentsPrefaceviAppetizer: using Probability to cover a geometric set11 Preliminaries on random quantities associated with random classical of sums of independent random concentration inequalities? s s : degrees of random Hoeffding s and Khintchine s s vectors in high of the matrices and principal component of High-Dimensional distributions in higher : Grothendieck s inequality and semidefinite : Maximum cut for trick, and tightening of Grothendieck s on , covering numbers and packing : error correcting bounds on random sub-gaussian : community detection in bounds on sub-gaussian : covariance estimation and without of Lipschitz functions on the on other metric measure.

2 Johnson-Lindenstrauss Bernstein s : community detection in sparse : covariance estimation for general forms, symmetrization and of anisotropic random matrices with : matrix concepts and s bounds on Gaussian s minoration dimension, stable rank, and Gaussian projections of s : empirical : statistical learning s majorizing measure and comparison s of random matrices and geometric deviation matrices, random projections and covariance Lemma for infinite sections:M bound and Escape High-Dimensional signal recovery Signal recovery based onM Recovery of sparse Low-rank matrix Exact recovery and the restricted isometry Lasso algorithm for sparse Notes26711 Dvoretzky-Milman s Deviations of random matrices with respect to general Johnson-Lindenstrauss embeddings and sharper Chevet Dvoretzky-Milman s Notes279 Bibliography280 Index290 PrefaceWho is this book for?

3 This is a textbook in Probability in high dimensions with a view toward applica-tions in data sciences. It is intended for doctoral and advanced masters studentsand beginning researchers in mathematics, statistics, electrical engineering, com-putational biology and related areas, who are looking to expand their knowledgeof theoretical methods used in modern research in data this book?Data sciences are moving fast, and probabilistic methods often provide a foun-dation and inspiration for such advances. A typical graduate Probability courseis no longer sufficient to acquire the level of mathematical sophistication thatis expected from a beginning researcher in data sciences today.

4 The proposedbook intends to partially cover this gap. It presents some of the key probabilis-tic methods and results that may form an essential toolbox for a mathematicaldata scientist. This book can be used as a textbook for a basic second course inprobability with a view toward data science applications. It is also suitable is this book about? High-Dimensional Probability is an area of Probability theory that studies randomobjects inRnwhere the dimensionncan be very large. This book places par-ticular emphasis on random vectors, random matrices, and random teaches basic theoretical skills for the analysis of these objects, which includeconcentration inequalities, covering and packing arguments, decoupling and sym-metrization tricks, chaining and comparison techniques for stochastic processes,combinatorial reasoning based on the VC dimension, and a lot Probability provides vital theoretical tools for applicationsin data science.

5 This book integrates theory with applications for covarianceestimation, semidefinite programming, networks, elements of statistical learning,error correcting codes, clustering, matrix completion, dimension reduction, sparsesignal recovery, sparse regression, and essential prerequisites for reading this book are a rigorous course in probabil-ity theory (on Masters or level), an excellent command of undergraduatelinear algebra, and general familiarity with basic notions about metric, normedviPrefaceviiand Hilbert spaces and linear operators.

6 Knowledge of measure theory is notessential but would be word on exercisesExercises are integrated into the text. The reader can do them immediately tocheck his or her understanding of the material just presented, and to preparebetter for later developments. The difficulty of the exercises is indicated by thenumber of coffee cups; it can range from easiest (K) to hardest (KKKK).Related readingThis book covers only a fraction of theoretical apparatus of high -dimensionalprobability, and it illustrates it with only a sample of data science chapter in this book is concluded with aNotessection, which has pointersto other texts on the matter.

7 A few particularly useful sources should be notedhere. The now classical book [8] showcases the probabilistic method in applica-tions to discrete mathematics and computer science. The forthcoming book [20]presents a panorama of mathematical data science, and it particularly focuseson applications in computer science. Both these books are accessible to gradu-ate and advanced undergraduate students. The lecture notes [210] are pitchedfor graduate students and present more theoretical material in feedback from my many colleagues was instrumental in perfecting this special thanks go to Florent Benaych-Georges, Jennifer Bryson, Lukas Gr atz,Remi Gribonval, Ping Hsu, Mike Izbicki, Yi Li, George Linderman, Cong Ma,Galyna Livshyts, Jelani nelson , Ekkehard Schnoor, Martin Spindler, DominikSt oger, Tim Sullivan, Terence Tao, Joel Tropp, Katarzyna Wyczesany, Yifei Shenand Haoshu Xu for many valuable suggestions and corrections.

8 Especially to Sjo-erd Dirksen, Larry Goldstein, Wu Han, Han Wu, and Mahdi Soltanolkotabi fordetailed proofreading of the book, Can Le, Jennifer Bryson and my son IvanVershynin for their help with many pictures in this this book had been published, many colleagues offered further suggestionsand noticed many typos, gaps and inaccuracies. I am grateful to Karthik Abinav,Diego Armentano, Aaron Berk, Soham De, Hu Fu, Adam Jozefiak, Nick Harvey,Harrie Hendriks, Yi Li, Chris Liaw, Hengrui Luo, Sikander Randhawa, KarthikSankararaman, and especially to Aryeh Kontorovich, Jake Knigge, Mark Meckes,Abbas Mehrabian, Holger Rauhut, and Hao Xing who offered substantial feedbackthat lead to significant issues that are found after publication have been corrected in the electronicversion.

9 Please feel free to send me any further , CaliforniaJune 9, 2020 Appetizer: using Probability to cover a geometric setWe begin our study of High-Dimensional Probability with an elegant argumentthat showcases the usefulness of probabilistic reasoning in that aconvex combinationof pointsz1,..,zm Rnis a linear combi-nation with coefficients that are non-negative and sum to 1, it is a sum of theformm i=1 iziwhere i 0 andm i=1 i= 1.( )Theconvex hullof a setT Rnis the set of all convex combinations of all finitecollections of points inT:conv(T) :={convex combinations ofz1.}

10 ,zm Tform N};see Figure for convex hull of a set of points representing major citiesThe numbermof elements defining a convex combination inRnis not restricteda priori. However, the classical Caratheodory s theorem states that one can alwaystakem n+ (Caratheodory s theorem).Every point in the convex hull of asetT Rncan be expressed as a convex combination of at mostn+ boundn+ 1 can not be improved, as it is clearly attained for a simplexT(a set ofn+ 1 points in general position). Suppose, however, that we only want12 Appetizer: using Probability to cover a geometric settoapproximatea pointx conv(T) rather than exactly represent it as a convexcombination.


Related search queries