Transcription of Data Science Cheatsheet 2
{{id}} {{{paragraph}}}
Data Science Cheatsheet Updated June 19, 2021 DistributionsDiscreteBinomial-xsuccesses innevents, each withpprobability (nx)pxqn x, with =npand 2=npq If n = 1, this is a Bernoulli distributionGeometric- first success withpprobability on thenthtrial qn 1p, with = 1/pand 2=1 pp2 Negative Binomial- number of failures beforersuccessesHypergeometric-xsuccesse s inndraws, no replacement,from a sizeNpopulation withXitems of that feature (Xx)(N Xn x)(Nn), with =nXNPoisson- number of successesxin a fixed time interval, wheresuccess occurs at an average rate xe x!, with = 2= ContinuousUniform- all values betweenaandbare equally likely 1b awith =a+b2and 2=(b a)212orn2 112if discreteNormal/GaussianN( , ), Standard NormalZ N(0,1) Central Limit Theorem - sample mean of dataapproaches normal distribution Empirical Rule - 68%, 95%, and of values lie withinone, two, and three standard deviations of the mean Normal Approximation - discrete distributions such asBinomial and Poisson can be approximated using z-scoreswhennp,nq, and are greater than 10 Exponential- memoryless time between independent eventsoccurring at an average rate e x, with =1 Gamma- time untilnindependent events occurring at a
Linear Discriminant Analysis Supervised method that maximizes separation between classes and minimizes variance within classes for a labeled dataset Compute the mean and variance of each independent variable for every class C i 2.Calculate the within-class (˙ 2 w) and between-class (˙ b) variance 3.Find the matrix W= (˙2 w) 1(˙2 b) that ...
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}