PDF4PRO ⚡AMP

Modern search engine that looking for books and documents around the web

Example: quiz answers

Data Science Cheatsheet 2

Data Science Cheatsheet Updated June 19, 2021 DistributionsDiscreteBinomial-xsuccesses innevents, each withpprobability (nx)pxqn x, with =npand 2=npq If n = 1, this is a Bernoulli distributionGeometric- first success withpprobability on thenthtrial qn 1p, with = 1/pand 2=1 pp2 Negative Binomial- number of failures beforersuccessesHypergeometric-xsuccesse s inndraws, no replacement,from a sizeNpopulation withXitems of that feature (Xx)(N Xn x)(Nn), with =nXNPoisson- number of successesxin a fixed time interval, wheresuccess occurs at an average rate xe x!, with = 2= ContinuousUniform- all values betweenaandbare equally likely 1b awith =a+b2and 2=(b a)212orn2 112if discreteNormal/GaussianN( , ), Standard NormalZ N(0,1) Central Limit Theorem - sample mean of dataapproaches normal distribution Empirical Rule - 68%, 95%, and of values lie withinone, two, and three standard deviations of the mean Normal Approximation - discrete distributions such asBinomial and Poisson can be approximated using z-scoreswhennp,nq, and are greater than 10 Exponential- mem

Principal Component Analysis Projects data onto orthogonal vectors that maximize variance. Remember, given an n nmatrix A, a nonzero vector ~x, and a scaler , if A~x= ~xthen ~xand are an eigenvector and eigenvalue of A. In PCA, the eigenvectors are uncorrelated and represent principal components. 1.Start with the covariance matrix of ...

Loading..

Tags:

  Analysis, Principal component analysis, Principal, Component

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Spam in document Broken preview Other abuse

Transcription of Data Science Cheatsheet 2

Related search queries