Example: tourism industry

Probability Cheatsheet v2.0 Thinking Conditionally Law of ...

Probability Cheatsheet by William Chen ( ) and Joe Blitzstein,with contributions from Sebastian Chiu, Yuan Jiang, Yuqi Hou, andJessy Hwang. Material based on Joe Blitzstein s (@stat110) lectures( ) and Blitzstein/Hwang s Introduction toProbability textbook ( ). LicensedunderCC BY-NC-SA Please share comments, suggestions, and Updated September 4, 2015 CountingMultiplication RulecakewaffleSVCSVCSVC cakewafflecakewafflecakewaffleLet s say we have a compound experiment (an experiment withmultiple components). If the 1st component hasn1possible outcomes,the 2nd component hasn2possible outcomes, .. , and therthcomponent hasnrpossible outcomes, then overall there for the whole Table765842931 The sampling table gives the number of possible samples of sizekoutof a population of sizen, under various assumptions about how thesample is MattersNot MatterWith Replacementnk(n+k 1k)Without Replacementn!

Moment Generating Functions MGF For any random variable X, the function M X(t) = E(e tX) is the moment generating function (MGF) of X, if it exists for all tin some open interval containing 0.

Tags:

  Thinking, Probability, Random, Cheatsheet, Conditionally, Probability cheatsheet, Thinking conditionally law of

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Probability Cheatsheet v2.0 Thinking Conditionally Law of ...

1 Probability Cheatsheet by William Chen ( ) and Joe Blitzstein,with contributions from Sebastian Chiu, Yuan Jiang, Yuqi Hou, andJessy Hwang. Material based on Joe Blitzstein s (@stat110) lectures( ) and Blitzstein/Hwang s Introduction toProbability textbook ( ). LicensedunderCC BY-NC-SA Please share comments, suggestions, and Updated September 4, 2015 CountingMultiplication RulecakewaffleSVCSVCSVC cakewafflecakewafflecakewaffleLet s say we have a compound experiment (an experiment withmultiple components). If the 1st component hasn1possible outcomes,the 2nd component hasn2possible outcomes, .. , and therthcomponent hasnrpossible outcomes, then overall there for the whole Table765842931 The sampling table gives the number of possible samples of sizekoutof a population of sizen, under various assumptions about how thesample is MattersNot MatterWith Replacementnk(n+k 1k)Without Replacementn!

2 (n k)!(nk)Naive Definition of ProbabilityIf all outcomes are equally likely, the Probability of an eventAhappening is:Pnaive(A) =number of outcomes favorable toAnumber of outcomesThinking ConditionallyIndependenceIndependent EventsAandBare independent if knowing whetherAoccurred gives no information about whetherBoccurred. Moreformally,AandB(which have nonzero Probability ) are independent ifand only if one of the following equivalent statements holds:P(A B) =P(A)P(B)P(A|B) =P(A)P(B|A) =P(B)Conditional IndependenceAandBare Conditionally independentgivenCifP(A B|C) =P(A|C)P(B|C). Conditional independencedoes not imply independence, and independence does not implyconditional , Intersections, and ComplementsDe Morgan s LawsA useful identity that can make calculatingprobabilities of unions easier by relating them to intersections, andvice versa. Analogous results hold with more than two sets.(A B)c=Ac Bc(A B)c=Ac BcJoint, Marginal, and ConditionalJoint ProbabilityP(A B) orP(A,B) Probability (Unconditional) ProbabilityP(A) Probability ProbabilityP(A|B) =P(A,B)/P(B) Probability ofA, given ProbabilityisProbabilityP(A|B) is a probabilityfunction for any fixedB.

3 Any theorem that holds for Probability alsoholds for conditional of an Intersection or UnionIntersections via ConditioningP(A,B) =P(A)P(B|A)P(A,B,C) =P(A)P(B|A)P(C|A,B)Unions via Inclusion-ExclusionP(A B) =P(A) +P(B) P(A B)P(A B C) =P(A) +P(B) +P(C) P(A B) P(A C) P(B C)+P(A B C).Simpson s ParadoxDr. HibbertDr. Nickheartband-aidIt is possible to haveP(A|B,C)< P(A|Bc,C) andP(A|B,Cc)< P(A|Bc,Cc)yet alsoP(A|B)> P(A|Bc).Law of Total Probability (LOTP)LetB1,B2,B3,..Bnbe apartitionof the sample space ( , they aredisjoint and their union is the entire sample space).P(A) =P(A|B1)P(B1) +P(A|B2)P(B2) + +P(A|Bn)P(Bn)P(A) =P(A B1) +P(A B2) + +P(A Bn)ForLOTP with extra conditioning, just add in another eventC!P(A|C) =P(A|B1,C)P(B1|C) + +P(A|Bn,C)P(Bn|C)P(A|C) =P(A B1|C) +P(A B2|C) + +P(A Bn|C)Special case of LOTP withBandBcas partition:P(A) =P(A|B)P(B) +P(A|Bc)P(Bc)P(A) =P(A B) +P(A Bc)Bayes RuleBayes Rule, and with extra conditioning (just add inC!)

4 P(A|B) =P(B|A)P(A)P(B)P(A|B,C) =P(B|A,C)P(A|C)P(B|C)We can also writeP(A|B,C) =P(A,B,C)P(B,C)=P(B,C|A)P(A)P(B,C)Odds Form of Bayes RuleP(A|B)P(Ac|B)=P(B|A)P(B|Ac)P(A)P(Ac) Theposterior oddsofAare thelikelihood ratiotimes theprior Variables and their DistributionsPMF, CDF, and IndependenceProbability Mass Function (PMF)Gives the Probability that adiscreterandom variable takes on the (x) =P(X=x) PMF satisfiespX(x) 0 and xpX(x) = 1 Cumulative Distribution Function (CDF)Gives the probabilitythat a random variable is less than or equal (x) =P(X x) CDF is an increasing, right-continuous function withFX(x) 0 asx andFX(x) 1 asx IndependenceIntuitively, two random variables are independent ifknowing the value of one gives no information about the independent if forallvalues ofxandyP(X=x,Y=y) =P(X=x)P(Y=y)Expected Value and IndicatorsExpected Value and LinearityExpected Value( ,expectation, oraverage) is a weightedaverage of the possible outcomes of our random , ifx1,x2,x3.

5 Are all of the distinct possible valuesthatXcan take, the expected value ofXisE(X) = ixiP(X=xi) + Y741433 xi yi+ (xi + yi)=E(X)E(Y)+E(X + Y)=i=1ni=1ni=1nn1n1n1 LinearityFor any , and constantsa,b,c,E(aX+bY+c) =aE(X) +bE(Y) +cSame distribution implies same meanIfXandYhave the samedistribution, thenE(X) =E(Y) and, more generally,E(g(X)) =E(g(Y))Conditional Expected Valueis defined like expectation, onlyconditioned on any (X|A) = xxP(X=x|A)Indicator random VariablesIndicator random Variableis a random variable that takes on thevalue 1 or 0. It is always an indicator of some event: if the eventoccurs, the indicator is 1; otherwise it is 0. They are useful for manyproblems about counting how many events of some kind occur. WriteIA={1 ifAoccurs,0 ifAdoes not thatI2A=IA,IAIB=IA B,andIA B=IA+IB Bern(p) wherep=P(A).Fundamental BridgeThe expectation of the indicator for eventAisthe Probability of eventA:E(IA) =P(A).}

6 Variance and Standard DeviationVar(X) =E(X E(X))2=E(X2) (E(X))2SD(X) = Var(X)Continuous RVs, LOTUS, UoUContinuous random Variables (CRVs)What s the Probability that a CRV is in an interval?Take thedifference in CDF values (or use the PDF as described later).P(a X b) =P(X b) P(X a) =FX(b) FX(a)ForX N( , 2), this becomesP(a X b) = (b ) (a )What is the Probability Density Function (PDF)?The PDFfis the derivative of the (x) =f(x)A PDF is nonnegative and integrates to 1. By the fundamentaltheorem of calculus, to get from PDF back to CDF we can integrate:F(x) = x f(t)dt 4 4 find the Probability that a CRV takes on a value in an interval,integrate the PDF over that (b) F(a) = baf(x)dxHow do I find the expected value of a CRV?Analogous to thediscrete case, where you sumxtimes the PMF, for CRVs you integratextimes the (X) = xf(x)dxLOTUSE xpected value of a function of an expected value ofXis defined this way:E(X) = xxP(X=x) (for discreteX)E(X) = xf(x)dx(for continuousX)TheLaw of the Unconscious Statistician (LOTUS)states thatyou can find the expected value of afunction of a random variable,g(X), in a similar way, by replacing thexin front of the PMF/PDF byg(x) but still working with the PMF/PDF ofX:E(g(X)) = xg(x)P(X=x) (for discreteX)E(g(X)) = g(x)f(x)dx(for continuousX)What s a function of a random variable?

7 A function of a randomvariable is also a random variable. For example, ifXis the number ofbikes you see in an hour, theng(X) = 2 Xis the number of bike wheelsyou see in that hour andh(X) =(X2)=X(X 1)2is the number ofpairsof bikes such that you see both of those bikes in that s the point?You don t need to know the PMF/PDF ofg(X)to find its expected value. All you need is the PMF/PDF of Uniform (UoU)When you plug any CRV into its own CDF, you get a Uniform(0,1) random variable. When you plug a Uniform(0,1) into an inverseCDF, you get an with that CDF. For example, let s say that arandom variableXhas CDFF(x) = 1 e x,forx >0By UoU, if we plugXinto this function then we get a uniformlydistributed random (X) = 1 e X Unif(0,1)Similarly, ifU Unif(0,1) thenF 1(U) has CDFF. The key point isthat for any continuous random variableX, we can transform it into aUniform random variable and back by using its and MGFsMomentsMoments describe the shape of a distribution.

8 LetXhave mean andstandard deviation , andZ= (X )/ be thestandardizedversionofX. Thekth moment ofXis k=E(Xk) and thekth standardizedmoment ofXismk=E(Zk). The mean, variance, skewness, andkurtosis are important summaries of the shape of a (X) = 1 VarianceVar(X) = 2 21 SkewnessSkew(X) =m3 KurtosisKurt(X) =m4 3 Moment Generating FunctionsMGFFor any random variableX, the functionMX(t) =E(etX)is themoment generating function (MGF)ofX, if it exists for alltin some open interval containing 0. The variabletcould just as wellhave been calleduorv. It s a bookkeeping device that lets us workwith thefunctionMXrather than thesequenceof is it called the Moment Generating Function?Becausethekth derivative of the moment generating function, evaluated at 0,is thekth moment ofX. k=E(Xk) =M(k)X(0)This is true by Taylor expansion ofetXsinceMX(t) =E(etX) = k=0E(Xk)tkk!= k=0 ktkk!MGF of linear functionsIf we haveY=aX+b, thenMY(t) =E(et(aX+b)) =ebtE(e(at)X) =ebtMX(at)UniquenessIf it exists, the MGF uniquely determines thedistribution.

9 This means that for any two random variablesXandY,they are distributed the same (their PMFs/PDFs are equal) if andonly if their MGFs are Independent RVs by Multiplying independent, thenMX+Y(t) =E(et(X+Y)) =E(etX)E(etY) =MX(t) MY(t)The MGF of the sum of two random variables is the product of theMGFs of those two random PDFs and CDFsJoint DistributionsThejoint CDFofXandYisF(x,y) =P(X x,Y y)In the discrete case,XandYhave ajoint PMFpX,Y(x,y) =P(X=x,Y=y).In the continuous case, they have ajoint PDFfX,Y(x,y) = 2 x yFX,Y(x,y).The joint PMF/PDF must be nonnegative and sum/integrate to DistributionsConditioning and Bayes rule for discrete (Y=y|X=x) =P(X=x,Y=y)P(X=x)=P(X=x|Y=y)P(Y=y)P(X=x) Conditioning and Bayes rule for continuous |X(y|x) =fX,Y(x,y)fX(x)=fX|Y(x|y)fY(y)fX(x)Hybri d Bayes rulefX(x|A) =P(A|X=x)fX(x)P(A)Marginal DistributionsTo find the distribution of one (or more) random variables from a jointPMF/PDF, sum/integrate over the unwanted random PMF from joint PMFP(X=x) = yP(X=x,Y=y)Marginal PDF from joint PDFfX(x) = fX,Y(x,y)dyIndependence of random VariablesRandom variablesXandYare independent if and only if any of thefollowing conditions holds.

10 Joint CDF is the product of the marginal CDFs Joint PMF/PDF is the product of the marginal PMFs/PDFs Conditional distribution ofYgivenXis the marginaldistribution ofYWriteX Yto denote thatXandYare LOTUSLOTUS in more than one dimension is analogous to the 1D discrete random variables:E(g(X,Y)) = x yg(x,y)P(X=x,Y=y)For continuous random variables:E(g(X,Y)) = g(x,y)fX,Y(x,y)dxdyCovariance and TransformationsCovariance and CorrelationCovarianceis the analog of variance for two random (X,Y) =E((X E(X))(Y E(Y))) =E(XY) E(X)E(Y)Note thatCov(X,X) =E(X2) (E(X))2= Var(X)Correlationis a standardized version of covariance that is alwaysbetween 1 and (X,Y) =Cov(X,Y) Var(X)Var(Y)Covariance and IndependenceIf two random variables areindependent, then they are uncorrelated. The converse is notnecessarily true ( , considerX N(0,1) andY=X2).X Y Cov(X,Y) = 0 E(XY) =E(X)E(Y)Covariance and VarianceThe variance of a sum can be found byVar(X+Y) = Var(X) + Var(Y) + 2 Cov(X,Y)Var(X1+X2+ +Xn) =n i=1 Var(Xi) + 2 i<jCov(Xi,Xj)IfXandYare independent then they have covariance 0, soX Y= Var(X+Y) = Var(X) + Var(Y)IfX1,X2.


Related search queries