Example: air traffic controller

Probability Cheatsheet v2.0 Thinking Conditionally Law of ...

Probability Cheatsheet by William Chen ( ) and Joe Blitzstein,with contributions from Sebastian Chiu, Yuan Jiang, Yuqi Hou, andJessy Hwang. Material based on Joe Blitzstein s (@stat110) lectures( ) and Blitzstein/Hwang s Introduction toProbability textbook ( ). LicensedunderCC BY-NC-SA Please share comments, suggestions, and Updated September 4, 2015 CountingMultiplication RulecakewaffleSVCSVCSVC cakewafflecakewafflecakewaffleLet s say we have a compound experiment (an experiment withmultiple components). If the 1st component hasn1possible outcomes,the 2nd component hasn2possible outcomes, .. , and therthcomponent hasnrpossible outcomes, then overall there for the whole Table765842931 The sampling table gives the number of possible samples of sizekoutof a population of sizen, under various assumptions about how thesample is MattersNot MatterWith Replacementnk(n+k 1k)Without Replacementn!

Cumulative Distribution Function (CDF) Gives the probability that a random variable is less than or equal to x. F X(x) = P(X x) 0 1 2 3 4 0.0 0.2 0.4 0.6 0.8 1.0 x cdf

Tags:

  Probability, Cheatsheet, Probability cheatsheet v2

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Probability Cheatsheet v2.0 Thinking Conditionally Law of ...

1 Probability Cheatsheet by William Chen ( ) and Joe Blitzstein,with contributions from Sebastian Chiu, Yuan Jiang, Yuqi Hou, andJessy Hwang. Material based on Joe Blitzstein s (@stat110) lectures( ) and Blitzstein/Hwang s Introduction toProbability textbook ( ). LicensedunderCC BY-NC-SA Please share comments, suggestions, and Updated September 4, 2015 CountingMultiplication RulecakewaffleSVCSVCSVC cakewafflecakewafflecakewaffleLet s say we have a compound experiment (an experiment withmultiple components). If the 1st component hasn1possible outcomes,the 2nd component hasn2possible outcomes, .. , and therthcomponent hasnrpossible outcomes, then overall there for the whole Table765842931 The sampling table gives the number of possible samples of sizekoutof a population of sizen, under various assumptions about how thesample is MattersNot MatterWith Replacementnk(n+k 1k)Without Replacementn!

2 (n k)!(nk)Naive Definition of ProbabilityIf all outcomes are equally likely, the Probability of an eventAhappening is:Pnaive(A) =number of outcomes favorable toAnumber of outcomesThinking ConditionallyIndependenceIndependent EventsAandBare independent if knowing whetherAoccurred gives no information about whetherBoccurred. Moreformally,AandB(which have nonzero Probability ) are independent ifand only if one of the following equivalent statements holds:P(A B) =P(A)P(B)P(A|B) =P(A)P(B|A) =P(B)Conditional IndependenceAandBare Conditionally independentgivenCifP(A B|C) =P(A|C)P(B|C). Conditional independencedoes not imply independence, and independence does not implyconditional , Intersections, and ComplementsDe Morgan s LawsA useful identity that can make calculatingprobabilities of unions easier by relating them to intersections, andvice versa.

3 Analogous results hold with more than two sets.(A B)c=Ac Bc(A B)c=Ac BcJoint, Marginal, and ConditionalJoint ProbabilityP(A B) orP(A,B) Probability (Unconditional) ProbabilityP(A) Probability ProbabilityP(A|B) =P(A,B)/P(B) Probability ofA, given ProbabilityisProbabilityP(A|B) is a probabilityfunction for any fixedB. Any theorem that holds for Probability alsoholds for conditional of an Intersection or UnionIntersections via ConditioningP(A,B) =P(A)P(B|A)P(A,B,C) =P(A)P(B|A)P(C|A,B)Unions via Inclusion-ExclusionP(A B) =P(A) +P(B) P(A B)P(A B C) =P(A) +P(B) +P(C) P(A B) P(A C) P(B C)+P(A B C).Simpson s ParadoxDr. HibbertDr. Nickheartband-aidIt is possible to haveP(A|B,C)< P(A|Bc,C) andP(A|B,Cc)< P(A|Bc,Cc)yet alsoP(A|B)> P(A|Bc).Law of Total Probability (LOTP)LetB1,B2,B3.

4 Bnbe apartitionof the sample space ( , they aredisjoint and their union is the entire sample space).P(A) =P(A|B1)P(B1) +P(A|B2)P(B2) + +P(A|Bn)P(Bn)P(A) =P(A B1) +P(A B2) + +P(A Bn)ForLOTP with extra conditioning, just add in another eventC!P(A|C) =P(A|B1,C)P(B1|C) + +P(A|Bn,C)P(Bn|C)P(A|C) =P(A B1|C) +P(A B2|C) + +P(A Bn|C)Special case of LOTP withBandBcas partition:P(A) =P(A|B)P(B) +P(A|Bc)P(Bc)P(A) =P(A B) +P(A Bc)Bayes RuleBayes Rule, and with extra conditioning (just add inC!)P(A|B) =P(B|A)P(A)P(B)P(A|B,C) =P(B|A,C)P(A|C)P(B|C)We can also writeP(A|B,C) =P(A,B,C)P(B,C)=P(B,C|A)P(A)P(B,C)Odds Form of Bayes RuleP(A|B)P(Ac|B)=P(B|A)P(B|Ac)P(A)P(Ac) Theposterior oddsofAare thelikelihood ratiotimes theprior Variables and their DistributionsPMF, CDF, and IndependenceProbability Mass Function (PMF)Gives the Probability that adiscreterandom variable takes on the (x) =P(X=x) PMF satisfiespX(x) 0 and xpX(x) = 1 Cumulative Distribution Function (CDF)Gives the probabilitythat a random variable is less than or equal (x) =P(X x) CDF is an increasing, right-continuous function withFX(x) 0 asx andFX(x)

5 1 asx IndependenceIntuitively, two random variables are independent ifknowing the value of one gives no information about the independent if forallvalues ofxandyP(X=x,Y=y) =P(X=x)P(Y=y)Expected Value and IndicatorsExpected Value and LinearityExpected Value( ,expectation, oraverage) is a weightedaverage of the possible outcomes of our random , ifx1,x2,x3,..are all of the distinct possible valuesthatXcan take, the expected value ofXisE(X) = ixiP(X=xi) + Y741433 xi yi+ (xi + yi)=E(X)E(Y)+E(X + Y)=i=1ni=1ni=1nn1n1n1 LinearityFor any , and constantsa,b,c,E(aX+bY+c) =aE(X) +bE(Y) +cSame distribution implies same meanIfXandYhave the samedistribution, thenE(X) =E(Y) and, more generally,E(g(X)) =E(g(Y))Conditional Expected Valueis defined like expectation, onlyconditioned on any (X|A) = xxP(X=x|A)Indicator Random VariablesIndicator Random Variableis a random variable that takes on thevalue 1 or 0.

6 It is always an indicator of some event: if the eventoccurs, the indicator is 1; otherwise it is 0. They are useful for manyproblems about counting how many events of some kind occur. WriteIA={1 ifAoccurs,0 ifAdoes not thatI2A=IA,IAIB=IA B,andIA B=IA+IB Bern(p) wherep=P(A).Fundamental BridgeThe expectation of the indicator for eventAisthe Probability of eventA:E(IA) =P(A).Variance and Standard DeviationVar(X) =E(X E(X))2=E(X2) (E(X))2SD(X) = Var(X)Continuous RVs, LOTUS, UoUContinuous Random Variables (CRVs)What s the Probability that a CRV is in an interval?Take thedifference in CDF values (or use the PDF as described later).P(a X b) =P(X b) P(X a) =FX(b) FX(a)ForX N( , 2), this becomesP(a X b) = (b ) (a )What is the Probability Density Function (PDF)?}

7 The PDFfis the derivative of the (x) =f(x)A PDF is nonnegative and integrates to 1. By the fundamentaltheorem of calculus, to get from PDF back to CDF we can integrate:F(x) = x f(t)dt 4 4 find the Probability that a CRV takes on a value in an interval,integrate the PDF over that (b) F(a) = baf(x)dxHow do I find the expected value of a CRV?Analogous to thediscrete case, where you sumxtimes the PMF, for CRVs you integratextimes the (X) = xf(x)dxLOTUSE xpected value of a function of an expected value ofXis defined this way:E(X) = xxP(X=x) (for discreteX)E(X) = xf(x)dx(for continuousX)TheLaw of the Unconscious Statistician (LOTUS)states thatyou can find the expected value of afunction of a random variable,g(X), in a similar way, by replacing thexin front of the PMF/PDF byg(x) but still working with the PMF/PDF ofX:E(g(X)) = xg(x)P(X=x) (for discreteX)E(g(X)) = g(x)f(x)dx(for continuousX)What s a function of a random variable?

8 A function of a randomvariable is also a random variable. For example, ifXis the number ofbikes you see in an hour, theng(X) = 2 Xis the number of bike wheelsyou see in that hour andh(X) =(X2)=X(X 1)2is the number ofpairsof bikes such that you see both of those bikes in that s the point?You don t need to know the PMF/PDF ofg(X)to find its expected value. All you need is the PMF/PDF of Uniform (UoU)When you plug any CRV into its own CDF, you get a Uniform(0,1)random variable. When you plug a Uniform(0,1) into an inverseCDF, you get an with that CDF. For example, let s say that arandom variableXhas CDFF(x) = 1 e x,forx >0By UoU, if we plugXinto this function then we get a uniformlydistributed random (X) = 1 e X Unif(0,1)Similarly, ifU Unif(0,1) thenF 1(U) has CDFF.

9 The key point isthat for any continuous random variableX, we can transform it into aUniform random variable and back by using its and MGFsMomentsMoments describe the shape of a distribution. LetXhave mean andstandard deviation , andZ= (X )/ be thestandardizedversionofX. Thekth moment ofXis k=E(Xk) and thekth standardizedmoment ofXismk=E(Zk). The mean, variance, skewness, andkurtosis are important summaries of the shape of a (X) = 1 VarianceVar(X) = 2 21 SkewnessSkew(X) =m3 KurtosisKurt(X) =m4 3 Moment Generating FunctionsMGFFor any random variableX, the functionMX(t) =E(etX)is themoment generating function (MGF)ofX, if it exists for alltin some open interval containing 0. The variabletcould just as wellhave been calleduorv. It s a bookkeeping device that lets us workwith thefunctionMXrather than thesequenceof is it called the Moment Generating Function?

10 Becausethekth derivative of the moment generating function, evaluated at 0,is thekth moment ofX. k=E(Xk) =M(k)X(0)This is true by Taylor expansion ofetXsinceMX(t) =E(etX) = k=0E(Xk)tkk!= k=0 ktkk!MGF of linear functionsIf we haveY=aX+b, thenMY(t) =E(et(aX+b)) =ebtE(e(at)X) =ebtMX(at)UniquenessIf it exists, the MGF uniquely determines thedistribution. This means that for any two random variablesXandY,they are distributed the same (their PMFs/PDFs are equal) if andonly if their MGFs are Independent RVs by Multiplying independent, thenMX+Y(t) =E(et(X+Y)) =E(etX)E(etY) =MX(t) MY(t)The MGF of the sum of two random variables is the product of theMGFs of those two random PDFs and CDFsJoint DistributionsThejoint CDFofXandYisF(x,y) =P(X x,Y y)In the discrete case,XandYhave ajoint PMFpX,Y(x,y) =P(X=x,Y=y).


Related search queries