Example: tourism industry

Probability inequalities

CHAPTER 15 Probability inequalitiesWe already used several types of inequalities , and in this Chapter we give a more systematicdescription of the inequalities and bounds used in Probability and Boole s inequality, Bonferroni inequalitiesBoole s inequality(or theunion bound) states that for any at most countable collection ofevents, the Probability that at least one of the events happens is no greater than the sum ofthe probabilities of the events in the (Boole s inequality)Suppose(S,F,P)is a Probability space, andE1,E2,.. Fare events. ThenP( i=1Ei)6 i=1P(Ei). only give a proof for a finite collection of events, and we mathematicalinduction on the number of then= 1we see thatP(E1)6P(E1).Suppose that for somenand any collection of eventsE1,..,Enwe haveP(n i=1Ei)6n i=1P(Ei).Recall that by ( ) for any eventsAandBwe haveP(A B) =P(A) +P(B) P(A B).We apply it toA= ni=1 EiandB=En+1and using the associativity of the union n+1i=1Ei=A B, we get thatP(n+1 i=1Ei)=P(n i=1Ei) +P(En+1) P((n i=1Ei) En+1).

Here we revisit Chebyshev's inequality Proposition 14.1 we used previously. This results shows that the di erence between a random variable and its expectation is controlled by its variance. Informally we can say that it shows how far the random variable is from its mean on average. Proposition 15.4 (Chebyshev's inequality)

Tags:

  Name, Inequalities, Probability, Expectations, Probability inequalities

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of Probability inequalities

1 CHAPTER 15 Probability inequalitiesWe already used several types of inequalities , and in this Chapter we give a more systematicdescription of the inequalities and bounds used in Probability and Boole s inequality, Bonferroni inequalitiesBoole s inequality(or theunion bound) states that for any at most countable collection ofevents, the Probability that at least one of the events happens is no greater than the sum ofthe probabilities of the events in the (Boole s inequality)Suppose(S,F,P)is a Probability space, andE1,E2,.. Fare events. ThenP( i=1Ei)6 i=1P(Ei). only give a proof for a finite collection of events, and we mathematicalinduction on the number of then= 1we see thatP(E1)6P(E1).Suppose that for somenand any collection of eventsE1,..,Enwe haveP(n i=1Ei)6n i=1P(Ei).Recall that by ( ) for any eventsAandBwe haveP(A B) =P(A) +P(B) P(A B).We apply it toA= ni=1 EiandB=En+1and using the associativity of the union n+1i=1Ei=A B, we get thatP(n+1 i=1Ei)=P(n i=1Ei) +P(En+1) P((n i=1Ei) En+1).

2 19519615. Probability INEQUALITIESBy the first axiom of probabilityP(n i=1Ai An+1)>0,and therefore we haveP(n+1 i=1Ei)6P(n i=1Ei)+P(En+1).Thus using theinduction hypothesiswe see thatP(n+1 i=1Ei)6n i=1P(Ei) +P(En+1) =n+1 i=1P(Ei). One of the interpretations of Boole s inequality is what is known as -sub-additivityinmeasure theory applied here to the Probability s inequality can be extended to get lower and upper bounds on Probability of unions ofevents known asBonferroni inequalities . As before suppose(S,F,P)is a Probability space,andE1,E2,..En Fare events. DefineS1:=n i=1P(Ei),S2:= 16i<j6nP(Ei Ej)Sk:= 16i1< <ik6nP(Ei1 Eik),k= 3,.., (Bonferroni inequalities )For oddkin1,..,nP(n i=1Ei)6k j=1( 1)j 1Sj,for evenkin2,..,nP(n i=1Ei)>k j=1( 1)j omit the proof which starts with considering the casek= 1for which we need to MARKOV S INEQUALITY197P(n i=1Ei)61 j=1( 1)j 1Sj=S1=n i=1P(Ei),which is Boole s inequality. Whenk= 2P(n i=1Ai)>2 j=1( 1)j 1Sj=S1 S1=n i=1P(Ei) 16i<j6nP(Ei Ej).

3 Which forn= 2is the inclusion-exclusion identity (Proposition ).Example we placendistinguishable balls intomdistinguishable boxes atrandom (n > m). LetEbe the event that a box is empty. The sample space can be describedas ={ = ( 1,.., n) : 16 i6m}withP( ) =1mn. DenoteEl:={ : i6=lfor alli= 1,..,n}forl= 1,2,..,m. Then,E=E1 .. Em 1sinceEmis empty, and we can include it or not, this does not changethe can see that for anymwe haveP(Ei1 .. Eik) =(m k)nkn=(1 km) we can use the inclusion-exclusion principle to getP(E) =m(1 1m)n (m2)(1 2m)n+..+ ( 1)m 2(mm 1)(1 m 1m)nThe last term is zero, since all boxes can not be empty. The expression is quite if we use Bonferroni inequalities we see thatm(1 1m)n (m2)(1 2m)n6P(E)6m(1 1m)nThis gives a good estimate whennis large compared tom. For example, ifm= 10then10 ( )n 45 ( )n6P(E)610 ( ) particular, forn= 50, then45 ( )50= , which is the difference betweenthe left and right sides of the estimates. This gives a rather good Markov s inequalityProposition (Markov s inequality)SupposeXis a nonnegative random variable, then for anya >0we haveP(X>a)6 EXa19815.

4 Probability only give the proof for a continuous random variable, the case of a discreterandom variable is similar. SupposeXis a positive continuous random variable, we canwriteEX= xfX(x)dxX>0= 0xfX(x)dxa>0> axfX(x)dxx>a> aafX(x)dx=a afX(x)dx=aP(X>a).ThereforeaP(X>a)6 EXwhich is what we wanted to prove. Example we observe that Boole s inequality can be interpreted as expectationsof the number of occurred events. Suppose(S,F,P)is a Probability space, andE1,E2,.. Fare events. DefineXi:= 1,ifEioccurs0, :=X1+..+Xnis the number of events that occur. ThenEX=P(E1) +..+P(En).Now we would like to prove Boole s inequality using Markov s inequality. Note thatXis anonnegative random variable, so we can apply Markov s inequality. Fora= 1we getP(X>1)6EX=P(E1) +..+P(En).Finally we see that the eventX>1means that at least one of the eventsE1,E2,..Enoccur,soP(X>1) =P( i=1Ei),thereforeP( i=1Ei)=P(X>1)6EX=P(E1) +..+P(En)which completes the Binom (n,p). We would like to use Markov s inequality tofind an upper bound onP(X>qn)forp < q < thatXis a nonnegative random variable andEX=np.

5 By Markov s inequality, wehaveP(X>qn)6 EXqn= CHEBYSHEV S Chebyshev s inequalityHere we revisit Chebyshev s inequality Proposition we used previously. This resultsshows that the difference between a random variable and its expectation is controlled by itsvariance. Informally we can say that it shows how far the random variable is from its meanon (Chebyshev s inequality)SupposeXis a random variable, then for anyb >0we haveP(|X EX|>b)6 Var (X) := (X EX)2, thenYis a nonnegative random variable and we canapply Markov s inequality (Proposition ) toY. Then forb >0we haveP(Y>b2) thatEY=E(X EX)2= Var (X),P(Y>b2)=P((X EX)2>b2)=P(|X EX|>b)which completes the proof. Example againX Binom (n,p). We now will use Chebyshev s inequalityto find an upper bound onP(X>qn)forp < q < thatEX=np. By Chebyshev s inequality withb= (q p)n >0we haveP(X>qn) =P(X np>(q p)n)6P(|X np|>(q p)n)6 Var (X)((q p)n)2=p(1 p)n((q p)n)2=p(1 p)(q p) Probability Chernoff boundsProposition (Chernoff bounds)SupposeXis a random variable and we denote bymX(t)its moment generatingfunction, then for anya RP(X>a)6mint>0e tamX(t),P(X6a)6mint<0e tamX(t).

6 (X>a) =P(etX>eta),t >0,P(X6a) =P(etX>eta),t < that note thatetXis a positive random variable for anyt R. Therefore we can applyMarkov s inequality (Proposition ) to see thatP(X>a) =P(etX>eta)6 EetXeta,t >0,P(X6a) =P(etX>eta)6 EetXeta,t < thatEetXis the moment generating functionmX(t), and so we haveP(X>a)6mX(t)eta,t >0,P(X6a)6mX(t)eta,t < the minimum over appropriatetwe get the result. Example againX Binom (n,p). We now will use Chernoff bounds forP(X>qn)forp < q <1. Recall that in Example we found the moment generatingfunction forXas followsmX(t) = (pet+ (1 p)) a Chernoff bound givesP(X>qn)6mint>0e tqn(pet+ (1 p)) find the minimum ofg(t) =e tqn(pet+ (1 p))nwe can take its derivative and usingthe only critical point of this function , we can see that the minimum on(0, )is achievedatt such CHERNOFF BOUNDS201et =q(1 p)(1 q)p,and sog(t ) =(q(1 p)(1 q)p) qn(pq(1 p)(1 q)p+ (1 p))n=(q(1 p)(1 q)p) qn(1 p1 q)n=(pq)qn(1 p1 q) qn(1 p1 q)n=(pq)qn(1 p1 q)(1 q) the Chernoff bound givesP(X>qn)6(pq)qn(1 p1 q)(1 q) (Comparison of Markov s, Chebyshev s inequalities and Chernoff bounds).

7 These three inequalities for the binomial random variableX Binom (n,p)giveMarkov s inequalityP(X>qn)6pq,Chebyshev s inequalityP(X>qn)6p(1 p)(q p)2n,Chernoff boundP(X>qn)6(pq)qn(1 p1 q)(1 q) the right-hand sides are very different: Markov s inequality gives a bound indepen-dent ofn, and the Chernoff bound is the strongest with exponential convergence to0asn . For example, forp= 1/2andq= 3/4we haveMarkov s inequalityP(X>3n4)623,Chebyshev s inequalityP(X>3n4)64n,Chernoff boundP(X>3n4)6(1627) example, forp= 1/3andq= 2/3we have20215. Probability INEQUALITIESM arkov s inequalityP(X>3n4)612,Chebyshev s inequalityP(X>3n4)62n,Chernoff boundP(X>3n4)62 Cauchy-Schwarz inequalityThis inequality appears in a number of areas of mathematics including linear algebra. Wewill apply it to give a different proof for the bound for correlation coefficients. Note that theCauchy-Schwarz inequality can be easily generalized to random (Cauchy-Schwarz inequality)SupposeXandYare two random variables, then(EXY)26EX2 EY2,and the equality holds if and only ifX=aYfor some constanta the random variableU:= (X sY)2which is a nonnegative randomvariable for anys R.

8 Then06EU=E(X sY)2=EX2 2sEXY+ (s) :=EX2 2sEXY+s2EY2which is a quadratic polynomial ins. What we knowis thatg(s)is nonnegative for alls. Completing the square we see thatg(s) =EY2s2 2 EXY s+EX2=( EY2s EXY EY2)2+EX2 (EXY)2EY2,sog(s)>0for allsif and only ifEX2 (EXY)2EY2>0,which is what we needed to deal with the last claim, observe that ifU >0with Probability one, theng(s) =EU > happens only ifEX2 (EXY)2EY2> ifEX2 (EXY)2EY2= 0, theng(EXYEY2)=EU= 0, which only can be true ifX EXYEY2Y= 0,that is,Xis a scalar multiple ofY. JENSEN S INEQUALITY203 Example can use the Cauchy-Schwartz inequality to prove one of the proper-ties of correlation coefficient in Proposition Namely, supposeXandYare randomvariables, then| (X,Y)|61. Moreover,| (X,Y)|= 1if and only if there are constantsa,b Rsuch thatX=a+ will use normalized random variables as before, namely,U:=X EX VarX,V:=Y EY 0,EU2=EV2= 1. We can use the Cauchy-Schwartz inequality forUandVto see that|EUV|6 EU2 EV2= 1and the identity holds if and only ifU=aVfor somea Equation ( ) (X,Y) =E(UV),which gives the bound we need.

9 Note that ifU=aV, thenX EX VarX=a(Y EY VarY),thereforeX=a VarX(Y EY VarY)+EX=a VarX VarYY a VarX VarYEY+EX,which completes the Jensen s inequalityRecall that a functiong:R Risconvexon[a,b]if for eachx,y [a,b]and each [0,1]we haveg( x+ (1 )y)6 g(x) + (1 )g(y).Note that for a convex functiongthis property holds for anyconvex linear combinationofpoints in[a,b], that is,g( 1x1+..+ nxn)6 1g(x1) +..+g( nxn), 1+..+ n= 1,06 1,.., n61,x1,..,xn [a,b].20415. Probability INEQUALITIESI fgis twice differentiable, then we have a simple test to see if a function is convex, namely,gis convex ifg (x)>0for allx [a,b]. Geometrically one can show that ifgis convex, thenif we draw a line segment between any two points on the graph of the function, the entiresegment lies above the graph ofg, as we show formally bellow. A functiongisconcaveif gis convex. Typical examples of convex functions areg(x) =x2andg(x) =ex. Examplesof concave functions areg(x) = x2andg(x) = logx. Convex and concave functions arealways functions lie above tangentsSupposea < c < bandg: [a,b] Rbe convex.

10 Then there existA,B Rsuchthatg(c) =Ac+Band for allx [a,b]we haveg(x)>Ax+ < c < y6bwe can writecas a convex combination ofxandy,namely,C= x+ (1 )ywith =y cy x [0,1]. Thereforeg(c)6 g(x)) + (1 )g(y)which implies thatg(c) g(x)c x6g(y) g(c)y we can takesupx<cg(c) g(x)c x6A6infy>cg(y) g(c)y c,so that we have for allx < yin[a,b]thatg(x)>A(x c) +g(c) =Ax+ (g(c) Ac). Proposition (Jensen s inequality)SupposeXis a random variable such thatP(a6X6b) = 1. Ifg:R Ris convexon[a,b], thenEg(X)>g(EX).Ifgis concave, thenEg(X)6g(EX). constant, then there is nothing to prove, so assumeXis not we havea <EX < :=EX. Theng(x)>AX+Bandg(EX) =AEX+Bfor someA,B R. Also note that|g(X)|6|A||X|+|B|6|Amax{|a|,|b|}|X|+ |B|, JENSEN S INEQUALITY205soE|g(X)|< and thereforeEg(X)is well defined. Now we can useAX+B6g(X)tosee thatg(EX) =AEX+B=E(AX+B)6Eg(X) Example (Arithmetic-geometric mean inequality).Supposea1, ..,anare positivenumbers, andXis a discrete random variable with the mass densityfX(ak) =1nfork= 1.