Transcription of Chapter 4 Multivariate Random Variables, Correlation, and ...
1 Chapter 4 Multivariate Random Variables, Correlation, and Error PropagationOf course I would not propagate error for its own be not merelywicked, but diabolical. Thomas Babington Macaulay, speechtoParliament, April 14, IntroductionSo far we have dealt with a single Random variableX,whichwecan use tomodel collections of scalar,and similar data, suchasthe distance between points orthe times between magnetic we mean by similar isthat all thedata are (we assume in the model) of the same kind of thing,ormeasurement, sothat a single,scalar serve as the probabilistic this Chapter ,wegeneralize to this to pairs,triples,..m-tuples of use suchmultiple variables either to represent vector-valued, butstill similar,quantities (suchasvelocity,the magnetic field, or angular motionsbetween tectonic plates); or we mayuse them to model situations in whichwehavetwo or more different kinds of quantities that we wish to model ,wewant to have probability models that can include cases in whichthedata appear to depend on example of sucharelationshipoccurs,for example,ifone data type is the magnitude of an earthquake,and other isthe rupture length of the fault that caused of suchdata,1shown inFigure , displays considerable scatter,and so is appropriate for probability model-ing.
2 But we need a model that can express the observed fact that larger earthquakemagnitudes correspond to longer plot displays a common, and important, aspect of data analysis,whichistotransformthe data in whatever way makes the relationship case wehave taken the logarithm of the rupture length and of course,inusing earthquakemagnitude,the logarithm of the radiated energy (approximately).This generalization to more than one Random variable is calledmultivariateprobability,though it might better be called touched on thetwo-dimensional case in Chapters 2 and 3 as needed to discuss combinations of twoindependent rv s; in this Chapter we extend and formalize our particu-lar,wedescribe the idea of correlation and covariance,and describe how multivari-ate probability is applied to the problem of propagating errors thoughnotin thesense of the quotation MultivariatePDF sSuppose we have anm-dimensional Random variable X,whichhas as compo-nentsmscalar Random variables: X=X1,X2.
3 , easily generalize ourdefinition of a univariate pdf to saythat the Xis distributed according to ajoint1 Wells, , and K. (1994). New empirical relationships among magnitude,rupture length, rupture width, rupture area, and surface displacement,Bull. ,84,974-1002. 2008 Probability4-2 Figure density function(whichwewill, as in the univariate case,just call apdf): (x1,x2,.. ,xm)=def ( x)whichisnow the derivative of a Multivariate distribution function : ( x)= m ( x) x1 xmThe distribution is an integral of ;ifthe domain of is not all ofm-dimensionalspace,this integral needs to be done with appropriate domain is all ofm-dimensional space,wecan write the integral as: (x1,x2,.. ,xm)=x1 x2 ..xm ( x)dm xGiven a regionRinm-dimensional space (it does not have to be any particularshape), the probability of the Xfalling insideRis just the integral of overRP(X R)=R (x)dm xfrom whichcome the properties that must be everywhere nonnegative and that theintegral of over the whole region of applicability must be course,itisnot easy to visualize functions inm-dimensional space ifmisgreater than three,ortoplot them ifmis greater than examples willtherefore focus on the casem=2, for whichthe pdf becomes abivariate shows what suchapdf might look like,plotting contours of equal values of.
4 (Wehaveintentionally made this pdf a somewhat complicated one; the dots anddashed lines will be explained below).It will be evident that the probability of (say) 2008 Probability4-3 Figure in a certain range is not unrelated to the probability ofX1falling in a cer-tain (perhaps different) range: for example,ifX1is around zero,X2will tend to be; ifX1is far from zero,X2will be see how to formalize this ability to express relationships that makes Multivariate probability suchause-ful Reducingthe Dimension: Conditionals and MarginalsWe can, from a Multivariate pdf,find two kinds of other,lower-dimensional,pdf s. Westart with examples for the bivariate case,for which(obviously) the onlysmaller number of dimensions thanmis 1, so our reduction in dimension gives uni-variate pdf first consider themarginal either variable this is the result ofintegrating the bivariate pdf over the other , for example,forX1themarginal pdf is the pdf forX1irrespective of the value (x1,x2)isthebivariate cumulative distribution, then the marginal cumulative distribution forX1is given by (x1, ).
5 (x1, )=P(X1 x1,X2 )=P(X1 x) 2008 Probability4-4 But what is probably easier to visualize is the marginal density function, whichcomes from integrating the bivariate density function over all values of (say)x2 orto put it another way,collapsing all the density onto one integral is then (x1)= (x1,x2)dx2 Figure shows the marginal pdf s for the bivariate pdf plotted in Figure , with (x1)retaining a little of the multimodal character evident in the bivariate case,though (x2)does pdfis something quite has this name because itis,for Random Variables, the expression of conditional probability : it gives the proba-bility,(that is,the pdf) of (say)X2for knownx1;note that we write the conditioningvariable in lowercase to show that it is a conventional variable,not a Random conditional pdf is found from (x1,x2)bycomputing c(x2)= (x1,x2) (x1,x2)dx2(Wecould write the conditional as X2|X1=x1(x2), but while this is complete it is prob-ably confusing).
6 We see thatx1is held fixed in the integral in the pdf cis essentially a slice through the Multivariate pdf with one vari-able held fixed, normalized by its own integral so that it will integrate to 1, as itmust to be a we could equally well look at the conditional pdf ofX1forx2held fixed or indeed the pdf along some arbitrary slice (or even a curve)through the full bivariate pdf .The dashed lines in Figure show two slices forx1andx2held fixed, and Figure shows the resulting conditional probabilities.(The dots will be explained in Section ).Note that these conditional pdf s peak atmuchhigher values than does the bivariate illustrates the general factthat as the dimension of an ,its pdf tends to have smaller values as itmust in order to still integrate to 1 over the whole of the relevant name for the the marginal pdf is theunconditionalpdf,inthe sensethat the marginal pdf for (say)X2describes the behavior ofX2if we consider all pos-sible values ofX1.
7 2008 Probability4-5We can generalize both of these dimension-reduction strategies to more dimen-sions than with a multidimensional pdf (x), we mayeither hold thevariable values fixed forkdimensions (kbeing less thanm), to get a conditional pdfof dimensionm k;orwemay integrate overkdimensions to get a marginal pdf ofdimensionm , for example,ifwehaveapdf in 3 dimensions we might: Integrate over one direction (it does not have to be along one of the axes) to getabivariate marginal pdf. Integrate over two directions to get a univariate marginal pdf; for example,integrate over a plane,say over thex2 x3plane,toget a function ofx1only. Sample over a plane (again, it does not have to be along the axes) to get abivariate conditional pdf. Sample along a line to get a univariate conditional we will see below,when we discuss regression, a particularly important caseoccurs when we take the conditional pdf fork=m 1, whichmakes the conditionalpdf GeneratingUniform Variates on the SphereWe can apply the ideas just discussed to the problem of generating points that areuniformly distributed on the surface of a sphere,something with obvious interest in Cartesian coordinates of points that are uniformly distributed on the surfaceof the unit sphere can be written as three Random Variables, X1,X2, distri-bution do these of all, the conditional probability distribution of(X1,X2)for anygivenX3must be uniform on a circle of radius(1 X2)
8 123(that is,around the circle,notwithin it).Next, consider the marginal distribution of anyX,sayX3(obviously,they allhave to be the same).Foranintervaldx3,the area of the corresponding slice of the sphereis proportional ,the marginal distribution of eachXis uniform on[ 1, 1]. Nowsuppose we generate a pair of uniform variatesU1andU2,eachdistributed uniformlybetween 1and 1; then, accept any pair for whichS=U21+U22<1:that is,the points areinside the unit be uniform on [0, 1], so1 2 Sis uniform on[ 1, 1]. Henceif we setX1X2X3=2U1 1 S2U2 1 S1 2 Swe see thatX1,X2,andX3all satisfy the conditions for a uniform distribution on thesphere,provided that 1 2 Sis independent ofU1/ SandU2/ S, Momentsof Multivariate PDF sWe can easily generalize from moments of univariate pdf s tomoments of multi-variate pdf s.
9 The zero-order moment, being the integral over the entire domain ofthe pdf,isstill 1. But there aremfirst moments,instead of one; these are definedby32 Marsaglia, G.(1972). Choosingapoint from the surface of a sphere,Ann. Math. Statist.,43, alert:wemake a slight change in usage from that in Chapter 2, using the sub-script to denote different moments of the same degree,rather than the degree of the moment;this degree is implicit in the number of subscripts. 2008 Probability4-6 i=defE[xi]= .. xi ( x)dm xwhich, as in the univariate case,expresses the location of thei-th variable thoughnot alwaysvery second moments are more varied, and more interesting,than in the uni-variate case: for one thing,there arem2of them.
10 As in the univariate case,wecouldconsider second moments about zero,orabout the expected value (the firstmoments); in practice,nobody ever considers anything but the second kind, makingthe expression for the second moments ij=def .. (xi i)(xj j) ( x)dm xWe can, as with univariate pdf s, describe the variance asV[Xi]=def ii=E[(Xi i)(Xi i)]whichasbefore expresses something about the spread of this the moreinteresting moments are thecovariancesbetween two Variables, whichare definedasC[Xj,Xk]=def jk=E[(Xj j)(Xk k)]= .. (xj j)(xk k) (x1,x2,.. ,xm)dm xFrom this,itisclear that the variances are special cases of the covariances,with thevariance being the covariance of a Random variable with itself:V[Xj]=C[(Xj,Xj)].Covariance has the very useful property of showing the degree oflinearassocia-tion this out in detail:A.