Example: biology

POL 571: Convergence of Random Variables

POL 571: Convergence of Random VariablesKosuke ImaiDepartment of Politics, Princeton UniversityMarch 28, 20061 Random Sample and StatisticsSo far we have learned about various Random Variables and their distributions. These conceptsare, of course, all mathematicalmodelsrather than the real world itself. In practice, we do notknow the true models of human behavior, and they may not even correspond to probability Box once said that there is no true model , but there are useful models. Even if there issuch a thing as the true probability model , we can never observe it! Therefore, we must connectwhat we can observe with our theoretical models. The key idea here is that we use the probabilitymodel ( , a Random variable and its distribution) to describe thedata generating process.

model because in theory one can obtain the infinite number of random sample from the population (hence, the population size is infinite). In the social sciences, this often requires one to think about ... implies that the marginal distribution of X i is the same as the case of sampling with replacement. 1.

Tags:

  Model, Marginal

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of POL 571: Convergence of Random Variables

1 POL 571: Convergence of Random VariablesKosuke ImaiDepartment of Politics, Princeton UniversityMarch 28, 20061 Random Sample and StatisticsSo far we have learned about various Random Variables and their distributions. These conceptsare, of course, all mathematicalmodelsrather than the real world itself. In practice, we do notknow the true models of human behavior, and they may not even correspond to probability Box once said that there is no true model , but there are useful models. Even if there issuch a thing as the true probability model , we can never observe it! Therefore, we must connectwhat we can observe with our theoretical models. The key idea here is that we use the probabilitymodel ( , a Random variable and its distribution) to describe thedata generating process.

2 Whatwe observe, then, is a particular realization (or a set of realizations) of this Random variable. Thegoal ofstatistical inferenceis to figure out the true probability model given the data you 1 Random variablesX1, X2, .. , Xnare said to be independent and identically dis-tributed (or ) if they are independent and share the same distribution functionF(x). It is alsocalled a (an ) Random sample of sizenfrom the population,F(x).If we usef(x) to denote the probability density (or mass) function associated withF(x), then thejoint probability density (or mass) function given a particular set of realizations (x1, x2, .. , xn) isgiven by ni=1f(xi).Of course, if the Random Variables are not , then the joint density (ormass) function will be much more complicated,f(xn|xn 1.)

3 , x1) f(x2|x1)f(x1).The above definition is an example of what is sometimes called aninfinite (or super) populationmodelbecause in theory one can obtain the infinite number of Random sample from the population(hence, the population size is infinite). In the social sciences, this often requires one to think abouta hypothetical population from which a particular realization is drawn. For example, what doesit mean to say, the outcome of the 2000 presidential election is a particular realization from thepopulation model ?Another important framework is afinite population modelwhere we consider the population sizeto be finite. This might be appropriate in a survey sampling context where a sample of respondentsis drawn from the particular population, which is of course finite.

4 The concept will also play arole in analyzing randomized experiments as well as the statistical method called bootstrap. Anexample of this model is given next,Definition 2 Consider a population of sizeN,{x1, x2, .. , xN}, withN N. Random variablesX1, X2, .. , Xnare called a simple Random sample if these units are sampled with equal probabilityand without thatP(Xi=x) = 1/Nfor alli= 1,2, .. , nifxis a distinct elements of{x1, x2, .. , xN}. Thisimplies that the marginal distribution ofXiis the same as the case of sampling with , a simple Random sample is no longer independent because the conditional distributionofX2givenX1, for example, depends on the observed value ofX1. Of course, this is one ofthe simplest probability sampling methods, and there are more sophisticated sampling a Random sample, we can define a statistic,Definition 3 LetX1.

5 , Xnbe a Random sample of sizenfrom a population, and be the samplespace of these Random Variables . IfT(x1, .. , xn)is a function where is a subset of the domainof this function, thenY=T(X1, .. , Xn)is called a statistic, and the distribution ofYis calledthe sampling distribution you should take away from this definition is that a statistic is simply a function of the dataand that since your data set is a Random sample from a population, a statistic is also a randomvariable and has its own distribution. Three common statistics are given below,Definition 4 LetX1, .. , Xnbe a Random sample from a population. Then,1. The sample mean is defined byX=1n ni= The sample variance is defined byS2=1n 1 ni=1(Xi X)2whereS= S2is called thesample standard statistics are good guesses of their population counterparts as the following 1 (Unbiasedness of Sample Mean and Variance)LetX1.

6 , Xnbe an ran-dom sample from a population with mean < and variance 2< . IfXis the sample meanandS2is the sample variance, (X) = , andvar(X) = (S2) = 2 The theorem says thaton averagethe sample mean and variances are equal to their populationcounterparts. That is, over repeated samples, you will get the answer right on average. Thisproperty is calledunbiasedness. But, of course, typically we only have one Random sample and sothe answer you get from a particular sample you have may or may not be close to the truth. Forexample, eachXiis also an unbiased estimator of although sample mean is perhaps a betterestimator because the variance is smaller. We will revisit this issue later in the course. Thistheorem can be also generalized to any functiong(Xi) provided thatE[g(X)] and var[g(X)] should be able to showE[ ni=1g(Xi)/n] =E[g(X)] and var[ ni=1g(Xi)]/n] =var[g(X)] are several useful properties of the sample mean and variance, we use later in the course,when the population distribution is 2 (Sample Mean and Variance of Normal Random Variables )LetX1, X2.

7 , Xnbe an sample from the Normal distribution with mean and variance 2. LetXandS2bethe sample mean and variance, respectively. Then, N( , 2/n).2.(n 1)S2/ 2 2n n(X )/S tn is important about the last result of this theorem is that the distribution of the statistic, n(X )/Sdoes not depend on the variance ofX. That is, regardless of the value of 2, theexact distribution of the statistic ist1. We also consider the distribution of the ratio of two 3 (Ratio of the Sample Variances)LetX1, X2, .. , Xnbe an sample fromthe Normal distribution with mean Xand variance 2X. Similarly, letY1, Y2, .. , Ymbe an from the Normal distribution with mean Yand variance 2Y. IfS2 XandS2 Yare the samplevariances, then the statistic,S2X/S2Y 2X/ 2Y=S2X/ 2XS2Y/ 2Y,is distributed as theFdistribution withn 1andm 1degrees of , we give another class of statistics, which is a bit more complicated than the samplemean and 5 LetX1, X2.

8 , Xnbe an Random sample from a population. The order statis-ticsX(1), X(2), .. , X(n)can be obtained by arranging this Random sample in non-decreasing order,X(1) X(2) .. X(n)where(1),(2), .. ,(n)is a ( Random ) permutation of1,2, .. , n. In par-ticular, we define the sample median asX((n+1)/2)ifnis odd and(X(n/2)+X(n/2+1)) will see later in the course that the sample median is less affected by extreme observations thanthe sample mean. Here, we consider the marginal distribution of the order 4 (Order Statistics)LetX(1), X(2), .. , X(n)be the order statistics from an ran-dom sample from a If the population distribution is discrete with the probability mass functionfX(x)andx1<x2< are possible values ofXin ascending order, thenP(X(j)=xi) =n k=j(nk)[qki(1 qi)n k qki 1(1 qi 1)n k],whereqi= ik=1P(X=xk) =P(X xi)andq0= If the population distribution is continuous with the probability density functionfX(x)andthe distribution functionFX(x), thenfX(j)(x) =n!

9 (j 1)!(n j)!fX(x)[FX(x)]j 1[1 FX(x)]n , we can easily answer the following question,Example 1 LetX1, X2, .. , Xnbe an Random sample fromUniform(0,1). What is thedistribution of thejth order statistic?32 Convergence of Random VariablesThe final topic of probability theory in this course is the Convergence of Random Variables , whichplays a key role inasymptoticstatistical inference. We are interested in the behavior of a statisticas the sample size goes to infinity. That is, we ask the question of what happens if we can collectthe data of infinite size? Of course, in practice, we never have a sample of infinite size. However,if we have a data set that is large enough, then we might be able to use the large-sample resultas a way to find a good approximation for the finite sample case.

10 We consider four different modesof Convergence for Random Variables ,Definition 6 Let{Xn} n=1be a sequence of Random Variables andXbe a Random {Xn} n=1is said to converge toXin therth meanwherer 1, iflimn E(|Xn X|r) = {Xn} n=1is said to converge toXalmost surely, ifP( limn Xn=X) = {Xn} n=1is said to converge toXin probability, if for any >0,limn P(|Xn X|< ) = {Xn} n=1is said to converge toXin distribution, if at all pointsxwhereP(X x)iscontinuous,limn P(Xn x) =P(X x).Almost sure Convergence is sometimes calledconvergence with probability 1(do not confuse thiswith Convergence in probability). Some people also say that a Random variable convergesalmosteverywhereto indicate almost sure Convergence . The Xis often used for al-most sure Convergence , while the common notation for Convergence in probability isXnp Xorplimn Xn=X.


Related search queries