0.1 Markov Chains - Stanford University

Markov Markov GeneralitiesAMarkov Chainconsists of a countable (possibly finite) setS(called thestate space) togetherwith a countable family of random variablesX , X1, X2, with values inSsuch thatP[Xl+1=s|Xl=sl, Xl 1=sl 1, , X =s ] =P[Xl+1=s|Xl=sl].We refer to this fundamental equation as theMarkov property. The random variablesX , X1, X2, are dependent. Markov Chains are among the few sequences of dependentrandom variables which are of a general character and have been successfully investigatedwith deep results about their behavior. Later we will discuss martingales which also provideexamples of sequences of dependent random variables. Martingales have many applicationsto probability often thinks of the subscriptlof the random variableXlas representing the time(discretely), and the random variables represent the evolution of a system whose behavior isonly probabilistically known.

Markov property expresses the assumption that the knowledgeof the present ( ,Xl=sl) is relevant to predictions about the future of the system, howeveradditional information about the past (Xj=sj,j l 1) is irrelevant. What we meanby the system is explained later in this subsection. These ideas will be clarified by the state space is countable (or even finite) it customary (but not always the case)to use the integersZor a subset such asZ+(non-negative integers), the natural numbersN={1,2,3, }or{0,1,2, , m}as the state space. The specific Markov chain underconsideration often determines the natural notation for the state space. In the general casewhere no specific Markov chain is singled out, we often useNorZ+as the state space. WesetPl,l+1ij=P[Xl+1=j|Xl=i]For fixedlthe (possibly infinite) matrixPl= (Pl,l+1ij) is called thematrix of transitionprobabilities(at timel). In our discussion of Markov Chains , the emphasis is on the case wherethe matrixPlis independent oflwhich means that the law of the evolution of the system istime independent.

For this reason one refers to such Markov Chains astime homogeneousorhavingstationary transition probabilities. Unless stated to the contrary, all Markov chainsconsidered in these notes are time homogeneous and therefore the subscriptlis omittedand we simply represent the matrix of transition probabilities asP= (Pij).Pis calledthetransition matrix. The non-homogeneous case is generally calledtime inhomogeneousornon-stationary in time!2 The matrixPis not arbitrary. It satisfiesPij 0, jPij= 1 for alli.( )A Markov chain determines the matrixPand a matrixPsatisfying the conditions of ( )determines a Markov chain . A matrix satisfying conditions of ( ) is calledMarkovorstochastic. Given an initial distributionP[X =i] =pi, the matrixPallows us to computethe the distribution at any subsequent time. For example,P[X1=j, X =i] =pijpiandmore generallyP[Xl=jl, , X1=j1, X =i] =Pjl 1jlPjl 2jl 1 Pij1pi.

( )Thus the distribution at timel= 1 is given by the row vector (p1, p2, )Pand moregenerally at timelby the row vector(p1, p2, )P P P ltimes= (p1, p2, )Pl.( )For instance, forl= 2, the probability of moving from stateito statejin two units of timeis the sum of the probabilities of the eventsi 1 j, i 2 j, i 3 j, , i n j,since they are mutually exclusive. Therefore the required probability is kPikPkjwhichis accomplished by matrix multiplication as given by ( ) Note that (p1, p2, ) is arow vector multiplyingPon the left side. Equation ( ) justifies the use of matricesis describing Markov Chains since the transformation of the system afterlunits of time isdescribed byl-fold multiplication of the matrixPwith basic fact is of fundamental importance in the development of Markov Chains . It isconvenient to make use of the notationPl= (P(l)ij). Then forr+s=l(randsnon-negativeintegers) we havePl=PrPsorP(l)ij= kP(r)ikP(s)kj.

( )Example integers modn, letY1, Y2, be a sequence of indepen-dent indentically distributed (from now on iid) random variables with values inZ/nanddensity functionP[Y=k] = Markov CHAINS3 SetY = 0 andXl=Y +Y1+ +Ylwhere addition takes place inZ/n. UsingXl+1=Yl+1+Xl,the validity of the Markov property and time stationarity are easily verified and it followsthatX , X1, X2 is a Markov chain with state spaceZ/n={0,1,2, , n 1}. TheequationXl+1=Yl+1+Xlalso implies that transition matrixPisP= p p1p2 pn 2pn 1pn 1p p1 pn 3pn 2pn 2pn 1p pn 4pn p p1p1p2p3 pn 1p We refer to this Markov chain as thegeneral random walk onZ/n. Rather than starting at0 (X =Y = 0), we can start at some other point by settingY =mwherem Z/n. Apossible way of visualizing the random walk is by assigning toj Z/nthe pointe2 ijnon theunit circle in the complex plane. If for instancepk= 0 fork6= 0, 1, then imagine particlesat any and all locationsj e2 ijn, which after passage of one unit of time, stay at the sameplace, or move one unit counterclockwise or clockwise with probabilitiesp , p1respectivelyand independently of each other.

The fact that moving counterclockwise/clockwise or stayingat the same location have the same probabilities for all locationsjexpresses the propertyof spatial homogeneity which is specific to random walks and not shared by general Markovchains. This property is expressed by the rows of the transition matrix being shifts of eachother as observed in the expression forP. For general Markov Chains there is no relationbetween the entries of the rows (or columns) except as specified by ( ). Note that thetransition matrix of the general random walk onZ/nhas the additional property that thecolumn sums are also one and not just the row sums as stated in ( ). A stochasticmatrix with the additional property that column sums are 1 is calleddoubly continue with the preceding example and make some =mwhere 1 m n 2, andpj= 0 unlessj= 1 orj= 1 (which is the samething asn 1 since addition is modn.)

SetP(Y= 1) =pandP[Y= 1] =q= 1 the matrixPby leavingPijunchanged for 1 i n 2 and definingP = 1, P j= 0, Pn 1n 1= 1, Pn 1k= 0, j6= 0, k6=n is still a Markov chain . The states 0 andn 1 are calledabsorbingstates since transitionoutside of them is impossible. Note that this Markov chain describes the familiarGambler sRuin Problem. 4 Remark example we can replaceZ/nwithZor more generallyZmsothat addition takes place inZm. In other words, we can start with iid sequence of randomvariablesY1, Y2, with values inZmand defineX = 0, Xl+1=Yl+1+ the same reasoning as before the sequenceX , X1, X2, is a Markov chain with statespaceZm. It is called thegeneral random walkonZm. Ifm= 1 and the random variableY( any of theYj s) takes only values 1 then it is called asimple random walkonZand ifin addition the values 1 are assumed with equal probability12then it is called thesimplesymmetric randomwalk onZ.

The analogous definition forZmis obtained by assuming thatYonly takes 2mvalues( 1,0, ,0),(0, 1,0, ,0), ,(0, ,0, 1),each with probability12m. One similarly defines the notions of simple and symmetric randomwalks onZ/n. In a basic course on probability it is generally emphasized that the underlying probabilityspace should be clarified before engaging in the solution of a problem. Thus it is importantto understand the underlying probability space in the discussion of Markov Chains . This ismost easily demonstrated by looking at the Markov chainX , X1, X2, , with finite statespace{1,2, , n}, specified by ann ntransition matrixP= (Pij). Assume we havenbiased dice with each die havingnsides. There is one die corresponding each state. Ifthe Markov chain is in stateithen theithdie is rolled. The die is biased and sidejof dienumberiappears with probabilityPij. For definiteness assumeX = 1.

If we are interestedin investigating questions about the Markov chain inL units of time ( , the subscriptl L), then we are looking at all possible sequences 1k1k2k3 kLifL < (or infinitesequences 1k1k2k3 ifL= ). The sequence 1k1k2k3 kLis the event that die number 1was rolled and sidek1appeared; then die numberk1was rolled and sidek2appeared; thendie numberk2was rolled and side numberk3appeared and so on. The probability assignedto this event isP1k1Pk1k2Pk2k3 PkL can graphically represent each event 1k1k2k3 kLas a function consisting of brokenline segments joining the point (0,1) to (1, k1), (1, k1) to (2, k2), (2, k2) to (3, k3) and so one can look at the event 1k1k2k3 kLas a step function taking valuekmonthe interval [m, m+ 1). Either way the horizontal axis represents time and the vertical Markov CHAINS5the state or site. Naturally one refers to a sequence 1k1k2k3 kLor its graph as apath, andeach path represents arealizationof the Markov chain .]

Graphic representations are usefuldevices for understanding Markov underlying probability space is the set ofall possible paths in whatever representation one (or measures in moresophisticated language) are assigned to events 1k1k2k3 kLor paths (assumingL < )as described above. We often deal with conditional probabilities such asP[?|X =i]. Theappropriate probability space in this, for example, will all paths of the formik1k2k3 .Example so that each path is an infinite sequence 1k1k2k3 inthe context described above, and is the set of all such paths. AssumeP(l)ij= >0 forsome giveni, jandl. How is this statement represented in the space ? In this case weconsider all pathsik1k2k3 such thatkl=jand no condition on the remainingkm s. ThestatementP(l)ij= >0 means this set of paths in has probability . What makes a random walk special is that instead of having one die for every site, thesame die (or an equivalent one) is used for all sites.

0.1 Markov Chains - Stanford University

Tags:

Information

Transcription of 0.1 Markov Chains - Stanford University

Related search queries

0.1 Markov Chains - Stanford University

Tags:

Information

Documents from same domain

Related documents

Related search queries