Transcription of Gaussian Markov Processes
1 C. E. rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006,ISBN 2006 Massachusetts Institute of BGaussian Markov ProcessesParticularly when the index set for a stochastic process is one-dimensional suchas the real line or its discretization onto the integer lattice, it is very interestingto investigate the properties of GaussianMarkovprocesses (GMPs). In thisAppendix we useX(t) to define a stochastic process with continuous time pa-rametert. In the discrete time case the process is ,X 1,X0,X1,..etc. We assume that the process has zero mean and is, unless otherwise stated, discrete-time autoregressive (AR) process of orderpcan be written asAR processXt=p k=1akXt k+b0Zt,( )whereZt N(0,1) and allZt s are.
2 Notice the order-pMarkov propertythat given the historyXt 1,Xt 2,..,Xtdepends only on the previousp X relationship can be conveniently expressed as a graphical model; part ofan AR(2) process is illustrated in Figure The name autoregressive stemsfrom the fact thatXtis predicted from theppreviousX s through a regressionequation. If one stores the currentXand thep 1 previous values as a statevector, then the AR(p) scalar process can be written equivalently as a vectorAR(1) ..Figure : Graphical model illustrating an AR(2) from the discrete time to the continuous time setting, the questionarises as to how generalize the Markov notion used in the discrete-time ARprocess to define a continuoous-time AR process.
3 It turns out that the correctgeneralization uses the idea of having not only the function value but alsopofits derivatives at timetgiving rise to the stochastic differential equation (SDE)1 SDE: stochasticdifferential equationC. E. rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006,ISBN 2006 Massachusetts Institute of Markov ProcessesapX(p)(t) +ap 1X(p 1)(t) +..+a0X(t) =b0Z(t),( )whereX(i)(t) denotes theith derivative ofX(t) andZ(t) is a white Gaus-sian noise process with covariance (t t ). This white noise process can beconsidered the derivative of the Wiener process. To avoid redundancy in thecoefficients we assume thatap= 1. A considerable amount of mathemati-cal machinery is required to make rigorous the meaning of such equations, ksendal [1985].
4 As for the discrete-time case, one can write eq. ( ) asa first-order vector SDE by defining the state to beX(t) and its firstp begin this chapter with a summary of some Fourier analysis results insection Fourier analysis is important to linear time invariant systems suchas equations ( ) and ( ) becausee2 istis an eigenfunction of the corre-sponding difference (resp differential) operator. We then move on in to discuss continuous-time Gaussian Markov Processes on the real line andtheir relationship to the same SDE on the circle. In section we describediscrete-time Gaussian Markov Processes on the integer lattice and their re-lationship to the same difference equation on the circle. In section weexplain the relationship between discrete-time GMPs and the discrete samplingof continuous-time GMPs.
5 Finally in section we discuss generalizationsof the Markov concept in higher dimensions. Much of this material is quitestandard, although the relevant results are often scattered through differentsources, and our aim is to provide a unified treatment. The relationship be-tween the second-order properties of the SDEs on the real line and the circle,and difference equations on the integer lattice and the regular polygon is, toour knowledge, Fourier AnalysisWe follow the treatment given by Kammler [2000]. We consider Fourier analysisof functions on the real lineR, of periodic functions of periodlon the circleTl, of functions defined on the integer latticeZ, and of functions onPN, theregularN-polygon, which is a discretization sufficiently well-behaved functions onRwe havef(x) = f(s)e2 isxds, f(s) = f(x)e 2 isxdx.
6 ( )We refer to the equation on the left as thesynthesisequation, and the equationon the right as functions onTlwe obtain the Fourier series representationsf(x) = k= f[k]e2 ikx/l, f[k] =1l l0f(x)e 2 ikx/ldx,( )1 Theakcoefficients in equations ( ) and ( ) are not intended to have a close relation-ship. An approximate relationship might be established through the use of finite-differenceapproximations to E. rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006,ISBN 2006 Massachusetts Institute of Fourier Analysis209where f[k] denotes the coefficient ofe2 ikx/lin the expansion. We use squarebrackets [ ] to denote that the argument is discrete, so thatXtandX[t] areequivalent forZwe obtainf[n] = l0 f(s)e2 isn/lds, f(s) =1l n= f[n]e 2 isn/l.
7 ( )Note that f(s) is periodic with periodland so is defined only for 0 s < ltoavoid aliasing. Often this transform is defined for the special casel= 1 but thegeneral case emphasizes the duality between equations ( ) and ( ).Finally, for functions onPNwe have the discrete Fourier transformf[n] =N 1 k=0 f[k]e2 ikn/N, f[k] =1NN 1 n=0f[n]e 2 ikn/N.( )Note that there are other conventions for Fourier transforms, particularly thoseinvolving = 2 s. However, this tends to destroy symmetry between theanalysis and synthesis equations so we use the definitions given the case of stochastic Processes , the most important Fourier relationshipis between the covariance function and the power spectrum; this is known asthe Wiener-Khintchine theorem, see Chatfield [1989].
8 Sampling and PeriodizationWe can obtain relationships between functions and their transforms onR,Tl,Z,PNthrough the notions : Given a functionfonRand a spacing parameterh-samplingh >0, we construct a corresponding discrete function onZusing [n] =f(nh), n Z.( ) Similarly we can discretize a function defined onTlontoPN, but in this casewe must takeh=l/Nso thatNsteps of sizehwill equal the by summation: Letf(x)be a function onRthatperiodization bysummationrapidly approaches 0 asx . We can sum translates of the function toproduce thel-periodic functiong(x) = m= f(x ml),( )forl >0. Analogously, when is defined onZand [n]rapidly approaches 0asn we can construct a function onPNbyN-summation by setting [n] = m= [n mN].
9 ( ) C. E. rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006,ISBN 2006 Massachusetts Institute of Markov ProcessesLet [n] be obtained byh-sampling fromf(x), with corresponding Fouriertransforms (s) and f(s). Then we have [n] =f(nh) = f(s)e2 isnhds,( ) [n] = l0 (s)e2 isn/lds.( )By breaking up the domain of integration in eq. ( ) we obtain [n] = m= (m+1)lml f(s)e2 isnhds( )= m= l0 f(s +ml)e2 inh(s +ml)ds ,( )using the change of variables =s ml. Now sethl= 1 and usee2 inm= 1forn, mintegers to obtain [n] = l0( m= f(s+ml))e2 isn/lds,( )which implies that (s) = m= f(s+ml),( )withl= 1/h. Alternatively settingl= 1 one obtains (s) =1h m= f(s+mh).
10 Similarly iffis defined onTland [n] =f(nlN) is obtained by sampling then [n] = m= f[n+mN].( )Thus we see that sampling inx-space causes periodization in Fourier consider the periodization of a functionf(x) withx Rto give thel-periodic functiong(x), m= f(x ml). Let g[k] be the Fourier coefficientsofg(x). We obtain g[k] =1l l0g(x)e 2 ikx/ldx=1l l0 m= f(x ml)e 2 ikx/ldx( )=1l f(x)e 2 ikx/ldx=1l f(kl),( )assuming thatf(x) is sufficiently well-behaved that the summation and inte-gration operations can be exchanged. A similar relationship can be obtainedfor the periodization of a function defined onZ. Thus we see that periodizationinx-space gives rise to sampling in Fourier E. rasmussen & C.