Example: tourism industry

Neural Networks for Time Series Prediction

Neural Networks for Time SeriesPrediction15-486/782: Artificial Neural NetworksFall 2006(based on earlier slides by Dave Touretzky and Kornel Laskowski)What is a Time Series ?A sequence of vectors (or scalars) which depend on timet. In thislecture we will deal exclusively with scalars:{x(t0), x(t1), x(ti 1), x(ti), x(ti+1), }It s the output of some processPthat we are interested in:Px(t)2 Examples of Time Series Dow-Jones Industrial Average sunspot activity electricity demand for a city number of births in a community air temperature in a buildingThese phenomena may Phenomena Dow-Jones Industrial Average closing value each day sunspot activity each daySometimes data have to be aggregated to get meaningful.

15-486/782: Artificial Neural Networks Fall 2006 (based on earlier slides by Dave Touretzky and Kornel Laskowski) What is a Time Series? A sequence of vectors (or scalars) which depend on time t. In this ... • air temperature in a building These phenomena may be discrete or continuous. 3. Discrete Phenomena

Tags:

  Network, Building

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Neural Networks for Time Series Prediction

1 Neural Networks for Time SeriesPrediction15-486/782: Artificial Neural NetworksFall 2006(based on earlier slides by Dave Touretzky and Kornel Laskowski)What is a Time Series ?A sequence of vectors (or scalars) which depend on timet. In thislecture we will deal exclusively with scalars:{x(t0), x(t1), x(ti 1), x(ti), x(ti+1), }It s the output of some processPthat we are interested in:Px(t)2 Examples of Time Series Dow-Jones Industrial Average sunspot activity electricity demand for a city number of births in a community air temperature in a buildingThese phenomena may Phenomena Dow-Jones Industrial Average closing value each day sunspot activity each daySometimes data have to be aggregated to get meaningful.

2 Births per minute might not be as useful as births per month4 Continuous Phenomenatis real-valued, andx(t) is a continuous get a Series {x[t]}, mustsamplethe signal at discrete uniform sampling, if our sampling period is t, then{x[t]}={x(0), x( t), x(2 t), x(3 t), }(1)To ensure thatx(t) can be recovered fromx[t], tmust be chosenaccording to the Nyquist sampling Sampling TheoremIffmaxis the highest frequency component ofx(t), then we mustsample at a rate at least twice as high:1 t=fsampling>2fmax(2)Why? Otherwise we will see aliasing of frequencies in the range[fsampling/2, fmax].

3 6 Studying Time SeriesIn addition to describing either discrete or continuous phenomena,time Series can also be deterministic vs stochastic, governed by linearvs nonlinear dynamics, Series are the focus of several overlapping disciplines: Information Theorydeals with describing stochastic time Series . Dynamical Systems Theorydeals with describing and manipulatingmostly non-linear deterministic time Series . Digital Signal Processingdeals with describing and manipulatingmostly linear time Series , both deterministic and will use concepts from all Types of Processing predictfuture values ofx[t] classifya Series into one of a few classes price will go up price will go down sell now no change describea Series using a few parameter values of some model transformone time Series into anotheroil prices7 interest rates8 The Problem of Predicting the FutureExtending backward from timet, we have time Series {x[t], x[t 1], }.

4 From this, we now want to estimatexat some future time x[t+s] =f(x[t], x[t 1], )sis called thehorizon of Prediction . We will come back to this; inthe meantime, let s predict just one time sample into the future, s= is a function approximation s how we ll solve it:1. Assume a generative For every pointx[ti] in the past, train the generative modelwith what precededtias theInputsand what followedtias Now run the model to predict x[t+s] from{x[t], }.9 EmbeddingTime is constantly moving forward. Temporal data is hard to we set up a shift register of delays, we can retain successivevalues of our time Series .

5 Then we can treat each past value asanadditionalspatialdimension in the input space to our implicit transformation of a one-dimensional time vector intoan infinite-dimensional spatial vector is input space to our predictor must be finite. At each instantt,truncate the history to only the calledtheembedding the Past to Predict the Futurex(t 1)x(t 2)x(t T)fx(t)tapped delay linedelay element x(t+ 1)11 Linear SystemsIt s possible thatP, the process whose output we are trying topredict, is governed by linear study of linear systems is the domain of Digital Signal Process-ing (DSP).

6 DSP is concerned with linear, translation-invariant (LTI)operationson data streams. These operations are implemented byfilters. Theanalysis and design of filters effectively forms the core of this operate on an input sequenceu[t], producing an output se-quencex[t]. They are typically described in terms of their frequencyresponse, ie. low-pass, high-pass, band-stop, are two basic filter architectures, known as the FIR filter andthe IIR Impulse Response (FIR) FiltersCharacterized byq+ 1 coefficients:x[t] =q i=0 iu[t i](3)FIR filters implement the convolution of the input signal with a givencoefficient vector{ i}.

7 They are known asFinite Impulse Responsebecause, when the inputu[t] is the impulse function, the outputxis only as long asq+ 1,which must be RESPONSE13 Infinite Impulse Response (IIR) FiltersCharacterized bypcoefficients:x[t] =p i=1 ix[t i] +u[t](4)In IIR filters, the inputu[t] contributes directly tox[t] at timet, but,crucially,x[t] is otherwise a weighed sum ofits own past filters are known asInfinite Impulse Responsebecause, inspite of both the impulse function and the vector{ i}being finitein duration, the response only asympotically decays to of thex[t] s is non-zero, it will make non-zero contributions tofuture values ofx[t] ad and IIR DifferencesIn DSP notation.

8 P 2 1u[t]x[t]x[t 1]x[t 2]x[t p]u[t]x[t] 1 2 q 0u[t 1]u[t 2]u[t q]F IRIIR15 DSP Process ModelsWe re interested in modeling a particular process, for the purposeof predicting future Signal Processing (DSP) theory offers three classesof pos-sible linear process models: Autoregressive (AR[p]) models Moving Average (MA[q]) models Autoregressive Moving Average (ARMA[p, q]) models16 Autoregressive (AR[p]) ModelsAn AR[p] assumes that at its heart is an IIR filter applied to some(unknown) internal signal, [t].pis the order of that [t] =p i=1 ix[t i] + [t](5)This is simple, but adequately describes many complex phenomena(ie.)

9 Speech production over short intervals).If on average [t] is small relative tox[t], then we can estimatex[t]using x[t] x[t] [t](6)=p i=1wix[t i](7)This is an FIR filter! Thewi s are estimates of the i AR[p]ParametersBatch version:x[t] x[t](8)=p i=1wix[t i](9) x[p+ 1]x[p+ 2].. = x[1]x[2] x[p]x[2]x[3] x[p+ 1].. w(10)Can use linear regression. : speech recognition. Assume that over small windowsof time, speech is governed by a static AR[p] model. To learnwisto characterize the vocal tract during that window. This is calledLinear Predictive Coding (LPC).

10 18 Estimating AR[p]ParametersIncremental version (same equation):x[t] x[t]=p i=1wix[t i]For each sample, modify eachwiby a small wito reduce thesample squared error(x[t] x[t])2. One iteration of : noise cancellation. Predict the next sample x[t] andgenerate x[t] at the next time stept. Used in noise cancellingheadsets for office, car, aircraft, Average (MA[q]) ModelsA MA[q] assumes that at its heart is an FIR filter applied to some(unknown) internal signal, [t].q+ 1 is the order of that [t] =q i=0 i [t i](11)Sadly, cannot assume that [t] is negligible;x[t] would have to benegligible.


Related search queries