Event Detection from Time Series Data - Data Analysis and ...

Event Detection from time Series data Valery Guralnik, Jaideep Srivastava Department of Computer Science University of Minnesota { guralnik , Abstract whether the sampling is fixed or adaptive. The In the past few years there has been increased interest in overall goal is to obtain an accurate picture of the using data -mining techniques to extract interesting patterns phenomenon with minimum sampling effort. Examples from time Series data generated by sensors monitoring of such observations include highway traffic monitoring, temporally varying phenomenon. Most work has assumed electro-cardiograms, and monitoring of oil refineries.}

That raw data is somehow processed to generate a sequence In the past few years there has been increased interest of events , which is then mined for interesting episodes. In in using data mining techniques to extract interesting some cases the rule for determining when a sensor reading patterns from temporal sequences [SA95, MTV97, should generate an Event is well known. However, if the PT96]. A standard assumption has been that the raw phenomenon is ill-understood, stating such a rule is difficult. data collected from sensors is somehow processed to Detection of events in such an environment is the focus generate a sequence of events , which is then mined for of this paper.

Consider a dynamic phenomenon whose behavior changes enough over time to be considered a interesting episodes [MTV95, HKM+96]. While there qualitatively significant change. The problem we investigate is no strict definition of an episode, intuitively it is a is of identifying the time points at which the behavior pattern of events occurring in some order, and close change occurs. In the statistics literature this has been enough to each other in time . Recent research has called the change-point Detection problem. The standard developed languages for specifying temporal patterns approach has been to (a) upriori determine the number [MT96, PT96, GWS98], and algorithms have been of change-points that are to be discovered, and (b) decide proposed that takes advantage of the specified pattern the function that will be used for curve fitting in the to increase the computational efficiency of the mining interval between successive change-points.

In this paper process. we generalize along both these dimensions. We propose an However, an issue that has received scant attention iterative algorithm that fits a model to a time segment, and uses a likelihood criterion to determine if the segment should is of deriving an Event sequence from raw sensor data . be partitioned further, if it contains a new change- In some cases the rule for determining when a sensor point. In this paper we present algorithms for both the batch reading should generate an Event is well known, and incremental versions of the problem, and evaluate their if the temperature of a boiler goes above a certain behavior with synthetic and real data .

Finally, we present threshold, then sound an alarm. However, if the initial results comparing the change-points detected by the phenomenon is ill-understood or changes its behavior batch algorithm with those detected by people using visual unpredictably, adapting the threshold such that Event inspection. reporting is accurate becomes very difficult. Thus, a more systematic approach is required for processing the 1 Introduction raw sensor data to generate an Event sequence. This is the focus of our paper. Sensor-based monitoring of any phenomenon creates time Series data . The spacing between successive Consider a dynamic phenomenon whose behavior readings may be constant or varying, depending on changes enough over time so as to be considered a qualitatively significant change.

Each such change is Permission to make digital or hard topics of all or part of this work ~CII an Event . An example is the change of highway traffic personal or classroom use is granted without fee provided that copies from light to heavy to congested. Another example is are not made or distributed lb profit or commercial advantage and that the change of a boiler from normal to super-heated. copies bear this notice and the full citation on the tirst page. To copy The specific problem we address is of applying data otherwise, to republish, to post on scrvcrs or to redistribute to lists. requires prior specific permission and/or a fee.

Mining techniques to identify the time points at which KDD-99 San Diego CA USA the changes, events , occur. In the statistics Copyright ACM 1999 l-58113-143-7/99 $ literature this has been called the change point Detection 33. problem. The standard approach has been to (a) of parameters in the model, or perhaps even the change apriori determine the number of change points to of the model itself, at unknown time (s). be discovered, and (b) decide the model to be used This problem is widely known as the change-point for fitting the subsequence between successive change Detection problem in the field of statistics.

A number points. Thus, the problem becomes one of finding the of approaches have been proposed to solve the change- best set of the predetermined number of points that point. Detection problem [SO94, Hus93, Haw76, HM73, minimizes the error in fitting the pre-decided function Gut74]. The standard assumption is that the phe- [SO94, Hus93, Haw76, HM73, Gut74]. [KS971 addresses nomenon can be approximated by a known, stationary the problem of approximating a sequence of sensor (usually linear) model. However, this assumption may readings by a set of Ic linear segments as a pre-processing not be true in some domains, creating the need for an step.]

This too can be considered a version of the change- approach that works without this assumption. In this point Detection problem. In the proposed approach, we paper we propose an approach that simultaneously ad- address both limitations of standard approaches. First, dresses the issue of model selection and change-point we place no constraint on the class of functions that Detection . will be fitted to the subsequences between successive change points. Second, the number of change points is Formal Statement of the Problem not, fixed apriori. Rather, the appropriate set is found Consider a time Series denoted by y(t), t = 1,2.

N. by using maximum likelihood methods [Hud66]. where t is a time variable. In this paper we study two versions of the change We would like to find a piecewise segmented model point Detection problem, namely the batch and the M, given by incremental versions. In the batch version the entire Y = fl(t,wl) + cl(t), (1 < t I h), data set is available, as in the case of 24-hour data from = f2(t,w2) + e2(t), (6 < t 5 f32), traffic sensors, from which the best, set of change points (t,.Wlc) +ek(tj, .ie,l;.(t' ~j;~. can be determined. In the incremental version, the algorithm receives new data points one at a time , and determines if the new observation causes a new change- An fi(t, wi) is the function (with its vector of point to be discovered.)))

Event Detection from Time Series Data - Data Analysis and ...

Tags:

Information

Transcription of Event Detection from Time Series Data - Data Analysis and ...

Related search queries

Event Detection from Time Series Data - Data Analysis and ...

Tags:

Information

Documents from same domain

Web Mining — Concepts, Applications, and Research …

Related documents

F ast Algorithms for Mining Asso ciation Rules

LECTURE NOTES ON DATA MINING& DATA …

MANAGING AND MINING GRAPH DATA - Charu Aggarwal

Event Log Analysis with the LogCluster Tool - estpak.ee

Related search queries