Example: dental hygienist

A Survey of Sequential Pattern Mining

Data Science and Pattern Recognitionc 2017 ISSN XXXX-XXXXU biquitous InternationalVolume1, Number1, February2017A Survey of Sequential Pattern MiningPhilippe Fournier-VigerSchool of Natural Sciences and HumanitiesHarbin Institute of Technology Shenzhen Graduate SchoolHIT Campus Shenzhen University Town Xili, Shenzhen, Chun-Wei LinSchool of Computer Science and TechnologyHarbin Institute of Technology Shenzhen Graduate SchoolHIT Campus Shenzhen University Town Xili, Shenzhen, Uday KiranUniversity of Tokyo, Tokyo, JapanNational Institute of Information and Communication Technology, Tokyo, Sing KohDepartment of Computer ScienceUniversity of Auckland, Auckland, New ThomasDepartment of Computer Science and EngineeringSCT, Bhopal, unexpected and useful patterns in databases is a fundam

56 P. Fournier-Viger, Jerry C. W. Lin, R. U. Kiran, Y. S. Koh and R. Thomas increase of 20$, and a decrease of 20$, respectively. There …

Tags:

  Mining, Patterns, Sequential, Sequential pattern mining

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of A Survey of Sequential Pattern Mining

1 Data Science and Pattern Recognitionc 2017 ISSN XXXX-XXXXU biquitous InternationalVolume1, Number1, February2017A Survey of Sequential Pattern MiningPhilippe Fournier-VigerSchool of Natural Sciences and HumanitiesHarbin Institute of Technology Shenzhen Graduate SchoolHIT Campus Shenzhen University Town Xili, Shenzhen, Chun-Wei LinSchool of Computer Science and TechnologyHarbin Institute of Technology Shenzhen Graduate SchoolHIT Campus Shenzhen University Town Xili, Shenzhen, Uday KiranUniversity of Tokyo, Tokyo, JapanNational Institute of Information and Communication Technology, Tokyo, Sing KohDepartment of Computer ScienceUniversity of Auckland, Auckland, New ThomasDepartment of Computer Science and EngineeringSCT, Bhopal, unexpected and useful patterns in databases is a fundamentaldata Mining task.

2 In recent years, a trend in data Mining has been to design algorithmsfor discovering patterns in Sequential data. One of the most popular data Mining tasks onsequences is Sequential Pattern Mining . It consists of discovering interesting subsequencesin a set of sequences, where the interestingness of a subsequence can be measured interms of various criteria such as its occurrence frequency, length, and profit. Sequentialpattern Mining has many real-life applications since data is encoded as sequences inmany fields such as bioinformatics, e-learning, market basket analysis, text analysis, andwebpage click-stream analysis.

3 This paper surveys recent studies on Sequential patternmining and its applications. The goal is to provide both an introduction to sequentialpattern Mining , and a Survey of recent advances and research opportunities. The paperis divided into four main parts. First, the task of Sequential Pattern Mining is defined andits applications are reviewed. Key concepts and terminology are introduced. Moreover,main approaches and strategies to solve Sequential Pattern Mining problems are of traditional Sequential Pattern Mining approaches are also highlighted, andpopular variations of the task of Sequential Pattern Mining are presented.

4 The paperalso presents research opportunities and the relationship to other popular Pattern miningproblems. Lastly, the paper also discusses open-source implementations of sequentialpattern Mining : Sequential Pattern Mining , Sequences, Frequent Pattern Mining , Itemsetmining, Data Mining ,54A Survey of Sequential Pattern Mining consists of extracting information from data stored in databases to un-derstand the data and/or take decisions. Some of the most fundamental data Mining tasks are clustering,classification, outlier analysis, and Pattern Mining [6, 58].

5 Pattern Mining consists of discovering interest-ing, useful, and unexpected patterns in databases. This field of research has emerged in the 1990s withthe seminal paper of Agrawal and Srikant [1]. That paper introduced the Apriori algorithm, designed forfinding frequent itemsets, that is groups of items (symbols) frequently appearing together in a databaseof customer transactions. For example, the Apriori algorithm can be used to discover patterns such as{carrotjuice, salad, kiwi}in a retail store database, indicating that these products are frequently boughttogether by interest in Pattern Mining techniques comes from their ability to discover patterns that can behidden in large databases and that are interpretable by humans, and hence useful for understandingthe data and for decision-making.

6 For example, a Pattern {milk, chocolatecookies}can be used tounderstand customer behavior and take strategic decisions to increase sales such as co-promoting productsand offering Pattern Mining has become very popular due to its applications in many domains, severalpattern Mining techniques such as those for frequent itemset Mining [1, 53, 116, 86, 106] and associationrule Mining [1] are aimed at analyzing data, where the Sequential ordering of events is not taken intoaccount. Thus, if such Pattern Mining techniques are applied on data with time or Sequential orderinginformation, this information will be ignored.

7 This may result in the failure to discover important patternsin the data, or finding patterns that may not be useful because they ignore the Sequential relationshipbetween events or elements. In many domains, the ordering of events or elements is important. Forexample, to analyze texts, it is often relevant to consider the order of words in sentences [94]. In networkintrusion detection, the order of events is also important [93].To address this issue, the task ofsequential Pattern miningwas proposed. It is a prominent solutionfor analyzing Sequential data [2, 98, 117, 4, 51, 89, 3, 47, 30, 111, 31, 32, 27, 28, 22, 100, 79].

8 Itconsists of discovering interesting subsequences in a set of sequences, where the interestingness of asubsequence can be measured in terms of various criteria such as its occurrence frequency, length, andprofit. Sequential Pattern Mining has numerous real-life applications due to the fact that data is naturallyencoded as sequences of symbols in many fields such as bioinformatics [108], e-learning [22], market basketanalysis [98], text analysis [94], energy reduction in smarthomes [104], webpage click-stream analysis [25]and e-learning [124].

9 Moreover, Sequential Pattern Mining can also be applied to time series ( stockdata), when discretization is performed as a pre-processing step [66] Sequential Pattern Mining is a very active research topic, where hundreds of papers present newalgorithms and applications each year, including numerous extensions of Sequential Pattern Mining forspecific needs. Because of this, it can be difficult for newcomers to this field to get an overview of thefield. To address this issue, a Survey has been published in 2010 [79]. However, this Survey is no longerup-to-date as it does not discuss the most recent techniques, advances and challenges in the field.

10 In thispaper, we aim to address this issue by presenting an up-to-date Survey of Sequential Pattern Mining thatcan serve both as an introduction and as a guide to recent advances and opportunities in the field. Therest of this paper is organized as follows. The next section describes the problem of Sequential patternmining, and the main techniques employed in Sequential Pattern Mining . Then, the paper discussespopular extensions of the problem of Sequential Pattern Mining , and other problems in data Mining thatare closely related to Sequential Pattern Mining .


Related search queries