Example: bankruptcy

The Handbook of Data Mining - pudn.com

THEHANDBOOK OF data MININGE dited byNong YeHuman Factors andErgonomicsTHE Handbook OF data MININGH uman Factors and ErgonomicsGavriel Salvendy, Series EditorHendrick, H., and Kleiner, B.(Eds.):Macroergonomics: Theory, Methods, andApplicationsHollnagel, E.(Ed.): Handbook of Cognitive Task DesignJacko, , and Sears, A.(Eds.):The Human-Computer Interaction Handbook :Fundamentals, Evolving Technologies and Emerging ApplicationsMeister, D., and Enderwick, T.(Eds.):Human Factors in System Design, Development, andTestingStanney, Kay M.(Ed.): Handbook of Virtual Environments: Design, Implementation,and ApplicationsStephanidis, C.(Ed.):User Interfaces for All: Concepts, Methods, and ToolsYe, Nong(Ed.):The Handbook of data MiningAlso in this SeriesHCI 1999 Proceedings 2-Volume Set Bullinger, , and Ziegler, J.

THE HANDBOOK OF DATA MINING Edited by Nong Ye Human Factors and Ergonomics. THE HANDBOOK OF DATA MINING. ... Handbook of Virtual Environments: Design, Implementation, and Applications Stephanidis, ... Books published by Lawrence Erlbaum Associates are printed on acid-free paper, and their bindings are chosen for strength and durability. ...

Tags:

  Handbook, Data, Free, Mining, The handbook of data mining

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of The Handbook of Data Mining - pudn.com

1 THEHANDBOOK OF data MININGE dited byNong YeHuman Factors andErgonomicsTHE Handbook OF data MININGH uman Factors and ErgonomicsGavriel Salvendy, Series EditorHendrick, H., and Kleiner, B.(Eds.):Macroergonomics: Theory, Methods, andApplicationsHollnagel, E.(Ed.): Handbook of Cognitive Task DesignJacko, , and Sears, A.(Eds.):The Human-Computer Interaction Handbook :Fundamentals, Evolving Technologies and Emerging ApplicationsMeister, D., and Enderwick, T.(Eds.):Human Factors in System Design, Development, andTestingStanney, Kay M.(Ed.): Handbook of Virtual Environments: Design, Implementation,and ApplicationsStephanidis, C.(Ed.):User Interfaces for All: Concepts, Methods, and ToolsYe, Nong(Ed.):The Handbook of data MiningAlso in this SeriesHCI 1999 Proceedings 2-Volume Set Bullinger, , and Ziegler, J.

2 (Eds.):Human-Computer Interaction: Ergonomicsand User Interfaces Bullinger, , and Ziegler, J.(Eds.):Human-Computer Interaction:Communication, Cooperation, and Application DesignHCI 2001 Proceedings 3-Volume Set Smith, , Salvendy, G., Harris, D., and Koubek, (Eds.):Usability Evaluationand Interface Design: Cognitive Engineering, Intelligent Agents, and Virtual Reality Smith, , and Salvendy, G.(Eds.):Systems, Social, and InternationalizationDesign Aspects of Human-Computer Interaction Stephanidis, C.(Ed.):Universal Access in HCI: Towards an Information Societyfor AllFor more information about LEA titles, please contact Lawrence Erlbaum Associates, Publishers, Handbook OF data MININGE dited byNong YeArizona State UniversityLAWRENCE ERLBAUM ASSOCIATES, PUBLISHERS2003 Mahwah, New JerseyLondonSenior Acquisitions Editor:Debra RiegertEditorial Assistant:Jason PlanerCover Design:Kathryn Houghtaling LaceyTextbook Production Manager: Paul SmolenskiFull-Service Compositor:TechBooksText and Cover Printer:Hamilton Printing CompanyThis book was typeset in 10/12 pt.

3 Times, Italic, Bold, Bold Italic,and Courier. The heads were typeset in Americana Bold and AmericanaBold 2003 by Lawrence Erlbaum Associates, rights reserved. No part of this book may be reproduced in anyform, by photostat, microfilm, retrieval system, or any other means,without prior written permission of the Erlbaum Associates, Inc., Publishers10 Industrial AvenueMahwah, New Jersey 07430 The editor, authors, and the publisher have made every effort toprovide accurate and complete information in this Handbook but thehandbook is not intended to serve as a replacement for professionaladvice. Any use of this information is at the reader s discretion. Theeditor, authors, and the publisher specifically disclaim any and allliability arising directly or indirectly from the use or application ofany information contained in this Handbook .

4 An appropriate professionalshould be consulted regarding your specific of Congress Cataloging-in-Publication DataThe Handbook of data Mining / edited by Nong cm. (Human factors and ergonomics)Includes bibliographical references and 0-8058-4081-81. data Mining . I. Ye, Nong. II. H385 dc212002156029 Books published by Lawrence Erlbaum Associates are printed on acid-freepaper, and their bindings are chosen for strength and in the United States of America10987654321 ContentsForewordxviiiGavriel SalvendyPrefacexixNong YeAbout the EditorxxiiiAdvisory BoardxxvContributorsxxviiI:METHODOLOGIES OF data MINING1 Decision Trees3 Johannes GehrkeIntroduction3 Problem Definition4 Classification Tree Construction7 Split Selection7 data Access8 Tree Pruning15 Missing Values17A Short Introduction to Regression Trees20 Problem Definition20 Split Selection20 data Access21 Applications and Available Software22 Cataloging Sky Objects22 Decision Trees in Today s data Mining Tools22 Summary22 References232 Association Rules25 Geoffrey I.

5 WebbIntroduction26 Market Basket Analysis26 Association Rule Discovery27 The Apriori Algorithm28 The Power of the Frequent Item Set Strategy29 Measures of Interestingness31vviCONTENTSLift31 Leverage32 Item Set Discovery32 Techniques for Frequent Item Set Discovery33 Closed Item Set Strategies33 Long Item Sets35 Sampling35 Techniques for Discovering Association Rules without Item Set Discovery35 Associations with Numeric Values36 Applications of Association Rule Discovery36 Summary37 References383 Artificial Neural Network Models for data Mining41 Jennie Si, Benjamin J. Nelson, and George C. RungerIntroduction to Multilayer Feedforward Networks42 Gradient Based Training Methods for MFN43 The Partial Derivatives44 Nonlinear Least Squares Methods45 Batch versus Incremental Learning47 Comparison of MFN and Other Classification Methods47 Decision Tree Methods47 Discriminant Analysis Methods48 Multiple Partition Decision Tree49A Growing MFN50 Case Study 1 Classifying Surface Texture52 Experimental Conditions52 Quantitative Comparison Results of Classification Methods53 Closing Discussions on Case 155 Introduction to SOM55 The SOM Algorithm56 SOM Building Blocks57 Implementation of the SOM Algorithm58 Case Study 2 Decoding Monkey s Movement Directions from ItsCortical Activities59 Trajectory Computation from Motor Cortical Discharge Rates60 Using

6 data from Spiral Tasks to Train the SOM62 Using data from Spiral and Center Out Tasks to Train the SOM62 Average Testing Result Using the Leave-K-Out Method63 Closing Discussions on Case 264 Final Conclusions and Discussions65 References654 Statistical Analysis of Normal and Abnormal Data67 Connie M. BorrorIntroduction67 Univariate Control Charts68 Variables Control Charts68 Attributes Control Charts81 CONTENTSviiCumulative Sum Control Charts89 Exponentially Weighted Moving Average Control Charts93 Choice of Control Charting Techniques95 Average Run Length96 Multivariate Control Charts98 data Description98 Hotelling T2 Control Chart98 Multivariate EWMA Control Charts101 Summary102 References1025 Bayesian data Analysis103 David Madigan and Greg RidgewayIntroduction104 Fundamentals of Bayesian Inference104A Simple Example104A More Complicated Example106 Hierarchical Models and Exchangeability109 Prior Distributions in Practice111 Bayesian Model Selection and Model Averaging113 Model Selection113 Model Averaging114 Model Assessment114 Bayesian

7 Computation115 Importance Sampling115 Markov Chain Monte Carlo (MCMC)116An Example117 Application to Massive Data118 Importance Sampling for Analysis of Massive data Sets118 Variational Methods120 Bayesian Modeling121 BUGS and Models of Realistic Complexity via MCMC121 Bayesian Predictive Modeling125 Bayesian Descriptive Modeling127 Available Software128 Discussion and Future Directions128 Summary128 Acknowledgments129 References1296 Hidden Markov Processes and Sequential Pattern Mining133 Steven L. ScottIntroduction to Hidden Markov Models134 Parameter Estimation in the Presence of Missing Data136 The EM Algorithm136 MCMC data Augmentation138 Missing data Summary140 Local Computation140 The Likelihood Recursion140viiiCONTENTSThe Forward-Backward Recursions141 The Viterbi Algorithm142 Understanding the Recursions143A Numerical Example Illustrating the Recursions143 Illustrative Examples and Applications144 Fetal Lamb Movements144 The Business Cycle150 HMM Stationary and Predictive Distributions153 Stationary Distribution ofdt153 Predictive Distributions154 Posterior Covariance of h154 Available Software154 Summary154 References1557 Strategies and Methods for Prediction159 Greg RidgewayIntroduction to the Prediction Problem160 Guiding Examples160

8 Prediction Model Components161 Loss Functions What We are Trying to Accomplish162 Common Regression Loss Functions162 Common Classification Loss Functions163 Cox Loss Function for Survival Data166 Linear Models167 Linear Regression168 Classification169 Generalized Linear Model172 Nonlinear Models174 Nearest Neighbor and Kernel Methods174 Tree Models177 Smoothing, Basis Expansions, and Additive Models179 Neural Networks182 Support Vector Machines183 Boosting185 Availability of Software188 Summary189 References1908 Principal Components and Factor Analysis193 Daniel W. ApleyIntroduction194 Examples of Variation Patterns in Correlated Multivariate Data194 Overview of Methods for Identifying Variation Patterns197 Representation and Illustration of Variation Patterns in Multivariate Data197 Principal Components Analysis198 Definition of Principal Components199 Using Principal Components as Estimates of the Variation Patterns199 CONTENTSixFactor Rotation202 Capabilities and Limitations of PCA202 Methods for Factor Rotation203 Blind Source Separation205 The Classic Blind Source Separation Problem205 Blind Separation Principles206 Fourth-Order Blind Separation Methods208 Additional Manufacturing Applications211 Available Software211 Summary212 References2129 Psychometric Methods of Latent Variable Modeling215 Edward Ip.

9 Igor Cadez, and Padhraic SmythIntroduction216 Basic Latent Variable Models217 The Basic Latent Class Model217 The Basic Finite Mixture Model221 The Basic Latent Trait Model224 The Basic Factor Analytic Model226 Common Structure229 Extension for data Mining229 Extending the Basic Latent Class Model229 Extending the Basic Mixture Model232 Extending the Latent Trait Model233 Extending the Factor Analytic Model234An Illustrative Example236 Hierarchical Structure in Transaction Data236 Individualized Mixture Models237 data Sets238 Experimental Results238 References and Tools241 References241 Tools243 Summary244 References24410 Scalable Clustering247 Joydeep GhoshIntroduction248 Clustering Techniques: A Brief Survey249 Partitional Methods250 Hierarchical Methods255 Discriminative versus Generative Models256 Assessment of Results256 Visualization of Results258 Clustering Challenges in data Mining259 Transactional data Analysis259xCONTENTSNext Generation Clickstream Clustering260 Clustering Coupled Sequences261 Large Scale Remote Sensing261 Scalable Clustering for data Mining262 Scalability to Large Number of Records or Patterns,N262 Scalability to Large Number of Attributes or Dimensions,d264 Balanced Clustering266 Sequence Clustering Techniques266 Case Study: Similarity Based Clustering of Market Baskets and Web Logs267 Case Study: Impact of Similarity Measures on Web Document Clustering270 Similarity Measures.

10 A Sampler270 Clustering Algorithms and Text data Sets272 Comparative Results273 Clustering Software274 Summary274 Acknowledgments274 References27511 Time Series Similarity and Indexing279 Gautam Das and Dimitrios GunopulosIntroduction279 Time Series Similarity Measures281 Euclidean Distances andLpNorms281 Normalization Transformations282 General Transformations282 Dynamic Time Warping283 Longest Common Subsequence Similarity284 Piecewise Linear Representations287 Probabilistic Methods288 Other Similarity Measures288 Indexing Techniques for Time Series289 Indexing Time Series When the Distance Function Is a Metric290A Survey of Dimensionality Reduction Techniques292 Similar Time-Series Retrieval When the Distance Function Is Not a Metric299 Subsequence Retrieval301 Summary302 References30212 Nonlinear Time Series Analysis305 Ying-Cheng Lai, Zonghua Liu, Nong Ye, and Tolga YalcinkayaIntroduction305 Embedding Method for Chaotic Time Series Analysis307 Reconstruction of Phase Space307 Computation of Dimension309 Detection of Unstable Periodic Orbits311 Computing Lyapunov Exponents from Time Series317 Time-Frequency Analysis of Time Series323 Analytic Signals and Hilbert Transform324 Method of EMD331 CONTENTSxiSummary338 Acknowledgment338 References33813 Distributed Da


Related search queries