Example: barber

A Model for Data Quality Assessment - Metadata standard

A Model for data Quality Assessment Baba Piprani1, Denise Ernst2 1 SICOM Canada 2 DSI Now, Canada Abstract. One of the major causes for the failure of information systems to deliver can be attributed to data Quality . Gartner s figures and other similar studies show the failure rate hovering at a plateau of 50% for data warehouses since 2004. While the true cause of poor data Quality can be attributed to a lack of supporting business processes, insufficient analysis techniques, along with protecting oneself with the introduction of data Quality firewalls for incoming data , the question has to be raised as to whether a data Quality Assessment of the existing data would be worthwhile or plausible?

A Model for Data Quality Assessment Baba Piprani 1, Denise Ernst 2 1 SICOM Canada 2 DSI Now, Canada babap@attglobal.net, denise.mcconnell@rogers.com Abstract. One of the major causes for the failure of information systems to

Tags:

  Assessment, Data, Quality, Metadata, Data quality assessment

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of A Model for Data Quality Assessment - Metadata standard

1 A Model for data Quality Assessment Baba Piprani1, Denise Ernst2 1 SICOM Canada 2 DSI Now, Canada Abstract. One of the major causes for the failure of information systems to deliver can be attributed to data Quality . Gartner s figures and other similar studies show the failure rate hovering at a plateau of 50% for data warehouses since 2004. While the true cause of poor data Quality can be attributed to a lack of supporting business processes, insufficient analysis techniques, along with protecting oneself with the introduction of data Quality firewalls for incoming data , the question has to be raised as to whether a data Quality Assessment of the existing data would be worthwhile or plausible?

2 This paper defines a data Quality Assessment Model that enables a methodology to assess data Quality and assign ratings using a score-card approach. A by-product of this Model helps establish sluice gate parameters to allow data to pass through data Quality filters and data Quality firewalls. Keywords: data Quality Assessment , data Quality Firewall, data Quality filter, data lineage, Type Instance 1 Introduction Did you know that in September 1999 a metric mishap caused the crash landing of a Mars bound spacecraft where NASA lost a $125 million Mars orbiter because 2 engineering teams, for a key spacecraft operation, used different units of measure which resulted in failure of data transfer due to the mismatch?

3 [2]. Did you know that a referee in the World Cup Soccer 2006 match between Australia and Croatia handed a soccer player 3 yellow cards before the player was sent off? The rule is 2 yellows results in a player being sent off [1]. Did you know that data warehouse success measures, or more appropriately stated, failure rates or limited acceptance rates have been in the range of 92% (back in late 1990s) to greater than 50% for 2007 [3][6]----a dismal record indeed. So what do we mean by failure ? The meaning of the term failure has been amplified by the Standish Group [4] with the interpretation that the success of the project refers to the project being completed on time and on budget with all features and functions as initially specified; or, the project being challenged refers to the project is completed and operational but, over-budget, over the time estimate, and offers a subset of features and functions originally specified; and being impaired refers to the project being cancelled at some point during the development cycle.

4 According to the Standish Group s 2003 CHAOS report, 15% of the IT projects failed and another 51% were considered challenged , while 82% of the IT projects experienced significant schedule slippage with only 52% of required features and functions being delivered. For 2004, results show that 29% of all projects succeeded delivered on time, on budget, with required features and functions; 53% were challenged ; and 18% failed cancelled prior to completion or delivered and never used. A staggering 66% of IT projects proved unsuccessful in some measure, whether they fail completely, exceed their allotted budget, aren't completed according to schedule or are rolled out with fewer features and functions than promised [5].

5 2 Root cause of failures Quality appears missing in the meaning of success or failure of a IT project. Lack of data Quality appears to be the major culprit in the failure of IT projects. The traditional project management triangle is represented by cost, scope and schedule in Figure 1 with Quality injecting throughout the cycle but often enough there no associated project deliverable. This gap is unexpected yet understood so how do we assess and close the gap? Fig. 1. Project Management triangle. It is important to observe that in any typical manufacturing process, Quality is injected in every process from the very start. For example, a casting metal foundry technician systematically monitors a melt to assess that the required composition of metal compounds like carbon, iron, nickel, chromium, zinc etc.

6 Are in place prior to pouring to ensure the Quality of the desired metal casting. Similarly, it is imperative that data Quality needs to be injected in every phase of information system design and implementation with due diligence to governance, monitoring and auditing, among other things. In this paper we explore how we can define a similar Quality control Assessment for data . 3 Issues to tackle Where do we start? In our experience, examining the IT projects we have been called into to salvage and steer the usually sinking project towards 100% success, we have observed the following issues: Business requirements documentation is non-existent, not maintained upon change, too high-level, lacks integrated enterprise viewpoint, and lacks supporting business processes Business rules are buried in program code which results in higher maintenance costs, dependency on specialized skills, and a lack of awareness Undocumented definitions and missing semantics Inability to audit and monitor changes to the architecture and contained data These are only samplings of the issues that are encountered that contribute to the data Quality chasm in the building of information systems in both existing and under development.

7 This list demonstrates a need to address data Quality assessments throughout the solution s Systems Development Lifecycle (SDLC) as in Figure 2. Requirements Analysis/Design Construction Transition D a ta Q u a lity A s s e s s m e n t Fig. 2. Generic SDLC and data Quality assessments. 4 data Quality Assessment Objective The objective of the Assessment is to identify the Quality of the data in the identified business activity. An organization could be primarily in the service industry while another in the regulation sector. The Assessment results determine the accuracy, completeness, consistency, precision, reliability, temporal reliability, uniqueness and validity of the data .

8 Assessment standard criteria are used when conducting the Assessment . When conducting an Assessment , the business requirements are your window to the world. This can be a daunting task when assessing the Quality of data at the enterprise perspective. Remember to scope the tests within the Assessment criteria to ensure a balanced cost/benefit for the organization. For example, if financial data is the vital to the business operations then the Quality of such data will be an important factor in key business decision-making. Quality Assessment can happen in several manners, generally as either detection tests or penetration tests. Assessment detection tests assess data Quality , identify risks, and can be used to determine risk mitigation efforts.

9 Assessment penetrations tests, in addition to Assessment detection, will penetrate systems with faulty data and monitor the effect and result. This will help to identify process deficiencies as well as determine Quality of the data . The summary of Assessment tests should reveal data Quality scorecard metrics vital to the organization s business and operations. 5 Methodology Assessing data Quality should not be like trying to pick up jello! Nor should it be an exercise in throwing darts on a Saturday afternoon in a pub! What is needed is an approach to methodically put in place data Quality measures and standards sufficiently applicable at any stage of the life cycle, even if being parachuted in any part of the life cycle!

10 Then it becomes a primary requirement to be able to assess data Quality across both the earlier stages and later stages of the development life cycle from any given point in the development life cycle. Not only that, it should be possible to precisely home in on any given stage of the development life cycle to enable the establishment of subsequent correctional measures going forwards. Table 1 highlights how data Quality Assessment criteria are addressed by NIAM and ORM based modeling. The data Quality Assessment tests that can be conducted in level pair constructs across the 3 data lineage levels to determine the resultant data Quality . The Assessment test examples can be performed based on the data lineage level of the attribute by allocating each attribute in the implementation with a class-term which simply groups similar attribute types based on similarity of concepts, amounts, dates, ratios, counts, quantities.


Related search queries