lundi 13 juillet 2009

2.1 Data quality definition

"Data has quality if it satisfies the requirements of its intended use. It lacks quality to the extent that it does not satisfy the requirement. In other words, data quality depends as much on the intended use as it does on the data itself. To satisfy the intended use, the data must be accurate, timely, relevant, complete, understood, and trusted." (Olsen, J. (2003). Data quality: The accuracy dimension. p.24)
In general one agrees to define data quality according to six dimensions.
Accuracy: The quality of being near to the true value. (Wordnet.princeton.edu. (2009). Accuracy definition.) Accuracy is the most important dimension. (Olsen, J. (2003). Data quality: The accuracy dimension. p.3)
Timelessness: unaffected by time. (Wordnet.princeton.edu. (2009). Timelessness definition.)
Relevant: the degree to which search results meet the requirements or expectations implicit in the query. (WhamTech . (n.d). Glossary of less-than-usual terms used in the Web site.)
Complete: bring to a whole, with all the necessary parts or elements.
Understood: perceive (an idea or situation) mentally.
Trusted: inclined to believe or confide readily.

Each of those dimensions can be accepted with a certain level of acceptance. As previously said everything depends on the intended use of the information. For example a database with 70% of accuracy may have a value for some company departments (e.g: marketing for estimations) because those 70% of data are exploitable.
On the other hand it can be useless for others, for e.g: an accounting department releasing a balance sheet of 70% accuracy.
Data quality is a complex topic and some additional dimensions can be included for the use of the data such as:

Accessibility, Accuracy, Amount of data, Applicability, Attractiveness, Availability, Believability, Completeness, Concise representation, Consistent representation, Cost effectiveness, Customer support, Currency, Documentation, Duplicates, Ease of operation, Expiration, Flexibility, Granularity, Interactive, Internal consistency, Interpretability, Latency, Maintainable, Novelty, Objectivity, Ontology, Organization, Price, Relevancy, Reliability, Reputation, Response time, Security, Specialization, Source's information, Timeliness, Understand ability, Validity, Value-added. (Muñoz, C./Moraga, A./Piattini, M. (2008). Handbook of Research on Web Information Systems Quality. p.138)

Aucun commentaire:

Enregistrer un commentaire