The use of inaccurate data can have significant consequences for organizations that rely on that data to develop and strengthen relationships with customers. Additionally, data silos are the biggest problem for most businesses today. However, companies are still reluctant to stop capturing or deleting data for fear of missing out on potential future value. The sheer growth in the velocity, volume, and variety of data is fanning this appetite for gathering data, unsure of its value and if or when it will be useful. This article by Malcolm Chisholm at The Data Administration Newsletter (TDAN) speaks about challenges concerning data accuracy.
About Data Accuracy
Data accuracy is one of the “dimensions” of data quality. The purpose of data quality is to measure each dimension and take action in case of any anomalies. However, these best practices vary from dimension to dimension.
Definition of Accuracy
According to Wikipedia, the definition of accuracy is “The degree of closeness of measurements of a quantity to that quantity’s true value.”
Here ‘true’ refers to a representation of reality and is very important for understanding accuracy. Aristotle defined truth in a way that emphasized the relationship between a representation and the reality that it seeks to depict.
Definition of Data Accuracy
According to TDAN, the definition of data accuracy is “The degree to which a data value represents what it purports to represent.”
The concept of data quality is challenging because there are disagreements about what it is and how it should be defined. However, TDAN believes their definition will serve as the basis for determining whether the data value represents what it purports to present.
Data accuracy cannot be achieved with 100% accuracy for observations, as two great minds in quality control have demonstrated. Walter A. Shewhart noted that all measurement systems contain an error. W. Edward Luttwak mentioned that human error is inherent in all measurement systems.
You can assess data quality by comparing a sample of data values with the curated data. However, in some instances, data is not based on external observations, such as the values of a bank account.
Data accuracy is hard to achieve when you have come up with the information solely based on observations. Also, you cannot estimate the accuracy of the result based on data alone.
To read the original article, click on https://tdan.com/the-hard-truth-of-data-accuracy/29020