The extensive discussion triggered by my question about the value of an incremental investment in data quality (Economics and Quality Shared) led to the statement that is the title of this post. This conclusion can be re-stated as a mathematical solution for Potential Value.
Potential Value = (data x quality) sharing
Potential Value is currency transparent in this context. One could imagine solving this problem in the currencies of economics, environmental, or societal values.
Data in this context has a length scale of time. The longer the period of record then the greater the potential value.
Quality in this context has a scale that is the product of the rigor of the quality management system times the defensibility of compliance with that system. The quality factor has dimensions equal to the service objectives of the quality management framework typically: accuracy, timeliness and reliability.
Data with the highest potential for extensive re-usability are collected to internationally recognized standards complete with auditable traceability of full compliance.
A more reasonable quality objective for most data providers would be compliance with local or regional standards that are meaningful in the context, useful for the primary client, affordable: hence achievable and therefore traceable and defensible.
The quality is passed along with the data as a metadata payload that can be investigated for analysis of fitness for purpose.
Sharing in this context of this exponent has two scaling factors. One is curation of both the data and the metadata. Curation is a metric for the data life-cycle management which determines the likelihood of the data being intact in 10, 100 of 500 years into the future. Curation of the metadata payload includes a requirement for cataloguing to make the data searchable and discoverable. The other scaling factor is interoperability.
Imagine you chiseled your data into stone tablets. This data would have great enduring value; however it would have low inter-operability for data sharing.
Data with a sharing exponent of zero would resolve to a potential value of 1 (relative to whatever currency and scaling factor for value that you choose). In other words, single purpose data have finite potential value. In my post Dark Data, I equate data sharing with recycling. Further, I argue that it is unethical not to share because you can never predict the value of the information in your data to a user that you don’t know.
It is, perhaps, useful to visualize this equation in terms of its length vectors:
These vectors each represent manageable elements of any monitoring program. It is also useful to consider the role of time in each of these vectors. Time is explicit in the length vector for period of record. Time is implicit in the service objective of timeliness for the quality vectors. Time is also implicit in the sharing vector. This can be interpreted as meaning that data continue to accrue in potential value for as long as they are properly curated.
Some would argue that potential value is also a function of location. Data from a water-rich, data-rich region are surely less valuable than data from a water-poor data-poor region. I am not so sure. Predictions of who would discover great value in your data, for what purpose, or why are not certain as we witness the slow-motion train wreck that is the collision of land-use change with climate-change, which is creating a growing need for water data at all locations and across all time and space scales.
What are the social/economic/political/technological/ethical barriers that need to be broken down to increase the rate at which water data are shared? Breaching these barriers will exponentially increase the cumulative potential value of our global data assets for resolving the wicked problems ahead.
Do you agree?
This eBook examines the current standards for characterizing and communicating data quality. Discover how qualifying your data can build confidence and trust.