Environmental agencies and organizations invest huge amount of money to build the required hardware and software infrastructure for collecting and storing data from field sensors in order to extract valuable information hidden in the time series numbers about the environment. If the sensor measurements could not accurately represent the environmental parameter of interest, the extracted information will be misleading and making decisions based upon false information could even be a risk to the human life.

## Mission Critical Hydrological Data and Accuracy in Models

As an example, it is extremely important for mission critical hydrological data consumers such as flood forecasting centers to feed high quality data with certain level of resolution and accuracy into their models to minimize the risk of devastating consequences of poor and inaccurate forecasts.

Organizations who are responsible to manage and publish hydrological data in near real-time, generally formulate data production standards in terms of resolution and accuracy for each parameter. The question is; once the data quality standards are defined and formulated, how to apply the standards to data? Basically what we are looking for is a reliable and efficient techniques and tools to validate acquired sensor data against the data quality standards and correct it before publishing and passing into data analysis and modeling tools.

I’d like to start this discussion with little bit of background about measurement accuracy and error and then move to various discussions on the subject of automated data validation and correction on hydrological and the latest technological breakthrough in this area.

## Definitions

The first basic question comes to the mind is that what accuracy means and what the differences between accuracy and terms like resolution, error and uncertainty are? These terms are normally used interchangeably in our day to day conversation. Here are the definitions of those terms [1] [2]:

- The resolution of a measuring device or technique is the smallest detectable incremental change of input parameter that can be detected in the output signal. For instance, a data logger and pressure transducer will often resolve a stage measurement to 1 mm, but the accuracy may be less than this due to different types of sensor errors that will be discussed later in some details.
- The accuracy of a measurement relates to how well it expresses the true value. However, as the true value is often unknown, the accuracy of a hydrological measurement usually has to be expressed in terms of statistical probabilities. Accuracy is a qualitative term, although it is not unusual to see it used quantitatively. As such, it only has validity if used in an indicative sense; any serious estimate should be in terms of uncertainty (below);
- The error in a result is the difference between the measured value and the true value of the quantity measured. Errors can commonly be classified as systematic, random or spurious;

a) Systematic error is the type of error that either remains constant in the course of a number of measurements of a same value for a given quantity (e.g. offset error) or varies according to a definite law when condition changes (e.g. drift error). I will talk about this type of error in details later.

b) Random error varies in an unpredictable manner, in magnitude and in sign, when measurements of the same variable are made under the same condition

c) Spurious or outliers are deterministic and known errors due to human mistakes or instrument malfunction

- Uncertainty is the range within which the true value of a measured quantity can be expected to lie, at a stated probability (or confidence level). The numerical value of uncertainty is a product of the true standard deviation of the errors and a numerical parameter α depending on the confidence level. In mathematical form e = αS where α is the parameter associated with the confidence level and S is standard deviation which is the measure of dispersion of values about their mean.

This table shows α for random error with Gaussian Distribution:

Confidence Level | α |

0.50 | 0.674 |

0.6 | 0.842 |

0.66 | 0.954 |

0.8 | 1.282 |

0.9 | 1.645 |

0.95 | 1.960 |

0.98 | 2.326 |

0.99 | 2.576 |

0.999 | 3.291 |

References:

[1]Guide to Hydrological practices, World Methodological Organization – WMO-NO. 168, 2008

[2] J.J. Carr, Sensors and Circuits Prentice Hall, 1993

Stay tuned for Part 2 where we discuss further and take a closer look into the various types and sources of sensor errors…

Comments? Questions?

Great Article Touraj!