How much information does your hydrometric data contain?

I have written previously about the measurement of measurement uncertainty (Dec 9, 2011).

The inverse of this problem is the measurement of the information contained in the data.

One way of thinking about this is to imagine that our sensors are robotic students who are assigned the task of learning everything they can about some environmental condition. We then ask them to tell us what they have learned and we evaluate what they tell us against some ratable test.

In parallel to the way that human students are assigned a task of learning everything on a course curriculum, the extent to which the information has been ‘learned’ can be tested against reference data. Human students are graded on the basis of the percentage of correct answers. The problem is that not all tests are commensurate. To mitigate for this we may ‘grade against the curve’ and assign letter grades so that all excellent results can be distinguished from good, fair, poor and failing results. We can also provide ‘weightings’ on different tests so that a quiz has a different weight than a final exam.

The concept of assigning a grade (a.k.a. quality code, symbol, or qualifier) to hydrometric data has been around for some time. The ultimate test comes with each field visit where we can ask of the sensor: what is the water level at this instant in time? In contrast, a professor would ask a large number of questions of the students to evaluate how fully they have absorbed the information since the last test. There is no point in asking the question: ‘what was the exact time and magnitude of the peak water level?’ because we do not know the true answer. We can instead ask the question: ‘since the last test, are the data free from anomalies or discontinuities?’ In other words, in the absence of evidence that the data are all true, do we at least have evidence to support our ‘belief’ the data are likely true?

We can see the metaphor of grading students breaking down with respect to the much more subjective grading of data. There are well-established protocols and traditions for measuring the information content of students.

Not so much for hydrometric data.

There are several problems with measuring the information content of hydrometric data:

  • the result is closely tied to fitness-for-purpose and any given dataset can be used for multiple purposes;
  • the timing and frequency of field visits are generally inadequate for a robust measure of information content; and
  • evaluation of information content by circumstantial evidence (e.g. inspection for serial auto-correlation in the time-series) will often identify obvious faults but evidence that there are no obvious faults is not the same as evidence that the data are true.

Similarly, there are well-established protocols and traditions for aggregating grades for students (i.e. the Grade Point Average) both within a course, and within a collection of courses (e.g. a degree program).

I can find no internationally recognized standard method for aggregating grades on hydrometric data.

The widely practiced protocol for aggregation of data grades is that they can neither sum nor average therefore a GPA is not possible. Instead of a GPA it is commonly accepted that the ‘least grade wins’ rule is invoked within any collection of data. This would mean that if I fail a 15 minute pop quiz in a course, I would get a failing grade on the whole course even if I aced the 3 hour final exam. The failing grade on that course would win over all the other courses I took for the duration of the program and hence I would fail the program.

Data grades may only represent a lower boundary of the information content of the data not to be confused with the full potential for information in the data. Once we agree on methods for objective quantification of hydrometric uncertainty, that uncertainty estimate may only represent  an upper boundary of the information content of the data. The true information in the data may not be well represented by either grades or uncertainty, especially in the context of epistemic errors. The truth will probably lie somewhere in between subjective data grading and objective quantification of uncertainty.


 

Resources_Whitepaper-Data-Management

Whitepaper: Best Water Data Possible! 5 Key Requirements for Modern Systems

The best possible data quality, timeliness, and affordability can only be realized with a modern hydrological data management system that meets 5 key requirements. Benchmark your system.

 

7 responses to “How much information does your hydrometric data contain?”

  1. Dave Gunderson July 20, 2012 at 12:36 am

    As usual, an excellent post Stu.

    Here’s my take on the subject. Are there any fixed rules or conventions about supplemental Data? Nothing I’ve seen.

    First, when you say supplemental or support data, this can mean several things. You mention that there can be ‘flags’ that accompany the data in a log. This is actually standard in one Data Logger family (Sutron 9210 XLite). The Data Logger uses a ‘Quality Flag’ as a component of the log entry. This is a single character of “G” for good, “B” for bad or “U” for unknown. What makes the quality? Success of the measurement.

    This is a start in providing supplemental information about a measurement but it is limited. In my own data collection system, I use a series and types of data quality flags for both real-time telemetry purposes and data logging (for analysis and record generation). Let me give some examples.

    For qualification of real-time data, how do you determine good from bad data? How do we pass the quality flag if the only component passed is the data? Simple. If the Data is faulty, substitute a numeric flag for the Data. We use a number that you would NEVER get in the collection. I use the number -999.xxx as the flag(s) in my collection. Let’s look at the process that happens at the Data Logger:

    Measurement>>>Process Data>>>Data Poll (from Base Station)

    At the Measurement phase – if we get a valid read from a Sensor, the Data is passed. If the Measurement was bad, I substitute -999.777 as the value.

    Once the Data is Measured (and not -999.777), we Process the Data. If this step is successful, the Data gets Processed and passed. If the process failed, I substitute -999.666 as the value.

    At this time, the Data will reside at the Data Logger until Polled from the Base Station. Upon completion of the Poll – a good result is the expected Data. A missed Data Poll gets -999.888 substituted.

    Now, at the Base Station after the Polling sequence we look at the result:

    -999.777 = The Sensor was Bad. Send the Tech out to the site with another sensor.
    -999.666 = Something at the site is not right. However it is not the sensor.
    -999.888 = We have either a communications issue or the site has died…
    Anything else = You had valid data returned. Was it OK?

    You see that data flags can help a great deal in the diagnosing the problems that a collection system may have.

    The methods I employ in providing supplemental support Data for analysis and record generation deals with collection and logging techniques. I’ll cover those in another post.

    It’s getting late and I have a long day in the field tomorrow.

    Dave

  2. Daniel Fundisi July 20, 2012 at 9:27 am

    Thanks for insightful information with an interesting example of data loggers as robotic students. This makes the issue of uncertainty in hydrologic measurements very clear.

  3. Dave Gunderson July 26, 2012 at 1:26 am

    My previous post talked about how data collection in a real-time system can be leveraged to indicate problems in the collection. Real-time data is what we call provisional data. It is not the final product. To become the final record the data gets corrected. If a sensor needed calibration, the record needs to be corrected for it. If there were gaps in the data, the gaps need to be corrected as well.

    The log contained from the data logger contains the sensor data. A written log book is also maintained at the site. It contains field notes and observations noted during field visits.

    Think of the data logger’s log as the main product. The information contained here is the combination of sensor data and time stamps. The method of collection and the setup of the sensor determines the character of the data collection. Here’s what I mean – the log entry normally happens every 15 minutes. The data measuring can be a single reading from the time period or it can be a series of measurements that is averaged for the 15 minute time frame. What is the better product? Is it an instant reading or an averaged reading? Considering the amount of space available in the log – why not collect both readings? At this point, we have two entries of STAGE in our log but they have different characteristics (time wise). The instant reading tells you the exact level at the time stamp, the averaged reading tells you the STAGE for the time frame.

    The data collection process can be leveraged to provide a better end product. The next level actually is in the sensor. Most sensors we use are microprocessor based and utilize the SDI-12 interface. The sensors (depending on type) can be programmed to output multiple data. The instruments can also be programmed to do internal averaging during a single measurement request.

    Look at this scenario. Let’s say we can take a STAGE measurement every minute from a shaft encoder from the data logger. We can pre-program that shaft encoder to take a series of readings within that minute time frame. The shaft encoder can take a series of one second measurements over a 30 second time frame. The encoder than be queried to give the MEAN AVERAGE of that time period and also the MIN & MAX values as well. What have we accomplished by doing this? We have gotten a more accurate reading for dynamic conditions at the site.

    I mentioned that logging in modern data loggers can be pretty flexible. In the old days, we had a single log with a single logging interval. Remember that, Stu? Today, we can build logs that contain different logging intervals or logging by event. That being said – why not create a secondary log that can contain data that is logged every minute? In this log we can also log that MAX & MIN values as well. When you have secondary data like this, you can make determinations on site conditions and see events as they happen. In my system, I call this the DISCHARGE log.

    As a general rule, if an instrument can give any helpful secondary information – I log it. The data logger has the room. Why not store it?

    Stu, on your post – you ask about qualifying data that will improve the knowledge of seeing the conditions at the remote site. This is how I address it.

    Site uncertainty is caused from a variety of reasons. You can have instability in site conditions and the sensor itself. The methods of collection and logging different parameters often points us in the right direction in determining the cause.

  4. Hi Dave,

    There are several ideas in your comment that I think are interesting and distinct extensions from where I left off. For now, I will follow just one tangent for your consideration.

    The concept that data filtering can be combined with the substitution of information about exactly which gate in the data path resulted a filtering event seems very efficient and informative.

    One thing that makes me a bit nervous is the processing of information upstream of an auditable data management system. Specifically, modern sensors and loggers allow quite a bit of end-user control using on-board programming and the fate of these algorithms are often disassociated from the fate of the data. Years later, how do I trace the provenance of data when both the sensor and the logger have the power to do any number of transformations to the data but I no longer have access to the algorithms that were used?

    If, say, the sensor fault was a programmatic error calling the European numeric convention where 3.456 is communicated to the logger as 3,456 triggering a negative sanity test result leading to substitution of 3,456 with -999.777. Alternatively, the processing algorithm on the logger might result in a misplaced decimal where 3.456 is rendered as 3456 which fails a sanity test (e.g. based on an arbitrary hard-coded threshold of 1000) resulting in the substitution of 3456 with -999.666. While sanity tests at each point in the communication path are diagnostically useful, I think a safer approach is to pass all raw data through to the final destination where provenance can be properly managed before any data are ‘touched’.

    Basically, what I would like you to comment on at more length is your opinion about management (including archive if you think that is in scope) of sensor and logger programs that are part of the chain of custody for data provenance. For example, I may have a sensor that provides electronic ‘stilling’ of a turbulent water level by averaging 30 readings obtained over 150 seconds – do I need to document and manage this information? Does it matter if I periodically alter the ‘stilling’ effect over time? How should I be storing the data, with an explicit start time and end time to cover the sampling period, or simply with an instantaneous time?

  5. Stu–

    A very interesting post and this sort of thing is a long-standing question for hydrologic modelers. The only way to know, as you put it, the information contained in a single data point, is to have an a priori notion of what a value could be at any given point based on the conditions at that point, i.e. if you have some functional relationship between other variables and your variable of interest. In this approach, you essentially assume that your a priori model represents the system and serves as your control to identify when (and perhaps why) your target data do not reflect expected results. A data coding scheme like that mentioned above by Dave would delineate physical or electrical problems versus underlying measurement quality.

  6. I’ll try reply to Dave and Andrew at the same time.

    Dave is talking about technology that is capable of both creating new, useful, information from the raw sensor readings and storing all of the metadata required to understand and interpret this information.

    Andrew is talking about the ability to verify information independently. When you get to the checkout at the grocery store you can trust the till receipt with more confidence if you already have done the summation in your head.

    My understanding of what is possible is:

    1. That logical tests can be conducted on the information that would result in machine-generated metadata based on a pass/fail criteria. These are binary measures of some quality of the data and as such cannot logically be combined in any way with any other measures of data quality. In other words, a value at any given time-stamp can accumulate several of these indicators all of which are relevant for forensic analysis of pathological data. The nature of the test may determine whether the data are still usable or not and, if not, then the data value may be censored from visibility further downstream.

    2. Human grading of the data based on conformance with an a priori model of what seems reasonable. This model may be informed by: other variables (e.g. what an upstream gauge is doing); evidence that the gauge is working correctly (e.g. inspection of on-board diagnostics); and evidence that the gauge has been operated in conformance with standard operating procedures (e.g. inspection of the written log book). Supervised grading is inherently rational and done a posteriori and so a different set of rules is required for this type of metadata.

    3. Contextual metadata (e.g. the algorithms programmed into the sensor and logger) can be stored in a text format and associated with blocks of data. Unfortunately, this metadata tends not to be machine readable and so a receiving system does not know what to do with it. It is also difficult to link this metadata to the data in a way that it will be discoverable anytime in the future.

    This all sounds good but there are several aspects that I struggle with. One thing is the case of a shaft encoder giving the max, min and mean for a 30 second sample of 1-second readings. This provides a precise mean value as well as a measure of dispersion so you lose neither information about the water level nor about the turbulence. What time-stamp do you give this information, the beginning, middle or end of the sample period? How do you aggregate a series of these readings for an hour or for the duration of a rating measurement? You need to be very careful about averaging a series of averages if you are interested in making inference about the value during the unsampled time-frame.

    Unbundling this context efficiently would, I think, require machine-readable metadata. What would be helpful is some sort of industry-wide standard for encoding sensor and logger programming metadata.

  7. Dave Gunderson August 1, 2012 at 1:08 am

    I was wondering when my last post – would post. I knew that there would be more questions and concerns by you, Stu.

    Something I learned about data acquisition and measurements was our perception of them. Take for instance the timed entries (measurements) in a data logger’s log. The reading is static, the time-stamp linear. The site properties are dynamic. When you’re on site, you see them as they happen. Random boat traffic causing waves that affect the stilling well, the quick changes seen in a canal or a Dam’s tail-bay. You begin to think in terms of your medium as a moving target and you’re shooting for accuracy. You also learn the unique relationship between taking measurements and timing. As you correctly stated Stu, there is a timing consideration you look at with a series of averaged data and think of a ‘time lag’. True? Yes, but not the way to look at it. You are looking at the mean average for a time duration. As I said earlier, I collect data for both the Instant (last measured reading) and the Mean Average for the time period.

    @Andrew: As a Modeler, what component would be more valuable – The Instant reading or the Average reading? For someone doing water accounting, what would be more representative of the condition?

    Stu, your other point – there are no standards or adopted conventions for high tech data logging. True. When was the last conversation you had with anyone in a group that addressed this? As much as I get around, I haven’t discussed the topic with anyone that had the interest.

    One thought that I’d like to throw out is the resistance people generally have in learning new technologies. Equipment has changed a great deal in the last 20 years. Learning and becoming proficient often requires a learning curve that many do not want to undertake. If I said that learning to program and marrying the technology to the task was simple and could be mastered in less than a year – I’d be lying.

    The other universal truth is that the best knowledge wasn’t learned in the classroom, it was acquired in the field. Where was the documentation? Or who had the time to write it all down? The USGS pub also didn’t cover the topic or the situation at hand in depth. Only generally.

    To answer you about the end data and the hydrographer – They are fully aware of the collection methods and algorithms. It’s part of the final record in the documentation. The interesting thing in all the development of the process – there is always some refinement. It often happens after we have a sit-down and brainstorm things out.

    Making their jobs easier in maintaining the site and generating a good record is my main goal.

Join the conversation