Closing the Gap in Hydrometric Data – A Call for Your Participation

How long is the tail of hydrometry?

A solution for the gap between data availability and the impacts of water variability, across all scales of interest, on people and the environment is needed. One of my great hopes for the development of OGC standards for interoperable hydrometric data is that it will shed light on the dark data under the long tail of hydrometry. It is my opinion, unsubstantiated by quantitative surveys, that there are far more hydrometric data out there than are readily accessible from the major hydrometric data providers.

There are more projects with fewer than 10 gauges than there are agencies running more than 10,000 gauges. How many more: one hundred times; one thousand times; ten thousand times? Most of our accessible data are from the large national hydrometric programs – it is difficult to discover the size of the total data potential.

Hydrology is location-based.

There is no ‘ideal’ density for a hydrometric network. More data are always better because even closely placed gauges can represent quite different scaling, climatic, anthropogenic (e.g. effect of extractions, dams, diversions), and landscape processes. Hydrologic misfortune is too often a result of sole reliance on synoptic scale monitoring to predict hydrologic variability at a local scale for planning and management decisions. If you need to understand water at a local scale you need data at a local scale.

This need is largely met by project-specific monitoring done at a very small scale, often by independent stream hydrographers running a handful of gauges. These hydrographers do not have ready access to the resources of the large data providers for data management, archive and dissemination resulting in data that is unsearchable, undiscoverable and inaccessible. As a result this data tends to be collected, often at some considerable expense, for one-time use.

Re-use of such data could greatly expand our ability to understand and manage hydrological variability across all scales of interest. Data re-use implies effective metadata management. Evaluation of ‘fitness for purpose’ for 3rd party use of data requires relevant, reliable and trustworthy information about the data.

Quantifying the size of the opportunity for increasing our global hydrometric data asset is a daunting task.

I would like to get at least a small sense for this opportunity with an informal survey of readers of this blog.

Please take a few moments to answer few questions:

A simple conversation about the opportunity to make the most of our global hydrometric data investment seems like a good place to start.

There will be no cost to participate in the WebEx teleconference, which I will schedule for some time in late September. If needed, the teleconference might be in two parts to accommodate diverse time zones.

The readers of this blog might be just the right group of people to start the conversation.

Please pass the link to this post along to any colleagues who you believe are knowledgeable about the problem and/or who should be part of the solution.

7 responses to “Closing the Gap in Hydrometric Data – A Call for Your Participation”

  1. Solution to your problem requires both a “standardization support” and the right methodology. I have some ideas which I could share with you. I am a standardization expert, with a large experience in EU Research Projects.

    Do not hesitate to contact me.

    Best regards,

    André PIRLET, MScE
    Standardization&Research Belgium

    • Hi André,
      I absolutely agree that standardization and methodology are integral to the solution. There are well established groups working on these problems (WMO commission for Hydrology ISO Technical committee 113 and the OGC Hydrology Domain Working Group to name just three) and I am very confident in these groups being able to fulfill their mandate.

      However, what if you designed the best possible standards and no one adopted them?

      The perspective I am hoping to learn from with this conversation is the long tail perspective.

      It is not possible to get these players together at a conference or workshop. Their natural element is in the field, as far away from stuffy meetings and protocols (which are implicit in the standards development process) as they can get.

      My intention is to provide a relatively painless and informal opportunity for the small operators to share their stories. It will be these stories that will inform the scope of opportunity to unleash the latent power of their ‘hidden’ data.

  2. The quest to assemble hydrologic data in a single location accessible to all is a noble one, but requires a HUGE effort that is generally beyond the scope, capability and funding of any one entity. Almost in the realm of establishing a “World Government”. The USEPA has tried this with the STORET water-quality database over the last 30 years and met with some success. The US Geological Survey has managed this with the National Water Information System (NWIS), but this applies primarily to USGS data collection and that was/is difficult enough. The problem is that almost everyone does their data collection, interpretation and analysis a little differently and resists conformity–and many are generally incapable or unwilling to provide the level of “metadata” required to support sufficient knowledge of data “quality” so that the end-user can make their own judgements. Of course someone could offer to provide this service for a small fee………… Chuck

    • Hi Chuck
      The large government databases are very successful for sharing data that exists in the head of the distribution. That is not my concern

      There have been efforts to rescue orphaned data and otherwise provide a home for ‘small’ data but most give up trying within a few years.

      I don’t think previous failures are good predictors for future outcomes.There are certainly lessons to be learned about what not to repeat but those approaches were based on unverified assumptions within a very different social and technological reality.

      If we aren’t evolving into a data sharing society then how can you explain the success of Facebook and Instagram?

      There is a lot of good work being done on the technological side which gives me great hope for workable solutions for the dark data problem but no technology will work unless there is a social will to participate.

      One piece that is chronically missing in top-down planning and design is consultation with the most critical stakeholders- the small data provider. Any solution that does not address their key constraints and motivations is bound to fail.

      How do we find out what the real challenges are? I think a bottom-up conversation is a good place to start.

  3. @Chunk Danby’s comments are spot on. My own thoughts are:

    1. It doesn’t matter how large or small the organization is. We all support the data collection that is important to our needs. Even within an organization like the USGS (that operates with a unified plan), it often defaults to how the local office conducts it’s own work. Some are better than others.

    2. Chuck also mentions Meta Data within the collection. What constitutes the Meta Data from a measurement and what are the standards of the Data being collected? I’d LOVE to talk with others that would like to discuss this topic.

    3. Where do we talk about best practices and the methods we employ? The best venue that I’ve seen was the USGS Surface Water Convention. Other venues usually rest with the vendors of our data collection equipment. With the advancement of the webinar, provides another method to connect people that have the interest but not the time/money to attend these special events. Our own agency is looking into video conferencing to conduct meetings of this type.

  4. Hi Stuart,

    You made several very good points. First, never give up! Second, in this era of “big data” and social media the ability and desire to share data may trump some of the problems of the past. In addition, one of the bigger challenges of bringing diverse hydrologic data, of known quality, into a shareable “database” deals with information collected in the past. There is always the future, and with someone taking the lead in establishing the appropriate protocols, a new foundation can be built plumb and square. I am reminded of the early days of GIS–very little data available and very little metadata. But now it is a very different story–good data everywhere. In that situation a software vendor (ESRI) took the lead and drove the process with assistance from state and federal agencies. Perhaps aquainformatics can do the same ?
    Chuck

    • Hi Chuck,
      I like your attitude!

      Your example about the rapid evolution of a data sharing ethic in the geospatial world with concurrent rapid advances in technological and methodological standardization is well placed. I share your optimism for the future potential for water data and further believe that water data sharing will ultimately be less visible but more impactful on peoples lives than geopsatial data sharing, which I acknowledge is a very bold statement. Not many people can imagine how they ever lived without a GIS enabled device in their pocket. Water is so fundamental to our collective health, culture, environment, and economic well being that any improvement in data, understanding and knowledge will benefit us all much more than being able to find the nearest Starbucks on our smart phone.

      The geospatial data sharing example proves that we have the collective smarts to build the technology and standards for inter-operability to make it work. What we don’t have evidence for yet, is whether the long tail hydrometric community will actually be motivated to share their data once the technological solutions are in place. We can’t even specify what the size of the opportunity is. The difference between hydrology and the geospatial data communities is that the owners of geospatial data assets are mostly large government agencies with a clear mandate to create a ‘public good’. The equivalent government agencies in hydrometry are all on board with a well developed data sharing ethic. That part is pretty much a solved problem.

      Designing to meet the needs of large government agencies is one thing. They can articulate a coherent use case that can be resolved into requirements and specifications.

      The long tail data producers are each unique in some way. Some characteristics of long tail data are that they can be: heterogenous, hand generated, unique procedures, individual curation, institutional repositories, not maintained, obscured or protected, seldom reused, and currently unnoticed. Building to meet the needs of long tail data producers will require that some compromises be made.

      How can decisions be made that yield the greatest level of participation for sharing of the highest quality long tail data?

      There will inevitably be some data that are not worthy of being shared. How can a data sharing ethic and technology be developed that protects end-users from dis-information?

      Design of a solution without an adequate understanding of the problem to be solved is unlikely to lead to success. In order to meet the needs of the ‘silent majority’ of hydrometric data producers we need for them to tell their story. What are their requirements? What is the fate of their data now? Would they wish for a different fate if they could?

Join the conversation