A data centric culture is difficult to cultivate without effective data infrastructure and systems. This session will explore the technological underpinnings and value to users of new and established data systems. Operational data systems (deployed software that stores, organizes, provides APIs or other interfaces to access or manipulate data) make data more accessible to more people, easier for them to manipulate and analyze, and can enable larger volumes of users to utilize larger volumes of data in a systematic way.
The Challenges of Modernizing Enterprise Software in the Federal Space: The NWIS Case Study
Daniel K. Pearson, USGS
Lessons from 18 Years of Building Operational Systems at the National Earthquake Information Center
Mike Hearne, USGS
Data-Driven Streamflow Drought Forecasts for the Conterminous United States (CONUS): Preparing for the Upcoming Launch of an Operational Tool to Enhance Drought Early Warning
John Hammond, USGS
Last Mile Data Delivery for the National Water Availability Assessment Data Companion
Megan Hines, Kaycee Faunce, USGS
Automated georeferencing and feature extraction from geologic maps using the Polymer web application
Margaret Goldman, Joshua Rosera, Graham Lederer, Garth Graham, David Watkins, USGS
==========
Descriptions:
The Challenges of Modernizing Enterprise Software in the Federal Space: The NWIS Case Study
Daniel K. Pearson, USGS
The National Water Information System (NWIS) Modernization program has been on a 5-year journey to provide the necessary improvements to NWIS, the world's largest authoritative enterprise water information system. After recognizing that the legacy NWIS was both inflexible, and suffering from extensive technological debt in 2019, the USGS Water Mission Area kicked off a 10M/year investment to reduce the risk of system failure due to aging infrastructure. A modernized NWIS was needed to support a robust, authoritative enterprise water information system which is foundational to advancing WMA priorities and meeting the needs of USGS stakeholders. This talk will focus on "lessons learned" and highlight accomplishments and what is next for NWIS!
Lessons from 18 Years of Building Operational Systems at the National Earthquake Information Center
Mike Hearne, USGS
The Real-Time Products (RTP) team at the National Earthquake Information Center (NEIC) has been creating various earthquake triggered products since before 2007. Some of these products include: ShakeMap, a system designed to estimate and make maps of ground shaking in the region around an earthquake; PAGER, a system that estimates shaking-related fatalities and economic losses; gmprocess, software designed to automatically download, process, and derive peak ground motions from seismometers. These software systems, among others, feed information to each other and to the Earthquake Hazards Program website. Most of these systems are deployed on-premise, but we have had recent success migrating ShakeMap to the Amazon cloud. We hope with time to replicate this with other models and products. This talk will focus on the experiences gained from working on these 24/7 mission critical systems, what expertise is needed, and what decisions need to be made to facilitate deployment and operations.
Data-Driven Streamflow Drought Forecasts for the Conterminous United States (CONUS): Preparing for the Upcoming Launch of an Operational Tool to Enhance Drought Early Warning
John Hammond, USGS
Hydrological drought, defined as abnormally low streamflows and groundwater levels, has direct impacts on agriculture, hydropower, ecosystems, public water supply, and recreation. Unlike more readily-available precipitation forecasts, forecasting streamflow drought requires accounting for storage (snow and groundwater), human modifications (diversions and reservoirs), and complex terrestrial processes. To address this challenge, the U.S. Geological Survey Water Mission Area Drought Program is working to advance early warning capacity for hydrological drought onset, duration, and severity using data-driven models. We use gradient-boosted decision tree and long short-term memory neural network modeling approaches to forecast 1-13 week streamflow percentiles across the conterminous United States (CONUS) using gridded meteorology and meteorological forecasts, modeled snow and soil moisture, and watershed properties. We forecast drought for moderate (20%), severe (10%) and extreme (5%) intensity levels using seasonally varying drought thresholds. Models show a strong ability to forecast severe droughts via variable streamflow percentiles in the near term, but have weaker predictive capacity for regulated basins, drier areas of the CONUS, increasingly intense droughts, and longer lead times. For these reasons modified approaches are being explored to improve model performance. As we prepare for the launch of the streamflow drought assessment and forecasting tool launch later this year, we are incorporating stakeholder input to design a website that complements existing drought and water supply prediction tools.
Last Mile Data Delivery for the National Water Availability Assessment Data Companion
Megan Hines, Kaycee Faunce, USGS
The National Water Availability Assessment Data Companion (NWDC) is a centralized website providing U.S. Geological Survey-derived water availability, supply, and use information that underlies the National Water Availability Assessment. The NWDC also extends Water Data for the Nation’s publicly available observed water data by providing modeled data that are spatially and temporally continuous, filling in spatial gaps between monitoring stations and temporal gaps between periodic sampling at these stations. These nationally consistent datasets are available at the monthly timescale and sub-watershed spatial scale (12-digit hydrologic unit codes).
Designing a novel delivery system that integrates multiple streams of national-level data presents numerous challenges. Original research outputs do not always overlap with the desired spatial scales for delivery, so steps to perform normalization for integration are necessary. The steps to transform each dataset may be different and require automated testing to ensure the outputs are correct. Fast moving science also tends to be done as effectively as possible at the time of the research, but without inputs or frequent external peer review from the delivery team, requiring integration steps to be taken at many stages. Research data production and coding is also still done in a manner that does not usually focus on operationalized use, thus forcing the delivery team to juggle test datasets rather than having easy access to continuously updating outputs of the modelers’ latest work.
This presentation will explore innovative R and targets pipeline approaches developed by the NWDC team to power our integrated, dynamic website and associated tools. These approaches allow for the central management of website content and the transformation of model data outputs, enabling repeatable just-in-time data integration and improved accessibility of the data at a national level.
Automated georeferencing and feature extraction from geologic maps using the Polymer web application
Margaret Goldman, Joshua Rosera, Graham Lederer, Garth Graham, David Watkins
The Polymer web application is a human-machine interface (HMI) designed to support geologic map compilation, georeferencing, and data extraction for mineral resource assessment workflows. Geologic maps provide essential data for assessments, including information about lithology, geologic structure, geomorphology, and eviden