Analyzing the effects of data splitting and covariate shift on machine learning based streamflow prediction in ungauged basins

TRRP Training: 2022 Program

presented by: GSI Environmetal Inc.

Texas Risk Reduction Program regulations (TRRP; 30 TAC 350) establish consistent risk-based protocols for assessment and response to soil, groundwater, or surface water impacts associated with environmental releases of regulated wastes or substances.

Presented by GSI Environmental Inc., this popular and informative training series is a must for professionals who need a working understanding of TRRP and those needing to stay up-to-date with the latest TCEQ TRRP guidance and policies.

TRRP Training Course (2 Days): Provides an overview of the TRRP framework and step-by-step training on property assessment and response action procedures established under the TRRP rule

Attendees will become acquainted with rules, key guidance and policies covering affected property assessments, protective concentration levels, and response actions. The course material presents strategies for efficient project management in compliance with TRRP and explains the various report forms adopted by TCEQ.

TAEP image

Sponsored by:
Texas Association of Environmental Professionals (TAEP) TAEP is the premier organization for environmental professionals in the State of Texas. The goals of TAEP include the advancement of the environmental profession and the establishment of a forum to discuss important environmental issues. TAEP members receive a 10% discount. Please call 713.522.6300 for the code.

Dates and Location

Dates

June 14th and 15th, 2022

Location

Crowne Plaza River Oaks 2712 SW Freeway Houston, Texas 77098 713.523.8448 http://www.crowneplaza.com/

Price and Registration

Early-Bird Price

(Paid by May 1, 2022)
$XXX

Standard Price

(Paid after May 1, 2022)
$XXX

TAEP Membership Price

$XXX

Government Price

$XXX
Lodging and meals are not
included in course cost

Published: 2025

Authors: Pin-Ching Li, Sayan Dey, Venkatesh Merwade 

Abstract

Machine learning (ML) models are alternatives to traditional hydrologic modeling for streamflow predictions in ungauged basins (PUB). The variability in watershed characteristics of ungauged basins; however, adds uncertainties to PUB frameworks based on ML models. These uncertainties arise from the inconsistency in the statistical distributions between the dataset used to train and test a ML model, known as covariate shifts, and the real-world (global) dataset on which the trained model is implemented. In real-world applications, covariate shift is a widespread issue for ML that has not been investigated in hydrological applications. This study evaluates the uncertainty in ML-based PUB method including Random Forest (RF) and Artificial Neural Network (ANN) under the influence of covariate shift. The Monte Carlo method is applied to aggregate simulations of RF and ANN according to various data splitting configurations as predictive distributions. The results indicate that ML performance is not robust under covariate shifts. ML performance is influenced by watershed characteristics displaying heterogeneity, such as drainage area, dam density, and urbanized area. 20–48% simulation results show a departure from the normal distribution under different covariate shift scenarios Furthermore, the efficiency and limitation of Random Forest models for PUB are highlighted by investigating their biased predictions in watersheds with varying dam density, drainage area, and meteorological variables, such as annual snowfall and annual precipitation.