Stefan F. Gary

and 6 more

River sediment microbial respiration is a key indicator of ecosystem functioning and the biogeochemical fluxes across this critical zone link surface and subsurface waters. As such, there is tremendous interest in measuring and mapping these respiration rates. Respiration observations are expensive and labor intensive; there is limited data available to the community. An open science, collaborative initiative is collecting samples for respiration rate analysis and multi-scale metadata; this evolving data set is being used for making machine learning (ML) predictions at unsampled sites to help inform continued community engagement. However, it is a challenge to find an optimum configuration for ML models to work with this feature-rich (i.e. 100+ possible input variables) data set. Here, we present results from a two-tiered approach to managing the analysis of this complex data set: 1) a stacked ensemble of models that automatically optimizes hyperparameters and manages the training of many models and 2) feature permutation importance to detect the most important features in the models. The major elements of this workflow are modular, portable, open, and cloud-based thus making this implementation a potential template for other applications. The models developed here predict that sediment organic matter chemistry is one of the most important features for predicting sediment respiration rate. Other larger-scale, important features fall into the categories of climatic, ecological, geological, and fluvial settings. Leveraging these larger-scale features to generate data-driven estimates of river sediment respiration rates reveals spatially consistent but heterogeneous patterns across the river network of the Columbia River Basin.

Yunxiang Chen

and 16 more

Streambed grain sizes and hydro-biogeochemistry (HBGC) control river functions. However, measuring their quantities, distributions, and uncertainties is challenging due to the diversity and heterogeneity of natural streams. This work presents a photo-driven, artificial intelligence (AI)-enabled, and theory-based workflow for extracting the quantities, distributions, and uncertainties of streambed grain sizes and HBGC parameters from photos. Specifically, we first trained You Only Look Once (YOLO), an object detection AI, using 11,977 grain labels from 36 photos collected from 9 different stream environments. We demonstrated its accuracy with a coefficient of determination of 0.98, a Nash–Sutcliffe efficiency of 0.98, and a mean absolute relative error of 6.65% in predicting the median grain size of 20 testing photos. The AI is then used to extract the grain size distributions and determine their characteristic grain sizes, including the 5th, 50th, and 84th percentiles, for 1,999 photos taken at 66 sites. With these percentiles, the quantities, distributions, and uncertainties of HBGC parameters are further derived using existing empirical formulas and our new uncertainty equations. From the data, the median grain size and HBGC parameters, including Manning’s coefficient, Darcy-Weisbach friction factor, interstitial velocity magnitude, and nitrate uptake velocity, are found to follow log-normal, normal, positively skewed, near log-normal, and negatively skewed distributions, respectively. Their most likely values are 6.63 cm, 0.0339 s·m-1/3, 0.18, 0.07 m/day, and 1.2 m/day, respectively. While their average uncertainty is 7.33%, 1.85%, 15.65%, 24.06%, and 13.88%, respectively. Major uncertainty sources in grain sizes and their subsequent impact on HBGC are further studied.

Timothy Scheibe

and 1 more

The rhizosphere is a complex system in which many diverse and heterogeneous small-scale components (e.g, plant roots, fluids, microbes, and mineral surfaces) interact with one another, often in nonlinear ways, giving rise to emergent system behaviors. Ecosystem-scale perturbations, such as nitrogen limitation or drought, drive changes in micro-environments through a cascade of complex interacting processes, leading to a bidirectional feedback across scales between microbial and plant habitats at the microscale and ecosystem function at the macroscale. We are developing a conceptual and numerical framework for multiscale simulation of organic carbon transport, transformation, and disposition in the soil-microbe-plant continuum. The conceptual model comprises a set of directed graphs, with nodes representing system processes and states and edges representing process-state relationships. The graphs are coded in the graphviz syntax enabling dynamic web visualization. Graph nodes are hyperlinked to metadata pages summarizing current understanding of each process or state and its representation in current numerical codes. This conceptual model is available via a git repository and can guide identification of opportunities for coupling (data exchange) between codes operating at different length scales. The numerical implementation of the conceptual model is based on execution of integrated data processing and multiscale modeling scientific workflows. The numerical framework is enabled by a recent development in information technology known as orchestration, a class of solutions to problems of deployment and execution of cloud-oriented software. Orchestration technology is well-suited to automating complex scientific workflows, both in model-coupling efforts and experimental analysis pipelines. Here it is used to flexibly define workflow steps based on precedent events (such as arrival of a new model output in the data repository). It is being applied to integrate several community software packages spanning scales from molecules to ecosystems, linked to experimental data from the Environmental Molecular Sciences Laboratory (a national scientific user facility), to address critical scientific questions related to soil nutrient cycling.

Yunxiang Chen

and 10 more

Quantifying the multiscale feedback between hydrodynamics and biogeochemistry is key to reliable modeling of river corridor systems. However, accurate and efficient hydrodynamics models over large spatiotemporal scales have not yet been established due to limited surveys of riverbed roughness and high computational costs. This work presents a semi-automated workflow that combines topographic and water stage surveys, computational fluid dynamics modeling, distributed wall resistance modeling, and high-performance computing to simulate flow in a 30-kilometer-long reach at the Columbia River during 2011-2019. The results show that this workflow enables a high accuracy in modeling water stage at all seven survey locations during calibration (1 month) and validation (65 months) periods. It also enables a high computational efficiency to model the streamflow during a 58-month solution-time within less than a 6-day wall-clock-time with mesh number, time step, and CPU hours of about 1.2 million, 3 seconds, and 1.1 million hours, respectively. Using the well-validated results, we show that riverbed dynamic pressure is randomly distributed over all spatiotemporal scales with its cross-sectional average values approximately quantified by a normal distribution with a mean and standard deviation of -0.353 m and 0.0352 m; bed shear stress is affected by flowrate and large- and small-scale topographic features with cross-sectional maximum values following a smooth but asymmetric distribution with 90% of its value falling between 5 Pa and 35 Pa; and hydrostatic pressure is influenced by flowrate and large-scale topographic features with cross-sectional maximum values quantified by a discontinuous and skewed distribution determined by streamwise topographic variations.

Firnaaz Ahamed

and 4 more

Priming leads to the significant changes in the decomposition rate of organic matter (OM) in natural ecosystems induced by minimal treatments. A fundamental understanding of priming effects is critical to accurately predict biogeochemical dynamics and carbon/nitrogen OM cycles in natural ecosystems. However, we poorly understand how the priming effect is mechanistically induced and what factors govern the process among microbial activities and environmental constraints. Here, we propose a generalizable theory to collectively explain diverse patterns of priming effects via the cybernetic approach that accounts for regulation as key features of microbial growth. The cybernetic model treats microorganisms as dynamic systems that optimally regulate metabolic functions with respect to environmental conditions to safeguard their survival. Motivated by priming phenomenon observed in the hyporheic corridor of a riverine ecosystem, we formulated our model to investigate how the addition of exogenous labile OM primes the microbial respiration of polymeric OM. Our model accounts for interspecies interactions between various assortments of microbial groups with distinct metabolic traits to enable prediction of both increase (positive priming) and decrease (negative priming) of OM turnover using the same model structure. Our modeling framework reveals that: (1) the priming effects are manifestations of microbial regulatory response to diverse environmental conditions, and (2) priming magnitude and direction are highly dependent on the polymeric OM richness and the extent of treatment with labile OM. Beyond elucidating qualitative understanding of the phenomenon, our model also suggests that interspecies interactions between microbial groups with distinct metabolic traits (i.e., population turnover, sensitivity to labile OM, and efficiency in degrading polymeric OM) potentially drive the priming effects. By integrating contextual knowledge and a generalizable theory, our holistic modeling framework is effective for investigation and prediction of biogeochemical dynamics of natural ecosystems across diverse biological and environmental settings.

Timothy Scheibe

and 18 more

River corridors, the spatial domains around rivers in which river water interacts with surrounding sediment and rock, are important components of watersheds. They comprise extremely complex ecosystems: heterogeneous at all spatial scales with strong temporal dynamics, coupled biological, geochemical, and hydrologic processes, and ubiquitous human impacts. We present several ways that our project, focused around the 75 km Hanford Reach of the Columbia River but with multiple connections to other systems, is addressing this challenge. These include 1) deployment of intensive, automated sensor networks supplemented by data from the Hanford Environmental Information System (HEIS) for hyporheic zone monitoring 2) data assimilation of these and other data into models using joint hydrologic and geophysical inversion, 3) integrating MASS2 model outputs and bathymetry data using machine learning to classify hydromorphologic features, 4) a community-based effort to develop broad understanding of organic carbon biogeochemistry and microbiomes in diverse river systems, and 5) use of multi-‘omics data to develop new biogeochemical reaction networks. These underpin the incorporation of process understanding and diverse data into high-resolution mechanistic models, and employment of those models to develop reduced-order models that can be applied at large scales while retaining the effects of local features and processes. In so doing we are contributing to reduction of uncertainties associated with major Earth system biogeochemical fluxes, thus improving predictions of environmental and human impacts on water quality and riverine ecosystems and supporting environmentally responsible management of linked energy-water systems.