loading page

Simulating N2O Emission from Fertilized Mesocosm Using Knowledge Guided Machine Learning
  • +9
  • Licheng LIU,
  • Shaoming Xu,
  • Zhenong Jin,
  • Jinyun Tang,
  • Timothy Griffis,
  • Kaiyu Guan,
  • Alexander Frie,
  • Taegon Kim,
  • Bin Peng,
  • Yufeng Yang,
  • Wang Zhou,
  • Vipin Kumar
Licheng LIU
University of Minnesota Twin Cities

Corresponding Author:[email protected]

Author Profile
Shaoming Xu
University of Minnesota Twin Cities
Author Profile
Zhenong Jin
University of Minnesota-Twin Cities
Author Profile
Jinyun Tang
Lawrence Berkeley Natl Lab
Author Profile
Timothy Griffis
Univ Minnesota
Author Profile
Kaiyu Guan
University of Illinois at Urbana Champaign
Author Profile
Alexander Frie
University of California Riverside
Author Profile
Taegon Kim
University of Minnesota Twin Cities
Author Profile
Bin Peng
University of Illinois at Urbana Champaign
Author Profile
Yufeng Yang
University of Minnesota Twin Cities
Author Profile
Wang Zhou
University of Illinois at Urbana Champaign
Author Profile
Vipin Kumar
University of Minnesota Twin Cities
Author Profile

Abstract

Nitrous oxide (N2O) is one of the important greenhouse gases (GHGs), with its global warming potential 265 times greater than that of carbon dioxide (CO2). About 60% of the anthropogenic N2O emission is from agriculture production. To date, estimating N2O emissions from cropland remains a challenging task because the related microbial origin processes (e.g. incomplete nitrification and denitrification) are controlled by a diverse factors of climate, soil, plant and human activities. In this study, we developed a ML model with physical/biogeochemical domain knowledge, namely knowledge guided machine learning (KGML), for simulating daily N2O fluxes from the agriculture ecosystem. The Gated Recurrent Unit (GRU) was used as the basis to build the model structure. A range of ideas have been implemented to optimize the model performance, including 1) hierarchical structure based on variable causal relations, 2) intermediate variable (IMV) prediction and transfer, 3) inputting IMV initials for constraints, 4) model pretrain/retrain, and 5) multitask learning. The developed KGML was pre-trained by millions of synthetic data generated by an advanced PB model, ecosys, and then re-trained by observations from six mesocosm chambers during three growing seasons. Six other pure ML models were developed using the same data from mesocosm chambers to serve as the benchmark for the KGML model. The results show that KGML can always outperform the PB model in efficiency and ML models in prediction accuracy of capturing N2O flux magnitude and dynamics. Besides, the reasonable predictions of IMVs increase the interpretability of KGML. We believe the footprint of KGML development in this study will stimulate a new body of research on interpretable machine learning for biogeochemistry and other related geoscience processes.