The linear relationship between gross primary productivity (GPP) and evapotranspiration (ET), evidenced by site-scale observations, is well recognized as an indicator of the close interactions between carbon and hydrologic processes in terrestrial ecosystems. However, it is not clear whether this relationship holds at the catchment scale, and if so, what are the controlling factors of its slope and intercept. This study proposes and examines a generalized GPP-ET relationship at 380 near-natural catchments across various climatic and landscape conditions in the contiguous U.S., based on monthly remote sensing-based GPP data, vegetation phenology, and several hydrometeorological variables. We demonstrate the validity of this GPP-ET relationship at the catchment scale, with Pearson’s r ≥ 0.6 for 97% of the 380 catchments. Furthermore, we propose a regionalization strategy for estimating the slope and intercept of the generalized GPP-ET relationship at the catchment scale by linking the parameter values a priori with hydrometeorological data. We validate the monthly GPP predicted from the relationship and regionalized parameters against remote-sensing based GPP product, yielding Kling-Gupta Efficient (KGE) values ≥ 0.5 for 92% of the catchments. Finally, we verify the relationship and its parameter regionalization at 35 AmeriFlux sites with KGE ≥ 0.5 for 25 sites, demonstrating that the new relationship is transferable across the site, catchment, and regional scales. The relationship will be valuable for diagnosing coupled water–carbon simulations in land surface and Earth system models and constraining remote-sensing based estimation of monthly ET.

Licheng LIU

and 12 more

Improving the estimation of CO2 exchange between the atmosphere and terrestrial ecosystems is critical to reducing the large uncertainty in the global carbon budget. Large amounts of the atmospheric CO2 assimilated by plants return to the atmosphere by ecosystem respiration (Reco), including plant autotrophic respiration (Ra) and soil microbial heterotrophic respiration (Rh). However, Ra and Rh are challenging to be estimated at large regional scales because of the limited understanding of the complex interactions among physical, chemical, and biological processes and the resulting high spatio-temporal dynamics. Traditional approaches for estimating Reco including process-based (PB) models are limited by human knowledge resulting in limited accuracy and efficiency. Accumulation of the in situ observation of net ecosystem exchange (NEE), weather, and soil, and satellite data of GPP, LAI and soil moisture make it possible for applying data driven machine learning (ML) approaches. But the ML model approach has disadvantages of omission of domain knowledge and lack of interpretability. Here we propose a novel knowledge guided machine learning (KGML) method for predicting daily Ra and Rh in the US crop fields. With Gated Recurrent Unit (GRU) as the basis, we develop the KGML models constructing the hierarchical structure of ML with a mass balance constraint. The KGML models were pre-trained using synthetic data generated by an advanced agroecosystem model, ecosys, and re-trained with real-world FLUXNET observation data. We extrapolate the best KGML model to crop fields over the US with the help of satellite data, reanalysis climate forcings, and soil database to reveal the spatio-temporal variations and key controlling factors. We believe this study advances the interpretable machine learning concept for carbon cycle estimation and will shed light on many other process-based biogeochemistry research.

Ryan G Knox

and 14 more

Licheng LIU

and 11 more

Nitrous oxide (N2O) is one of the important greenhouse gases (GHGs), with its global warming potential 265 times greater than that of carbon dioxide (CO2). About 60% of the anthropogenic N2O emission is from agriculture production. To date, estimating N2O emissions from cropland remains a challenging task because the related microbial origin processes (e.g. incomplete nitrification and denitrification) are controlled by a diverse factors of climate, soil, plant and human activities. In this study, we developed a ML model with physical/biogeochemical domain knowledge, namely knowledge guided machine learning (KGML), for simulating daily N2O fluxes from the agriculture ecosystem. The Gated Recurrent Unit (GRU) was used as the basis to build the model structure. A range of ideas have been implemented to optimize the model performance, including 1) hierarchical structure based on variable causal relations, 2) intermediate variable (IMV) prediction and transfer, 3) inputting IMV initials for constraints, 4) model pretrain/retrain, and 5) multitask learning. The developed KGML was pre-trained by millions of synthetic data generated by an advanced PB model, ecosys, and then re-trained by observations from six mesocosm chambers during three growing seasons. Six other pure ML models were developed using the same data from mesocosm chambers to serve as the benchmark for the KGML model. The results show that KGML can always outperform the PB model in efficiency and ML models in prediction accuracy of capturing N2O flux magnitude and dynamics. Besides, the reasonable predictions of IMVs increase the interpretability of KGML. We believe the footprint of KGML development in this study will stimulate a new body of research on interpretable machine learning for biogeochemistry and other related geoscience processes.

Jinyun Tang

and 4 more

In studying problems like plant-soil-microbe interactions in environmental biogeochemistry and ecology, one usually has to quantify and model how substrates control the growth of, and interaction among, biological organisms. To address these substrate-consumer relationships, many substrate kinetics and growth rules have been developed, including the famous Monod kinetics for single substrate-based growth, Liebig’s law of the minimum for multiple-nutrient co-limited growth, etc. However, the mechanistic basis that leads to these various concepts and mathematical formulations and the implications of their parameters are often quite uncertain. Here we show that an analogy based on Ohm’s law in electric circuit theory is able to unify many of these different concepts and mathematical formulations. In this Ohm’s law analogy, a resistor is defined by a combination of consumers’ and substrates’kinetic traits. In particular, the resistance is equal to the mean first passage time that has been used by renewal theory to derive the Michaelis-Menten kinetics under substrate replete conditions for a single substrate as well as the predation rate of individual organisms. We further show that this analogy leads to important insights on various biogeochemical problems, such as (1) multiple-nutrient co-limited biological growth, (2) denitrification, (3) fermentation under aerobic conditions, (4) metabolic temperature sensitivity, and (5) the accuracy of Monod kinetics for describing bacterial growth. We expect our approach will help both modelers and non-modelers to better understand and formulate hypotheses when studying certain aspects of environmental biogeochemistry and ecology.