Nick Van De Giesen

A Bayesian model for quantifying errors in citizen science data: application to rainf...

Jessica A Eisma

and 3 more

April 04, 2023

High quality citizen science data can be instrumental in advancing science toward new discoveries and a deeper understanding of under-observed phenomena. However, the error structure of citizen scientist (CS) data must be well-defined. Within a citizen science program, the errors in submitted observations vary, and their occurrence may depend on CS-specific characteristics. This study develops a graphical Bayesian inference model of error types in CS data. The model assumes that: (1) each CS observation is subject to a 5 specific error type, each with its own bias and noise; and (2) an observation’s error type depends on the error community of the CS, which in turn relates to characteristics of the CS submitting the observation. Given a set of CS observations and corresponding ground-truth values, the model can be calibrated for a specific application, yielding (i) number of error types and error communities, (ii) bias and noise for each error type, (iii) error distribution of each error community, and (iv) the error community to which each CS belongs. The model, applied to Nepal CS rainfall observations, 10 identifies five error types and sorts CSs into four model-inferred communities. In the case study, 73% of CSs submitted data with errors in fewer than 5% of their observations. The remaining CSs submitted data with unit, meniscus, unknown, and outlier errors. A CS’s assigned community, coupled with model-inferred error probabilities, can identify observations that require verification. With such a system, the onus of validating CS data is partially transferred from human effort to machine-learned algorithms.

Extreme Precipitation Return Levels for Multiple Durations on a Global Scale

Gaby J Gründemann

and 6 more

August 25, 2021

Quantifying the magnitude and frequency of extreme precipitation events is key in translating climate observations to planning and engineering design. Past efforts have mostly focused on the estimation of daily extremes using gauge observations. Recent development of high-resolution global precipitation products, now allow estimation of global extremes. This research aims to quantitatively characterize the spatiotemporal behavior of precipitation extremes, by calculating extreme precipitation return levels for multiple durations on the global domain using the Multi-Source Weighted-Ensemble Precipitation (MSWEP) dataset. Both classical and novel extreme value distributions are used to provide an insight into the spatial patterns of precipitation extremes. Our results show that the traditional Generalized Extreme Value (GEV) distribution and Peak-Over-Threshold (POT) methods, which only use the largest events to estimate precipitation extremes, are not spatially coherent. The recently developed Metastatistical Extreme Value (MEV) distribution, that includes all precipitation events, leads to smoother spatial patterns of local extremes. While the GEV and POT methods predict a consistent shift from heavy to thin tails with increasing duration, the heaviness of the tail obtained with MEV was relatively unaffected by the precipitation duration. The generated extreme precipitation return levels and corresponding parameters are provided as the Global Precipitation EXtremes (GPEX) dataset. These data can be useful for studying the underlying physical processes causing the spatiotemporal variations of the heaviness of extreme precipitation distributions.

Neutrons fast and slow: Boron-based Large-scale Observation of Soil Moisture (BLOSM)

Nick Van De Giesen

and 1 more

October 07, 2021

The ratio between slow or thermal (<2.2 km/s) and fast (>2.2 km/s) neutrons is known to be a good measure of the amount of water present in a radius of about 300m from the measurement. COSMOS detectors use this principle and measure neutrons by means of the helium isotope 3He. COSMOS has been in use for some time now and its large-scale observations are central to bridging the scaling gap between direct gravimetric observation of soil moisture (<<1m2) and the scale at which soil moisture is represented in hydrological models and satellite observations (>100m2). The main sources of 3He were nuclear warheads. The fortunate demise of nuclear weapons has had the less fortunate consequence that 3He has become expensive, leading to a search for more affordable alternatives. Here, we present laboratory results of a boron-based neutron detector called BLOSM. About 20% of naturally occurring boron is 10B, which has a large cross-section for thermal neutrons. When 10B absorbs a neutron, it decays into lithium and alpha particles. Alpha particles can then be detected by ZnS(Ar), which sends out UV photons. Because real-estate is at a premium for most neutron detection applications, most boron detectors are based on relatively expensive enriched boron with >99% 10B. In hydrology, space is usually less of an issue, so one innovation here is that we use natural boron in a detector that is simply a bit larger than one based on enriched boron but much cheaper. A second innovation, put forward by Jeroen Plomp of the Delft Reactor Institute, are wavelength shifting fibers that capture UV photons by downshifting the wavelength to green. Green photons have a wider angle of total internal reflection and tend to stay in the fiber until they exit at the end. Here, a third innovation comes into play, inspired by Spencer Axani’s $100 muon detector, namely the use of simple electronics and silicon photon multipliers (SiPMs). Because we want to know the ratio between fast and slow neutrons, we need two detectors, one that just counts the thermal neutrons that continuously zap around and through us, and one covered by a moderator that slows down faster neutrons to thermal levels, so that they can be detected. Presently, we can build two detectors for about EU 1000. We expect that after the development of some custom electronics, this will come down to around EU 500. Ideally, we would like to build a network of these detectors in Africa in conjunction with the TAHMO network (www.tahmo.org).

Towards Reproducible Hydrological Modelling with eWaterCycle

Niels Drost

and 15 more

February 27, 2021

The eWaterCycle platform(https://www.ewatercycle.org/) is a fully Open Source system designed explicitly to advance the state of Open and FAIR Hydrological modelling. Reproducibility is a key ingredient of FAIR, and one of the driving principles of eWaterCycle. While working with Hydrologists to create a fully Open and FAIR comparison study, we noticed that many ad-hoc tools and scripts are used to create input (forcing, parameters) for a hydrological model from the source datasets such as climate reanalysis and land-use data. To make this part of the modelling process better reproducible and more transparent we have created a common forcing input processing pipeline based on an existing climate model analysis tool: ESMValTool (https://www.esmvaltool.org/). Using ESMValTool the eWaterCycle platform can perform commonly required pre-processing steps such as cropping, re-gridding, and variable derivation in a standardized manner. If needed, it also allows for custom steps for a Hydrological model. Our pre-processing pipeline directly supports commonly used datasets such as ERA-5, ERA-Interim, and CMIP climate model data, and creates ready-to-run forcing data for a number of Hydrological models. Besides creating forcing data, the eWaterCycle platform allows scientists to run Hydrological models in a standardized way using Jupyter notebooks, wrapping the models inside a container environment, and interfacing to these using BMI, the Basic Model Interface (https://bmi.readthedocs.io/). The container environment (based on Docker) stores the entire software stack, including the operating system and libraries, in such a way that a model run can be reproduced using an identical software environment on any other computer. The reproducible processing of forcing and a reproducible software environment are important steps towards our goal of fully reproducible, Open, and FAIR Hydrological modelling. Ultimately, we hope to make it possible to fully reproduce a Hydrological model experiment from data pre-processing to analysis, using only a few clicks.