Andrew Bennett

and 2 more

The hydrologic cycle is a complex and dynamic system of interacting processes. Hydrologists seeking to understand and predict these systems develop models of varying complexity, and compare their output to observations to evaluate their performance or diagnose shortcomings within the models. Often, these analyses take into account only single variables or isolated aspects of the hydrologic system. To explore how process interactions affect model performance we have developed a general framework based on information theory and conditional probabilities. We compare how conditional mutual information and mean square errors are related in a variety of hydrometeorological conditions. By exploring different regions of phase space we can quantify model strengths and weaknesses in terms of both process accuracy as well as classical performance. By considering a range of conditions we can evaluate and compare models outside of their average behavior. We apply this analysis to physically-based models (based on SUMMA), statistical models, and observations from FluxNet towers at a number of hydro-climatically diverse sites. By focusing on how the turbulent heat fluxes are affected by shortwave radiation, air temperature, and relative humidity we go beyond simple error metrics and are able to reason about model behavior in a physically motivated way. We find that the statistically based models, while showing better performance in the mean field, often do not represent the underlying physics as well as the physically based models. The statistically based model’s over-reliance on shortwave radiation inputs limits their ability to reproduce more complex phenomena.

Mashrekur Rahman

and 2 more

Recent advancement of computational linguistics, machine learning, including a variety of toolboxes for Natural Language Processing (NLP), help facilitate analysis of vast electronic corpuses for a multitude of objectives. Research papers published as electronic text files in different journals offer windows into trending topics and developments, and NLP allows us to extract information and insight about these trends. This project applies Latent Dirichlet Allocation (LDA) Topic Modeling for bibliometric analyses of all abstracts in selected high-impact (Impact Factor > 0.9) journals in hydrology. Topic modeling uses statistical algorithms to extract semantic information from a collection of texts and has become an emerging quantitative method to assess substantial textual data. The resulting generated topics are interpretable based on our prior knowledge of hydrology and related sub-disciplines. Comparative topic trend, term, and document level cluster analyses based on different time periods was performed. These analyses revealed topics such as climate change research gaining popularity in Hydrology over the last decade. An inter-topic correlation analysis also revealed the nature of information exchange and absorption between various communities within the hydrology domain. The primary objective of this work is to allow researchers to explore new branches and connections in the Hydrology literature, and to facilitate comprehensive and inclusive literature reviews. We aim to use these results combined with probability distribution between topics, journals and authors to create an ontology that is useful for scientists and environmental consultants for exploring relevant literature based on topics and topic relationships.