loading page

Transferring hydrologic data across continents -- leveraging US data to improve hydrologic prediction in other countries
  • +5
  • Kai Ma,
  • Dapeng Feng,
  • Kathryn Lawson,
  • Wen-Ping Tsai,
  • Chuan Liang,
  • Xiaorong Huang,
  • Ashutosh Sharma,
  • Chaopeng Shen
Kai Ma
Sichuan University
Author Profile
Dapeng Feng
Pennsylvania State University
Author Profile
Kathryn Lawson
Pennsylvania State University
Author Profile
Wen-Ping Tsai
Pennsylvania State University
Author Profile
Chuan Liang
Sichuan University
Author Profile
Xiaorong Huang
Sichuan University
Author Profile
Ashutosh Sharma
Pennsylvania State University
Author Profile
Chaopeng Shen
Pennsylvania State University

Corresponding Author:[email protected]

Author Profile

Abstract

There is a drastic geographic imbalance in available global streamflow gauge and catchment property data, with additional large variations in data characteristics, so that models calibrated in one region cannot normally be migrated to another. Currently in these regions, non-transferable machine learning models are habitually trained over small local datasets. Here we show that transfer learning (TL), in the sense of weights initialization and weights freezing, allows long short-term memory (LSTM) streamflow models that were trained over the Conterminous United States (CONUS, the source dataset) to be transferred to catchments on other continents (the target regions), without the need for extensive catchment attributes. We demonstrate this possibility for regions where data are dense (664 basins in the UK), moderately dense (49 basins in central Chile), and where data are scarce and only globally-available attributes are available (5 basins in China). In both China and Chile, the TL models significantly elevated model performance compared to locally-trained models. The benefits of TL increased with the amount of available data in the source dataset, but even 50-100 basins from the CONUS dataset provided significant value for TL. The benefits of TL were greater than pre-training LSTM using the outputs from an uncalibrated hydrologic model. These results suggest hydrologic data around the world have commonalities which could be leveraged by deep learning, and significant synergies can be had with a simple modification of the currently predominant workflows, greatly expanding the reach of existing big data. Finally, this work diversified existing global streamflow benchmarks.