loading page

On Robustness of the Explanatory Power of Machine Learning Models
  • +3
  • Banamali Panigrahi,
  • Saman Razavi,
  • Lorne E Doig,
  • Blanchard Cordell,
  • Hoshin V. Gupta,
  • Karsten Liber
Banamali Panigrahi
University of Saskatchewan
Author Profile
Saman Razavi
University of Saskatchewan

Corresponding Author:[email protected]

Author Profile
Lorne E Doig
University of Saskatchewan
Author Profile
Blanchard Cordell
University of Saskachewan
Author Profile
Hoshin V. Gupta
The University of Arizona
Author Profile
Karsten Liber
University of Saskatchewan
Author Profile

Abstract

Machine learning (ML) is increasingly considered the solution to environmental problems where only limited or no physico-chemical process understanding is available. But when there is a need to provide support for high-stake decisions, where the ability to explain possible solutions is key to their acceptability and legitimacy, ML can come short. Here, we develop a method, rooted in formal sensitivity analysis (SA), that can detect the primary controls on the outputs of ML models. Unlike many common methods for explainable artificial intelligence (XAI), this method can account for complex multi-variate distributional properties of the input-output data, commonly observed with environmental systems. We apply this approach to a suite of ML models that are developed to predict various water quality variables in a pilot-scale experimental pit lake.
A critical finding is that subtle alterations in the design of an ML model (such as variations in random seed for initialization, functional class, hyperparameters, or data splitting) can lead to entirely different representational interpretations of the dependence of the outputs on explanatory inputs. Further, models based on different ML families (decision trees, connectionists, or kernels) seem to focus on different aspects of the information provided by data, although displaying similar levels of predictive power. Overall, this underscores the importance of employing ensembles of ML models when explanatory power is sought. Not doing so may compromise the ability of the analysis to deliver robust and reliable predictions, especially when generalizing to conditions beyond the training data.
14 Mar 2024Submitted to ESS Open Archive
15 Mar 2024Published in ESS Open Archive