Analytical case

3.1. Problem statement

A simple model with three unknown parameters is employed to illustrate the proposed subsampling ANOVA approaches, which is expressed as follows:
where and are independent variables uniformly distributed within [0, 1]. This simplified model is proposed by (Chen et al., 2019). The purpose of this model is to explore the sensitivity indices change of model parameters with different subsampling methods in the ANOVA-based sensitivity analysis. In our study, we define “5” as the five levels are selected equidistantly within the initial parameter range firstly. Then the five levels are subsampled (see section 2.1), and totally 10 () combinations of different level pairs are obtained for two-level ANOVA. Similarly, “2” represents only two levels (maximum and minimum values) of the parameter were selected from the range, without subsampling. For example, “522” means that five levels of X1 are selected with equidistantly from the range before subsampling, meanwhile only two levels of the X2 and X3 are selected from the range. In turn, we define 252, 225, 552, 525, 255, 222, 333, 444 and 555 for different ANOVA approaches. For 522, 252 and 225, only one of the three parameters is subsampled, which represent single-subsampling ANOVA. For 552, 525 and 255, two of the three parameters are subsampled, which represent multiple-subsampling ANOVA scheme. Similarly, 222,333,444 and 555 represent full-subsampling ANOVA with different parameters levels.

3.2 Influence of subsampled parameter

Figure 1. presents sensitivity indices of individual and interactions of the three parameters under different subsampling ANOVA approaches. Figure 1(a) represents single-subsampling ANOVA and Figure 1(b) represents multiple-subsampling ANOVA. Firstly, it can be found that the parameter’s sensitivity varies with each other. In detail, the sensitivity range of and interactions are 4.1%-41.2%, 25.1%-78.5%, 7.5%-47.3% and 7.0%-15%, respectively. In most cases, X2 is the most sensitive parameter. Secondly, the parameter’s individual sensitivity varied significantly with different subsampling scheme. For single-subsampling ANOVA, the minimum value (the red bar) of X1’s sensitivity is obtained in 522 where only X1 is subsampled. Similarly, the minimum values (the red bar) of X2’s and X3’s sensitivities are obtained in 252 and 225, respectively. The results indicate that the individual sensitivity of the parameter will reduce sharply when the parameter is subsampled in single-subsampling ANOVA. As for multiple-subsampling ANOVA in Figure 1(b), the maximum value (blue bar) of X1’s sensitivity is obtained in 255 where only X1 is non-subsampled. Similarly, the maximum values of X2’s and X3’s sensitivities are obtained in 525 and 552, which indicate that in multiple-subsampling ANOVA, the individual sensitivity will increase for the non-subsampled parameter. Thirdly, the black bars in Figure 1 represent sensitivity indices of individual and interactions for the three parameters obtained by Sobol’s. Compared with sobol’s results, the subsampling process will reduce the subsampled parameter’s individual sensitivity and increase the non-subsampled parameter’s individual sensitivity. Lastly the subsampling process not only change the value of parameter sensitivities but also change the ordering of the parameter sensitivities (as shown in supporting masteries Figure S1-S3). For example, the order of sensitivity for the case by the 522 method is parameter x2 > x3 > interaction > x1 while 252 values yield a slightly different order: x3 > x1 > x2 > interaction. This also indicates that the results of either single- or multiple-subsampling schemes are biased. Consequently, the full-subsampling ANOVA approach is expected to employ in the following part aims to diminish the deviation.

3.3 Influence of parameter levels

In the full-subsampling ANOVA approach, different levels can be chosen for each parameter from its variation range. In this study, four scenarios would be tested with each parameter having 2, 3, 4 or 5 levels (i.e. 222, 333, 444 and 555) respectively. Figure 2 shows the influence of parameters levels on individual and interactions sensitivity. The sensitivities of three parameters change with the parameters levels change. As the parameters levels increase from 222 to 555, the individual sensitivity of X1 and X3 gradually increase from 11.7% and 19.4% to 19.1% and 24.1%, respectively. At the same time, the interactive parameter sensitivity gradually decrease from 18.1% to 5.5%. The individual sensitivity of X2 which has the biggest contribution keeps relatively stable, ranging from 50.9% to 52.2%. The results show that for full–subsampling ANOVA method, the individual and interactive parameters sensitivities are affected by the subsampled parameters levels. The increased parameters levels increase the sensitivity value slightly for the low sensitive parameter and decrease the interactive sensitivity. Another thing to watch out is that the order of parameters sensitivities would change when the parameter level increases from 2 to 3. While when the 3 or more parameter levels are chosen, the variation of the obtained results is relatively small and the order of parameters sensitivities remained consistent with that of sobol’s. As a whole, the full-subsampling ANOVA approach with more than 3 levels is suggested to diminish the deviation.

3.4 Comparison with sobol’s method

To evaluate the accuracy of different subsampling ANOVA approaches, the sobol’s method is used as a benchmark method, which is widely used in hydrological models (Zhang et al., 2013, Wang et al., 2018, Song et al., 2015, Sobol’, 2010) as an effective approach to globally characterize single- and multiple-parameter interactive sensitivities (Tang et al., 2007). In this study, take sensitivity indices calculated by sobol’s method as base values, the deviation between subsampling ANOVA and sobol’s can be evaluated as , where is the sensitivity indices calculated by the subsampling ANOVA approaches, is the sensitivity indices calculated by sobol’s method. All the sensitivity indices calculated by subsampling ANOVA and sobol’s are available in supporting material and the deviations between subsampling ANOVA and sobol’s methods are presented in Figure 3.
The deviations between results of subsampling ANOVA and sobol’s vary (0.0008-0.114) with different subsampling schemes and parameters levels. The lower deviation indicates the individual and interactions sensitivity calculated are more accurate. For single-subsampling ANOVA and multiple-subsampling ANOVA approaches, the corresponding deviations range from 0.024 to 0.114. As expected, significantly better performances (the corresponding deviations range from 0.001 to 0.016) are obtained in full-subsampling ANOVA method. Moreover, the deviations are lower than 0.002 if 3 or more parameter levels are chosen in the full-subsampling ANOVA. Such deviations indicate that biased/inaccurate sensitivity indices obtained through the single/multiple-subsampling ANOVA methods. The negligible bias in full-subsampling ANOVA method show that the parameters sensitivities are very close to the “true value” when the subsampled parameter level is 3 or more. Therefore, in order to get more reliable parameter sensitivity results, the full-subsampling scheme with 3 or more parameter levels is necessary for the application of subsampling ANOVA methods.
Many researches point that sobol’s method is computationally expensive (Tang et al., 2008, Tian, 2013, Reusser et al., 2011).  Here, to illustrate the computational advantages of the subsampling ANOVA methods, the number of model running and the number of calculations of variance required by subsampling ANOVA methods and sobol’s are presented in Table 1. Generally speaking, N*(M+2) model evaluations are required for the application of sobol’s, where N is the random sample size and M is the number of parameters, for more details about sobol’s method, please refer to (Sobol’, 1990, Nossent et al., 2011). In this case study, in order to get a stable result of the sensitivity analysis, different set of N samples are applied in the sobol’s. We found that the sensitivity analysis remained relatively stable when N was larger than 2000. So in this simple three-parameter model, the number of running the model is 2000*(3+2), which is a barely acceptable computing requirement.
Fortunately the subsampling ANOVA methods can significantly reduce the calculation requirements while sobol’s calculation accuracy is achieved. For example, in full-sampling ”444”, the model needs to run only 64 times (64=4*4*4). It should be noted that after running the model 64 times, the 64 sets of model responses can be obtained. Through resampling process, 216 sets (216=, where ) of 2*2*2 combination can be obtained, and each combination can calculate a set of variance results. Thus, 216 sets of variance results can be obtained. The final sensitivity results can be obtained by averaging and homogenizing the 216 sets of variance. The number of running the model decides the computing requirements. Through reducing the number of model runs, the subsampling ANOVA methods are effective and feasible sensitivity analysis methods with relatively low computational requirements. Reduction of model running times requirement is very important, especially for those models with limited parameters but extensive computational demand.