4、Practical case study

4.1 Problem statement

To further illustrate the applicability of the subsampling ANOVA methods, the proposed approaches are applied for parameter sensitivity analyses in hydrological simulation through the conceptual model GR4J, as shown in Figure 4. The study area is Zengjiang River which is one tributary of Dongjiang River located in the Pear River Delta, China as shown in Figure 5. The meteorological data (daily evaporation and daily precipitation) are collected from Qilinzui Hydrological Station (as shown in Figure 5) during the period of 2009-2015 which were obtained from Hydrological Data of Pearl River Basin, Annual Hydrology Report, P. R. China. The total drainage area above the Qilinzui Hydrological Station is 2866 km2, accounting for 91% of the Zengjiang River basin (3160 km2). The mean annual temperature and precipitation are 21.6°C and 2188 mm, respectively. More details about Zengjiang River basin can be found in (Tao et al., 2011).
GR4J model is a rainfall-runoff model which is based on four free parameters from daily rainfall data. In GR4J, the production components include an interception of raw rainfall and potential evapotranspiration by an interception reservoir of null capacity, a soil moisture accounting procedure to calculate effective rainfall and a water exchange term to model water losses to or gains from deep aquifers. Its routing module includes two flow components with constant volumetric split (10–90%), two unit hydrographs, and a non-linear routing store (as shown in Figure 4). The descriptions and initial fluctuating ranges of GR4J model parameters are presented in Table 1. For more details of GR4J model, please refer to the literature (Perrin et al., 2003). The initial fluctuating ranges of GR4J model parameters are wide considering the structure varies in different basins. However, for a specific watershed, the appropriate parameter range should be obtained through the calibration process that produce an acceptable level of model performance (Freer et al., 1996, Pianosi et al., 2016, Shin et al., 2013). It’s reported that the parameters sensitivities were strongly influenced by the range of parameter values used (Shin et al., 2013, Wang et al., 2013), therefore it is important to obtain an appropriate parameter range corresponding to considered model performance before SA (Shin et al., 2013, Saltelli et al., 2019). Therefore, in this study, the model calibration based on the MH algorithm is used prior to SA in order to constraint the input variability space. The details about MH algorithm are presented in supporting materials. Nash–Sutcliffe model efficiency (NSE) is used to assess the predictive power of model results which involves standardization of the residual variance. Here, the objective functions adopted can be represented as follows (Nash and Sutcliffe, 1970, Legates and Jr, 1999):
where \(Q_{\text{sim}}\ \)is the simulated runoff, \(Q_{\text{obs}}\) is the observed runoff, \(\overset{\overline{}}{Q_{\text{obs}}\ }\) is the mean value of the observed runoff and \(n\) is the sample size.
For the MH algorithm parameterization, the markov chain with 10,000 iterations for each parameter are examined and the model performance of each iteration are presented in Figure 6. The NSE is greater than 0.74 and remained stable after a number of iterations. In this study, the first 50% of the samples in markov chain are ruled out as a warm-up period. The last 50% of the samples passed the Heidelberger and Welch Convergence Diagnostics (Heidelberger and Welch, 1983). The posterior distributions are obtained from the last 5000 parameter sets and the posterior PDFs of each parameter are presented in Figure 7. All parameters ranges appear in a relatively small interval which is different from their initial value. The posterior PDFs of four parameters are approximately normal distribution characteristics, indicating that the parameters in GR4J are well identified after a number of iterations even with a wide range of prior densities. The predictive intervals of stream obtained by the last 5000 parameter sets are presented in Figure 8. It can be observed that the obtained predictive intervals can generally match the observations, except for some overestimations in high-flow periods.
With the appropriate parameter range, the next work is to evaluate how much each parameter contributes to the stream uncertainty. Sensitivities of parameters in a rainfall–runoff model structure are specific to a site, and cannot be assumed from previous work in other catchments (Van Griensven et al., 2006, Shin et al., 2013). In this study, the proposed subsampling ANOVA methods are applied for analyzing parameters sensitivities of GR4J model in Zengjiang River basin. Based on the calibration of GR4J, the ranges of the four parameters are determined as presented in Table 2. Similar to Section 3.1, different subsampling ANOVA approaches, including single-subsampling ANOVA (5222, 2522, 2252 and 2225), multiple-subsampling ANOVA (5522, 5252, 5225, 2552, 2525, 2255, 5552, 5525, 5255 and 2555), and full-subsampling ANOVA with different parameters level (2222, 3333, 4444 and 5555) are defined and applied here.

4.2 Influence of subsampled parameter

With only one parameter subsampling, the contributions of individual and interactive effects for the four parameters in GR4J model are shown in Figure 9. There are several findings as follows. Firstly, taking sobol’s results as the reference results, X1 makes the largest independent contribution to GR4J model uncertainty in Zenjiang River and followed by the interactive effects of the four parameters. X1(the first parameter) represents the maximum capacity of the production store. The high sensitivity of parameter X1 indicates that runoff generation in Zengjiang basin is highly affected by the maximum capacity of the production store. The maximum capacity of the production store (X1) increases to handle with an overestimation of rainfall and decreases to handle with underestimation, thus adapting its capacity to hold and evaporate different amounts of water (Oudin et al., 2006). Secondly, the parameters sensitivities obtained change significantly with different subsampling scheme. For example, the contributions of X1 are 0.109, 0.230, 0.275 and 0.205 for the four single-subsampling schemes (where X1, X2, X3, and X4 are subsampled separately). The lowest sensitivity value for X1 obtained in 5222, which X1 decomposed into five levels and take subsampling. The other three parameters have the same basic behaviors as the X1. Therefore, similar with section 3.2, the subsampling procedure also lead to a lower sensitivity value for one parameter which is subsampled in. Thirdly, the ranking of parameter sensitivity is influenced by different single-subsampling schemes. In order to check if hierarchy is kept by the methods more clearly, the bar plots which is close to Figure 9,but grouped by simulations and not by parameter are presented in supplementary materials Figure S4-S6. For instance, the sensitivity order is Interactions >X3 > X4 > X1 > X2 in 5222 scheme, while in 2252 scheme, the sensitivity order is Interactions >X1 > X3 > X4 > X2. Such results indicate that the single-subsampling ANOVA approach generate unreliable sensitivity values, which are highly influenced by the parameter to be subsampled. In multiple-subsampling ANOVA, more than one parameters in GR4J model are subsampled. The contributions of individual and interactive effects for GR4J model parameters under different multiple-subsampling schemes are presented in Figure 10.
It can be found that, for each parameters, the red bar values are significantly lower than the blue bar values. The mean values of the red bars for X1, X2, X3 and X4 are 0.184, 0.033, 0.124 and 0.078, respectively. Meanwhile the mean values for the blue bars for X1, X2, X3 and X4 are 0.306, 0.098, 0.264 and 0.225, respectively. For each parameter, the mean value without subsampling (blue bars) is more than twice as much as the mean value with subsampling (red bars). This indicates that, similar with the single-subsampling scheme, the multiple-subsampling ANOVA approaches also generate unreliable results, and the subsampling-procedure would significantly reduce the resulting individual sensitivity value. In other words, the difference of sampling densities among parameters has great influence on quantification of parametric sensitivities in hydrologic modeling.

4.3 Influence of subsampled parameter level in full-subsampling

In the full-subsampling ANOVA approach, all the four parameters in GR4J model are subsampled. However, different levels for each parameter can be chosen before the subsampling procedure. Similar with the analytic case in section 3, four levels (2 to 5) are going to be chosen for each parameter in GR4J. The contributions of individual and interactions for GR4J model parameters under different parameter levels in full-subsampling ANOVA are presented in Figure 11. As the parameter levels increase from 2222 to 5555, the sensitivity of X1, X2 and X4 gradually increase from 20.1%, 3.7% and 4.7% to 31.0%, 7.6% and 15.8%, respectively. At the same time, the contribution of X3 and the interaction gradually decrease from 21.7% to 17.8% and 48.9% to 25.9%. The results indicate that the parameters levels will affect the individual and interactive sensitivities in the full-subsampling ANOVA approach. In details, the sensitivity of the most sensitive parameter and interaction generally decrease, while that of the other parameters increase with the parameter level increased. The obtained results would vary most significantly when the parameter level increases from 2 to 3. At the same time, the variation of the obtained results is relatively small and the order of parameters sensitivity would not change when the parameter levels are higher than three.

4.4 Compared with sobol’s

The sensitivity values for the four parameters in GR4J model from the subsampling ANOVA (i.e. single-subsampling, multiple-subsampling and full-subsampling) and sobol’s are discussed above. To compare the subsampling ANOVA and sobol’s methods further, the deviations for parameter sensitivity values and the calculation cost are presented in Figure 12 and Table 3. There are significant differences between the deviations obtained with different subsampling ANOVA methods. The large deviations (> 0.08) are obtained with 2225, 2255 and 2525, meanwhile small deviations (< 0.01) are obtained with 3333, 4444, and 5555. Except for 2222, other full-subsampling schemes perform very well in the sensitivity analysis of GR4J model parameters. Therefore, in order to get reliable parameter sensitivity results, the three or more parameter levels in the full-subsampling ANOVA approach are recommended.
In this GR4J model study, 3 million random samples are applied in sobol’s in order to get a stable result of the parameters sensitivities. So totally, the GR4J model need to run 3000000*(5+2) times, which is a very large computational requirement. However, the subsampling ANOVA methods can significantly reduce the calculation requirements while achieve similar calculation accuracy in GR4J model. For example, in full-sampling ”4444” where the deviation is only 0.0006, the model only need to run 256 times (256=4*4*4*4). Similar with section 3.4, for the four parameters conceptual model, 1296 sets variance results can be obtained through subsampling process where 1296= and . The final sensitivity results obtained by averaging and homogenizing the 1296 sets of variance. Through reducing the number of model runs, the subsampling ANOVA methods are effective and feasible sensitivity analysis methods with relatively low computational requirements.