Unpacking some of the linkages between uncertainties in observational
data and the simulation of different hydrological processes using the
Pitman model in the data scarce Zambezi River basin.
D.A. Hughes1* and F. Farinosi2
1 Institute for Water Research (IWR), Rhodes
University, Grahamstown, South Africa
2 European Commission, Joint Research Centre (JRC),
Ispra (VA), Italy
* Corresponding authors email:
D.Hughes@ru.ac.za
Abstract: The main objective of this study was to use an uncertainty
version of a widely used monthly time step, semi-distributed model (the
Pitman model) to explore the equifinalities in the way in which the main
hydrological processes are simulated and any identifiable linkages with
uncertainties in the available observational data. The study area is the
Zambezi River basin and 18 gauged sub-basins have been included in the
analyses. Unfortunately, it is not generally possible to quantify some
of the observational uncertainties in such a data scarce area and mostly
we are limited to identifying where these data are clearly deficient
(i.e. erroneous or non-representative). The overall conclusion is that
the equifinalities in the model are hugely dominant in terms of the
uncertainties in the relative occurrence of different runoff generating
processes, although water use uncertainties in the semi-arid parts of
the basin can contribute to these uncertainties. The identification of
landscape features that suggest the occurrence of saturation excess
surface runoff provides some information to constrain the model.
Improved independent estimates of groundwater recharge is also
identified as a key source of observational data that would help a great
deal in constraining the model parameter space and therefore reducing
some of the model equifinality.
Keywords: Processes; Hydrological models; Observations; Uncertainty;
Zambezi River basin
INTRODUCTION
Models are typically developed to simulate the response of a system to
driving forces in the absence of observations of the response. This is
true for many different kinds of models, including environmental models
(hydrology, geomorphology, oceanographic, climate, etc.), economic
models, health models (drug pharmacokinetics, for example) and others.
Models may also be constructed to improve our understanding of the
internal dynamics (processes) of the system (Ward, 1985; Fenicia et al.,
2008; Beven, 2012), even if there are observations of both the driving
forces and the response. The dilemma is that we need some observational
data to be able to develop and validate the model structure (McMillan et
al., 2011). A further problem lies in the reality that, for many
systems, and notably environmental systems, the observational data that
are available (including the driving forces) are often deficient in
terms of accuracy or representativeness and are therefore uncertain
(Beven, 2009; Westerberg and McMillan, 2015). Observational ‘data’ may
also refer to different things; some may be ‘hard’ quantitative data
(direct measurements), while some may be ‘soft’ qualitative data
(Winsemius et al., 2009). While the sources of uncertainty in hard and
soft data may be different, both are subject to errors (McMillan et al.,
2012) that will potentially influence the development of the model, or
the model results (Gan et al., 1997). From a hydrological perspective,
models may be developed on the basis of largely soft conceptual data
(classical hydrological process theory; Ward, 1984), tested in places
(or at times) where hard data are available, and applied in places and
times when there are few hard data, through a process of extrapolation
(parameter regionalisation, for example) using a combination of hard and
soft data (Siebert and McDonnell, 2002).
Models may be constructed in a way that largely ignores the internal
processes and concentrate on establishing a quantitative relationship
between the inputs (driving forces) and the output responses (Todini,
2011). Alternatively, models may be designed to explicitly simulate the
internal processes of the system, using different levels of complexity
(Chien and Mackay, 2014). Arguably, the latter require more
observational data if we wish to not only validate the responses, but
also the realism of the internal process simulations (Kirchner, 2006;
Euser et al., 2013). The issue of model complexity has been a recurring
theme in the hydrological modelling literature for many years
(Hrachowitz et al., 2013), and there have been arguments presented in
favour of both simple (or parsimonious) models, as well as more complex
models (Jakeman and Hornberger, 1993; Perrin et al., 2001). Arguably, a
simple model is easier to apply from a mathematical perspective,
particularly if the model time step is short, has many spatial elements,
and if automatic calibration (or uncertainty ensemble outputs) methods
are to be used. Clearly, a small parameter space defining the
hydrological response characteristics of each spatial element will take
less time to run, and probably converge to a unique solution quicker
(less equifinality; Beven, 2006) than a model with a larger parameter
space. For models that are applied with coarse spatial (sub-basins) and
time scales (monthly), the issue of model complexity becomes less of a
problem from a computer run time perspective, but the issue of
equifinality remains. It might also be argued that there is little point
in having a complex model structure if it is applied at coarse spatial
and temporal scales because all the individual hydrological processes
are subsumed in the total sub-basin response characteristics. However,
this argument relies on the assumption that the total response cannot be
decomposed into the sub-basin scale effects of individual processes.
There is evidence to suggest that this argument is false in at least
some regions and that it is possible to infer (or hypothesise) the
relative effects of different processes from the total basin response
(Clarke et al., 2009; Hughes, 2013, 2016). A model that includes,
implicitly or explicitly, the range of different hydrological processes
can be used to assess the validity of process hypotheses (Gallart et
al., 2007; Beven, 2012), through any number of different uncertainty
analysis methods (Pechlivanidis et al., 2011). This type of approach
would not be possible with a much simpler model structure where
processes are lumped together in model algorithms that are designed to
represent the total response, but not individual processes.
The detailed outputs of a more complex model can be compared to any data
(hard and soft) that might be available to support the presence and
importance, or even partially quantify, specific process activity. This
is an important point, given the ever increasing availability of global
data sets based on increasingly more sophisticated (and presumably more
accurate and representative) methods of collecting and processing remote
sensing data that can tell us something about inter aliavegetation, evapotranspiration, soil and ground water storage regimes
and their variation over space and time (Pekel et al., 2016; Lucey et
al., 2020; Sadeghi et al., 2020). The regional context of this
contribution is southern Africa, where it is hard enough to maintain
even basic hydrometeorological observation networks (rainfall and stream
flow), and therefore the likelihood of ever having detailed ground-based
observations that might help to resolve questions about process activity
is extremely remote. One motivation for more complex models is that we
wish to know whether we are modelling the response for the correct
reason (Kirchner, 2006), given the many different limitations and
uncertainties inherent in the model and available forcing data. Perhaps
the key questions are what is the value of this information, how would
we benefit from it, and why is it important to generate realistic
outputs for the right reason? Apart from the rather esoteric answer that
as scientists we want know if we are correct, this knowledge could be
valuable for applying the same model in areas that are not gauged. This
introduces the other key theme that has perplexed hydrological modellers
for a number of decades; what are the best ways of transferring the
knowledge about a model and it’s functioning from gauged basins to
ungauged basins? There have been many contributions to this topic and
many suggestions for different approaches (Blöschl, et al., 2013;
Hrachowitz et al., 2013). Perhaps the two main approaches are those
based on parameter regionalisation using basin physical properties and
their relationships with calibrated parameter values (Pokhrel and Gupta,
2009), and those based on the regionalisation of basin response indices
against which the ungauged basin model outputs can be compared, or
constrained (Westerberg et al., 2016; Kabuya et al., 2020; McMillan,
2020). There are also some model packages that provide direct methods of
calculating parameter values from basin physical properties. All of
these approaches rely upon some observational data associated with the
main driving variables (climate), total basin response and landscape
characteristics, which will inevitably be subject to uncertainties that
will impact on the validity of the parameter estimation methods and
simulation results.
The main purpose of this contribution is to unpack some of the
uncertainties associated with the observational data as well as the
model, and to explore how these uncertainties affect hypotheses about
the key hydrological processes that are active within different parts of
the basin. The real point is to investigate how this approach might be
useful in conjunction with an understanding of the links between process
activity and a conceptual interpretation of the landscape to help with
parameterising the model in ungauged areas. The term ‘landscape’ is used
here to represent the many different characteristics that might
influence the dynamics of the runoff response and includes topography,
vegetation, soils, geology, drainage pattern, etc. The term ‘conceptual’
assumes that the interpretation could be based on a mixture of both soft
(or subjective) and hard (numerical analysis of available data)
information. The geographic context is the Zambezi River basin in
southern Africa, where different climate zones are represented, where
data are typically scarce and often of unknown quality, but where well
informed water resource management decisions are required that
frequently rely on simulated information. The model is a version
(Hughes, 2013) of the Pitman (1973) monthly time-step model that has
been widely used in the region and is typically applied at relatively
coarse spatial scales (in a semi-distributed, sub-basin structure).
However, the principles of the approach are considered to be equally
applicable to any other model where individual hydrological processes
are represented either implicitly or explicitly.
THE PITMAN MODEL AND PROCESS INTERPRETATION
Most of the original structure (Pitman, 1973), as well as more recent
additions (Hughes, 2004; Hughes and Mazibuko, 2019) to the model have
been designed to represent processes explicitly (Figure 1), albeit at
the sub-basin scale, using approaches that are similar to the
probability distributed principle of Moore (1985). Given the rather
large parameter space (20 parameters covering the full range of natural
hydrological processes), any form of calibration (manual or automatic)
can become a daunting task and experience suggests (Hughes, 2013) that
to benefit from the explicit representation of the processes, it is
important to understand the conceptualisation of the model algorithms.
Figure 1 summarises the main model structure, while the following
sub-sections provide a little more detail. The model outputs include
some details of the simulations of the individual processes so that
these can be compared to any available observational data as well as the
total sub-basin output.
Interception and evapotranspiration.
Interception depth is defined by a storage parameter that can vary
seasonally, while evapotranspiration losses are dependent upon soil
moisture storage, input values of potential evapotranspiration (PET) and
a parameter (0≤R≤1, with lower values implying higher relative actual
losses). Spatial and temporal variations in vegetation cover can be
readily obtained from satellite imagery such as Leaf Area Index (LAI) or
MODIS Normalized Difference Vegetation Index (NDVI) data, but these data
do not provide direct measures of interception loss.
Surface runoff.
There are two methods of generating surface runoff in the Pitman model
(ISQ and SSQ in Figure 1). The first is effectively a saturation-excess
surface runoff process (Hughes and Mazibuko, 2018), while the second is
a function only of rainfall depth and represents an infiltration (or
adsorption) excess surface runoff process. The key parameter (SSR) is
the wetness value (ST*SSR) at which this process is initiated. This
function was added to account for the presence of relatively flat valley
bottom areas (Dambo’s) that remain wet during the dry seasons, due to
interflow from the surrounding hillslopes (von der Heyden, 2004). These
features are relatively straightforward to identify using Google Earth
imagery, while Lampitlaw and Gens (2006) refer to quantitative mapping
methods using satellite imagery and topographic analysis. Hughes and
Mazibuko (2018) demonstrated that the inclusion of this function
improved the seasonal distributions of simulated flows in catchments
where such landscape features are known to exist, but gave poorer
simulations if used in other areas. The second surface runoff function
uses a triangular distribution of catchment adsorption rates defined by
two parameters and the area under the cumulative frequency curve for a
given rainfall depth represents the depth of surface runoff. There are
no data sources that can directly help with quantifying these
parameters.
Interflow runoff
Interflow runoff depth is determined from a non-linear power
relationship (Figure 1; IQ) and there are no observational data that can
directly support the determination of the parameters (FT, SL and POW).
However, topography and soils data can at least point to the likelihood
that interflow is either a dominant or largely irrelevant process. The
key signal in the observed stream flow data is the shape of the wet
season recession, where slow (or fast) recessions suggest relatively
high (or low) proportions of interflow.
Groundwater recharge and discharge to stream flow
The recharge function is the same form as the interflow function (Figure
1) and is routed through a conceptual groundwater storage (influenced by
drainage density and storativity parameters), while outflows to the
river channel are mostly determined by a transmissivity parameter and
the level of storage (used with drainage density to estimate the
hydraulic gradient towards the channel). An additional parameter defines
the proportion of the sub-basin area that represents the riparian strip
from which groundwater can be lost to evapotranspiration. Further
details of the structure and algorithms can be found in Hughes (2004).
Experience within South Africa suggests that the best information for
constraining some of the parameters comes from independent evaluations
of groundwater recharge rates and the geological characteristics of the
underlying aquifers (DWAF, 2005).
Water use functions
Apart from an option to account for large reservoirs, there are also
functions to allow for direct abstractions from the river, and for
storage and abstractions from distributed small dams. The available data
for quantifying storages and abstractions in southern Africa is
typically almost non-existent (or at least not available), while some
global data sets are available to quantify the maximum surface area of
water bodies (Pekel et al., 2016; Gonzalez-Sanchez et al., 2020) and
areas under irrigation (IFPRI, 2019), but converting these to useful
information on patterns of water use is also subject to a great deal of
uncertainty (discussed later).
Equifinality between and within the different process representations.
The two surface runoff functions both determine the patterns of moderate
to high flows, but they have quite different seasonal distributions
because the first is driven by the sub-basin moisture status and
rainfall, while the second is driven only by rainfall. Resolving some of
the equifinality therefore relies on an assessment of the shape of the
wet season stream flow response, or clearly identifying the presence of
Dambo type features. The interflow and both surface runoff functions
partly determine the shape of the middle part of the flow duration
curves and it is never very straightforward to determine the most
appropriate parameter combinations. Simulating low flow patterns are
associated with equifinalities between the interflow and groundwater
recharge functions, within the two functions (the interplay between the
scaling (FT and GW) parameters and their respective power parameters
(POW and GPOW), Figure 1), as well as between the recharge and the
amount lost to riparian evaporation. It is often possible to identify
signals in the observed stream flow data that can resolve at least some
of these equifinalities, but there almost always remain a quite broad
range of plausible parameter sets that produce similar responses. Low
flow simulations are also affected by equifinalities between the natural
hydrology functions and the impacts of distributed water use.
STUDY AREA and DATA
While there are many gauged sub-basins within the southern Africa region
that could be used, the focus is on the Zambezi River basin, largely
because this basin has recently been the subject of a model calibration
(Hughes et al., 2020) and climate change assessment (Hughes and
Farinosi, 2020) study conducted under the auspices of the African Union
- NEPAD African Network of Centres of Excellence on Water Sciences and
Technology - ACEWATER phase 2 project. The primary objective of this
study was to achieve an acceptable calibration of the model across the
76 defined sub-basins (Figure 2) and to investigate the range of
uncertainties in the water resources availability in the future. While
they were not ignored, there was less focus on the likely realism of the
modelled processes or the observational data uncertainties, which are
the main concern of this paper.
The Zambezi River basin covers a total area of some 1 350 000
km2 and has eight riparian countries (Angola,
Botswana, Malawi, Mozambique, Namibia, Tanzania, Zambia and Zimbabwe).
The rainfall is highly seasonal and occurs mostly in the summer months
between October and March. Annual rainfall amounts vary from about 1 200
mm y-1 in the upper areas of the Shire and Kafue
sub-basins, to less than 700 mm y-1 in the semi-arid
sub-basins of Zimbabwe (Hughes et al. 2020). There are a number of
gauging stations in the basin, some in the headwater areas and others on
the main rivers. This study concentrates on 18 headwater gauged
sub-basins (Figure 2, Tables 1 and 2) most having records dating back to
about 1960. They have been selected to represent the range of climate
conditions, as well as the type and range of uncertainties that are
expected to exist in the observational data that are available to assist
with establishing behavioural model set ups. Table 1 provides the
gauging station details, but the remainder of the paper refers to these
sites using the model setup sub-area names given in Figure 1 and the
first column of Table 1. Additional information about these sub-basins
is contained within the results section, where it is considered relevant
to the interpretation of the model outputs.
While many of the main tributaries are gauged, the Zambezi River basin
is typical of many other parts of southern Africa in that it is largely
a data scarce region, particularly with respect to local climate data.
Even the available stream flow data contain a number of uncertainties,
partly related to possible rating curve problems, and partly related to
periods of missing data (Hughes et al., 2020). The original model was
forced with the University of East Anglia, Climate Research Unit data
(https://crudata.uea.ac.uk/~timm/grid/CRU_TS_2_1.html, accessed
during Oct. 2019), available from 1901 to 2017 at a grid scale of
0.5o (Harris et al., 2014). Additional rainfall data
(for the same period and spatial resolution) from the University of
Delaware (UNIDEL; Willmott and Matsuura, 2001) were used to assist with
identifying key rainfall data uncertainties. Both of these rainfall
products are based on extrapolation from sparse ground stations and are
expected to contain large uncertainties, particularly in the
representativeness of individual monthly rainfall depths. Comparisons
between them suggest that in most places they agree quite well, but
there are some of the Lake Malawi/Nyasa sub-basins where there are
substantial differences in the mean annual rainfall suggested by the two
datasets (Table 2).
The potential evaporation (PET) data are based on the LISVAP
calculations (Alfieri et al., 2019) using the ERA5 data for 1979 to 2018
(https://confluence.ecmwf.int/display/CKB/ERA5+data+documentation,
accessed during Oct. 2019), which are also expected to contain a number
of uncertainties . However, given that the Pitman model uses a single
annual PET depth and a fixed seasonal distribution for each sub-area,
the main data uncertainties are expected to be in the mean annual values
and the uncertainty range in the model has been set to ±10% of the
LISVAP values. Estimates of LAI are expected to be useful for
constraining simulated interception depths (annual means, seasonal
distributions and even time series values). The major uncertainties are
not expected to be in the conversion of LAI into depths of interception
for a given climate regime (De Groen and Savenije, 2006; Wu et al.,
2019; Návar, 2020). The LAI data (Mao and Yan, 2019) used in the study
are long-term (1981 to 2015) monthly means
(https://daac.ornl.gov/VEGETATION/guides/Mean_Seasonal_LAI.html,
accessed during July 2020) and the seasonal range for all sub-basins
used in this study (plotted against their aridity index), as well as
some sample seasonal distributions are given in Figure 3. MODIS actual
evapotranspiration data (AET) could help with partially resolving some
of the annual or long-term water balance (stream flow = rainfall –
evaporative losses) uncertainties. However, the MODIS AET data are
themselves subject to uncertainties (Velpuri et al., 2017) related, in
part, to the availability of local climate data, as well as the
interpretation of vegetation reflection signals.
Groundwater recharge data are potentially very useful for resolving some
of the equifinalities between simulated interflow and groundwater
contributions to stream flow, and estimates for the different geological
and climate zones of the basin are available from the British Geological
Survey (MacDonald et al., 2012). However, the level of uncertainty is
largely unknown as it is not very clear how the estimates were derived.
Some remotely sensed soil moisture data were investigated during this
study. Although these were not expected to be useful for constraining or
checking the simulated soil moisture storage regime (largely due to the
shallow depth of penetration of the sensors), it was considered that the
data could be useful to identify landscape features (such as Dambos)
that have different patterns of near surface moisture storage to other
areas and therefore assist with setting the parameter of the saturated
surface runoff function. In order to test the validity of our
hypotheses, we used here the European Space Agency (ESA) Climate Change
Initiative Soil Moisture dataset (ESA-CCI v0.47:
https://www.esa-soilmoisture-cci.org/node/238, accessed on August
2020) (Dorigo et al. 2017; Gruber et al. 2017, 2019) and the NASA – JPL
Soil Moisture Active Passive (SMAP) (respectively Level 4 9km:
https://nsidc.org/data/SPL4SMAU/versions/5; and Level 2 3km
resolution: https://nsidc.org/data/SPL2SMAP_S/versions/2 , accessed on
August 2020) (Das et al. 2019).
Water use data are notoriously difficult to obtain in most parts of
southern Africa, but some indications of agricultural water use can be
obtained from GIS analysis of land use data (IFPRI, 2019) to identify
areas of irrigation. The uncertainties lie in the accuracy of the
remotely sensed land use data as well as any assumptions made about
irrigation application rates. Similarly, it is not always clear where
the water is obtained from (reservoir, run-of-river or groundwater
supplies). There are data available on the maximum surface area of
reservoirs, that include quite small farm dams (Pekel et al., 2016;
Gonzalez-Sanchez et al., 2020), however, translating the areas into
storage volumes is highly uncertain (Hughes and Mantel, 2010; Busker et
al., 2019), as is defining the contributing catchment areas of the dams.
This issue is particularly relevant to the Zimbabwe sub-basins (Figure 2
and Table 2).
METHODS of ANALYSIS
The main approach to this study has been to use an uncertainty version
of the Pitman model to explore different parameter combinations that
generate similarly ‘good’ reproductions of the observed streamflow
response. The version of the model used allows for any or all of the
parameter inputs to be defined by minimum and maximum values, which are
independently randomly sampled (uniform distribution) during each of
(typically) 10 000 ensemble runs. The parameter values, a range of
summary statistics (e.g. mean monthly values of runoff volume, recharge
depth and depth of the four main modelled processes) and goodness-of-fit
statistics (objective functions) for each ensemble are part of the model
outputs. To avoid a single objective function statistic from dominating
the selection of ‘good’, or behavioural simulations a simple combined
statistic (CS) is used that combines the Nash coefficient of efficiency
values (CE) and the % bias in mean monthly runoff (%Bias), based on
untransformed and natural log (ln) transformed values.
\(CS=CE+CE(ln)+2\ \left|\frac{\%Bias}{100}\right|-\ \left|\frac{\%Bias(ln)}{100}\right|\)Equation 1
The maximum value is 4.0 for a perfect fit, while behavioural ensembles
can be selected as those that have CS values greater than (say) 95% of
the highest (best fit) value for the whole ensemble set.
The methods are therefore simple, but the process of setting appropriate
parameter ranges and interpreting the results is often more complex,
particularly when many parameters are set to be uncertain in the same
run. Previous experience (Hughes, 2016) therefore suggests that several
runs of the model focussing on different groups of parameter interaction
(or different process components of the water balance) are frequently
necessary to be able to explore the equifinalities in detail. In the
context of this Special Issue of the journal, the possible effects of
uncertainties in either the forcing climate data or the observed stream
flow data are also explored, as well as the value of any other hard or
soft observational data (referred to in the previous section) that can
be used to resolve some of the equifinalities. The latter would
typically be used to either constrain some of the parameter ranges, or
exclude ensemble members that do not generate outputs consistent with
the data.
RESULTS
There is insufficient space to present the full results for all 18
sub-basins, and some sub-basins are presented in more detail to
represent specific elements of uncertainty, while some of the pertinent
details of the simulations are presented in Table 3 for all sub-basins.
Arguably KAF4 represents the sub-basin with the least amount of
uncertainty in the observational data used, and apart from the generic
uncertainties in the rainfall, interception and evapotranspiration input
data, a key issue is the extent of Dambo occurrence and the effects on
saturated surface runoff. The maximum CS value within the 10 000
ensembles is 3.563, very close to the optimal value of 4, and all those
(98) greater than 3.38, but with no %Bias or %Bias{ln} values
greater than ±5.0, were accepted as behavioural. There are very few
differences between the minimum and maximum parameter values within the
behavioural ensembles compared to the total ensemble set, implying a
high degree of equifinality in the model, as is normally the case with a
model with so many parameters. The main differences are that the maximum
behavioural interception, saturated surface runoff and recharge
parameters are somewhat less than the maximum input values. The runoff
ratio for the behavioural ensembles lies between 10.2% and 11.3%,
while the full ensemble set range is 2.4% and 22.1%. The implication
is that the depth of AET (the main determinant of the overall water
balance together with rainfall, which is not considered uncertain for
this sub-basin) is relatively insensitive to uncertainties in the PET
observational data (assumed to be ±10% of the available estimates).
Further analysis of the simulations of interception were achieved by
setting only the interception and evapotranspiration parameters as
uncertain. The results confirmed that the overall model fit is almost
totally insensitive to the simulated interception depth (in the range of
56.4 to 175.1 mm y-1) and higher interception is
compensated for by less effective evapotranspiration from the moisture
store (and vice versa). The main impact is a slight shift forwards in
time in the seasonal distribution of simulated stream flow for the
higher interception depths.
The ranges of mean annual groundwater recharge values are 8.8 to 69.2mm
and 7.2 to 154.8mm for the behavioural and total ensemble sets,
respectively. The BGS values for this part of the Zambezi are between
109 and 146mm, clearly suggesting that the available observational data
are too uncertain to constrain the model. There is a wide range of
possible combinations of individual processes within the behavioural
ensemble set, and no clear differences between those with high and low
input PET values, suggesting that uncertainties in the PET data have a
low impact on the simulation of individual processes. A relatively
simple analysis of Google Earth images to approximately quantify the
surface area of Dambo features (Figure 4a), suggests that their maximum
area is ~15% of the total sub-basin area. Figure 4b
shows the relationship between relative moisture content and saturated
area calculated by the model for different values of the SSR parameter.
The Google Earth observational data suggest that this parameter could be
constrained to between about 0.55 and 0.65, allowing for quite high
uncertainty in the interpretation of the Google Earth images. This
reduces the behavioural ensemble set to 52, but has little impact on the
possible combinations of individual processes. Some of the grids for the
SMAP_L4 9km, 3 hour, soil moisture data showed characteristics that
might be expected from the presence of Dambos (more consistently wet
during the wet season and slower drying into the dry season, for
example), and most of these could be linked to areas that can be
identified as having a high density of Dambos. However, there are other
areas where Dambos are clearly visible on Google Earth that do not show
the same patterns in the soil moisture data. Part of the problem may be
related to the spatial resolution and part to the shallow depth of the
observational soil moisture data sample. The ESA product has a
resolution of a quarter degree which is too coarse to identify Dambo
areas, while the highest resolution data SMAP_L2/Sentinel 1A/B 1 and
3km data, are available only for scattered portions of the basin and
every few days, making it difficult to clearly identify signals of the
phenomenon investigated. Furthermore, the limited data available for the
higher spatial resolution soil moisture data showed very little
variation across the sub-basin. Similar conclusions were reached for the
other sub-basins and the soil moisture data, in their current stage of
development, were not found to be useful for constraining the model or
resolving any uncertainties in process simulations.
Figure 5 shows the partitioning of total runoff for two ensembles (low
and high recharge) and there are clearly substantial differences in the
way in which the model can simulate the observed stream flow response,
that are largely independent of any of the observational data
uncertainties. No uncertainties in the observed stream flow data have
been included, largely because there are no stage-discharge rating data
readily available upon which to base quantitative estimates. However,
they are expected to be low relative to other sub-basins and the main
impact would be simply to increase the number of ensemble members
considered to be behavioural. The other conclusions for this site would
not substantially change.
KAF11 is similar to KAF4 except that the extent of Dambo features
appears to be much less, and there are additional uncertainties
associated with water use for mining and irrigation (mostly from direct
river abstractions). The patterns within the behavioural ensembles are
similar to KAF4, although the SSR parameters are generally much higher,
consistent with fewer Dambo features, while the recharge values tend to
be higher (42 to 100 mm y-1). The runoff ratio varies
between 20.5% and 22.5%, which might reflect the smaller size, and
more headwater location, of KAF11 relative to KAF4.
BAR3, BAR4, BAR7 and CHB2 represent the sub-basins of the upper Zambezi
River and, apart from BAR3, are mostly underlain by deep Kalahari sand
deposits. The results for BAR3 are very similar to KAF4, with
behavioural runoff ratios of 9.1% to 10.0%, and recharge range of 24
to 70 mm y-1. The uncertainty in the distribution of
process contributions is also similar to that shown in Figure 5. During
the initial calibration of the model (Hughes et al., 2020) acceptable
simulations for BAR7 and CHB2 could not be achieved. It was also
concluded that the observed stream flow data for BAR7 were erroneous as
they show much higher low flows than at BAR5, BAR6 and ZAM1 further
downstream and below the Barotse floodplain (within BAR5; Figure 2). The
application of the uncertainty version of the model suggests that
acceptable simulations are obtainable at the sub-basin outlets, while
the miss-match with observed data downstream remains a major source of
uncertainty in either the observational data, or the model (including
the simulation of the wetland impacts of the Barotse floodplain), or
both.
The runoff ratios for BAR4 are much lower (5.4% to 6.2%), and
surprisingly the behavioural recharge range is only 12 to 38 mm
y-1, contrary to the expectation that groundwater
would play an important role in the area underlain by Kalahari sands.
The model simulates the majority of the low flows as interflow in all of
the behavioural simulations, despite all the groundwater parameters
having wide enough input ranges. More consistent with expectations is
the low contribution made by saturated surface runoff (no clear
indications of Dambos). In contrast, BAR7 is totally dominated by
groundwater in the small number of behavioural ensembles (Table 3), with
a narrow range of recharge values of 151 to 204 mm
y-1, and higher runoff ratios (16.7% to 17.1%). CHB2
has an overall much worse fit to the observed data and very few
behavioural ensembles, low runoff ratios (3.8% to 4.2%) and recharge
between 19 and 30 mm y-1. It is also more dominated by
groundwater outflow contributions (52 to 89% of total flow), and is
therefore similar to BAR7. One possible check on the simulations of the
sub-areas dominated by Kalahari sands, and especially the high low flows
and low high flows suggested by the observed data at BAR7, is to check
the downstream simulations at BAR6 (which are also consistent with the
observed data at ZAM1 and ZAM2 further downstream). However, this
assumes that the dynamics of the Barotse floodplain are simulated
appropriately. Unfortunately, the model is not able to simulate the high
flows, as well as the delayed peak in the wet season evident from the
observed flows at BAR6 (Figure 6), despite quite good simulations for
more than 50% of the upstream area (BAR3, BAR4, BAR7), and the fact
that all the evidence suggests that most of the ungauged sub-areas
(BAR1, BAR2 and BAR5) are unlikely to generate much higher wet season
flows (underlain by Kalahari sands). While Figure 6 illustrates that the
wetland sub-model is able to account for some of the peak flow delays,
this is achieved (as might be expected) at the expense of the peak
flows. The uncertainty issues therefore remain unresolved; are the
observed data and new simulations at BAR7 behavioural, and the main
problem associated with the wetland simulations, or are the observed
data and simulations at BAR7 wrong, thus preventing the wetland
sub-model from achieving a realistic downstream simulation?
For the semi-arid Zimbabwe sub-basins, the CS values in Table 3 only use
the CE an %Bias values because the values based on log transformed
flows are often misleading due to the large number of zero and very low
lows. The selection of behavioural ensembles is further limited to those
that have similar numbers of zero flow months to the observed data.
These sub-basins are also impacted by water use (mostly agricultural,
but some urban and mining supplies). The estimates from the
observational data (see also Hughes and Farinosi, 2020) are assumed to
be relatively uncertain and the input parameter ranges have been set at
±20% of the expected values. The real values could also be
non-stationary over the gauging period (starting in the late 1950’s),
adding another source of unknown uncertainty.
The behavioural ensembles for GWA3 do not have substantially different
parameter ranges than the full input range, despite there only being 11
ensembles accepted, further reinforcing the high level of equifinality
in the model structure. An exception is that the lower range of the
input PET values is not included in the behavioural set. The BGS
recharge estimates (> 60 mm y-1) are far
greater than the range of 2 to 17 mm y-1 simulated by
the model. The runoff ratio range is 4.5% to 5.3%, consistent with
semi-arid conditions and some water use. GWA4 has quite a large amount
of water use and this is reflected in much lower runoff ratios of 1.2%
to 1.6%, while the minimum recharge estimates are higher than GWA3 (a
range of 9 to 19 mm y-1). There is a weak positive
relationship for both GWA3 and GWA4 between the parameters determining
low flows (FT, POW, GW and GPOW) and the amount of assumed water use,
suggesting some impacts of uncertainty in observational data on water
use. For MAZ2 it was not possible to reproduce the observed number of
zero flow months (42%) within the ensembles with the best CS values.
While the simulated flows dry season flows are very low, they are not
actually zero. This is one of the more recent observed stream flow
records (2003 to 2017) and this result may be a reflection of hidden
uncertainties in some of the other sites related to the non-stationarity
of the water use data. The range of runoff ratios is between 12.9% and
13.9%, while simulated recharge is between (38 and 53 mm
y-1), both of which can be considered high for this
semi-arid sub-basin. This is one of the few sub-basins where a single
parameter (GW) has a much reduced range (15 to 20 mm
month-1) compared to the full input range (2 to 20 mm
month-1). One of the main problems with MAP2 is the
fact that the total stream flow record (1951 to 2017) shows a great deal
of non-stationarity, with a wetter period up to about 1984 and a much
drier period with less frequent and generally lower flows from 1984
onwards. There is some evidence to suggest that the main wet season
rainfalls were lower in the later period, while a contributory effect
may be changes in land and water use. The extent to which this effects
the models interpretation of the dominant processes is difficult to
determine without more reliable information. The main difference between
this sub-area and the previous semi-arid ones is the low combined
contribution of interflow and groundwater outflow. MAP3 has a range of
runoff ratios of 5.0% to 6.9% and recharge depths of 1.8 to 20.5 mm
y-1, and appears to be dominated by adsorption excess
surface runoff. However, the minimum values for the other processes
given in Table 3 are not very representative of all the behavioural
ensembles, which tend to have greater proportions of saturated excess
surface and groundwater runoff. MAP4 is quite similar to MAP3 with
slightly higher runoff ratios, but with a maximum recharge depth of 36
mm y-1 amongst the behavioural ensembles. As with the
previous Zimbabwe sub-areas the observed stream flow data are
non-stationary with lower overall discharge volumes in the second half
of the record (from the mid 1980s).
The Lake Malawi/Nyasa sub-basins are subject to rainfall and observed
stream flow data uncertainties to varying degrees (Table 2). MODIS
actual evapotranspiration data (AET: 2000 to 2014) has been used to try
and resolve some of the uncertainties in the rainfall data, by comparing
both CRU and UNIDEL mean annual rainfall data with the values derived
from a simple water balance of observed stream flow depth plus MODIS AET
depth (Table 4). However, this approach also has to take into account
the potential uncertainties in the observed stream flow data (Table 4,
‘Comments’ row), as well as any differences related to the choice of the
period used for the water balance checks (determined by the available
stream flow data). Despite these additional uncertainties, the decision
to use either CRU or UNIDEL rainfall data for RUK2 and RUH2, was quite
clear, while either rainfall data set appears to be suitable for RUH1
and NAM1. For RUK1 and RUK3, neither rainfall data set appears to be
suitable and both would need to be scaled to achieve a similar value to
the water balance derived estimate.
For RUK1 the initial runs with the CRU rainfall data generate
behavioural simulations that consistently under-estimate the higher
flows in the flow duration curve and have runoff ratios that are over
40%, a very high value even for a topographically steep area. The model
was re-run with UNIDEL rainfall scaled to generate a mean annual value
of ~1 380 mm y-1 (Table 4), after
which the runoff ratio varies from 26.8% to 29.7%, the number of
behavioural ensembles increases substantially and high flows are better
estimated. It was, however, necessary to adjust the input parameter
ranges to account for the much higher rainfall (notably the maximum soil
moisture content was increased and the maximum values of the interflow
and recharge parameters, FT and GW, were reduced). The two entries in
Table 3 for this sub-area indicate that the effects of the input data
uncertainties on the modelled processes is evident, with surface runoff
normally playing a more important role in the UNIDEL forced simulations.
The range of possible recharge depths is also greater in the UNIDEL
simulations (29 to 144 mm y-1), compared to the CRU
forced simulations (72 to 139 mm y-1). The initial
uncertainty model runs for RUK2 did not yield ensembles that had as good
statistics as the original manual calibrations (Hughes et al., 2020),
suggesting that the manual calibration parameters were a relatively
unique combination that could not be found even with 10 000 total
ensembles. Reducing the range of some of parameter inputs made a
substantial difference and generated 85 behavioural ensembles with
runoff ratio and recharge ranges of 11.9% to 14.0% and 13.4 to 40.8 mm
y-1, respectively. While most of the full ranges of
the input parameters are represented in the behavioural ensembles, the
lower estimates of PET were not. Despite increasing the rainfall input
to RUK3 (Table 4), the runoff ratios remain extremely high (50.4% to
57.0%), while the simulated recharge is also very high (120 to 305 mm
y-1), and even if the UNIDEL rainfall data are used
(Table 4), the runoff ratios remain at greater than 40%. The
distribution of runoff generation processes remains similar to the other
Lake Malawi/Nyasa sub-basins, although the high contribution of
intensity excess surface runoff is more consistent across the ensembles
than in other sub-basins. There remains a large amount of uncertainty in
the input climate data, as well as the response characteristics of this
sub-basin.
The runoff ratio range for RUH2 is high at 38.2% to 43.0%, with a
recharge range of 65 to 312 mm y-1, and as with some
other sub-basins the lower PET estimates do not seem to be valid. Both
of these values are high and the estimated recharge is quite close to
the BGS estimates (~146 mm y-1). RUH1
is downstream of RUH2 and in order to search for behavioural ensembles
independently of the effects of RUH2, the parameters of RUH2 are fixed
at values representing one of the best ensembles. RUH1 has the highest
number of behavioural ensembles and the highest maximum CS value of all
the sub-areas. The range of runoff ratios is 21.9% to 26.9%, and
recharge varies from 16.8 to 163 mm y-1 (a wide range
that also includes the BGS estimate). Table 3 illustrates that
behavioural results can be obtained with a mix of different processes
and this is further illustrated in Figures 7 and 8, and Table 5, based
on four selected ensemble members. They are those with the lowest and
highest proportion of saturated area surface runoff, and the lowest and
highest proportion of total low flow processes (interflow and
groundwater runoff). Generally, a decrease in saturated area surface
runoff is associated with increases in both intensity (adsorption)
excess surface runoff and interflow, while groundwater runoff is quite
stable across all ensemble members (Figure 8). The range of process
proportions amongst the ten best ensemble members is much lower, with
both surface runoff processes being ~25%, interflow
between 6% and 18%, while groundwater runoff is 33% to 40%. A closer
inspection of all the simulated years (1991 to 2008), suggests that
different behavioural ensemble members perform better in some years than
others. To what extent this can be associated with any uncertainties in
the accuracy of the input climate or observed stream flow data, or is
just part of the overall modelling uncertainty, is almost impossible to
resolve without more information. For NAM1 the first noticeable effect
is the consistently low values of the saturated surface runoff parameter
(SSR) within the behavioural ensembles, and this result is appropriate
given clear evidence of Dambo features. The range of runoff ratios is
9.5% to 11.1%, and recharge depths is 4.8 to 30.0 mm
y-1, both of which are quite similar to the results
for RUK2 (in a similar geographic location). As already noted for some
of this group of sub-basins, the lower estimates of PET are not included
in the behavioural ensembles.
Tables 2 and 4 both refer to some of the uncertainties in the observed
stream flow data for the Lake Malawi/Nyasa sub-basins, but these have
largely been ignored in the presentation of the results. The main reason
is that the differences between certain parts of the records are far too
large to be considered just uncertainties. Figure 9 illustrates the
problem in three sub-basins, and they all have substantial periods of
missing data within them. The majority of the total record (1957 to
2009) for NAM1 shows a consistent response and it is only the last 9
years where all flows are higher by a factor of ~4.7 on
average, a figure too high to attribute to any reasons apart from
errors. A similar situation arises for RUH1 (total record of 1972 to
2018), where the stream flows for the last 6 years are much higher,
after an extended period of 5 years of almost totally missing data. One
of the main problems with RUH1 is that almost all of the low flow months
are missing after 2012. For RUK3 it is the earlier part of the record
where the problems exist, while several years in the middle (1981 to
1994) and the later (2003 to 2018) period are consistent with each
other. This may be related to an error in the data records and failure
to convert stage observations in feet to metres. This error has been
noted in some other early Tanzanian records, and is quite simple to fix
if the raw observational data and rating curve information are made
available (which they are typically not). However, there is an
additional uncertainty issue in RUK3 and the model fit for the later
period is far better than the middle period, the former having a maximum
CS value of 3.148, compared to 1.223 for the later period (CS = 3.112
for the total period used in the model). The implication is that very
different results would be obtained if these two periods were used
separately.
DISCUSSION AND CONCLUSIONS
The CS statistic used in this study represents a useful approach for
identifying the most behavioural ensemble members. The part of the
statistic that includes the % bias objective functions assesses the
long-term water balance of the simulations, while the Nash coefficients
assess the simulations with respect to the individual monthly stream
flows. Table 3 illustrates that there is quite a wide range of maximum
CS values and a large part of this variation is expected to be related
to spatial variations in the representation of real individual monthly
rainfall values, in the absence of enough local data. However, one of
the key general observations is that uncertainties in some of the
climate inputs and the observational stream flow data are unlikely to be
the main effects causing uncertainties in the relative proportions of
the main four runoff generation processes. The results suggest that the
equifinalities in the model structure will always dominate. While it
might be postulated that the behavioural ranges given in Table 3 could
contain some outliers, Figure 10 suggests that this is generally not the
case. The weighted cumulative frequency is based on the CS values for
each ensemble member (i.e. giving slightly more weight to those with
better overall fits to the observed data). For RUH1 the ranges of all
processes could be reduced slightly as the curves have flattened tops
and bottoms. This is not the case for the semi-arid MAZ2 sub-basin.
Clearly, there are only some combinations of different processes that
are behavioural (or not), but as noted above (and illustrated in Figure
7) for RUH1, isolating these combinations is not straightforward and
simulations of similar total stream flow response can be made up of a
number of different combinations, particularly in the wetter sub-basins
where all four processes can play a substantial role. While many of the
uncertainties will be associated with inadequate representation of the
real monthly rainfall variations, additional uncertainties exist in some
of the Lake Malawi/Nyasa sub-basins where the rainfall data (both CRU
and UNIDEL) can be systematically biased (Table 4). However, even though
this may result in unrealistic simulations of the runoff ratio, the
distribution of simulated processes (and the associated uncertainties)
remains broadly the same, as illustrated by RUK1 in Table 3.
The best way to reduce some of the uncertainties is to add more
observational data, even if those data are not directly quantifying the
model process components, or are themselves uncertain. In this study
MODIS AET data have been used in some sub-basins (Table 4) to resolve
some of the long-term water balance uncertainties, largely related to
the rainfall data. LAI data have also been used to guide the
parameterisation of the interception parameters in the model, but mostly
in relative terms across the sub-basins in different parts of the basin,
as well as between the warm wet season and cooler dry season. However,
there are too many uncertainties in the conversion of LAI values
reported in the literature (see for example De Groen and Savenije, 2006;
Wu et al., 2019; Návar, J., 2020) into interception depths for different
rainfall regimes to allow the LAI values to be used directly to
constrain simulated interception depths. The example provided for KAF4
also suggests that the model can compensate for quite large
uncertainties in the simulation of interception depth by changing the
simulations of soil moisture evapotranspiration without substantially
affecting the accuracy of the stream flow simulations.
Some of the example sub-basins show clear evidence of Dambos (KAF4,
BAR3, NAM3 and RUK2) on Google Earth imagery, while others show some
evidence (KAF11, BAR3, GWA3, MAZ2 and RUH1). A previous study linked
these features to the occurrence of saturation excess surface runoff and
demonstrated improved simulations when this process was included in the
model (Hughes and Mazibuko, 2018). It should therefore be possible to
reduce the uncertainty ranges of the ‘Sat. surface’ column in Table 3 by
conducting a more detailed analysis (rather than the simple visual
assessment used here; Figure 4) of the frequency of Dambo occurrence.
Figure 8 illustrates that reducing the saturated area uncertainties, the
uncertainties in at least some of the other process simulations could
also be reduced. During this study it was thought that remotely sensed
soil moisture data might be useful to support this type of analysis.
However, while some of the patterns in the observational data could be
linked to landscape features, most could not. It is possible that a more
detailed investigation of the different soil moisture remote sensing
data available might reveal improved linkages with landscape features
and hydrological processes, and therefore offer some benefits for
setting up models. However, this was rather beyond the scope of this
study and should probably be conducted in areas where more ground-truth
data are available than in the large sub-basins of the Zambezi River
basin. Further development of these earth observation data, however,
could represent an important contribution to understanding processes in
data scarce areas, as demonstrated recently for the correction of
precipitation reanalyses data (Brocca et al. 2019), or for the
validation of river flow observation data (Brocca et al., 2020).
Within South Africa there is a national coverage of groundwater recharge
estimates (DWAF, 2005), which has proved to be very useful for
constraining model simulations and removing some of the equifinality in
the simulation of low flow generation processes (Tanner and Hughes,
2013). Unfortunately, the BGS estimates appear to be too uncertain for
that purpose in the Zambezi River basin. Some of the uncertainty in the
groundwater recharge estimates for this study are also related to the
inclusion of a riparian evapotranspiration component in the model. To
achieve the same groundwater outflow pattern, it is possible to have
relatively high recharge combined with a large riparian area or vice
versa. Any data that could limit the range of the riparian loss
parameter, would therefore be useful. In theory, it should be possible
to use remote sensing data for this purpose (to identify denser
vegetation, or enhanced actual evapotranspiration areas close to
channels). In practice, this could be quite difficult and time consuming
for the large areas covered by the Zambezi sub-basins.
In those basins where water use is expected to have a substantial impact
on gauged stream flows (mostly the semi-arid Zimbabwe sub-basins used in
this study), the uncertainties in the water use data has an inevitable
impact on the simulated processes. This effects mostly low flows
(through groundwater recharge and outflow processes) as the water use
volumes are relatively small compared to the wet season stream flows.
While issues of spatial scale pervade the whole modelling exercise due
to the large size of the sub-basins, the water use uncertainties can be
exacerbated by the model spatial structure, particularly when the main
water uses are from distributed farm dams. For example, in MAZ2 the
majority of the water use and farm dams are in the headwater areas,
which may be higher runoff areas on the basis of rainfall spatial
variations. The proportion of the sub-basin area that contributes to
these dams (a model parameter) should therefore take into account the
expected ‘sub-grid’ variations in runoff generation, introducing yet
another source of uncertainty.
For some of the sub-basins it can be quite easily demonstrated that
there are large uncertainties in both the forcing rainfall data, as well
as parts of the observed stream flow data records used to evaluate the
simulations. Most of these occur within the lake Malawi/Nyasa group of
sub-basins (Tables 2 and 4), but tend to have little influence on the
distribution of simulated processes. There are, of course, some
uncertainties in the rainfall data (including those inherent in the use
of a monthly time step) that are impossible to quantify without more
local observational data, but these are reflected more in the overall
quality of the simulations, rather than the simulation of individual
processes. The group of sub-basins above the Barotse floodplain
represent a situation where some identified uncertainties in the
observed stream flow data could impact on the dominant processes
simulated by the model. There are some incompatibilities between the
upstream (BAR3, BAR4 and BAR7) and downstream (BAR5 and ZAM1) observed
data that cannot be readily accounted for by the impacts of the wetland
(Figure 6). The main issue appears to be in the representation of the
peak wet season flows, particularly from the quite large sub-basin BAR7.
The simulations for BAR7 are acceptable compared to observed stream
flows and the process representations (mostly interflow and groundwater
outflow) are consistent with the other sub-basins underlain by deep
Kalahari sand deposits (Table 3). However, to achieve a match to the
downstream observed data, this sub-basin would require much higher wet
season peaks generated by surface runoff processes, similar to BAR7
(Table 3). This study was not able to resolve these incompatibilities
and further assessments of the hydrological responses in the western
headwaters of the Zambezi are strongly recommended.
In terms of the potential benefits to simulating ungauged sub-basins,
referred to in the introduction, there is still too much uncertainty in
the simulation of individual processes and not enough observational data
to support their identification. The use of regionalised indices of the
total response of sub-basins, both internationally (Westerberg et al.,
2016), as well as for southern Africa (Hughes, 2019; Kabuya et al.,
2020), seem to remain the best recommendations for dealing with ungauged
sub-basins. This paper therefore reaches similar conclusions to McMillan
(2020) that some of the indices (or signatures) are related to multiple
processes that are difficult to disentangle. This study suggests that
improved, model independent, quantification of groundwater recharge
depths offers some potential gains, as does the mapping of landscape
features (Dambos and others) that are likely to generate saturation
excess surface runoff.
ACKNOWLEDGEMENTS
The work presented in this paper was partially conducted within the
activities of the African Union - NEPAD African Network of Centres of
Excellence on Water Sciences and Technology - ACEWATER phase 2 project.
Contribution from the European Commission, in particular the
Directorate-General for International Cooperation and Development
(DEVCO) and the Joint Research Centre (JRC), is gratefully acknowledged.
The authors would like to thank the Zambezi Watercourse Commission
(ZAMCOM) for making available the stream flow information used in the
analyses. We are grateful to Dr Sukhmani Mantel of Rhodes University for
helping to process some of the soil moisture data.
SOFTWARE AND DATA AVAILABILITY
The Pitman model is available as part of the SPATSIM modelling framework
available from https://www.ru.ac.za/iwr/research/spatsim/. Further
details about the Pitman model are included in the documentation
included with the download (see the Pitman_Guide.pptx file in the
SPATSIM_V3/doc folder). The model setup (including the forcing data,
parameter sets, simulation results, etc.) can be obtained on request
from one of the authors, subject to some restrictions on the
distribution of the observed streamflow data.
REFERENCES
Alfieri L, Lorini V, Hirpa F, Harrigan S, Zsoter E, Prudhomme C, Salamon
P. 2019. A global streamflow reanalysis for 1980-2018, Journal of
Hydrology X, 6. https://doi.org/10.1016/j.hydroa.2019.100049.
Beven K. 2006. A manifesto for the equifinality thesis. Journal of
Hydrology 320: 18–36. https://doi.org/10.1016/j.jhydrol.2005.07.007.
Beven KJ. 2009. Environmental modelling: An uncertain future? Routledge,
Abingdon, UK.
Beven KJ. 2012. Causal models as multiple working hypotheses about
environmental processes. Comptes Rendus Geoscience 344: 77–88.
https://doi.org/10.1016/j.crte.2012.01.005.
Blöschl G, Sivapalan M, Wagener T, Viglione A, Savenije H. (Eds.). 2013.
Runoff Prediction in Ungauged Basins. Synthesis Across Processes, Places
and Scales. Cambridge University Press, UK.
Brocca L, Filippucci P, Hahn S, Ciabatta L, Massari C, Camici S,
Schüller L, Bojkov B, Wagner W. 2019. SM2RAIN–ASCAT (2007–2018):
Global Daily Satellite Rainfall Data from ASCAT Soil Moisture
Observations. Earth System Science Data, 11 (4): 1583–1601.
https://doi.org/10.5194/essd-11-1583-2019.
Brocca L, Massari C, Pellarin T, Filippucci P, Ciabatta L, Camici S,
Kerr YH, Fernández-Prieto D. 2020. River Flow Prediction in Data Scarce
Regions: Soil Moisture Integrated Satellite Rainfall Products Outperform
Rain Gauge Observations in West Africa. Scientific Reports 10
(1): 12517. https://doi.org/10.1038/s41598-020-69343-x.
Busker T, de Roo A, Gelati E, Schwatke C, Adamovic M, Bisselink B, Pekel
J-F, Cottam A. 2019. A global lake and reservoir volume analysis using a
surface water dataset and satellite altimetry. Hydrol. Earth Syst. Sci.,
23, 669–690. https://doi.org/10.5194/hess-23-669-2019.
Chien H, Mackay DS. 2014. How much complexity is needed to simulate
watershed streamflow and water quality? A test combining time series and
hydrological models. Hydrological Processes 28(22): 5624–5636.
https://doi.org/10.1002/hyp.10066.
Clark MP, Rupp DE, Woods RA, Tromp-van Meerveld HJ, Peters NE, Freer JE.
2009. Consistency between hydrological models and field observations:
Linking processes at the hillslope scale to hydrological responses at
the watershed scale. Hydrological Processes, 23(2): 311-319. https://
10.1002/hyp.7154.
Das N N, Entekhabi D, Dunbar RS, Chaubell MJ, Colliander A, Yueh S,
Jagdhuber T et al. 2019. The SMAP and Copernicus Sentinel 1A/B Microwave
Active-Passive High Resolution Surface Soil Moisture Product.Remote Sensing of Environment 233: 111380.
https://doi.org/10.1016/j.rse.2019.111380.
De Groen MM, Savenije HHG. 2006. A monthly interception equation based
on the statistical characteristics of daily rainfall. Water Resources
Research, 42: 1-10. https://doi.org/10.1029/2006WR005013.
Dorigo W, Wagner W, Albergel C, Albrecht F, Balsamo G, Brocca L, Chung D
et al. 2017. ESA CCI Soil Moisture for Improved Earth System
Understanding: State-of-the Art and Future Directions. Remote Sensing of
Environment 203: 185–215.
https://doi.org/10.1016/j.rse.2017.07.001.
DWAF. 2005. Groundwater Resource Assessment II. Department of Water
Affairs and Forestry, Pretoria, South Africa.
Euser T, Winsemius HC, Hrachowitz M, Fenicia F, Uhkenbrook S, Savanije
HHG. 2013. A framework to assess the realism of model structures using
hydrological signatures. Hydrology and Earth System Sciences, 17:
1893–1912. https://doi.org/10.5194/hess-17-1893-2013.
Fenicia F, Savenije HHG, Matgen P, Pfister L. 2008. Understanding
catchment behavior through stepwise model concept improvement. Water
Resources Research, 44. http://dx.doi.org/10.1029/2006WR005563.
Gallart F, Latron J, Llorens P, Beven K. 2007. Using internal catchment
information to reduce the uncertainty of discharge and baseflow
predictions. Advances in Water Resources 30(4): 808–823.
https://doi.org/10.1016/j.advwatres.2006.06.005.
Gan, T.Y., Dlamini, E.M., Biftu, G.F., 1997. Effects of model complexity
and structure, data quality, and objective functions on hydrologic
modelling. Journal of Hydrology, 192(1-4), 81-103. http://dx.doi.org/
10.1016/S0022-1694(96)03114-9
Gonzalez Sanchez R, Seliger R, Fahl F, De Felice L, Ouarda TBMJ,
Farinosi F. 2020. Freshwater use of the energy sector in Africa. Appl.
Energy, 270: 115171.
https://doi.org/10.1016/j.apenergy.2020.115171.
Gruber A, Dorigo WA, Crow W, Wagner W. 2017. Triple Collocation-Based
Merging of Satellite Soil Moisture Retrievals. IEEE Transactions on
Geoscience and Remote Sensing 55(12): 6780–92.
https://doi.org/10.1109/TGRS.2017.2734070.
Gruber A, Scanlon T, van der Schalie R, Wagner W, Dorigo, W. 2019.
Evolution of the ESA CCI Soil Moisture Climate Data Records and Their
Underlying Merging Methodology. Earth System Science Data 11(2):
717–39. https://doi.org/10.5194/essd-11-717-2019.
Harris I, Jones PD, Osborn TJ, Lister DH. 2014. Updated high-resolution
grids of monthly climatic observations – the CRUTS3.10 dataset.
International Journal of Climatology, 34 (3): 623–642.
https://doi.org/10.1002/joc.3711.
Hrachowitz M, Savenije HHG, Blöschl G, McDonnell JJ, Sivapalan M,
Pomeroy JW, Arheimer B, Blume T, Clark MP, Ehret U, Fenicia F, Freer JE,
Gelfan A, Gupta HV, Hughes DA, Hut RW, Montanari A, Pande S, Tetzlaff D,
Uhlenbrook S, Wagener T, Winsemius HC, Woods RA. 2013. A decade of
Predictions in Ungauged Basins (PUB) - a review. Hydrological Sciences
Journal, 58(7): 1198-1255. https://doi.org/10.1080/02626667.2013.803183.
Hughes DA. 2004. Incorporating ground water recharge and discharge
functions into an existing monthly rainfall‐runoff model. Hydrological
Sciences Journal, 49(2): 297–311.
https://doi.org/10.1623/hysj.49.2.297.34834.
Hughes DA. 2013. A review of 40 years of hydrological science and
practice in southern Africa using the Pitman rainfall‐runoff model.
Journal of Hydrology, 501: 111–124.
https://doi.org/10.1016/j.jhydrol.2013.07.043.
Hughes DA. 2016. Hydrological modelling, process understanding and
uncertainty in a southern African context: lessons from the northern
hemisphere. Hydrological Processes, 30(14): 2419-2431.
https://DOI.org/10.1002/hyp.10721.
Hughes DA. 2019. Facing a future water resources management crisis in
sub-Saharan Africa. Journal of Hydrology: Regional Studies, 23.
https://doi.org/10.1016/j.ejrh.2019.100600
Hughes DA, Farinosi F. 2020. Assessing development and climate
variability impacts on water resources in the data scarce Zambezi River
basin. Part 2: Simulating future scenarios of climate and development.
Journal of Hydrology: Regional Studies. Under review.
Hughes DA, Mantel SK, Farinosi F. 2020. Assessing development and
climate variability impacts on water resources in the data scarce
Zambezi River basin. Part 1: Initial model setup. Journal of Hydrology:
Regional Studies. Under review.
Hughes DA, Mantel SK. 2010. Estimating the uncertainty in simulating the
impacts of small farm dams on streamflow regimes in South Africa.
Hydrological Sciences Journal, 55 (4): 578-592.
https://doi.org/10.1080/02626667.2010.484903.
Hughes DA, Mazibuko S. 2018. Simulating saturation excess surface
run-off in a semi-distributed hydrological model. Hydrological
Processes, 32: 2685-2694. https://doi.org/10.1002/hyp.13182.
IFPRI (International Food Policy Research Institute). 2019. Global
Spatially-Disaggregated Crop Production Statistics Data for 2010 Version
1.1. https://doi.org/10.7910/DVN/PRFF8V, Harvard Dataverse, V3.
Jakeman AJ, Hornberger GM. 1993. How much complexity is warranted in a
rainfall-runoff model? Water Resources Research 29(8): 2637–2649.
Kabuya PM, Hughes DA, Tshimanga RM, Trigg MA, Bates P. 2020.
Establishing uncertainty ranges of hydrologic indices across climate and
physiographic regions of the Congo River Basin, Journal of Hydrology:
Regional Studies, 30. https://doi.org/10.1016/j.ejrh.2020.100710.
Kirchner JW. 2006. Getting the right answers for the right reasons:
Linking measurements, analyses, and models to advance the science of
hydrology. Water Resources Research 42:
https://doi.org/10.1029/2005WR004362.
Limpitlaw D, Gens R. 2006. Dambo mapping for environmental monitoring
using Landsat TM and SAR imagery: Case study in the Zambian Copperbelt.
International Journal of Remote Sensing, 27(21): 4839–4845.
https://doi.org/10.1080/01431160600835846.
Lucey JTD, Reager JT, Lopez SR. 2020. Global partitioning of runoff
generation mechanisms using remote sensing data. Hydrology and Earth
Systems Science, 24: 1415-1427.
https://doi.org/10.5194/hess-24-1415-2020.
MacDonald AM, Bonsor HC, Dochartaigh BÉÓ, Taylor, RG. 2012. Quantitative
Maps of Groundwater Resources in Africa. Environmental Research Letters,
7(2): 024009. https://doi.org/10.1088/1748-9326/7/2/024009.
Mao J, Yan B. 2019. Global Monthly Mean Leaf Area Index Climatology,
1981-2015. ORNL DAAC, Oak Ridge, Tennessee, USA.
https://doi.org/10.3334/ORNLDAAC/1653.
McMillan, H. 2020. Linking hydrological signatures to hydrologic
processes: A review. Hydrological Processes, 34, 1393-1409.
https://doi.org/10.1002/hyp.13632.
McMillan HK, Clark MP, Bowden WB, Duncan M, Woods RA. 2011. Hydrological
field data from a modeller’s perspective: Part 1. Diagnostic tests for
model structure. Hydrol. Process. 25: 511–522.
https://doi.org/10.1002/hyp.7841.
McMillan H, Krueger T, Freer T. 2012. Benchmarking observational
uncertainties for hydrology: rainfall, river discharge and water
quality. Hydrol. Process. 26: 4078–4111.
https://doi.org/10.1002/hyp.9384.
Moore RJ. 1985. The probability‐distributed principle and runoff
production at point and basin scales. Hydrological Sciences Journal,
30(2): 273–297. https://doi.org/10.1002/hyp.13632.
Návar J. 2020. Modeling rainfall interception loss components of
forests. Journ. Hydrol. 584.
https://doi.org/10.1016/j.jhydrol.2019.124449.
Pechlivanidis IG, Jackson BM, Mcintyre NR, Wheater HS. 2011. Catchment
scale hydrological modelling: A review of model types, calibration
approaches and uncertainty analysis methods in the context of recent
developments in technology and applications. Global Nest Journal, 13
(3): 193-214.
Pekel J.-F, Cottam A, Gorelick N, Belward A S. 2016. High-resolution
mapping of global surface water and its long-term changes. Nature,
540(7633): 418–422.
https://doi.org/10.1038/nature20584.
Perrin C, Michel C, Andréassian V. 2001. Does a large number of
parameters enhance model performance? Comparative assessment of common
catchment model structures on 429 catchments. Journal of Hydrology,
242(3–4): 275–301. https://doi.org/10.1016/S0022-1694(00)00393-0.
Pokhrel P, Gupta HV. 2009. Regularized Calibration of a Distributed
Hydrological Model Using Available Information About Watershed
Properties and Signature Measures. IAHS-AISH Publication No. 333:
20–25.
Pitman WV, 1973. A Mathematical Model for Generating Monthly River Flows
from Meteorological Data in South Africa Report No. 2/73. Hydrological
Research Unit, University of the Witwatersand, Johannesburg, South
Africa.
Sadeghi M, Gao L, Ebtehaj A, Wigneron J-P, Crow WT, Reager JT, Warrick
AW. 2020. Retrieving global surface soil moisture from GRACE satellite
gravity data. Journal of Hydrology, 584.
https://doi.org/10.1016/j.jhydrol.2020.124717.
Seibert J, McDonnell JJ. 2002. On the dialog between experimentalist and
modeler in catchment hydrology: use of soft data for multicriteria model
calibration. Water Resources Research 38(11): 1241.
https://doi.org/10.1029/2001WR000978.
Tanner JL, Hughes DA. 2013. Assessing uncertainties in surface-water and
groundwater interaction modelling - a case study from South Africa using
the Pitman model. Chapter 9 In: J. Cobbing, S. Adams, I. Dennis and K.
Riemann (Editors), Assessing and Managing Groundwater in Different
Environments, International Association of Hydrogeologists Selected
Papers. CRC Press, Taylor and Francis Group, London UK, 121-134.
https://doi.org/10.1201/b15937.
Todini E. 2011. History and perspectives of hydrological catchment
modelling. Hydrology Research, 42 (2-3): 73-85.
https://doi.org/10.2166/nh.2011.096.
Velpuri NM, Senay GB, Singh RK, Bohms S, Verdin JP. 2013. A
comprehensive evaluation of two MODIS evapotranspiration products over
the conterminous United States: using point and gridded FLUXNET and
water balance ET. Remote Sens. Environ., 139: 35-49.
https://doi.org/10.1016/j.rse.2013.07.013.
von der Heyden C J. 2004. The hydrology and hydrogeology of dambos: A
review. Progress in Physical Geography, 28(4): 544–564.
https://doi.org/10.1191/0309133304pp424oa.
Ward RC. 1984. On the response to precipitation of headwater streams in
humid areas. Journal of Hydrology, 74 (1-2): 171-189.
https://doi.org/10.1016/0022-1694(84)90147-1.
Ward RC. 1985. Hypothesis-testing by modelling catchment response, II.
An improved model. Journal of Hydrology, 81 (3-4): 355-373.
https://doi.org/10.1016/0022-1694(85)90038-1.
Westerberg IK, McMillan HK, 2015. Uncertainty in hydrological
signatures. Hydrol. Earth Syst. Sci., 19: 3951–3968.
https://doi.org/10.5194/hess-19-3951-2015.
Westerberg IK, Wagener T, Coxon G, McMillan HK, Castellarin A, Montanari
A, Freer, J. 2016. Uncertainty in hydrological signatures for gauged and
ungauged catchments. Water Resour. Res., 52: 1847–1865.
https://doi.org/10.1002/2015WR017635.
Willmott CJ, Matsuura K. 2001. Terrestrial Air Temperature and
Precipitation: Monthly and Annual Time Series (1950 - 1999),
http://climate.geog.udel.edu/~climate/html_pages/README.ghcn_ts2.html.
Winsemius HC, Schaefli B, Montanari A, Savenije HHG. 2009. On the
calibration of hydrological models in ungauged basins: a framework for
integrating hard and soft hydrological information. Water Resources
Research, 45. https://doi.org/10.1029/2009WR007706.
Wu J, Liu L, Sun C, Su Y, Wang C, Yang J, Liao J, He X, Li Q, Zhang C,
Zhang H. 2019. Estimating Rainfall Interception of Vegetation Canopy
from MODIS Imageries in Southern China. Remote Sens. 2019, 11: 2468.
https://doi.org/10.3390/rs11212468
LIST of FIGURES
Figure 1 Structure of the main sub-basin runoff generation components of
the Pitman model (the model parameter symbols are shown in italics,
while the full names are given for the state variables, such as IQ, RCH,
S, etc.).
Figure 2 Zambezi River basin, riparian countries and simulated
sub-basins (the 19 gauged areas used in this study are shaded in grey).
The gauge at BAR6 is used to help resolve some of the uncertainties in
the upstream area.
Figure 3 Minimum and maximum mean monthly LAI values (Mao and Yan, 2019)
for all sub-basins and some sample seasonal distributions.
Figure 4 A Google Earth image of part (~2 000
km2) of the KAF4 sub-basin showing the light coloured
Dambo features (a), and the relationship between simulated relative soil
moisture content and saturated area for different SSR parameters of the
Pitman model (b).
Figure 5 Runoff processes simulated by two equally behavioural
ensembles, with low and high groundwater recharge estimates.
Figure 6 Simulated inflows and outflows for the Barotse floodplain
sub-basin (BAR5) and observed flows at BAR6.
Figure 7 Observed and simulated (four ensemble members) stream flows for
RUH1. The simulations are drawn from the behavioural ensembles with some
extremes of different process representations.
Figure 8 Details of the process simulations for the same four ensembles
used in Figure 7 for RUH1 (note that differences in the responses for
individual years between Figure 7 and 8 are associated with the
inclusion of upstream flows from RUH2 in the total stream flow data
shown in Figure 7).
Figure 9 Flow duration curves for some of the Lake Malawi/Nyasa
sub-basins using different periods of the observed stream flow records.
Figure 10 Weighted (using the CS values) cumulative frequency
distributions of process proportions for all the behavioural ensemble
members for RUH1 and MAZ2 sub-basins.
Table 1 Details of the gauge station data used in the study.