Results
A genetic algorithm (GA) is used to optimize the three scaling factors
for the RDII impulse response functions (IRFs). The same method was used
to calibrate the total sewer flow simulated by the SWMM RTK method for
comparison. The efficiency of both RDII estimation methods is compared
using the modified Nash-Sutcliffe coefficient.
\(E_{j}=1-\frac{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-Q_{m}^{t})}^{2}}}{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-\overset{\overline{}}{Q_{0}})}^{2}}}\)(15)
where \(Q_{0}^{t}\) is observed discharge at time t [T],\(Q_{m}^{t}\) is modeled discharge at time t[L3/T], and \(\overset{\overline{}}{Q_{0}}\) is
the average of observed discharge [L3/T]. The
coefficient ranges from -∞ to 1 and E = 1 corresponds to a
perfect match between the observed discharge and the modeled discharge.j is a weighting factor (j = 1, 2, and 3).Wj is a weighting factor with the index j= 1 is applied to low flows, j = 2 is applied to medium flows,
and j = 3 is applied to peak flow values. In the conventional
Nash-Sutcliffe method, all three weighting factors are identical
(W 1 = W 2 =W 3). In this study, weighting factors are
adjusted so that the larger RDII peaks are emphasized. This modified
Nash-Sutcliffe method is suitable as RDII only occurs during storm
events.
The calibration period was from May 9, 2009, to June 7, 2009, and the
validation period was from June 9, 2009, to July 8, 2009. The IRF method
has three parameters to calibrate: roof connection scaling factor (R),
sump pump connection scaling factor (S), and leaky lateral scaling
factor (L). The RTK method has nine parameters to calibrate: R1, R2, R3,
T1, T2, T3, K1, K2, and K3. R is a ratio of I&I discharge volume to the
rainfall volume: R1 is for a fast inflow element, while R2 and R3
represent slower infiltration elements. T is the time to peak in each
hydrograph (typically expressed in hours), and K is the ratio of time of
recession to the time to peak.
For the GA optimization conditions, the size of the population was set
as 100, and the maximum number of generations was set as 300 for both
models approaches. Value 0.95 is selected as the probability of
crossover for both IRF and RTK calibration. The probability of mutation
is set as 0.06.
The calibrated parameter solutions for the IRF and RTK methods are
presented in Table 1. The Nash-Sutcliffe model efficiency coefficient of
the IRF solution is 0.534 in the calibration period and 0.560 in the
validation period. The modified Nash-Sutcliffe coefficients for the IRF
solution were 0.892 for the calibration period and 0.866 for the
validation period when the Nash-Sutcliffe weighting factors were set asW 1 = 3 for Q > 90-th
percentile, W 2 = 2 for 80- < Q< 90-th percentile, W 3 = 1 for Q< 80-th percentile. Assigning larger weighting factors for
high flows improved the model fit significantly. The Nash-Sutcliffe
coefficient of the best RTK solution was 0.848 in the calibration period
and 0.795 in the validation period.
Though the model fitness was improved by using the modified
Nash-Sutcliffe method, model efficiency based on the RTK method was
higher since the RTK method has three times more parameters to adjust,
nine instead of three parameters. However, in the validation period,
model efficiency was increased for the IRF solution while it was
decreased for the RTK solution. This may imply the pitfall of the RTK
method that the method is not consistent and may not be very robust.
The optimal solution of the IRF scaling factors using the GA is: R =
3,359 for roof, S = 22,653 for sump pump, and L = 19,985 for lateral.
These values can be interpreted as the RDII volume contribution of each
RDII source (Table 1). Contributing flow volume of each RDII source is
derived by multiplying the per-unit-area flow volume of IRFs and the IRF
weighting coefficients (Table 2). Then the contributing RDII volume from
the roof, sump pump, and lateral become 9,710 m3,
22,653 m3, and 32,543 m3,
respectively, and they are 15%, 35%, and 50% of total estimated RDII
flow volume. This simple calculation shows that the IRF result can be
interpreted as the RDII volume contribution of different RDII sources,
which shows the most problematic RDII contributor in the system
volume-wise. These values need to be interpreted with caution as the IRF
model application in this study is only one realization of a real
system, and each sewershed is unique in terms of factors that contribute
to RDII. However, this result can still provide insights into the RDII
behavior of the system by providing the physical meaning of the
solutions.
The IRF approach tends to be more robust because three parameters adjust
three IRF that represent processes based on physics. Each IRF shape is
defined independently using physics-based models, and the weighting
parameters reflect the contribution from each of the three IRF. The IRF
solutions are a unique solution, no matter how randomly the initial
population was selected. In contrast, the RTK method gives different
solutions every time the model runs. As an example, 30 sets of three RTK
hydrograph solutions display widely variable results, as presented in
Figure 5. Within the user-specified range for each hydrograph, the
solution can be vastly different for each run. The Nash-Sutcliffe
coefficient of the best case was 0.848, and that of the worst-case was
0.681. Depending on the user-specified ranges of each parameter, the
results can vastly differ, and the performance is not guaranteed.
RTK method has many local optimal solutions, which indicates that nine
coefficients are not independent. Thus the starting points or
constraints of the parameters cause other parameters to adjust to obtain
a local optimum that behaves similarly good for calibration data. Box
plots of the nine RTK parameters from the 30 model runs are presented in
Figure 6. Greater variability is observed in RTK parameters for the
second and third triangular hydrographs, especially the third one. This
is because the model tries to adjust these parameters according to the
given constraints of earlier parameters. Technically, different RTK
local solutions can result in the same model fitness. Change in one
hydrograph affects the other two hydrographs to change in a way to
achieve the best fitness. This indicates the problem of the RTK method
that physical processes are not reflected in the modeling.
Figure 7 shows the prediction of the monitored flow hydrograph using the
IRF solution and the best case of the RTK solutions during the
calibration period (Figure 7(a)) and the validation period (Figure
7(b)). On June 24, both methods predict flow peaks, but the peak is not
observed in the monitored flow record. The flow peak might have happened
in such a short period, and the flow monitor might have failed to
capture the peak. Overall, the RTK method tends to follow the monitored
hydrograph well, especially at the falling limbs of peaks, while IRF
tends to underestimate the flow at the falling limbs.
The volume and the peak flow values for the estimated DWF, observed
sewer flow, IRF model result, and RTK model result are summarized in
Table 3. Flowrate 0.3 m3/s is selected to define the
beginning and the end of each storm. The observed sewer flow, IRF
results, and RTK results are compared to the estimated DWF using the
following equation.
\(Compare\ to\ DWF=\frac{\text{Observed\ sewer}}{\text{Estimated\ DWF}}\times 100\)(16)
The observed sewer flow is three to four times of DWF in volume and
three to six times in peaks during the storms. Considering the
monitoring location is sanitary only, a great deal of RDII exists in the
area.
The IRF result and RTK result are compared to the observed sewer flow
using the following equation.
\(Compare\ to\ observed\ RDII=\frac{Predicted\ RDII-Observed\ RDII}{\text{Observed\ RDII}}\times 100\)(17)
Both models underestimated the flow volume; the IRF method
underestimates flow volume by 9% to 28%, and the RTK method
underestimates flow volume by 4% to 26% compare to monitoring volume.
In terms of flow peaks, the IRF method overestimated peak flowrate for
May 13, May 27, and June 11 storms by 19%, 25%, and 9%, respectively.
At the same time, the IRF method underestimated peak flowrate for May 15
and June 16 by 15% and 8%, respectively. RTK method overestimated peak
flowrate consistently from 1% to 16%.
Residual plots of the IRF and the best RTK solutions for the calibration
period and the validation period are presented in Figure 8. Residuals
are the difference between the observed value of the dependent variable
and the predicted value. Each data point has one residual and is defined
with the following equation.
Residual = Observed value – Predicted value (18)
Residuals are plotted against the observed value in the x-axis. There
are clusters of points at low flowrate, which represent tails in the
hydrographs. In Figure 8(a), IRF underestimates the peaks as most of the
residuals are on the positive side. These points were from the storms on
May 15, 2009, and May 27, 2009. The same trend exists in the validation
period, and the outliers were from the storms on June 11, 2009, and June
16, 2009 (Figure 8(b)). In the validation period, RTK also
underestimated peaks as most of the high flow points are on the positive
side. This means the best RTK solution for the calibration period loses
efficiency in the validation period. This explains the decrease of the
Nash-Sutcliffe coefficient of the RTK method in the validation period,
as presented in Table 1, and supports that the RTK method is more of a
curve fitting method with a limited physical meaning.