Introduction

\label{sec-1}

Approach

\label{sec-1-1} When addressing issues that require the consideration of long term environmental factors, decision makers worldwide are increasingly looking to global and regional climate simulations to inform choices and policies. However, the initial design and evaluation of said simulations will not typically have addressed the specific locale and temporal/spatial scales relevant to individual applications. Rather than taking climate data ‘at face value’, it is crucial to evaluate what relevant information it offers. This is not only a needed ‘sanity check’ for the application of climate information, but also serves as an informative framework when analyzing current models and observational products. The goal of this study is to illustrate a ‘bottom up’ analysis, linking local phenomena to signals within existing climate resources, by evaluating the extent to which Regional Climate Model (RCM) simulations can soundly estimate the odds of one dry year being followed by another dry year at a specific location in West Africa. The purpose of starting the analysis at the local level is to see at what point it becomes tenuous to link local information to that provided by external sources; i.e., points where our analysis ‘dead-ends’. This allows us to critically evaluate where limitations exist in using current resources to describe potential impacts, as opposed to producing a ‘best estimate’. This information can then be used to inform future development of regional climate simulations and (more pressingly) analysis methods. It can further provide climate information users with insights into the spatial and temporal scales, as well as climatic conditions, at which a simulation may be considered “credible.” However, as noted by \citep{smith_limits_2000} this can vary across the globe, which this providing a location specific case example.

Focusing on a given region and impact allows us to define a specific question wrt this issue, providing more definitive results than notions of general global performance. Large scale indicators of system behaviour are conceptually informative; e.g., Marotzke and Forster \cite{marotzke_forcing_2015}. However, it is important to remember that ‘skill is a function of scale’ \citep{sakaguchi_temporal-_2012}, and that while, for example, a Global Circulation Model (GCM) may skillfully capture the African Easterly Jet it does not mean that its output is appropriate for management decisions at the city scale. In light of this, here we consider Ghana’s Akosombo Dam, located at the southern end of Lake Volta in the Volta Basin, where the odds of a dry year being followed by a dry year can be considered critical for dam management decisions. We focus on regional climate processes already recognised as central to local weather patterns and readily available data sets, including information available in the output from the Coordinated Regional Climate Downscaling Experiment (CORDEX) project \citep{giorgi_addressing_2009}.

This study represents the ‘busy work of climate science’; examining simulations and observations to estimate their information content, to set baselines and directions for further investigations ¹, that can, in the long run, improve the social applicability of the research.

Method behind the Approach

\label{sec-1-2}

The motivation behind our approach is best illustrated by starting with the ideal end point, the narrative that we would have liked to provide for managers at the Akosombo Dam. Figure 1 below shows estimates for the frequency that dry years will transition to another dry year over the Volta basin, and how this may shift into the future under different emission scenarios. The figure shows both decrease and increase in frequency into the future, yet with a notable increase in frequency in the distant future under the high emission scenario, RCP8.5, reflecting what could indicate changing interannual rainfall patterns.

In the figure the frequency at which one dry year is followed by another dry year has been estimated for five different 20 year periods. Here the ‘dry’ years were considered to be the years during which the total summer season rainfall over the Volta basin are in the bottom third of values observed within 20 year periods ². Simulations for each of the five 20 year periods were produced using the DMI-HIRHAM Regional Climate Model ³. Reanalysis forcing was used for the evaluation period (1989-2008), where simulations were found to match the observational record (or rather, a selected observational record, as outlined in section 2.2.1). For each of the four future time periods two sets of RCM simulations are produced, one forced by the RCP 4.5 emission scenario and one forced by the RCP 8.5 emission scenario. A standard statistical analysis methodology ⁴ is then applied to produce estimates of the frequency at which one dry year is followed by another dry year, for each of the five different 20 year periods.

As can be observed in figure 2, the evaluation period (1989-2008) shows a frequency of recurring dry years at about fifty percent, with wide confidence intervals. For the two simulated emission scenarios, RCP4.5 (blue bars) and RCP8.5 (yellow bars), the frequency varies from period to period. The frequency of recurring dry years under the RCP8.5 does drop initially, yet increases notably from the present day by the last time-slice. These changes could indicate that the interannual rainfall patterns could be changing. However, due to the limited sample size (20 year periods) there is a low probability that most of the changes are statistically significant. Under more moderate external forcing, the RCP4.5 scenario, the odds of recurring dry years does not seem to change significantly.

This investigation into the frequency of recurring dry years and how the odds of such an occurrence might change with time, thus yielded a narrative that appears to be of potential value. However, there are a number of caveats to the process through which this figure was produced.

Impact studies often follow such a ‘top down’ trajectory; e.g, McCartney et al. (2012). Information is propagated from models producing global simulations, to those depicting possible regional realizations, to those estimating potential impacts. At every stage inherent limitations in predictability and the ability to definitively identify and describe relevant mechanisms cause uncertainties to ‘cascade’ through the system \citep{winsemius_framework_2013}. As such, when moving from global simulations to regional realizations to potential impacts, each stage will typically expand on the number of possible states of the climate system produced by the previous one, essentially defining a broad range of conceivable system states. However, the ability to fully describe the complete extent of possible system states is limited by technical considerations, such as access to the required computing power. This means that the climate data and projections that are currently available represent only an erratic sampling from the possible system states \citep{knutti_challenges_2010}. Even if studies were not limited by technical considerations, limitations in current knowledge and in the predictability of the climate system would still prevent a complete set of possible system states from being produced. If a reasonable sampling of what are considered possible system states could be produced, observational data could then be used to constrain the distribution to plausible realizations \citep{van_oijen_bayesian_2011}. Typically though, observational data and that produced through simulations do not overlap \citep{annan_climate_2002}, with no clear approach for mapping between them \citep{reichert_linking_2012}. That is, often simulations describe a set of system states different in character from those observed in the ‘real world,’; e.g. the location, variability, and/or local response to large scale circulation features may show time varying differences from what is observed in the ’real-world’ \citep{valdes_built_2011,daron_predicting_2013,liu_why_2014}. An even larger obstacle is that for much of the globe observational records are scare, or equivalently, the uncertainty in these measurements is undetermined; i.e. their relationship to ‘what actually happened’ is unknown \citep{parker_reviews_2011}. As such, there is often no fixed ‘ground truthing’ available by which to evaluate the accuracy of estimates at local scales.

It can be very challenging to determine if or how it is possible to translate data from within the cascade of uncertainties \citep{snyder_complex_2011} described above to information that is pertinent on local scales. As such, often analyses of climate simulations ‘dilute’ such data, either by reporting a plethora of conflicting outputs, or by smoothing away details by averaging across the ensemble of simulated system states. When considering specific impacts, however, we would like to ‘distill’ what, if any, actionable messages can be derived from available data. Studies that feed such information through local impact models typically include caveats regarding sources of uncertainty, but with little insight into how to interpret the model output in light of these. That the simulation represents the best information at hand does not preclude that it may be irrelevant to decision makers \citep{stainforth_confidence_2007}. Ideally we would estimate all uncertainties relevant to our application and determine whether they allowed for any statements that could be made with the confidence required for them to affect decision making. Due to limits in observational networks and experimental design, however, this analysis is typically impossible by the time the data reaches the analysis stage \citep{smith_what_2002}. The dilemma is that we ‘know better’ than to take climate simulations at ‘face value’ on local scales, but typically do not have the information necessary to construct robust error models.

“What is needed is the development of data-driven methodologies that are guided by theory to constrain search, discover more meaningful patterns, and produce more accurate models.”\citep{faghmous_big_2014}↩
“This approach, working with a ‘moving baseline,’ is based on the assumption that demand will be tailored to long term trends. A relatively low rainfall year, even in a ‘wetter future’, will still be considered dry if irrigation, power production, and other water uses are optimized to ‘typical’ levels. This is not an optimistic projection of future policy decisions, but a realistic one. Also, changing the baseline every 20 years keeps the analysis as relative as possible to the ‘model world’ we are investigating. This is needed due to our lack of information about the stationarity of model biases; the extent to which the difference between the reality and the model changes over time, as well as ‘external’ factors such as emissions scenarios and GCM reliability.”↩
“For this example the RCM is the DMI-HIRHAM RCM forced by the GCM ICHEC EC-Earth: https://www.ichec.ie/research/met_eireann.”↩
“Calculating transition frequencies from limited sample sizes induces a degree of uncertainty as to the best estimate for these values. This estimation uncertainty is separate from those listed above, and can be used to determine whether these values vary significantly from each other the significance of variation in these values over time. Methods are described by Welton (2005) and Pasanisi et al. (2012). In short, by considering the matrix of transition frequencies for all states to be a result of samples from a distribution constrained by the recorded transitions, we can describe the probability of a given value being a ‘good’ estimate of transition frequency between states though a Beta distribution. Parameters for the Beta distribution are defined by the number of occurrences of the initial state, and the number of transitions it makes to the state of interest.”↩