TEXT SIZE

search for



CrossRef (0)
Uncertainty decomposition in climate-change impact assessments: a Bayesian perspective
Communications for Statistical Applications and Methods 2020;27:109-128
Published online January 31, 2020
© 2020 Korean Statistical Society.

Ilsang Ohna, Seung Beom Seob, Seonghyeon Kima, Young-Oh Kimc, Yongdai Kim1,a

aDepartment of Statistics, Seoul National University, Korea;
bInstitute of Engineering Research, Seoul National University, Korea;
cDepartment of Civil and Environmental Engineering, Seoul National University, Korea
Correspondence to: 1Department of Statistics, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul 08826, Korea. E-mail: ydkim0903@gmail.com
Received October 13, 2019; Revised December 2, 2019; Accepted December 3, 2019.
 Abstract
A climate-impact projection usually consists of several stages, and the uncertainty of the projection is known to be quite large. It is necessary to assess how much each stage contributed to the uncertainty. We call an uncertainty quantification method in which relative contribution of each stage can be evaluated as uncertainty decomposition. We propose a new Bayesian model for uncertainty decomposition in climate change impact assessments. The proposed Bayesian model can incorporate uncertainty of natural variability and utilize data in control period. We provide a simple and efficient Gibbs sampling algorithm using the auxiliary variable technique. We compare the proposed method with other existing uncertainty decomposition methods by analyzing streamflow data for Yongdam Dam basin located at Geum River in South Korea.
Keywords : climate change impact study, multimodel ensemble, Bayesian statistics, uncertainty decomposition
1. Introduction

The projected impacts of climate change involve large uncertainties. Considering these large uncertainties is critical to establish robust policies. For example, a water management plan which does not consider uncertainty causes large losses if extreme events occur that are not predicted by a projection of streamflow. Thus, quantifying uncertainties of a climate-impact projections has received significant attention.

A climate-impact projection consists of multiple stages. For example, a streamflow projection includes four stages: emission scenarios, global circulation models (GCMs), bias-correction techniques and hydrological models. Emission scenarios describe how future greenhouse gases emissions could evolve based on the set of assumptions about future energy use, technological change and population levels. GCMs are computational models simulating temperature and precipitation under the selected emission scenario. Bias correction techniques remove the systematic biases of the GCM outputs with respect to the observed data. Hydrological models produce projected values of the streamflow by using projected temperature and precipitation as an input. The projection of the streamflow in past can also be produced but in this case the projection is based on observed greenhouse gas emissions instead of the emission scenario. We use the terms control period and scenario period to distinguish the past and future periods of the projection. Usually the uncertainty of a climate-impact projection is measured by the projected data in the scenario period.

A climate-impact projection usually gathers multiple projection results based on several scenarios/models/techniques in each stage. Figure 1 is an example of a streamflow projection generated by two emission scenarios, five GCMs, three bias correction techniques and four hydrological models, which results in 120 (= 2 × 5 × 3 × 4) streamflow projection results.

The total of the uncertainties, is a mixture of uncertainties produced in each stage. However, quantifying the uncertainties of stages is not straightforward since we have only projected values obtained after the last stage. Various methods have been proposed in order to quantify uncertainties of each stage (e.g., Wilby and Harris, 2006; Bastola et al., 2011; Najafi et al., 2011; Dobler et al., 2012; Lee et al., 2017). Most of these studies quantified uncertainty of each source separately and compared the magnitudes of them. These studies, however, do not guarantee that the total sum of uncertainties of the multiple sources is equal to the total uncertainty, therefore they cannot assess the relative contribution of each source to the total uncertainty. It is also difficult to identify how uncertainty is propagated as uncertainty sources are added. For example, we would like to examine how uncertainty accumulates in a combination of emissions scenarios and GCMs compared to uncertainty with only emissions scenarios alone.

To address these issues, there is a need to quantify the uncertainty of each source such that it is possible to evaluate the relative contribution to the total uncertainty of each uncertainty source. We call such analysis uncertainty decomposition. One of the simplest approaches for uncertainty decomposition is to quantify the contribution of the each uncertainty source by partitioning total variance of the projected values into components attributable to the individual sources, as is done in the analysis of variance (AVOVA) framework. Based on the ANOVA approach, Yip et al. (2011) decomposed the total uncertainty of the surface air temperature projection into uncertainties of natural variability, emission scenarios, GCMs and interactions. Using the same method, Bosshard et al. (2013) decomposed the total uncertainty of the streamflow projection into uncertainties of GCMs, statistical post-processing, hydrological models and their interactions. On the other hand, Kim et al. (2019) proposed a cumulative uncertainty method that provides a unified view of how uncertainty is propagated as each source of the projection proceeds. Kim et al. (2019) considered a cascade of several stages (i.e., sources) for streamflow projection and quantified cumulative uncertainties as stages progress. Kim et al. (2019) defined uncertainties of individual stages as the successive differences of the cumulative uncertainties.

Even though several studies on uncertainty decomposition have been proposed, there are still many areas to be improved. The first problem of the existing methods is that they do not fully utilize the information in projected values. Note that usually projected values are given at every month, but they use the average of all projected values instead of individual values. For example, the previous methods cannot be incorporated interannual variations. This is because the methods quantify uncertainty based on the averages of projected values over the future period (e.g., 50 years from 2030).

The second problem is that there is no room to use observed values in the analysis. Climate projection typically consists of two periods - control and scenario periods. In the control periods, we have projected values as well as observed values of a meteorological/hydrological variable of interest, while we have only projected values in the scenario period. The existing methods totally ignore the projected and observed values in the control period that may not yield a fully efficient result.

This study provides a Bayesian model for decomposing the total uncertainty into uncertainties of multiple sources. A Bayesian approach to analyze multimodel ensemble data has recently attracted much attention. Bayesian methods can express the complex relationships within the data, and between the data and prior knowledge in a transparent and clear structure. In addition, Bayesian methods provide a probabilistic representation of uncertainty. Tebaldi et al. (2005) is the initial work of the Bayesian uncertainty quantification. They obtained a posterior density of mean temperature change based on temporally averaged values of historic observations, GCM outputs for control period and GCM outputs for a scenario period. Tebaldi and Sansó (2009) introduced a Bayesian model that simultaneously analyzed temperature and precipitation. Smith et al. (2009) proposed a single Bayesian model that simultaneously deal with data for several regions known as the “multivariate” model in the sense that projected/observed values over different regions are treated as a multivariate response. Unlike the model of Tebaldi et al. (2005), which treated the data for the single region, the multivariate model of Smith et al. (2009) can distinguish biases of the simulators to the true climate values. Buser et al. (2009) analyzed annual data instead of temporally averaged values. The use of annual data makes it possible to take into account natural variability and linear trends in control and scenario periods.

Our proposed Bayesian model improves existing methods of decomposing the total uncertainty. First, it can model changes of mean as well as interannual variations, which makes it possible to incorporate the uncertainty of natural variability. As an illustration, we use a model in which the mean of projected values can change. In addition, the Bayesian model can utilize the projected and observed values in the observational period scientifically, which we believe leads to a better conclusion.

This article is organized as follows. Section 2 summarizes existing uncertainty decomposition methods. Section 3 provides a proposed Bayesian uncertainty decomposition method. Section 4 describes the study area and data involved and compares the proposed method with other uncertainty decomposition methods by analyzing the data. Finally, Section 5 provides the conclusion.

Terminology

Throughout this article, we refer to both emission scenarios, GCMs, bias correction techniques and hydrological models as “simulators”. There are two reasons: 1) to prevent confusion with statistical “models” for the simulator outputs and the observations; 2) to simplify sentences, e.g., we write “simulator combinations” instead of “combinations of model/scenario/technique”.

2. Review of uncertainty decomposition methods

In this section, we review existing methods for uncertainty decomposition. Suppose there are K many stages of a simulator cascade for projection. For example, a simulator cascade for streamflow projection generally consists of four stages: emission scenarios, GCMs, bias correction techniques and hydrological models. Suppose that for stage k, there are nk simulators denoted by . Let Yv be the projected value of a meteorological/hydrological variable of interest at the scenario period when specific simulators in v are used in projection, where v X : = k = 1 K X k. For example, if v = (2, 4, 3), then Yv is the projected value when simulator 2 in stage 1, simulator 4 in stage 2 and simulator 3 in stage 3 are used.

First we note that the total uncertainty can be defined as

U tot = U ( { Y v : v X } ) ,

where U denotes an uncertainty measure such as range, variance or standard deviation. The objective of the uncertainty decomposition is to quantify the uncertainty Uk of stage k for k = 1, . . . , K, which satisfies k = 1 K U k = U tot.

The ANOVA approach (Yip et al., 2011; Bosshard et al., 2013) is based on the ANOVA decomposition. In the terminology of ANOVA, the stages are “effects”, which could have an influence on the variability of the variable. The ANOVA approach quantifies the influence of the each effect by partitioning the total sum of the squares into sums of squares due to individual effects and the sum of squares due to errors. The uncertainty of stage k defined by the ANOVA approach is then

U k anova = 1 n k i X k ( Y ¯ k , i - Y ¯ ) 2 ,

which is equivalent to the sum of squares due to stage k divided by the number of projected values . Here Y_ denotes the average of all projection values and Y_k,i denotes the average of projection values where simulator i in stage k is involved, that is, and Y ¯ k , i = v : v k = i Y v / l = 1 , l k K n l. We have relationship such that

U tot = k = 1 K U k anova + U err ,

where Utot is equal to and Uerr represents the variability not explained by the effects. Since we can write U k anova = U var ( { Y ¯ k , i : i X k } ) with Uvar being the variance measure, the ANOVA approach can be generalized to any other uncertainty measure U by replacing Uvar with U. This generalization is equivalent to the “standard method” referred in Kim et al. (2019).

The disadvantage of the ANOVA approach is that it ignores relationships or interactions between stages. To clarify the problem, consider a simple example where there are two stages having two simulators each. Suppose that four projection values are given as Y(1,1) = 0, Y(1,2) = 0, Y(2,1) = 10 and Y(2,2) = −10. Even though the final projection values significantly change with respect to the choice of the simulator in the first stage, the uncertainty of the first stage quantified by using the ANOVA approach is 0 since Y_1,1=Y_1,2=0.

To overcome this problem, Lee et al. (2017) proposed an uncertainty quantification method considering the interactions between simulators from different stages. Lee et al. (2017) defined the uncertainty of stage k by

U k lee = 1 n k i X k U ( { Y v : v k = i } ) .

For the previously described example, the uncertainty of the first stage is 50 when the variance measure is used. In this case, Lee’s approach quantifies the uncertainties of individual stages more reasonably than the ANOVA approach. But Lee’s approach does not decompose uncertainty properly because k = 1 K U k lee U tot.

Kim et al. (2019) proposed the cumulative uncertainty approach that is the modified version of Lee’s approach. The cumulative uncertainty approach captures the interaction effect reasonably and decomposes the total uncertainty properly. Kim et al. (2019) defines cumulative uncertainty upto stage k by

CU k = 1 n k + 1 n K i K X K i k + 1 X k + 1 U ( { Y v : v k + 1 = i k + 1 , , v K = i K } ) .

The cumulative uncertainty can be viewed as the average of conditional uncertainties of the stages up to k given the simulators in stages after k. Then they defines uncertainty of stage k by

U k cu = CU k - CU k - 1 .

with CU0 = 0. Kim et al. (2019) showed that U k cu’s are nonnegative for most reasonable uncertainty measures including range, variance and standard deviation. The cumulative uncertainty up to the last stage is equal to the total uncertainty; therefore, by definition, the total uncertainty is equal to the sum of the uncertainties of stages.

3. Bayesian model for uncertainty decomposition

3.1. Model

For the control period, we let Xt,w be the projected value of a meteorological/hydrological variable of interest at time when specific simulators in w are used in projection, where w k = 2 K X k. Here denotes the set of time points in the control period. For the control period, emission scenarios are not used for simulator combinations, thus the projection is generated by the simulators of stage 2 to stage K. We let Xt,o be the observed value at time . Similarly, we let Yt,v be the projected value of time at the scenario period when simulator combination is used, where denotes the set of time points in the scenario period. In this paper, we consider annual data, that is, elements in and are years in the control and scenario periods, respectively. Thus we can ignore seasonal variation. Note that we do not observe the “true” value Yt,o in the scenario period.

For the observed values in the control period, our Bayesian model assumes

X t , o ~ ind N ( μ o + γ c t , σ o 2 ) ,  듼  듼  듼 t T c ,

where γc represents a linear time trend, σ o 2 represents an interannual variation. Here we centered the time variable t around the median year of the control period so that μo can be interpreted as the (true) mean value of the meteorological/hydrological variable at the median year.

For the description of the distributions of the projected values in the control period, we let Δμc,w denote the bias of the mean of Xt,w relative to μo and σ c , w 2 denote the interannual variation of Xt,w. We permit the linear time trends in the projected values to be the same as the trend in the observed values. Our Bayesian model assumes that the projected values in the control period are distributed as

X t , w ~ ind N ( μ o + Δ μ c , w + γ c t , σ c , w 2 ) ,  듼  듼  듼 t T c , w k = 2 K X k .

We also centered the time variable t around the median year of the scenario period. We let μs be the mean of the all projected values at the median year in the scenario period and Δμs,v and σ s , v 2 be the bias of the mean of Yt,v relative to μs and the interannual variation of Yt,v, respectively. We let the linear time trends in the projected values be the same for all the simulator combinations. Our Bayesian model assumes that the projected values in the scenario period are distributed as

Y t , v ~ ind N ( μ s + Δ μ s , v + γ s t , σ s , v 2 ) ,  듼  듼  듼 t T s , v k = 1 K X k .

We impose a certain structural assumption on Δμc,w, Δμs,v, σ w 2, and σ v 2 for uncertainty decomposition. We assume that

Δ μ c , w = k = 2 K Δ μ c , k , w k ,  듼  듼  듼 σ c , w 2 = σ o 2 + k = 2 K σ k , w k 2 Δ μ s , v = k = 1 K Δ μ s , k , v k ,  듼  듼  듼 σ s , v 2 = σ 1 , v 1 2 + ν k = 2 K σ k , v k 2 ,

where σ k , i 2 represents the interannual variation of simulator i in stage k, Δμc,k,i’s and Δμs,k,i’s represent the biases of simulator i in stage k in the control and scenario periods respectively, and ν is the multiplicative change in simulator interannual variations between the control and scenario periods. That is, we assume that the bias (or the interannual variation) of each simulator combination is the sum of the biases (or the interannual variation) of the individual simulators that compose the simulator combination.

Note that we assume that { σ k , i 2 : i X k, k = 2, …, K} are shared by the control and scenario periods to borrow information in the control period for the inference of the scenario period.

Our Bayesian model is similar to the model of Buser et al. (2009). We simplify the model by removing unidentifiable parameters such as additive changes in simulator biases between the control and scenario periods. We complete such a modification since our focus is to decompose the total uncertainty rather than provide physical meanings of the model.

3.2. Uncertainty decomposition

The total uncertainty can be understood as the variance of all detrended projected values Y s = { Y t , v det : t T s , v X }, where Y t , v det = Y t , v - γ t and X = k = 1 K X k. Note that the distribution of a randomly selected value from is a mixture of Gaussians given as

f ( y ) = 1 X v N ( μ s + Δ μ s , v , σ s , v 2 ) .

We define the total uncertainty τtot as the variance of f (y). By the variance formula of mixture distributions, τtot is given as

τ tot = 1 X [ v σ s , v 2 + v ( Δ μ s , v - Δ μ ¯ ) 2 ] ,

where Δ μ ¯ = v Δ μ s , v / X .

Under the assumption in Subsection 3.1, τtot can be rewritten as

τ tot = k = 1 K τ k ,

where

τ k = 1 n k i = 1 n k [ ν k σ k , i 2 + ( Δ μ k , i - Δ μ k ¯ ) 2 ] ,

where νk = 1 for k = 1, νk = ν for k ≥ 2 and Δ μ k ¯ = i = 1 n k Δ μ s , k , i / n k. We can interpret τk as the uncertainty raised by stage k. The total uncertainty is then the sum of uncertainties of each stage, which is a property required by Jones (2000). Furthermore, we define the uncertainty of each simulator by

τ k , i = 1 n k ( ν k σ k , i 2 + ( Δ μ k , i - Δ μ k ¯ ) 2 )

which satisfies a natural requirement that τ k = i = 1 n k τ k , i.

Lee et al. (2017) also quantified the uncertainty of each simulator, but they considered only the dispersion of the projected values which use the corresponding simulator, not the bias to the total mean. However, our method quantifies the simulator uncertainty more reasonably by considering both bias and dispersion.

The above definition of the total uncertainty, uncertainties of each stage and each simulator are given as conditional on the parameters. Our Bayesian method estimates the posterior distribution of the parameters as well as the uncertainties based on the data. Considering the posterior distribution, we report the posterior expectations of the uncertainties as our estimated uncertainties. Let Z be the set of all projected and observed data and E(· |Z) denote the posterior expectation given the data Z. Then we define the total uncertainty, uncertainties of each stage and each simulator as

E ( τ tot Z ) = 1 X E [ v σ s , v 2 + v ( Δ μ s , v - Δ μ ¯ ) 2 Z ] , E ( τ k Z ) = 1 n k i = 1 n k E [ ν k σ k , i 2 + ( Δ μ k , i - Δ μ k ¯ ) 2 Z ] , E ( τ k , i Z ) = 1 n k E [ ν k σ k , i 2 + ( Δ μ k , i - Δ μ k ¯ ) 2 Z ] ,

respectively.

3.3. Prior and Markov chain Monte Carlo algorithm

We let the prior of our model be as

μ o , μ s ~ N ( 0 , q μ 2 ) , γ c , γ s ~ N ( 0 , q γ 2 ) , Δ μ c , k , i , Δ μ s , k , i ~ N ( 0 , 2 ) , ( 2 ) - 1 ~ Gamma ( a , b ) , ( σ o 2 ) - 1 , ( σ k , i 2 ) - 1 ~ Gamma ( a σ , b σ ) , ν - 1 α ~ Gamma ( α , α ) , α ~ Gamma ( a α , b α ) .

We take a hierarchical prior for the simulator biases to prevent a nonidentifiable problem. Note that if any constant C is added to all the Δμs,k,i’s and subtracted from all the Δμs,k′,i’s for k′k, the distributions of the data remain unchanged since they are determined only by the sums of the simulator biases Δ μ s , v = k = 1 K Δ μ s , k , v k. The identifiability can be improved by the prior N(0, 2) with (2)−1 ~ Gamma(a, b), which implies the constraint on the Δμs,k,i’s such that i = 1 n k Δ μ s , k , i = 0 (Jackman, 2009).

The hyperparameters a, b, aσ, bσ, aα, bα, q μ 2, and q γ 2 are selected so that the corresponding priors are diffused. In our case studies, we use a = b = aσ = aσ = aα = bα = 0.001 and q μ 2 = q γ 2 = 10 , 000.

The joint distribution of the data and parameters are proportional to

( 1 σ 0 2 ) T c 2 exp  ( - 1 2 σ 0 2 t T c ( X t , o - μ o - γ c t ) 2 ) × ( w ( 1 σ c , w 2 ) T c 2 ) exp  ( - w 1 2 σ c , w 2 t T c ( X t , w - μ o - Δ μ c , w - γ c t ) 2 ) × ( v ( 1 σ s , v 2 ) T s 2 ) exp  ( - v 1 2 σ s , v 2 t T s ( Y t , v - μ s - Δ μ s , v - γ s t ) 2 ) × exp  ( - 1 2 2 ( k = 2 K i = 1 n k Δ μ c , k , i 2 + k = 1 K i = 1 n k Δ μ s , k , i 2 ) ) × exp  ( - 1 2 q μ 2 ( μ o 2 + μ s 2 ) ) × exp  ( - 1 2 q γ 2 ( γ c 2 + γ s 2 ) ) × [ ( σ o 2 ) - a σ + 1 e - b σ ( σ o 2 ) - 1 ] × [ k = 1 K i = 1 n k ( σ k , i 2 ) - a σ + 1 e - b σ ( σ k , i 2 ) - 1 ] × [ ( ν ) - α + 1 e - α ( ν - 1 ) ] × [ α a α - 1 e - b α α ] × [ ( 2 ) - a + 1 e - b ( 2 ) - 1 ] .

The conditional distributions of the parameters (except the interannual variation parameters) for the Gibbs sampling algorithm are given as:

μ o - ~ N ( μ ˜ o , 1 T c ( ( σ o 2 ) - 1 + w ( σ c , w 2 ) - 1 ) + ( q μ 2 ) - 1 ) , μ s - ~ N ( μ ˜ s , 1 T s v ( σ s , v 2 ) - 1 + ( q μ 2 ) - 1 ) , γ c - ~ N ( γ ˜ c , 1 ( ( σ o 2 ) - 1 + w ( σ c , w 2 ) - 1 ) t T c t 2 + ( q γ 2 ) - 1 ) , γ s - ~ N ( γ ˜ s , 1 ( v ( σ s , v 2 ) - 1 ) t T s t 2 + ( q γ 2 ) - 1 ) , ( 2 ) - 1 - ~ Gamma ( a + 1 2 n 1 + k = 2 K n k , b + 1 2 ( k = 2 K i = 1 n k Δ μ c , k , i 2 + k = 1 K i = 1 n k Δ μ s , k , i 2 ) ) , Δ μ c , k , i - ~ N ( Δ μ ˜ c , k , i , 1 T c w : w k = i ( σ c , w 2 ) - 1 + ( 2 ) - 1 ) , Δ μ s , k , i - ~ N ( Δ μ ˜ s , k , i , 1 T s v : v k = i ( σ s , v 2 ) - 1 + ( 2 ) - 1 ) ,

where

μ ˜ o = ( σ o 2 ) - 1 t T C ( X t , o - γ c t ) + w { ( σ c , w 2 ) - 1 t T c ( X t , w - Δ μ c , w - γ c t ) } T c ( σ o 2 ) - 1 + T c w ( σ c , w 2 ) - 1 μ ˜ s = v { ( σ s , v 2 ) - 1 t T s ( Y t , v - Δ μ s , v - γ s t ) } T s v ( σ s , v 2 ) - 1 , γ ˜ c = ( σ o 2 ) - 1 t T c t ( X t , o - μ o ) + w ( σ c , w 2 ) - 1 t T c t ( X t , w - μ o - Δ μ c , w ) ( ( σ o 2 ) - 1 + w ( σ c , w 2 ) - 1 ) t T c t 2 + ( q γ 2 ) - 1 γ ˜ s = v ( σ s , v 2 ) - 1 t T s t ( Y t , v - μ s - Δ μ s , v ) ( v ( σ s , v 2 ) - 1 ) t T s t 2 + ( q γ 2 ) - 1 , Δ μ ˜ c , k , i = w : w k = i ( σ c , w 2 ) - 1 t T c ( X t , w - μ o - h { 2 , , K } \ { k } Δ μ c , h , w h - γ c t ) T c w : w k = i ( σ c , w 2 ) - 1 + ( 2 ) - 1 , Δ μ ˜ s , k , i = v : v k = i ( σ s , v 2 ) - 1 t T s ( Y t , v - μ s - h { 1 , , K } \ { k } Δ μ s , h , v h - γ s t ) T s v : v k = i ( σ s , v 2 ) - 1 + ( 2 ) - 1 .

The conditional distributions of the interannual variation parameters are awkward forms; therefore, we use an auxiliary variable (Damlen et al., 1999) instead of a straightforward Gibbs sampling algorithm technique. Note that the distributions of Xt,w and Yt,v can be written as

X t , w = d μ o + Δ μ w + γ c t + t , w , o c + k = 2 K t , w , k c , Y t , v = d μ s + Δ μ v + γ s t + k = 1 K t , v , k s ,

where t , w , o c ~ N ( 0 , σ o 2 ) , t , w , k c ~ N ( 0 , σ k , w k 2 ) , t , v , 1 s ~ N ( 0 , σ 1 , v 1 2 ) and t , v , k s ~ N ( 0 , ν σ k , v k 2 ) for k ≥ 2, and they are independent. The joint distribution of t , w , o c , t , w , 2 c , , t , w , K - 1 c , t , w , o c + k = 2 K t , w , k c is the normal distribution with mean 0 and variance

( σ o 2 0 0 σ o 2 0 σ 2 , w 2 2 0 σ 2 , w 2 2 0 0 σ K - 1 , w K - 1 2 σ K - 1 , w K - 1 2 σ o 2 σ 2 , w 2 2 σ K - 1 , w K - 1 2 σ o 2 + k = 2 K σ k , w k 2 ) .

Note that if a multivariate normal random vector x is partitioned into x = (x1, x2), then the distribution of x1 conditional on x2 = a is x 1 x 2 = a ~ N ( μ 1 + Σ 22 - 1 ( a - μ 2 ) , Σ 11 - Σ 12 Σ 22 - 1 Σ 12 ) where μi = Exi, ii = var(xi) for i = 1, 2 and 12 = cov(x1, x2). Thus the distribution of ( t , w , o c , t , w , 2 c , , t , w , K - 1 c ) conditional on t , w , o c + k = 2 K t , w , k c = X t , w - μ o - Δ μ w - γ c t is the normal distribution with mean

m c , w : = X t , w - μ o - Δ μ w - γ c t σ o 2 + k = 2 K σ k , w k 2 Σ w

and variance

S c , w : = diag ( σ o 2 , σ 2 , w 2 2 , , σ K - 1 , w K - 1 2 ) - 1 σ o 2 + k = 2 K σ k , w k 2 Σ w Σ w ,

where Σ w = ( σ o 2 , σ 2 , w 2 2 , , σ K - 1 , w K - 1 2 ) . We generate

( t , w , o c , t , w , 2 c , , t , w , K - 1 c ) ~ N ( m c , w , S c , w )

and calculate

t , w , K c = ( X t , w - μ o - Δ μ w - γ c t ) - ( t , w , o c + k = 2 K - 1 t , w , k c ) .

Similarly we generate

( t , v , 1 s , , t , v , K - 1 s ) ~ N ( m s , v , S s , v ) ,

where we define

m s , v : = ( Y t , v - μ s - Δ μ v - γ s t ) ( σ 1 , v 1 2 + ν k = 2 K σ k , v k 2 ) - 1 Σ v

and

S s , v : = diag  ( σ 1 , v 1 2 , ν σ 2 , v 2 2 , , ν σ K - 1 , w K - 1 2 ) - ( σ 1 , v 1 2 + ν k = 2 K σ k , v k 2 ) - 1 Σ v Σ v ,

where Σ v = ( σ 1 , v 1 2 , ν σ 2 , v 2 2 , , ν σ K - 1 , w K - 1 2 ) . Then we calculate

t , v , K s = ( Y t , v - μ s - Δ μ v - γ s t ) - ( t , v , o s + k = 1 K - 1 t , v , k s ) .

Then we can sample the interannual variation parameters from the following conditional distributions

( σ o 2 ) - 1 - ~ Gamma ( a σ + T c 2 k = 2 K n k , b ˜ σ o ) , ( σ k , i 2 ) - 1 - ~ Gamma  ( a σ + 1 2 ( h = 2 K n h n k T c I { k 1 } + X n k T s ) , b ˜ σ k , i ) , ( ν ) - 1 - ~ Gamma ( α + K - 1 2 X T s , b ˜ ν ) ,

where

b ˜ σ o = b σ + 1 2 { t T c ( X t , o - μ o - γ c t ) 2 + w t T c ( t , w , o c ) 2 } , b ˜ σ k , i = b σ + 1 2 { w : w k = i t T c ( t , w , k c ) 2 I { k 1 } + v : v k = i t T s ( t , v , k s ) 2 } , b ˜ ν = α + 1 2 v k = 2 K ( ( σ k , v k 2 ) - 1 t T s ( t , v , k s ) 2 ) .

Lastly, the Metropolis-Hasting algorithm is used for the updating of α. Following Smith et al. (2009), we propose α* = αeU–1/2 with U being a random number generated from the uniform distribution on (0, 1) and calculate

l ( α ) = α log ( α ) - log ( Γ ( α ) ) + ( α - 1 ) log ( ν ) - α ν - a α log ( α ) - b α α , l ( α * ) = α * log ( α * ) - log ( Γ ( α * ) ) + ( α * - 1 ) log ( ν ) - α * ν - a α log ( α * ) - b α α * .

Then set

α = { α * , with probability min  ( e l ( α * ) - l ( α ) , 1 ) , α , otherwise .

One iteration of the Gibbs sampler is summarized in Algorithm 1.

4. Case studies

4.1. Study area

The target watershed is Yongdam Dam located at the upper basin of the Geum River in South Korea (Figure 2). The Yongdam Dam basin is located at latitude 35°35′–36°00′ and longitude 127°20′−127°45′, and it lies on the borders of Chungcheongnam-do, Jeollabuk-do and Gyeongsangnam-do. The basin area is 930km2 accounting for about 9.5% of the Geum River basin area. The climate of the Geum River region dominated by the East Asian monsoon, which is divided into a warm and wet summer monsoon and a cold and dry winter monsoon. The monsoon drives an intense precipitation during the summer season, but sometimes it leads to drought. Therefore the projections for precipitation and streamflow are associated with a large uncertainty. Various water-related issues such as frequent extreme droughts and increased water demands have recently occurred in the Geum River region; therefore, uncertainty analysis is becoming increasingly important for water resource planning.

One iteration of the Gibbs sampler for Bayesian uncertainty decomposition

1 for w k = 2 K X k do
2  for do
3   Sample t , w , o c , t , w , 1 c , , t , v , K - 1 c according to Eq. (3.10) and compute t , w , K s according to Eq. (3.11)
4  end
5 end
6 for v k = 1 K X k do
7  for do
8   Sample t , v , 1 s , , t , v , K - 1 s according to Eq. (3.12) and compute t , v , K s according to Eq. (3.13)
9  end
10 end
11 Sample μ0 and μs according to Eq. (3.3) and Eq. (3.4).
12 Sample γc and γs according to Eq. (3.5) and Eq. (3.6).
13 Sample (2)−1 according to Eq. (3.7)
14 Sample ν according to Eq. (3.16)
15 Sample α according to Eq. (3.17)
16 Sample ( σ o 2 ) - 1 according to Eq. (3.14)
17 for k = 1, . . . , K do
18  for i = 1, . . . , nk do
19   Sample Δμc,k,i according to Eq. (3.8) (not for k = 1)
20   Sample Δμs,k,i according to Eq. (3.9)
21   Sample ( σ k , i 2 ) - 1 according to Eq. (3.15)
22  end
23 end

4.2. Data

Daily streamflow data for the Yongdam Dam basin were obtained from the Water Resources Management Information System webpage (http://www.wamis.go.kr). For the projection data, we used the emission scenarios, GCMs, bias correction techniques, and hydrological models presented in Table 1 or emission scenarios. We also adopted two representative concentration pathways (RCPs) scenarios, RCP4.5 and RCP8.5. Out of 27 GCMs contained in the Coupled Model Intercomparison Project Phase 5 (CMIP5), we selected the following four GCMs: CESM1-BGC (Moore et al., 2013), HadGEM2-ES (Collins et al., 2011), MIROC-ESM-CHEM (Watanabe et al., 2011), MRI-CGCM3 (Yukimoto et al., 2012). The Katsavounidis-Kuo-Zhang (KKZ) algorithm (Katsavounidis et al., 1994) was used for GCM selection. For bias correction techniques, bias correction spatial disaggregation (BCSD) (Wood et al., 2004) and bias correction constructed analogs (BCCA) (Hidalgo et al., 2008) were used to remove the systematic biases of GCM outputs with respect to the observations. Lastly, we used two hydrological models GR4J (Perrin et al., 2003) and the Tank model (Sugawara 1979). These two hydrological models have been widely applied to hydrological studies including flood control and groundwater modeling (Amireche et al., 2017).

Daily projection and observation data are aggregated to the annual data for two seasons, DJF (December, January and February) and JJA (June, July and August). Our model was applied independently for both seasons. The control period is from 1976 to 2005 and the scenario period is from 2016 to 2045.

Figures 3 and 4 draw the averaged projection values over the scenario period according to the simulators in each stage. For example, the dots above ‘GR4J’ in the right lower panel of Figure 3 are 16 projection values where hydrological model GR4J was used and the diamond shape dot is the average of the 16 projection values. The dots connected by lines are projection values using the same combination of simulators except the simulator at the x-axis. For the JJA season the hydrological model stage seems to be the most influential to the projection. The two hydrological models yield fairly different projection values. In contrast, the choice of bias correction technique does not appear to have a significant impact on projection values. The two projection values from the same combination of models, except the bias correction technique, are almost similar. For the DJF season, emission scenario and GCM seem to be more influential to the projection than the other stages.

4.3. Estimated posterior distributions

We ran a single Markov chain with length 60,000. We saved every 50th sample after a burn-in period of 10,000 samples and obtained 1,000 samples used for further analysis.

Figures 5 and 6 present the estimated posterior densities of some easily interpretable parameters for the JJA and DJF seasons, respectively. For the JJA season, there is an apparent mean shift between the control and scenario periods but not for the DJF season. For both seasons, the dispersion of μs is larger than that of μo, which is due to the absence of the true values Yt,o. On the other hand, the posterior densities of the natural variability parameters are almost similar for both the control and scenario periods in both seasons, which implies that the natural system is expected not to change much.

There is a reduction in the linear trend for the JJA season. It shows that the degree of increase in streamflow gradually decreases over time. For the DJF season, the linear trend is negative in the control period, but positive in the scenario period.

Tables 2 and 3 present the posterior expectations of bias, interannual variation and uncertainty of each simulator for the JJA and DJF seasons, respectively. For both seasons, the interannual variation of RCP4.5 is greater than that of RCP8.5 and the interannual variation of Tank hydrological model is greater than that of GR4J hydrological model. This implies that RCP4.5 and Tank lead to the larger interannual variations. On the other hand, the bias of Tank hydrological model is higher than the bias of the GR4J hydrological model for the JJA season, but the relationship is reverse for the DJF season. It implies that Tank hydrological model projects more extreme streamflow compared to the GR4J hydrological model.

4.4. Uncertainty decomposition

We compared the proposed Bayesian uncertainty decomposition method with the AVOVA approach (Yip et al., 2011; Bosshard et al., 2013) and the cumulative uncertainty approach (Kim et al., 2019). Both the ANOVA and cumulative uncertainty approach used only the average of all projected values over the scenario period.

Our Bayesian model can be applied when there is only data in the scenario period. In this case, we exclude the unnecessary parameters μo and ν. The total uncertainty and the uncertainty of stage k are then computed by using Equation (3.1) and Equation (3.2) with ν = 1, respectively. The purpose of this analysis is to examine the impact of the data in the control period on uncertainty decomposition.

Tables 4 and 5 summarize the results of the uncertainty decomposition using the ANOVA method, the cumulative uncertainty method and the proposed Bayesian method using whole data and the Bayesian method using only the data in the scenario period for the JJA and DJF seasons. The percentage in the bracket indicates the proportion of the uncertainty of each stage that contributed to the total uncertainty. For the uncertainty measure in the cumulative uncertainty approach, we used the variance to quantify uncertainties for fair comparison. For the Bayesian model, the presented uncertainties are posterior expectation values and the proportions are computed as the ratio of each posterior expectation to the posterior expectation of the total uncertainty.

For the JJA season, the bias correction technique is the smallest contributor to the total uncertainty and the hydrological model is the largest contributor for all four methods. The uncertainty of GCMs is larger than that of emission scenarios for the three other methods except for the Bayesian method using whole data. For the DJF season, the hydrological model is the largest contributor among the four simulator stages for the two Bayesian methods but the emission scenario is the largest contributor for the other two methods. This is due to the large interannual variation of the Tank simulator, which is quantified by Bayesian methods.

As mentioned in Section 1, unlike the other methods, the proposed Bayesian method can incorporate the uncertainty of natural variability. When we use the whole data in the control and scenario periods, the estimated proportion of uncertainty of natural variability is the fourth largest for the JJA season and the largest for the DJF season. However, the uncertainty of natural variability when using only the data in the scenario period is similar to the JJA season, but different for the DJF season. The inconsistency of the results is due to the absence of historical observations.

The uncertainty decomposition results between the Bayesian model using whole data and the model using only the data in the scenario period, are quite different. This implies that the information from the data in the control period depends significantly on the uncertainty decomposition results.

5. Concluding remarks

We proposed a Bayesian method for decomposing the total uncertainty into uncertainty of each source with the following advantages: (1) it can borrow the information from projected and observed data in the control period; and (2) it can quantify the uncertainty of natural variability. We also provide a simple and efficient Gibbs sampling algorithm using auxiliary variables.

Future research will apply the proposed Bayesian method to meteorological/hydrological variables that are not expected to follow a normal distribution, e.g., maximum precipitation which may be assumed to follow a generalized extreme value distribution as in Jo et al. (2016). Another potential extension of the proposed Bayesian method is to treat datasets for multiple regions within a single Bayesian model. This multivariate Bayesian model could ease the distributional assumption due to the plenty of the data. For example, the multivariate Bayesian model can introduce changes in parameters between the control and scenario periods or correlations between the uncertainty sources.

Acknowledgements

This work was supported by Korea Environmental Industry & Technology Institute (KEITI) through Advanced Water Management Research Program, funded by Korea Ministry of Environment (Grant. 83082)

Figures
Fig. 1. An example of the hydrological projection consisting of four stages.
Fig. 2. Location of Yongdam Dam basin. This figure is from Lee et al. (2013).
Fig. 3. Averaged projection values over the scenario period for the JJA season. We draw averaged projection values according to simulators in each stage. Each diamond shape dot is the average of the circle shaped dots on the same vertical line. The dots connected by lines are projection values using the same combination of simulators except the simulator at the x-axis. ES = emission scenarios; GCM = global circulation model; BC = bias correction technique; HM = hydrological model.
Fig. 4. Same as Figure 3, but for the DJF season. ES = emission scenarios; GCM = global circulation model; BC = bias correction technique; HM = hydrological model.
Fig. 5. Posterior densities for μo and μs (left); σ o 2 and ν σ o 2 (middle); γc and γs (right) for the JJA season. The solid black lines and the red dashed lines indicate the parameters in the control and scenario periods, respectively.
Fig. 6. Same as Figure 5, but for the DJF season.
TABLES

Table 1

Emission scenarios (ESs), global circulation models (GCMs), bias correction techniques (BCs), and hydrological models (HMs) used for streamflow projections

Stage Simulator
ESs RCP 4.5, RCP 8.5
GCMs CESM1-BGC, HadGEM2-ES, MIROC-ESM-CHEM, MRI-CGCM3
BCs BCCA, BCSD
HMs GR4J, Tank

Table 2

Posterior expectations of bias, variance, and uncertainty of each simulator for the JJA season

Stage Simulator Δμk,i σ k , i 2 τk,i
ES RCP 4.5 0.0358 5.5000 2.8207
RCP 8.5 −0.1292 2.7917 1.4372

GCM CESM1-BGC 0.6574 1.1452 0.3080
HadGEM2-ES 0.7098 6.6361 1.7193
MIROC-ESM-CHEM 0.6176 4.0201 1.0402
MRI-CGCM3 −0.0782 0.6669 0.2542

BC BCCA 0.9462 1.2556 0.6480
BCSD 0.9441 2.1061 1.0825

HM GR4J −0.5993 0.8461 1.5742
Tank 2.4136 14.5822 8.5870

ES = emission scenarios; GCM = global circulation model; BC = bias correction technique; HM = hydrological model.


Table 3

Posterior expectations of bias, variance, and uncertainty of each simulator for the DJF season

Stage Simulator Δμk,i σ k , i 2 τk,i
ES RCP 4.5 0.0697 0.0889 0.0463
RCP 8.5 −0.0644 0.0049 0.0048

GCM CESM1-BGC 0.0591 0.0132 0.0037
HadGEM2-ES 0.1579 0.0477 0.0129
MIROC-ESM-CHEM 0.1054 0.0360 0.0090
MRI-CGCM3 0.0619 0.0107 0.0031

BC BCCA 0.1701 0.0072 0.0036
BCSD 0.1679 0.0167 0.0084

HM GR4J 0.2019 0.0111 0.0058
Tank 0.1599 0.1726 0.0857

ES = emission scenarios; GCM = global circulation model; BC = bias correction technique; HM = hydrological model.


Table 4

Results of decomposition of the total uncertainty into the sources including NV, ESs, GCMs, BCs, and HMs for the JJA season

Source ANOVA Cumulative Bayesian (whole) Bayesian (scenario)
NV - - 2.5331 (11.51%) 2.2324 (9.63%)
ES 0.0150 (0.43%) 0.3027 (8.63%) 4.2579 (19.35%) 1.1619 (5.01%)
GCM 0.4751 (13.55%) 0.5276 (15.05%) 3.3217 (15.1%) 6.8689 (29.63%)
BC 0.0010 (0.03%) 0.0011 (0.03%) 1.7305 (7.86%) 1.2469 (5.38%)
HM 2.6745 (76.29%) 2.6745 (76.29%) 10.1611 (46.18%) 11.6702 (50.34%)
RSS 0.3403 (9.71%) - - -

ANOVA denotes the ANOVA approach and Cumulative denotes the cumulative uncertainty approach, and Bayesian (whole) and Bayesian (scenario) denote the proposed Bayesian methods using the whole data and using only the data in the scenario period, respectively. RSS stands for residual sum of squares for the ANOVA approach. NV = natural variability; ES = emission scenarios; GCM = global circulation model; BC = bias correction technique; HM = hydrological model.


Table 5

Results of decomposition of the total uncertainty into the sources including NV, ESs, GCMs, BCs, and HMs for the DJF season

Source ANOVA Cumulative Bayesian (whole) Bayesian (scenario)
NV - - 0.1314 (41.75%) 0.0310 (9.61%)
ES 0.0066 (28.26%) 0.0158 (67.47%) 0.0511 (16.24%) 0.0859 (26.61%)
GCM 0.0058 (24.93%) 0.0073 (30.99%) 0.0287 (9.13%) 0.0528 (16.36%)
BC 4 × 10−6 (0.02%) 1.48 × 10−4 (0.63%) 0.0120 (3.81%) 0.0349 (10.82%)
HM 2.12 × 10−4 (0.91%) 2.12 × 10−4 (0.91%) 0.0915 (29.07%) 0.1181 (36.60%)
RSS 0.0108 (45.88%) - - -

ANOVA denotes the ANOVA approach and Cumulative denotes the cumulative uncertainty approach, and Bayesian (whole) and Bayesian (scenario) denote the proposed Bayesian methods using whole data and only the data in the scenario period. RSS stands for residual sum of squares for the ANOVA approach. NV = natural variability; ES = emission scenarios; GCM = global circulation model; BC = bias correction technique; HM = hydrological model.


References
  1. Amireche M, Merabtene T, Bermad A, and Boutoutaou D (2017). Comparative assessment between GR model and tank model for rainfall-runoff analysis using Kalman filter-application to Algerian basins, MATEC Web of Conferences, 120, 05006.
    CrossRef
  2. Bastola S, Murphy C, and Sweeney J (2011). The role of hydrological modelling uncertainties in climate change impact assessments of Irish river catchments, Advances in Water Resources, 34, 562-576.
    CrossRef
  3. Bosshard T, Carambia M, Goergen K, Kotlarski S, Krahe P, Zappa M, and Schär C (2013). Quantifying uncertainty sources in an ensemble of hydrological climate-impact projections, Water Resources Research, 49, 1523-1536.
    CrossRef
  4. Buser CM, Künsch HR, Lüthi D, Wild M, and Schär C (2009). Bayesian multi-model projection of climate: bias assumptions and interannual variability, Climate Dynamics, 33, 849-868.
    CrossRef
  5. Collins W, Bellouin N, and Doutriaux-Boucher M et al (2011). Development and evaluation of an earthsystem model-hadgem2, Geoscientific Model Development, 4,.
    CrossRef
  6. Damlen P, Wakefield J, and Walker S (1999). Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61, 331-344.
    CrossRef
  7. Dobler C, Hagemann S, Wilby R, and Stötter J (2012). Quantifying different sources of uncertainty in hydrological projections in an alpine watershed, Hydrology and Earth System Sciences, 16, 4343-4360.
    CrossRef
  8. Hidalgo HG, Dettinger MD, and Cayan DR (2008). Downscaling with constructed analogues: daily precipitation and temperature fields over the united states, California Energy Commission PIER Final Project Report CEC-500-2007-123.
  9. Jackman S (2009). Bayesian Analysis for the Social Sciences, John Wiley & Sons, Chichester.
    CrossRef
  10. Jo S, Kim G, and Jeon JJ (2016). Bayesian analysis to detect abrupt changes in extreme hydrological processes, Journal of Hydrology, 538, 63-70.
    CrossRef
  11. Jones RN (2000). Managing uncertainty in climate change projections-issues for impact assessment, Climatic Change, 45, 403-419.
    CrossRef
  12. Katsavounidis I, Kuo CCJ, and Zhang Z (1994). A new initialization technique for generalized Lloyd iteration, IEEE Signal Processing Letters, 1, 144-146.
    CrossRef
  13. Kim Y, Ohn I, Lee JK, and Kim YO (2019). Generalizing uncertainty decomposition theory in climate change impact assessments, Journal of Hydrology X, 3, 100024.
    CrossRef
  14. Lee JK, Kim YO, and Kim Y (2017). A new uncertainty analysis in the climate change impact assessment, International Journal of Climatology, 37, 3837-3846.
    CrossRef
  15. Lee S, Kim J, and Hur JW (2013). Assessment of ecological flow rate by flow duration and environmental management class in the Geum River, Korea, Environmental Earth Sciences, 68, 1107-1118.
    CrossRef
  16. Moore JK, Lindsay K, Doney SC, Long MC, and Misumi K (2013). Marine ecosystem dynamics and biogeochemical cycling in the community earth system model [CESM1 (BGC)]: comparison of the 1990s with the 2090s under the RCP4.5 and RCP8.5 scenarios, Journal of Climate, 26, 9291-9312.
    CrossRef
  17. Najafi M, Moradkhani H, and Jung I (2011). Assessing the uncertainties of hydrologic model selection in climate change impact studies, Hydrological Processes, 25, 2814-2826.
    CrossRef
  18. Perrin C, Michel C, and Andréassian V (2003). Improvement of a parsimonious model for streamflow simulation, Journal of Hydrology, 279, 275-289.
    CrossRef
  19. Smith RL, Tebaldi C, Nychka D, and Mearns LO (2009). Bayesian modeling of uncertainty in ensembles of climate models, Journal of the American Statistical Association, 104, 97-116.
    CrossRef
  20. Sugawara M (1979). Automatic calibration of the tank model/L’étalonnage automatique d’un modèle à cisterne, Hydrological Sciences Journal, 24, 375-388.
    CrossRef
  21. Tebaldi C and Sansó B (2009). Joint projections of temperature and precipitation change from multiple climate models: a hierarchical Bayesian approach, Journal of the Royal Statistical Society: Series A (Statistics in Society), 172, 83-106.
    CrossRef
  22. Tebaldi C, Smith RL, Nychka D, and Mearns LO (2005). Quantifying uncertainty in projections of regional climate change: a Bayesian approach to the analysis of multimodel ensembles, Journal of Climate, 18, 1524-1540.
    CrossRef
  23. Watanabe S, Hajima T, and Sudo K et al (2011). MIROC-ESM 2010: model description and basic results of CMIP5-20c3m experiments, Geoscientific Model Development, 4, 845-872.
    CrossRef
  24. Wilby RL and Harris I (2006). A framework for assessing uncertainties in climate change impacts: low-flow scenarios for the River Thames, UK, Water Resources Research, 42.
    CrossRef
  25. Wood AW, Leung LR, Sridhar V, and Lettenmaier D (2004). Hydrologic implications of dynamical and statistical approaches to downscaling climate model outputs, Climatic Change, 62, 189-216.
    CrossRef
  26. Yip S, Ferro CA, Stephenson DB, and Hawkins E (2011). A simple, coherent framework for partitioning uncertainty in climate predictions, Journal of Climate, 24, 4634-4643.
    CrossRef