search for

CrossRef (0)
The Bivariate Kumaraswamy Weibull regression model: a complete classical and Bayesian analysis
Communications for Statistical Applications and Methods 2018;25:523-544
Published online September 30, 2018
© 2018 Korean Statistical Society.

Juliana B. Fachini-Gomesa, Edwin M. M. Ortegab, Gauss M. Cordeiroc, and Adriano K. Suzuki1,d

aDepartment of Statistics, University of Brasilia, Brazil, bDepartment of Exact Sciences, University of São Paulo, Brazil, cDepartment of Statistics, Federal University of Pernambuco, Brazil, dDepartment of Applied Mathematics and Statistics, University of São Paulo, Brazil
Correspondence to: 1Department of Applied Mathematics and Statistics, University of São Paulo, Avenida Trabalhador São-carlense, 400 - Centro CEP: 13566-590, São Carlos-SP, Brazil. E-mail: suzuki@icmc.usp.br
Received March 20, 2018; Revised June 23, 2018; Accepted August 14, 2018.

Bivariate distributions play a fundamental role in survival and reliability studies. We consider a regression model for bivariate survival times under right-censored based on the bivariate Kumaraswamy Weibull (Cordeiro et al., Journal of the Franklin Institute, 347, 1399–1429, 2010) distribution to model the dependence of bivariate survival data. We describe some structural properties of the marginal distributions. The method of maximum likelihood and a Bayesian procedure are adopted to estimate the model parameters. We use diagnostic measures based on the local influence and Bayesian case influence diagnostics to detect influential observations in the new model. We also show that the estimates in the bivariate Kumaraswamy Weibull regression model are robust to deal with the presence of outliers in the data. In addition, we use some measures of goodness-of-fit to evaluate the bivariate Kumaraswamy Weibull regression model. The methodology is illustrated by means of a real lifetime data set for kidney patients.

Keywords : Bayesian inference, bivariate failure time, censored data, diagnostics, survival analysis
1. Introduction

Statistical applications for the time of occurrence of an event of interest generally use the exponential, Weibull, gamma, log-normal and log-logistic distributions. However, there has been growing interest in new distributions to model skewness, kurtosis and different types of hazard rates. Among them, we cite the exponentiated Weibull (Mudholkar et al., 1995) and beta modified Weibull (Silva et al., 2010) distributions. More recently, Cordeiro et al. (2010) introduced the Kumaraswamy Weibull (KwW) distribution from the generator defined by Cordeiro and de Castro (2011). By considering more than one response variable in the experiment, Cordeiro et al. (2010) proposed the bivariate Kumaraswamy Weibull (BKwW) distribution based on the construction of the bivariate Weibull (Hougaard, 1986) distribution.

In practice, regressor variables associated with the response variable of each observation are always presented; however, statistical analysis always gives consideration to models that account for all the information existing in the observations. Due to these factors, this work extends the well-known KwW distribution to include covariates by means of the scale parameter leading to the BKwW regression model. The inferential part is conducted using the asymptotic distribution of the maximum likelihood estimators (MLEs) subject to restrictions on parameters. To implement this method, we use the adjusted barrier function (Lange, 1999). Situations with small samples may present difficult results to justify. We explore the use of a Bayesian method as an alternative to classic analysis. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian analysis for the regression model.

After fitting the data, it is important to check model assumptions and conduct a robustness study to detect influential or extreme observations that can cause distortions in the results of the analysis. Influence diagnostics is an important step in the analysis of a data set, since it provides an indication of bad model fit or influential observations. Cook (1986) proposed a diagnostic approach named local influence to assess the effect of small perturbations in the model and/or data on parameter estimates. Several authors have applied the local influence method in more general regression models than the normal regression model. Some authors have also investigated the assessment of local influence in survival analysis models. For instance, Ortega et al. (2013) proposed the log-beta Weibull regression model with application to predict recurrence of prostate cancer, Hashimoto et al. (2013) adapted local influence methods to log-generalized gamma regression model for interval-censored data, Ortega et al. (2015) considered the problem of assessing local influence in a power series beta Weibull regression model to predict breast carcinoma and da Cruz et al. (2016) explored global and local influence methods to the log-odd log-logistic Weibull regression model with censored data. For the BKwW regression model, we propose a similar method to detect influential subjects by considering the global and local influence and Bayesian case influence.

The article is organized as follows. In Section 2, we present the BKwW regression model and some properties of the marginal distributions. In Section 3, we examine the performance of the likelihood function by computing the maximum likelihood, while the estimated equations are considered under parameter constraints and derive several diagnostic measures by considering the normal curvatures of local influence under various perturbation schemes. In Section 4, we consider a Bayesian approach and influence diagnostics for the BKwW regression model. In Section 5, we conduct various simulation studies to evaluate the behavior of the estimators in the BKwW regression model. In Section 6, we present a reanalysis of the dataset from patients of a renal insufficiency study reported by McGilchrist and Aisbett (1991). Finally, Section 7 provides some conclusions remark.

2. A bivariate KwW regression model

In practice, the majority of studies involve covariates related to survival times. Regression models can be formulated in various ways. In survival analysis, the class of parametric regression models and Cox regression model are well-known. However, we adopt a reparameterization of the BKwW (Cordeiro et al., 2010) distribution to define a new regression model. The bivariate Kumaraswamy (BKw) cumulative distribution is defined by

F(t1,t2)=1-[1-G(t1,t2)a]b,         t1,t2>0,

where a > 0 and b > 0 are additional shape parameters, and G(t1, t2) is an arbitrary joint cumulative distribution function (cdf). We also consider the bivariate Weibull (Hougaard, 1986) distribution due to its evident applicability and popularity in the literature. Its cdf G(t1, t2) is given by


Let T1 and T2 be two non-negative random variables denoting the survival times of two components of a system. We define a bivariate random variable T = (T1, T2)T having a BKwW distribution. Its cumulative distribution (for tk > 0, k = 1, 2) follows from equations (2.1) and (2.2) as


where a > 0, b > 0, and ck > 0 are shape parameters, λk > 0 is a scale parameter and 0 < α ≤ 1 is an association parameter between T1 and T2. We propose the reparameterization λk = exp(−μk) and ck = σk−1, for which −∞ < μk < ∞ and 0 < σk < ∞. We introduce a vector of regressor variables x = (x0, x1, …, xp)T and the linear structure μ = xTβ, where β = (β0, β1, …, βp)T is the unknown parameter vector associated with the covariates. Therefore, the reparameterization can be expressed as: λk = exp(−xTβk) = exp[−(β0k x0 + β1k x1 + · · · + βpk xp)] and ck = 1/σk, for k = 1, 2. The BKwW regression model is defined by


with probability density function (pdf) given by


where the functions A(·), B(·), C(·), and G(·) are given in the Appendix.

To illustrate some possible shapes of the pdf, we set the parameter values μ1 = 4.5, σ1 = 1.5, μ2 = 5, σ2 = 1.5, a = 1.5, b = 0.9, and take α = 0.99 (when the lifetimes, t1 and t2, are independent), α = 0.50 (when there is some dependency between the lifetimes t1 and t2), and α = 0.01 (when the lifetimes, t1 and t2, are highly dependent), respectively. Figure 1 displays the plots of the pdf.

The corresponding marginal pdfs f(tk|x) and marginal cdfs F(tk|x) are given by




respectively, for k = 1, 2.

If |z| < 1 and b is real non-integer, the power series


holds, where the binomial coefficient is defined for any real number. Let Tk be a random variable with pdf (2.5) for k = 1, 2. We consider here only the general case: the parameters a and b are real non-integers. By applying (2.7) to (2.5) and expanding the binomial term, the marginal density function of Tk reduces to


where gλr,(k),ck (x) denotes the Weibull density function with shape parameter ck = σk−1 and scale parameter λr,k = (r + 1)σk exp(−xTβk) and the coefficients wr are given by


Equation (2.8) reveals that the marginal density function of Tk is an infinite linear combination of Weibull densities. We can easily check using computer software that r=0wr=1 as expected. Thus, the ordinary, inverse and factorial moments and generating function of Tk can be obtained from an infinite weighted linear combination of those quantities for Weibull distributions.

The quantile function of Tk is readily obtained by inverting (2.6) as


where λk = exp(−xTβk) and u is uniform on the unit interval (0, 1).

The survival function associated with (2.3) can be expressed as


where the functions F(t1, t2|x) and F(t1|x), F(t2|x) are defined in (2.3) and (2.6), respectively. Consequently, the joint survival function for the model (2.3) reduces to


The marginal survival function of Tk is given by

3. Classical inference and diagnostics analysis

Let (t1k, δ1k, x1), …, (tnk, δnk, xn) be an observed sample of n independent observations, where tik represents the failure-time or the censoring-time, δik is a censoring indicator and xi = (xi1, …, xip)T is the vector of explanatory variables associated with the ith individual, where k = 1, 2 and i = 1, 2, …, n. The likelihood function for bivariate data was considered by Lawless (2003) and He and Lawless (2005). By taking explanatory variables xi and the joint survival function (2.10), the log-likelihood function for the BKwW regression model follows by summing the contributions from each one of the n individuals


The density function f(ti1, ti2|xi) is defined in (2.4), where Ψ=(a,b,α,βkT,σk)T is an unknown parameter vector and βkT=(β0k,β1k,,βpk) for k = 1, 2. The dimension of Ψ is (2p + 7).

The log-likelihood function (3.1) has the following restrictions on the parameters: a > 0, b > 0, 0 < α ≤ 1, and σk > 0 for k = 1, 2 and i = 1, …, n. So, to continue the estimation process it is necessary to rewrite the log-likelihood function including restrictions on the parametric space. We then consider the general problem of maximizing (3.1) subject to the linear constraints ojTΨ-cj*0, where oj, j = 1, 2, …, q are (2p + 7) × 1 vectors and cj* are scalars, both known and fixed numbers. In this study, cj* has value zero for the cases: a > 0, b > 0, α > 0, and σk > 0 and value one for α ≤ 1. To solve this problem, we use the adaptive barrier method (Lange, 1999). Thus, the log-likelihood function subject to linear constraints on the parameters reduces to


where ϑ > 0 is the multiplier barrier term, ojTΨ-cj* is the linear inequality constraint for j = 1, …, q, Ψ=(a,b,α,βkT,σk)T, and βkT=(β0k,β1k,,βpk).

The MLEs under constraints on the parameters in Ψ can be calculated numerically by maximizing (3.2). We can adopt the statistical software R to compute the estimate Ψ̂. However, it is usually more convenient to use the constrOptim and function to maximize this function numerically. Initial values for Ψ̂ are taken from the fit of the bivariate Weibull regression model corresponding to a = b = 1.

Under standard regularity conditions, the asymptotic distribution of (Ψ̂Ψ) is multivariate normal N(2p+7)(0, I(Ψ)−1), where I(Ψ) is the expected information matrix for the (2p + 7) × 1 vector of parameters. The asymptotic covariance matrix I(Ψ) of Ψ̂−1 can be approximated by the (2p+7)×(2p+7) inverse of the observed information matrix R(Ψ, ϑ) = −{2lR(Ψ, ϑ)/(ΨΨT)} evaluated at Ψ = Ψ̂.

The approximate multivariate normal N(2p+7)(0, − R(Ψ, ϑ)−1) distribution can be used to construct confidence intervals for the parameters in Ψ in the usual way.

Besides estimation, hypothesis tests is another key issue. Let Ψ1 and Ψ2 be proper disjoint subsets of Ψ. Suppose that we want to test H0 : Ψ1 = Ψ10 versus H1 : Ψ1Ψ10, where Ψ2 is a nuisance parameter vector. Let Ψ̂0 be the MLEs under H0, and define the likelihood ratio (LR) statistic w = 2{(Ψ̂)−(Ψ̂0)}. Under H0 and standard regularity conditions, w converges to a chi-square distribution with dim(Ψ1) degrees of freedom.

3.1. Diagnostics analysis: global influence

The first tool to perform sensitivity analysis, is by means of global influence starting from case deletion (Cook, 1977), which is a common approach to study the effect of dropping the ith case from the data. Case deletion for model (2.3) can be expressed as


Accordingly, a quantity with subscript “(i)” means the original quantity with the ith observation deleted. For model (3.3), the log-likelihood function is denoted by lR(i) (Ψ, ϑ).

Let Ψ^(i)=(a^(i),b^(i),α^(i),β^k(i)T,σ^k(i))T be the MLEs under constraints on the parameters in Ψ obtained by maximizing lR(i)(Ψ, ϑ). To assess the influence of the ith observation on the MLEs under constraints on the parameters in Ψ, the idea is to compare the difference between Ψ̂(i) and Ψ̂. If deletion of an observation seriously influences the estimates, more attention should be directed to that observation. Hence, if Ψ̂(i) is far from Ψ̂, then the ith case is regarded as an influential observation. A first global influence measure of the ith observation is the standardized norm of Ψ̂(i)Ψ̂ (generalized Cook distance; GD) given by


So, we can assess the values of GDi(a), GDi(b), GDi(α), GDi(βk), and GDi(σk) to estimate the impact of the ith observation on the estimates of a, b, α, βk, and σk, respectively. Another popular measure of the difference between Ψ̂(i) and Ψ̂ is the likelihood displacement (LD)


3.2. Diagnostics analysis: local influence

Since regression models are sensitive to the underlying model assumptions, performing sensitivity analysis in general is strongly advisable. Another approach is suggested by Cook (1986), where instead of removing observations, weights are given to them. Local influence calculation can be conducted for model (2.3). If likelihood displacement LD(ω) = 2{l(Ψ̂) − l(Ψ̂ω)} is used, where Ψ̂ω denotes the MLE under the perturbed model, the normal curvature for Ψ at the direction d, ||d|| = 1, is given by Cd(Ψ) = 2|dTΔT[(Ψ)]−1Δd|, where Δ is a (2p + 7) × n matrix that depends on the perturbation scheme, and whose elements are given by Δvi = 2l(Ψ|ω)/Ψv∂ωi, i = 1, …, n and v = 1, 2, …, (2p + 7) evaluated at Ψ̂ and ω0, where ω0 is the no perturbation vector (Cook, 1986). For the bivariate regression model, the elements of (Ψ) can be evaluated numerically. We can also calculate normal curvatures Cd(a), Cd(b), Cd(α), Cd(βk), and Cd(σk) to perform various index plots, for instance, the index plot of dmax, the eigenvector corresponding to Cdmax, the largest eigenvalue of the matrix B = −ΔT[(Ψ)]−1Δ. The index plots of Cdi(a), Cdi(b), Cdi(α), Cdi(βk), and Cdi(σk), called total local influence, where di denotes an n × 1 vector of zeros with one at the ith position. Thus, the curvature at direction di takes the form Ci=2ΔiT[L¨(Ψ)]-1Δi, where ΔiT denotes the ith row of Δ. It is usual to show those cases such that Ci ≥ 2, where C¯=(1/n)i=1nCi.

Consider the vector of weights ω = (ω1, …, ωn)T. Under three perturbation schemes (case-weight perturbation, response perturbation and explanatory variable perturbation), we can easily derive from the log-likelihood (3.2) the matrix


where v = 1, …, (2p + 7) and i = 1, …, n.

4. Bayesian inference and influence diagnostics

In this section, we consider the Bayesian method as an alternative analysis to incorporate previous knowledge of the parameters through informative prior density functions.

Let (t1k, δ1k, x1), …, (tnk, δnk, xn) be an observed sample of n independent observations, where tik represents the failure-time or the censoring-time, δik is a censoring indicator and xi = (xi1, …, xip)T is the vector of explanatory variables associated with the ith individual, where k = 1, 2 and i = 1, …, n. The log-likelihood function l(Ψ) for the model parameters Ψ=(a,b,α,βkT,σ1,σ2)T of the BKwW regression model is given by equation (3.1).

We consider that a, b, α, βk, σ1, σ2 have independent priors,




Further, we assume the following prior distributions βj*k ~ N(0, 102) for k = 1, 2 and j* = 0, …, p, log(α/(1 + α)) ~ N(0, 102), log(σ1) ~ N(0, 102), log(σ2) ~ N(0, 102), log(a) ~ N(0, 102), and log(b) ~ N(0, 102). The normal distribution with mean μ and variance τ2 is denoted by N(μ, τ2). All the hyper-parameters have been specified to express non-informative priors.

By combining the likelihood function (exponential of the log-likelihood (3.1)) and the prior distribution (4.1), we obtain the joint posterior distribution, which is analytically intractable. Then, we have based our inference on the MCMC simulation methods. By changing variables ξ = (log[α/(1 + α)], log(σ1), log(σ2), βk, log(a), log(b)), the parameter space is transformed into ℛ(2p+7) (necessary for the work with Gaussian densities).

To implement the Metropolis-Hastings algorithm, we proceed as follows.

  • Start with any point ξ(0) and stage indicator ν = 0;

  • Generate a point ξ′ according to the transitional kernel Q(ξ′, ξν) = Np+2(ξν, ∑̃), where ∑̃ is the covariance matrix of ξ, which is the same in any stage;

  • Update ξ(ν) to ξ(ν+01) = ξ′ with probability pν*=min{1,π(ξD)/π(ξ(ν)D)}, or keep Ψ(ν) with probability 1-pν*;

  • Repeat Steps (2) and (3) by increasing the stage indicator until the process has reached a stationary distribution.

All computations are performed in R software (R Development Core Team, 2016). In all the work, after 20,000 sample burn-in, we use every tenth sample from the 60,000 MCMC posterior samples to reduce the autocorrelations and yield better convergence results, thus obtaining an effective sample of size 6,000 upon which the posterior is based on. We monitor the convergence of the Metropolis-Hasting algorithm using the method proposed by Geweke (1992) as well as trace plots.

4.1. Model comparison criteria

A variety of methodologies can be applied for comparing several competing models for a given dataset and selecting those which provide the best fits to the data. We consider some of the Bayesian model selection criteria, namely, the deviance information criterion (DIC) proposed by Spiegelhalter et al. (2002), the expected Akaike information criterion (EAIC) by Brooks (2002) and the expected Bayesian (or Schwarz) information criterion (EBIC) by Carlin and Louis (2001). Let Ψ(1), …, Ψ(Q) be a sample of size Q of after the burn-in, where denote the full data. They are based on the posterior mean of the deviance, which can be approximated by d¯=q*=1Qd(Ψq*)/Q, where d(Ψ)=-2i=1nlog[f(t1i,t2iΨ)]. The DIC criterion can be estimated using the MCMC output by DIC^=d¯+ρ^d=2d¯-d^, where ρD is the effective number of parameters defined as E{d(Ψ)} − d{E(Ψ)}, and d{E(Ψ)} is the deviance evaluated at the posterior mean.

Similarly, the EAIC and EBIC criteria can be estimated by EAIC^=d¯+2#(Ψ) and EBIC^=d¯+#(Ψ)log(n), where #(Ψ) is the number of model parameters. Comparing alternative models, the preferred model is the one with the smallest criteria values.

Other criteria such as LPML is derived from conditional predictive ordinate (CPO) statistics. Let denote the data with the deleted ith observation. We denote the posterior density of Ψ given by , i = 1, …, n. For the ith observation, CPOi is given by


The CPOi can be interpreted as the height of the marginal density of the time for an event at ti. Therefore, high CPOi implies a better fit of the model. No closed-form of CPOi is available for the proposed model. However, a Monte Carlo estimate of CPOi can be obtained by using a single MCMC sample from the posterior distribution . A Monte Carlo approximation of CPOi (Ibrahim et al., 2001) is given by


For model comparisons, we use the log pseudo marginal likelihood (LPML) defined by LPML=i=1nlog(CPOi^). The higher the LPML value, the better the fit of the model.

4.2. Bayesian case influence diagnostics

It is well known that regression models may be sensitive to underlying model assumptions. Therefore, a sensitivity analysis is strongly advisable. Cook (1986) uses this idea to motivate his assessment of influence analysis and suggests that more confidence should be put in a model relatively stable under small modifications. In order to investigate if some of the observations are influential for the Bayesian analysis, we consider the Bayesian case-deletion influence diagnostic measures for the joint posterior distribution based on the ψ-divergence (Peng and Dey, 1995)

Let Dψ(P, P(−i)) denote the ψ-divergence between P and P(−i), in which P denotes the posterior distribution of Ψ for the full data, and P(−i) denotes the posterior distribution of Ψ without the ith case. Therefore,


where ψ is a convex function with ψ(1) = 0. Several choices concerning the ψ are given by Dey and Birmiwal (1994). For example, ψ(z*) = −log(z*) defines the Kullback-Leibler (K-L) divergence, ψ(z*) = 0.5|z* − 1| defines the L1 norm (or variational distance) and ψ(z*) = (z* − 1) log(z*) gives the J-distance (or the symmetric version of the K-L divergence).

Let Ψ(1), . . . ,Ψ(Q) be a sample of size Q from . Then, Dψ(P, P(−i)) can be calculated by


where CPOi^={(1/Q)q*=1Q1/f(t1i,t2iΨ(q*))}-1 is the numerical approximation of the conditional predictive ordinate statistic of the ith observation (Ibrahim, 2001).

Note that Dψ(P, P(−i)) can be interpreted as the ψ-divergence of the effect of deleting the ith case from the full data on the joint posterior distribution of Ψ. As pointed out by Peng and Dey (1995), it may be difficult for a practitioner to judge the cutoff point of the divergence measure so as to determine whether a small subset of observations is influential or not. By using a biased coin procedure (Peng and Dey, 1995), which has probability value φ, the ψ-divergence between the biased and unbiased coins is given by


where f0(y) = φy(1 − φ)1−y and f1(y) = 0.5, y = 0, 1. If Dψ(f0, f1) = dψ(φ), it can be easily checked that dψ satisfies the following equation


It is not difficult to see for the divergence measures considered that dψ increases as φ moves away from 0.5. In addition, dψ(φ) is symmetric about φ = 0.5 and dψ achieves its minimum value at φ = 0.5. For this point, dψ(0.5) = 0 and f0 = f1. Therefore, if we consider φ > 0.90 (or φ ≤ 0.10) as a strong bias in a coin, then dK-L(0.90) = 0.51, dJ(0.90) = 0.88, and dL1(0.90) = 0.4. In addition, an observation which dJ > 0.88 can also be considered influential if we use the J-distance. Similarly, using the K-L divergence and the L1 norm, we can consider an influential observation when dK-L > 0.51 and dL1 > 0.4, respectively.

5. Simulation study: maximum likelihood and Bayesian estimation

A simulation study is conducted to evaluate the parameter estimates for the proposed model. The results are obtained from 1,000 Monte Carlo simulations using the R software. In each replication, a random sample of size n is drawn from the BKwW regression model and the parameters are estimated by maximum likelihood and Bayesian estimation.

The samples denoted by (t11, t21), . . . , (t1n, t2n) are generated according to the following steps:

  • Step 0: Set up the sample size n and start with stage indicator i = 1.

  • Step 1: Generate the covariate x1i from a Bernoulli distribution with parameter 0.5 and the censorship time C1i from a uniform distribution U(0, τ1), where τ1 controls the percentage of censored observations.

  • Step 2: Use the quantile function given in (2.9) to obtain


    where ui1 ~ U(0, 1) and λ1i = exp(−β01β11x1i).

  • Step 3: Compare T1i with C1i in order to determine the indicator of censorship δ1i and the observed value given by t1i = min(T1i,C1i).

  • Step 4: Generate the covariate x2i from a Bernoulli distribution with parameter 0.5 and the censorship time C2i from a uniform distribution U(0, τ2), where τ2 controls the percentage of censored observations.

  • Step 5: Next, T2i is generated using a random variable ηi ~ U(0, 1) and the solution of the nonlinear equation, exp(−[{−log(1 − (1 − (1 − ui1)1/b)1/a)}1/α + (−log(1 − u2i))1/α]α) − 1 + u2i + (1 − (1 − ui1)1/b)1/aηiui1 = 0, by considering T2i = (1/λ2i)[−log{1 − [1 − (1 − u2i)1/b]1/a}]σ2, where λ2i = exp(−β02β21x2i).

  • Step 6: Compare T2i with C2i in order to determine the indicator of censorship δ2i and the observed value given by t2i = min(T2i,C2i).

  • Step 7: Do i = i + 1. If i = n stop, else return to Step 1.

The simulation study is performed for n = 50, 100, and 150. We consider the following values for the parameters of the model: α = 0.75, σ1 = 1.0, β01 = 4.0, β11 = 0.75, σ2 = 1.25, β02 = 3.25, β12 = 1.75, a = 1.0, and b = 0.75. For all sample sizes, the percentage of censored observations in the times 1 and 2 are approximately 30% and 20%, respectively.

Tables 1 and 2 provide the averages (Mean), biases and the mean square errors (MSEs) of the MLEs and Bayesian estimates of the parameters in the BKwW regression model, respectively. We can note that the MSE values decrease when the sample size increases in agreement with first-order asymptotics.

5.1. Influence of outlying observations

A goal of this study is to show the need for robust models to deal with the presence of outliers in the data. We generate a sample of length 200 with fixed parameters α = 0.5, σ1 = 3.0, β01 = −2.5, β11 = −3.5, σ2 = 1.5, β02 = −2.0, β12 = −3.0, a = 1.25, and b = 0.5.

We select cases 35, 95, and 132 for perturbation. The perturbation scheme is structured as following. To create influential observations artificially in the dataset, we choose one, two or three of these selected cases. For each case, we perturb one or both lifetimes as follows i = ti + 5St, i = 1, 2, where St is the standard deviation of the ti’s. For case 35, we perturb only the lifetime t1 and, for case 132, the lifetime t2, and for case 95, both lifetimes are perturbed.

In this study, we consider eight setups. Dataset a: original dataset without outliers; Dataset b: data with outlier 35; Dataset c: data with outlier 95; Dataset d: data with outlier 132; Dataset e: data with outliers 35 and 95; Dataset f: data with outliers 35 and 132; Dataset g: data with outliers 95 and 132; and dataset h: data with outliers 35, 95, and 132. The MCMC computations are made similar to those in the last subsection using the Geweke’s convergence diagnostic (Geweke, 1992) to monitor the convergence of the Gibbs samples as well as trace plots.

Table 3 reveals that the posterior inferences about the parameters are sensitive to the perturbation of the selected case(s), except the β parameters. Table 4 gives the Monte Carlo estimates of the measures DIC, EAIC, EBIC, and LPML for each perturbed version of the original data. As expected, the original simulated data (dataset a) yields the best fit.

We consider the sample from the posterior distributions of the parameters of the BKwW regression model to calculate the ψ-divergence measures in (4.3). The results in Table 5 reveal, before perturbation (Dataset a), that the selected cases are not influential according to all ψ-divergence measures. However, after perturbation (Dataset b–h), the measures increase, which indicates that the perturbed cases are influential.

Figures 24 display the four ψ-divergence measures for cases (a), (f), and (h), respectively. We can note that all measures identify influential case(s) and provide larger ψ-divergence measures in comparison to the other cases.

6. Application: renal insufficiency data

In this section, we perform a statistical analysis on the renal insufficiency study originally presented by McGilchrist and Aisbett (1991) using the BKwW regression model. Recently, these data are analyzed by Barriga et al. (2010) by considering a location scale model for bivariate survival times based on the proposal of a copula to model the dependence of bivariate survival data. The dataset refers to the occurrence times of two distinct infection events in patients suffering from renal insufficiency. The patients use portable dialysis machines and the occurrence of infection at the catheter insertion point is considered. When this occurs, the catheter must be removed until the infection is cured, when the catheter is reinserted. The times between insertion of the catheter and occurrence of an infection are recorded. Therefore, various infections can occur in the patients, but only two are recorded in this dataset. There are also situations when the catheter must be removed for reasons other than infection, in which case there is right censoring of the data. The response variables of interest are the times between catheter insertion and infection. For each patient i, i = 1, 2, …, 38, the associated variables are: ti1: time to occurrence of the first infection, in weeks; ti2: time to occurrence of the second infection, in weeks; δi1: censoring indicator of event 1; δi2: censoring indicator of event 2; xi1: patients sex (0: male, 1: female).

We perform an exploratory analysis, mainly considering the variables referring to the occurrence times of the two events of interest. The estimated Kendall correlation coefficient is ρ̂K = 0.03, thus indicating an absence of correlation between the times of the events of interest. Despite this very low correlation, we adopt these data to illustrate the application of the BKwW regression model. The parameter referring to the association between the responses in the regression model must detect this independence, without impairing the model’s adjustment to these data. The Kaplan-Meier survival estimates and the marginal KwW distributions fitted to both events 1 and 2 are displayed in Figure 5. So, it is adequate to assume the BKwW model for the survival times. Since the KwW distribution is adequate for the times to events 1 and 2, it is coherent, for the case of bivariate data, to consider the bivariate Weibull cumulative distribution (Hougaard, 1986).

6.1. Results: classical inference

Based on the results of Sections 2 and 3, Table 6 lists the MLEs of the parameters for the BKwW regression model, their standard errors and p-values. The results indicate that male patients have accelerated times for the second infection than female patients. This fact can be confirmed visually by means of Figure 6. The estimated association parameter indicates that the lifetimes are independent and confirm the result obtained at the beginning of this section, when calculating ρK.

Diagnostic analysis

After modeling, it is important to verify whether there are observations that influence the fit of the BKwW regression model. To investigate this, we calculate global influence measures, likelihood distance (LDi(Ψ)) and generalized Cook distance (GDi(Ψ)), as defined in Section 3, using the R software. The results of these influence measure index plots are displayed in Figure 7. Based on these plots, we note that case ♯21 is possibly an influential observation.

We apply the local influence theory developed in Section 3.2, where case-weight perturbation is used, and obtain the value of the maximum curvature Cdmax (Ψ) = 2.83. Figure 8 shows the plot of the eigenvector corresponding to dmax and Ci versus the observation index.

The influence of perturbations on the observed survival times is now analyzed (response variable perturbation). The value of the maximum curvature is Cdmax (Ψ) = 1677.50. In Figure 9, we plot dmax and Ci versus the observation index. Figure 9 indicates that the case #29 is distinguished from the others.

Impact of the detected influential observations

We conclude that the diagnostic analysis (global influence and local influence) detect as potentially influential observations, the following two cases: ♯21 and ♯29. The observation ♯21 corresponds to a male with a longer time to first occurrence of the infection and a median time to occurrence of infection 2. The observation ♯29 is related to a male with less time until the occurrence of infection and a low time until the occurrence of infection 2. In order to reveal the impact of the two observations on the parameter estimates, we refit the model under some situations. First, we individually eliminate each one of these two observations. Next, we remove from the original dataset the totality of potentially influential observations.

Table 7 gives the relative changes (in percentages) of the estimates defined by RCΨv = [(Ψ̂vΨ̂v(I))/Ψ̂v] × 100, and the corresponding p-values, where Ψ̂v(I) is the MLE of Ψv after the “set I” of observations is removed.

The figures in Table 7 reveal that there are substantial changes in the estimated parameter values as well as changes in the set of parameters that show the model’s significance. Despite this sensitivity of the model, inclusion and exclusion of these points do not imply changes in the interpretation of the results, since there is no change in the sign of the coefficient referring to the sex variable, which indicates that men tend to suffer infections sooner than women.

6.2. Results: Bayesian analysis

Table 8 provides the means, standard deviations (SDs) and 95% Bayesian credible intervals (CI (95%)) for the estimates of the parameters in the BKwW regression model fitted to the original dataset and when two observations (♯15 and ♯21) are dropped.

The sample considered for the posterior distributions of the parameters of the BKwW regression model and the ψ-divergence measures in (4.3) are calculated. Figure 10 displays the index plot of the three ψ-divergence measures. The cases ♯15 and ♯21 are possible influential observations in the posterior distribution. The three influence diagnostic measures for these two observations are given in Table 9.

Also, we obtain the DIC, EAIC, EBIC, and LPML values to compare the impact of the influential points. Table 10 presents the values for the complete data, without observation ♯15, ♯21 and both of two cases. By dropping the observation ♯21, we obtain a better fit compared to the case without the observation ♯15.

6.3. Goodness-of-fit

We check the quality of the fitted regression model by means of plots of the Kaplan-Meier survival function and the estimated marginal survival functions for the BKwW regression model. Figure 11 shows that the proposed model is well adjusted, because its fitted survival function follows the Kaplan-Meier curve closely.

7. Conclusions and remarks

We propose the BKwW regression model as an extension of the bivariate Weibull distribution. We estimate the model parameters using the maximum likelihood methodology subject to linear restrictions in the parameters. We also perform a sensitivity analysis to assess the robustness of the results. We carry out the estimation method using the R software. Several functions are implemented as well as new functions with the partial derivatives of the log-likelihood function.

The optimization process is sensitive to the choice of the initial values used in the algorithms. To choose the initial regression parameters, we obtain the estimates considering marginal models for each response variable, while we choose the values a = 1 and b = 1 for the other parameters that therefore represent the special bivariate Weibull distribution. By using the global influence, local influence and total local influence techniques, we identify some possible influential points. We examine these points to detect possible errors in managing the dataset, but discard this possibility. Further, we discuss the use of MCMC methods as an alternative way to perform Bayesian inference for lifetime data that are supposed to follow the BKwW regression model. We also adopt Bayesian case influence diagnostics based on the K-L divergence to study the sensitivity of the Bayesian estimates under perturbations in the model/data. We verify the robustness of the results after re-estimating the model parameters after individual and joint exclusion of the identified points. Despite small changes in the estimates, the results and their interpretations are not influenced by the possible influential points indicated by the diagnostic techniques. The functions and techniques presented allow the utility of the BKwW regression model to analyze survival data with two response variables and covariates, and to perform sensitivity analysis to confirm the adequacy of the model assumptions and validate the obtained results.


Expressions for A(·), B(·), C(·), and G(·) described in Section 2 are given below:



Fig. 1. Joint probability density function for some parameter values.
Fig. 2. -divergence measures from Dataset (a).
Fig. 3. -divergence measures from Dataset (f).
Fig. 4. -divergence measures from Dataset (h).
Fig. 5. Marginal survival estimates by Kaplan-Meier and Kumaraswamy Weibull distribution based on the renal insufficiency data.
Fig. 6. Survival curves marginally estimated by the Kaplan-Meier method for the renal insufficiency data for each event by means of the sex variable.
Fig. 7. (a) Index plot of GDi() (generalized Cook distance). (b) Index plot of LDi() (likelihood distance) to the renal insufficiency data.
Fig. 8. (a) Index plot of max for (case-weight perturbation) and (b) total local influence for (case-weight perturbation) based on the fit of the BKwW model to the renal insufficiency data.
Fig. 9. (a) Index plot of max for (simultaneous response perturbation) and (b) total local influence for (simultaneous response perturbation) based on the model fitted to the renal insufficiency data.
Fig. 10. Index plots of -divergence measures for the renal insufficiency data.
Fig. 11. Kaplan-Meier curves stratified by gender (0 = masculine, 1 = feminine) and estimated survival functions for the renal insufficiency data.

Table 1

Mean estimates, Bias and MSEs of the MLEs of the parameters in the bivariate Kumaraswamy Weibull regression model

Parameter (true value)N = 50N = 100N = 150

α (0.75)0.82910.07910.01530.82650.07650.01050.82370.07370.0085
σ1 (1.00)1.44840.44840.28781.38760.38760.19081.37540.37540.1655
β01 (4.00)4.49530.49530.32994.55880.55880.35684.55280.55280.3350
β11 (0.75)0.7097−0.04030.10780.6589−0.09110.06390.6473−0.10270.0484
σ2 (1.25)1.58930.33930.28781.51720.26720.19081.50640.25640.1655
β02 (3.25)2.8722−0.37780.19352.8918−0.35820.15492.9043−0.34570.1355
β12 (1.75)1.7191−0.03090.07421.7039−0.04610.03861.6916−0.05840.0274
a (1.00)0.5705−0.42950.21160.5710−0.42900.19770.5669−0.43310.1963
b (0.75)0.90490.15490.07070.92560.17560.05170.91270.16270.0413

MSE = biases and the mean square error; MLE = maximum likelihood estimator.

Table 2

Mean estimates, biases and MSEs of the Bayesian estimates of the parameters in the bivariate Kumaraswamy Weibull regression model

Parameter (true value)N = 50N = 100N = 150

α (0.75)0.90270.15270.04330.86150.11150.02630.84660.09660.0187
σ1 (1.00)1.43230.43230.35821.38110.38110.22311.36890.36890.1859
β01 (4.00)4.53090.53090.46264.56990.56990.40574.55820.55820.3619
β11 (0.75)0.7189−0.03110.24390.6658−0.08420.11830.6542−0.09580.0815
σ2 (1.25)1.56420.31420.35821.50610.25610.22311.49230.24230.1859
β02 (3.25)2.8727−0.37730.24552.8966−0.35340.17162.9022−0.34780.1515
β12 (1.75)1.7407−0.00930.16191.7200−0.03000.07311.6987−0.05130.0493
a (1.00)0.5730−0.42700.25150.5732−0.42680.21400.5645−0.43550.2084
b (0.75)0.91530.16530.29870.91900.16900.10580.89730.14730.0559

MSE = biases and the mean square error; MLE = maximum likelihood estimator.

Table 3

Posterior means and standard deviations (SDs) for the bivariate Kumaraswamy Weibull regression model according to different perturbation schemes

DatasetPerturbed caseασ1β01β11σ2β02β12ab
Mean (SD)Mean (SD)Mean (SD)Mean (SD)Mean (SD)Mean (SD)Mean (SD)Mean (SD)Mean (SD)




e{35 95}0.61212.8289−2.2190−3.83971.1567−2.7151−3.6568)1.04160.5838

f{35, 132}0.39222.1309−2.2725−3.70820.7735−2.7021−3.46451.84191.2165

g{95, 132}0.39582.1574−2.2769−3.72040.7797−2.7047−3.51081.81691.1387

h{35, 95, 132}0.38032.0017−2.2713−3.76510.7451−2.7274−3.59251.94301.1302

Table 4

Bayesian criteria for each perturbed version by fitting the bivariate Kumaraswamy Weibull regression model according to different perturbation schemes

Data namesBayesian criteria


EAIC = expected Akaike information criterion; EBIC = expected Bayesian information criterion; DIC = deviance information criterion; LPML = log pseudo marginal likelihood.

Table 5

Case influence diagnostics for the simulated data

Data namesCase numberψ-divergence measure


Table 6

MLEs for the bivariate Kumaraswamy Weibull regression model based on the renal insufficiency data

ParameterEstimateStandard errorp-value

MLEs = maximum likelihood estimators.

Table 7

Relative changes, estimates, and the corresponding p-values in parentheses

SetIcompleteI – {21}I – {29}I – {21 and 29}









Table 8

Case influence diagnostics for renal insufficiency data

ParameterComplete data setDropped observations 15 and 21

MeanSDCI (95%)MeanSDCI (95%)

SD = standard deviation; CI = credible intervals.

Table 9

Case influence diagnostics for the renal insufficiency data

Case numberψ-divergence measure


Table 10

Bayesian criteria

Dropped observationCriterion

15 and 21628.4176642.6693618.2174−309.4217

EAIC = expected Akaike information criterion; EBIC = expected Bayesian information criterion; DIC = deviance information criterion; LPML = log pseudo marginal likelihood.

  1. Barriga, GDC, Louzada-Neto, F, Ortega, EMM, and Cancho, VG (2010). A bivariate regression model for matched paired survival data: local influence and residual analysis. Statistics Methods and Applications. 19, 477-496.
  2. Brooks, SP (2002). Discussion on the paper by Spiegelhalter, Best, Carlin, and van der Linde. Journal of the Royal Statistical Society Series B. 64, 616-618.
  3. Carlin, BP, and Louis, TA (2001). Bayes and Empirical Bayes Methods for Data Analysis. Boca Raton: Chapman & Hall/CRC
  4. Cook, RD (1977). Detection of influential observations in linear regression. Technometrics. 19, 15-18.
  5. Cook, RD (1986). Assessment of local influence (with discussion). Journal of the Royal Statistical Society B. 48, 133-169.
  6. Cordeiro, GM, and de Castro, M (2011). A new family of generalized distributions. Journal of Statistical Computation and Simulation. 81, 883-898.
  7. Cordeiro, GM, Ortega, EMM, and Nadarajah, S (2010). The Kumaraswamy Weibull distribution with application to failure data. Journal of the Franklin Institute. 347, 1399-1429.
  8. da Cruz, JN, Ortega, EMM, and Cordeiro, GM (2016). The log-odd log-logistic Weibull regression model: modelling, estimation, influence diagnostics and residual analysis. Journal of Statistical Computation and Simulation. 86, 1516-1538.
  9. Dey, D, and Birmiwal, L (1994). Robust Bayesian analysis using divergence measures. Statistics and Probability Letters. 20, 287-294.
  10. Geweke, J (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. Bayesian Statistics, Bernardo, JM, Berger, JO, Dawid, AP, and Smith, AFM, ed: Oxford University Press, pp. 169-188
  11. Hashimoto, EM, Ortega, EMM, Cancho, VG, and Cordeiro, GM (2013). On estimation and diagnostics analysis in log-generalized gamma regression model for interval-censored data. Statistics. 47, 379-398.
  12. He, W, and Lawless, JF (2005). Bivariate location-scale models for regression analysis, with applications to lifetime data. Journal of the Royal Statistical Society. 67, 63-78.
  13. Hougaard, P (1986). A class of multivariate failure time distributions. Biometrika. 73, 671-678.
  14. Ibrahim, JG, Chen, MH, and Sinha, D (2001). Bayesian Survival Analysis. New York: Springer
  15. Lange, K (1999). Numerical Analysis for Statisticians. New York: Springer
  16. Lawless, JF (2003). Statistical Models and Methods for Lifetime Data. New York: Wiley
  17. McGilchrist, CA, and Aisbett, CW (1991). Regression with frailty in survival analysis. Biometrics. 47, 461-466.
    Pubmed CrossRef
  18. Mudholkar, GS, Srivastava, DK, and Friemer, M (1995). The exponentiated Weibull family: a reanalysis of the bus-motor-failure data. Technometrics. 37, 436-445.
  19. Ortega, EMM, Cordeiro, GM, and Kattan, MW (2013). The log-beta Weibull regression model with application to predict recurrence of prostate cancer. Statistical Papers. 54, 113-132.
  20. Ortega, EMM, Cordeiro, GM, Campelo, AK, Kattan, MW, and Cancho, VG (2015). A power series beta Weibull regression model for predicting breast carcinoma. Statistics in Medicine. 34, 1366-1388.
    Pubmed CrossRef
  21. Peng, F, and Dey, D (1995). Bayesian analysis of outlier problems using divergence measures. Canadian Journal of Statistics. 23, 199-213.
  22. R Core Team (2016). R: a language and environment for statistical computing.URL https://www.R-project.org
  23. Silva, GO, Ortega, EMM, and Cordeiro, GM (2010). The beta modified Weibull distribution. Lifetime Data Analysis. 16, 409-430.
    Pubmed CrossRef
  24. Spiegelhalter, DJ, Best, NG, Carlin, BP, and van der Linde, A (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society Series B. 64, 583-639.