Recurrent event data are frequently encountered in clinical and observational studies. Examples include tumor recurrences, recurrent heart attacks, repeated hospitalization and insurance claims and so on. Such data can provide a history of the symptom of interest and be utilized to evaluate the long term effects of covariates. To analyze recurrent event data, various methods have been proposed for either the intensity function (Andersen and Gill, 1982) and the marginal rate function (Wei
Sometimes, a portion of subjects would not experience recurrent events of interest and these subjects might be regarded as cured or non-susceptible. A similar situation is discussed in a context of a failure time data. A cure rate model assumes there exists a risk-free group explaining the Kaplan-Meier survival curve with a long and stable plateau at the tail. An introduction of cure rate model makes it possible to estimate the effect of covariate on both the incidence as well as the latency of the event. Sy and Taylor (2000) applied a proportional hazard model and adopted an EM algorithm to recover the unknown susceptibility of censored subjects. Peng and Dear (2000) considered a frailty effect for a cure probability and Kim and Jhun (2008) also applied a frailty effect to interval censored failure time with cure proportion and Kim (2017) extended to bivariate interval censored data. There have been several recent studies using joint models that simultaneously consider both zero-inflated recurrent events and terminal events. For example, Liu
The motivation for this study is infection record data obtained from 210 child patients with acute myeloid leukemia (AML). During the chemotherapy course, patients had been repeatedly infected by bacterial, viral or fungal infection. The causes are classified into two types, the bacterial infection and the fungal or viral, as examples of a bivariate recurrent event. The bacterial infection more often occurred and 156 patients (74%) had experienced more than one infection while only 73 patients (35%) had been infected with fungal or viral. In order to reflect such different occurrence rates, separate cure rate models are implemented for bivariate recurrent event data. We consider each event has a separate susceptibility model for more general extension of univariate recurrent event with zero inflation to bivariate case.
In this article, we present a joint model for bivariate recurrent event data with zero inflation. The remainder of the article is organized as follows. After introducing notations and assumptions, a joint model is presented in Section 2 and an estimation procedure is described in Section 3. In Section 4, a simulation study is performed for assessing the finite sample properties of the proposed method. In Section 5, the proposed method is applied to a leukemia dataset. Finally, some concluding remarks are provided in Section 6.
Suppose there are
Let
Logistic regression models are used to model the susceptible probabilities,
A bivariate frailty effect
where
Denote
where
For easy implementation of computation, a piecewise constant rate is adopted as a baseline intensity function of recurrent event. Lawless and Zhan (1998), Liu
where
Let
where
The marginal log likelihood is obtained by integrating (3.3) with respect to a bivariate frailty density
An EM algorithm is applied to recover unknown quantities
We present the results from a simulation study to investigate the performance of the proposed estimation procedures. Five hundred replicates are generated and each one has a sample size of 100. A single binary covariate
For the values of
where
At the first and second settings, both events are assumed to have same susceptible rates. About 30% of subjects are unsusceptible(cure) and another 15% of susceptible subjects had no recurrent events. All estimation procedures are performed using Proc NLMIXED in SAS under a piecewise baseline intensity assumption. The results of Tables 1 and 2 with the same proportions of two recurrent events indicate that all parameter estimates have small biases and standard error (SEE) and empirical values (SEM) are almost same. Coverage probabilities (CP) are also close to 95%. To evaluate the effect of zero inflation model, the simulation results ignoring a cure rate model are presented and show
In this section, the proposed method was applied to a leukemia clinical trial data in which the chemotherapy was administered to 210 patients with AML from October 2002 to October 2008. During this period, patients experienced repeated bacteria, viral or fungal related infections. In this study, we investigated the effects of some covariates on two kinds of infections. The first infection is caused by bacteria and the second one is related to viral or fungal ones. The bacterial infection occurred more often and 156 patients had more than one infections while only 73 patients had been infected with fungal and viral. The range of each infection is [1, 6] and [1, 4] and the average recurrent numbers were 2.08 and 1.21, respectively.
To reflect this discrepancy, a joint model with a separate zero inflation rate model was implemented. Three covariates (age at the diagnosis of AML, the leukemia risk level and the dose of cytarabine given for the first course of chemotherapy) are included in the logistic models and hazard models. The average of age was 8.65 (year). The risk has three classes (low = 0 (
Using piecewise constants baseline hazards for
In the preceding sections, we presented a joint model for regression analysis of bivariate recurrent events in the presence of zero inflation. Bivariate frailty effect is also applied to model the correlation between two recurrent events. In particular, bivariate recurrent events are assumed to have different susceptibilities for each event. The simulation results indicated that the model ignoring zero inflation brings biased estimates of covariate effects and variance. Such modelling can be extended to connect the recurrent event and susceptibility. For example, as a referee pointed out, one type of infection can be applied with an ordinary intensity model without a cure model because of a more frequent occurrence.
Another extension is to include a terminal event process. A similar model was studied in the context of univariate recurrent event data with zero inflation as well as a terminal event by Liu
where
proc nlmixed data=spptotal
parms h1=5 h2=5 h3=5 h4=5 h5=5 beta11=0.5 gam01=-1.0 gam11=0.1
g1=5 g2=5 g3=5 g4=5 g5=5 beta21=0.5 gam02=-1.0 gam12=0.1
theta1=0.5 theta=0.5;
bounds theta theta1>0;
if $event_type$=1 then do; /** for the first event **/
eta1=gam01+gam11*x1; /*** cure fraction **/
prob1=1/(1+exp(-eta1));
basehaz1=h1*e1+h2*e2+h3*e3+h4*e4+h5*e5;
cumhaz1=h1*d1+h2*d2+h3*d3+h4*d4+h5*d5;
mu1=x1*beta11+nu1;
loglik01=-exp(mu1)*cumhaz1;
if delta=0 & rep=0 then loglik1=log(prob1+(1-prob1)*exp(loglik01));
/** log likelihood for subjects without the 1st recurrent event **/
if delta=1 then loglik1=log(basehaz1)+mu1;
/** log likelihood for the 1st recurrent event **/
if delta=0 & rep>0 then loglik1=log(1-prob1)+loglik01;
/*** log likelihood for subjects with 1st recurrent event and last
observation **/
end;
if $event_type$=2 then do; /** for the second event **/
eta2=gam02+gam12*x1; /*** cure fraction**/
prob2=1/(1+exp(-eta2));
basehaz2=g1*e1+g2*e2+g3*e3+g4*e4+g5*e5;
cumhaz2=g1*d1+g2*d2+g3*d3+g4*d4+g5*d5;
mu2=x1*beta21+nu2;
loglik02=-exp(mu2)*cumhaz2;
if delta=0 & rep=0 then loglik1=log(prob2+(1-prob2)*exp(loglik02));
/** log likelihood for subjects without the 2nd recurrent event **/
if delta=1 then loglik1=log(basehaz2)+mu2;
/**log likelihood for the 2nd recurrent event ***/
if delta=0 & rep>0 then loglik1=log(1-prob2)+loglik02;
/***log likelihood for subjects with 2nd recurrent event and last
observation **/
end;
loglik=loglik1;
model time~general(loglik);
random nu1 nu2 ~ normal([0,0],[ theta, theta12, theta1]) subject=id;
run;
Authors thank Drs. Rubnitz, Stanley Pounds and Liang Zhu for providing the dataset. This work was supported by a Korea Research Grant (NRF-2017R1D1A1B03030578).
Simulation 1: Same zero rates at two recurrent events with positive correlation
Parameter | Bivariate frailty | No zero inflation | |||||||
---|---|---|---|---|---|---|---|---|---|
Est | SSE | SEM | CP | Est | SSE | SEM | CP | ||
1st event | 0.496 | 0.196 | 0.184 | 0.940 | 0.375 | 0.345 | 0.336 | 0.926 | |
−1.031 | 0.350 | 0.342 | 0.954 | ||||||
0.124 | 0.462 | 0.489 | 0.964 | ||||||
2nd event | 0.497 | 0.185 | 0.185 | 0.954 | 0.380 | 0.349 | 0.339 | 0.926 | |
−1.021 | 0.369 | 0.339 | 0.950 | ||||||
0.120 | 0.490 | 0.467 | 0.956 | ||||||
Covariance | 0.486 | 0.115 | 0.115 | 0.930 | 2.372 | 0.600 | 0.520 | 0.000 | |
0.493 | 0.113 | 0.112 | 0.924 | 2.404 | 0.564 | 0.520 | 0.000 | ||
0.397 | 0.098 | 0.087 | 0.942 | 0.220 | 0.321 | 0.286 | 0.922 |
Est = estimate; SEE = standard error; SEM = empirical values; CP = coverage probabilities.
Simulation 2: Same zero rates at two recurrent events with negative correlation
Parameter | Bivariate frailty | No zero inflation | |||||||
---|---|---|---|---|---|---|---|---|---|
Est | SSE | SEE | CP | Est | SSE | SEE | CP | ||
1st event | 0.505 | 0.191 | 0.187 | 0.940 | 0.394 | 0.324 | 0.340 | 0.916 | |
−0.993 | 0.336 | 0.337 | 0.954 | ||||||
0.102 | 0.459 | 0.415 | 0.932 | ||||||
2nd event | 0.503 | 0.190 | 0.184 | 0.940 | 0.381 | 0.357 | 0.334 | 0.916 | |
−1.044 | 0.363 | 0.343 | 0.932 | ||||||
0.129 | 0.515 | 0.491 | 0.940 | ||||||
Covariance | 0.479 | 0.104 | 0.108 | 0.920 | 2.420 | 0.584 | 0.531 | 0.000 | |
0.484 | 0.119 | 0.109 | 0.896 | 2.336 | 0.542 | 0.510 | 0.000 | ||
−0.295 | 0.090 | 0.087 | 0.942 | −0.236 | 0.277 | 0.283 | 0.954 |
Est = estimate; SEE = standard error; SEM = empirical values; CP = coverage probabilities.
Simulation 3: Different zero rates at two recurrent events with positive correlation
Parameter | Bivariate frailty | No zero inflation | |||||||
---|---|---|---|---|---|---|---|---|---|
Est | SSE | SEE | CP | Est | SSE | SEE | CP | ||
1st event | 0.511 | 0.241 | 0.223 | 0.920 | 0.258 | 0.635 | 0.594 | 0.910 | |
0.103 | 0.297 | 0.294 | 0.960 | ||||||
0.088 | 0.422 | 0.411 | 0.944 | ||||||
2nd event | 0.495 | 0.194 | 0.189 | 0.938 | 0.401 | 0.351 | 0.336 | 0.938 | |
−1.012 | 0.346 | 0.341 | 0.960 | ||||||
0.098 | 0.490 | 0.471 | 0.962 | ||||||
Covariance | 0.481 | 0.137 | 0.136 | 0.924 | 6.441 | 1.133 | 1.608 | 0.000 | |
0.488 | 0.106 | 0.110 | 0.930 | 2.372 | 0.587 | 0.519 | 0.000 | ||
0.295 | 0.104 | 0.104 | 0.946 | 0.254 | 0.525 | 0.504 | 0.948 |
Est = estimate; SEE = standard error; SEM = empirical values; CP = coverage probabilities.
Simulation 4: Different zero rates at two recurrent events with negative correlation
Parameter | Bivariate frailty | No zero inflation | |||||||
---|---|---|---|---|---|---|---|---|---|
Est | SSE | SEE | CP | Est | SSE | SEE | CP | ||
1st event | 0.487 | 0.221 | 0.222 | 0.948 | 0.248 | 0.639 | 0.593 | 0.912 | |
0.105 | 0.315 | 0.294 | 0.942 | ||||||
0.082 | 0.432 | 0.414 | 0.942 | ||||||
2nd event | 0.489 | 0.192 | 0.185 | 0.930 | 0.380 | 0.315 | 0.336 | 0.944 | |
−1.022 | 0.345 | 0.341 | 0.970 | ||||||
0.095 | 0.464 | 0.470 | 0.970 | ||||||
Covariance | 0.468 | 0.132 | 0.133 | 0.902 | 6.383 | 1.168 | 1.593 | 0.000 | |
0.484 | 0.114 | 0.109 | 0.902 | 2.371 | 0.588 | 0.519 | 0.000 | ||
−0.289 | 0.103 | 0.103 | 0.940 | −0.168 | 0.513 | 0.497 | 0.954 |
Est = estimate; SEE = standard error; SEM = empirical values; CP = coverage probabilities.
Leukemia patients’ infection data
Parameter | Bivariate frailty | No zero inflation | Independence | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Est | SE | Est | SE | Est | SE | ||||||
Bacterial infection | logistic model | intercept | −2.562 | 7.384 | 0.729 | −2.010 | 1.751 | 0.242 | |||
age | −5.630 | 9.346 | 0.547 | −1.240 | 4.216 | 0.764 | |||||
Risk 1 | −0.115 | 6.418 | 0.987 | −1.333 | 2.390 | 0.564 | |||||
Risk 2 | 0.941 | 4.527 | 0.835 | 0.897 | 1.897 | 0.636 | |||||
dose | 0.230 | 3.844 | 0.952 | 0.896 | 1.611 | 0.579 | |||||
recurrent event | age | −0.025 | 0.009 | 0.012 | −0.024 | 0.009 | 0.015 | −0.025 | 0.009 | 0.006 | |
Risk 1 | −0.244 | 0.140 | 0.084 | −0.041 | 0.144 | 0.772 | −0.244 | 0.138 | 0.078 | ||
Risk 2 | −0.435 | 0.161 | 0.008 | −0.229 | 0.162 | 0.157 | −0.413 | 0.160 | 0.010 | ||
dose | 0.075 | 0.121 | 0.534 | 0.077 | 0.120 | 0.520 | 0.083 | 0.121 | 0.488 | ||
Fungal or viral infection | logistic model | intercept | −1.589 | 4.717 | 0.736 | −0.283 | 3.286 | 0.931 | |||
age | −4.625 | 6.426 | 0.473 | −1.872 | 2.668 | 0.484 | |||||
Risk 1 | 0.278 | 5.045 | 0.956 | −1.362 | 2.941 | 0.644 | |||||
Risk 2 | 1.086 | 1.540 | 0.481 | −1.089 | 1.358 | 0.423 | |||||
dose | 0.412 | 2.384 | 0.862 | 0.374 | 0.574 | 0.516 | |||||
recurrent event | age | −0.010 | 0.019 | 0.609 | −0.004 | 0.023 | 0.836 | −0.015 | 0.019 | 0.430 | |
Risk 1 | −0.183 | 0.290 | 0.530 | 0.338 | 0.412 | 0.413 | −0.140 | 0.269 | 0.602 | ||
Risk 2 | −0.170 | 0.324 | 0.606 | 0.406 | 0.491 | 0.408 | −0.213 | 0.297 | 0.474 | ||
dose | 0.311 | 0.252 | 0.217 | 0.214 | 0.301 | 0.475 | 0.231 | 0.226 | 0.307 | ||
covariance | 0.111 | 0.081 | 0.174 | 0.096 | 0.066 | 0.152 | 0.080 | 0.071 | 0.260 | ||
0.599 | 0.339 | 0.078 | 1.186 | 1.009 | 0.239 | 0.257 | 0.229 | 0.262 | |||
0.239 | 0.113 | 0.035 | 0.338 | 0.147 | 0.022 |
Est = cure rates; SE = standard errors.