In this paper, we proposed a new long-term lifetime distribution with four parameters inserted in a risk competitive scenario with decreasing, increasing and unimodal hazard rate functions, namely the Weibull-Poisson long-term distribution. This new distribution arises from a scenario of competitive latent risk, in which the lifetime associated to the particular risk is not observable, and where only the minimum lifetime value among all risks is noticed in a long-term context. However, it can also be used in any other situation as long as it fits the data well. The Weibull-Poisson long-term distribution is presented as a particular case for the new exponential-Poisson long-term distribution and Weibull long-term distribution. The properties of the proposed distribution were discussed, including its probability density, survival and hazard functions and explicit algebraic formulas for its order statistics. Assuming censored data, we considered the maximum likelihood approach for parameter estimation. For different parameter settings, sample sizes, and censoring percentages various simulation studies were performed to study the mean square error of the maximum likelihood estimative, and compare the performance of the model proposed with the particular cases. The selection criteria Akaike information criterion, Bayesian information criterion, and likelihood ratio test were used for the model selection. The relevance of the approach was illustrated on two real datasets of where the new model was compared with its particular cases observing its potential and competitiveness.
Survival data in presence of competing risks arise in several areas, such as public health, actuarial science, biomedical studies, demography, and industrial reliability. A scenario classical of the competing risks occurs when there is no information about which risk was responsible for the component failure (or individual death) and only the minimum lifetime value among all risks is observed. This information is not accessible in many situations or it is impossible to specify the true cause of failure. This sometimes occurs when there is an interest to observe the lifetime of a system in series. In this case, the lifetime duration depends on a set of components.
In recent years, several authors proposed probability distributions which properly accommodate survival data in the presence of latent competing risks. For example, Adamidis and Loukas (1998) proposed a compounding distribution, denoted by exponential geometric (EG) distribution, which properly accommodate survival data in the presence of latent competing risks. Kuş (2007) proposed another compounding distribution that properly accommodates survival data in the presence of latent competing risks and exponential-Poisson distribution (EP). Tahmasbi and Rezaei (2008) introduced the logarithmic exponential distributions. Chahkandi and Ganjali (2009) introduced exponential power series that contains the distributions cited EG, EP, and logarithmic exponential as special cases. Louzada
Other characteristic in survival data arises when part of the population is not susceptible to the event of interest and is considered as immune or cured. Models which consider that one part of the population is cured have been widely developed and are usually called long-term survival models. For instance, a population can respond favorably to a treatment in clinical studies and therefore be considered cured. Perhaps the most popular type of long-term model was introduced by Boag (1949) and Berkson and Gage (1952). In this model, it is assumed that a certain proportion of the patients, say
In this paper, we proposed a Weibull-Poisson (WP) long-term model. This new model is based on the WP distribution, Bereta
The new model, due to its flexibility in accommodating various forms of the risk function, seems to be an important model that can be used in a variety of problems for survival data modeling in a latent competing risk scenario with long-term implications. In addition to that, the LWP model is also suitable to test goodness-of-fit of some special sub-models, such as new exponential-Poisson long-term distribution (LEP) and Weibull long-term distribution (LW). We demonstrate, by means of an application to real data, that the LWP model can produce better fits than some other known models. It therefore represents a good alternative for lifetime data analysis. We hope this generalization may attract wider applications in survival analysis. The inferential part of this model is carried out using the asymptotic distribution of maximum likelihood estimators. Considering that the LWP model is embedded in LEP and LW, the likelihood ratio test (LR) can be used to discriminate such models. Studies were conducted via Monte Carlo simulation to evaluate the performance of LWP distribution through means, mean squared error (MSE) for the maximum likelihood estimates (MLEs), power of the test, Akaike information criterion (AIC), and Bayesian information criterion (BIC) for model selection.
This paper is organized as follows. Section 2 introduces the LWP distribution and presents some forms of its hazard rate function. We also described its
Let
where
WP distribution results from latent competing risk scenarios, in which the lifetime associated with a particular risk is not observable, and only the minimum lifetime value among all risks is noticed. Application of WP distribution in survival studies were investigated by Bereta
where
The pdf (
Figure 1 illustrates some of the possible shapes of the hazard function
LWP distribution opens new possibilities for several types of fitted data. It is observed that when
Order statistics play an important role in quality control testing and reliability, where a practitioner needs to predict the failure of future items based on how many times develop early failure. These predictors are often based on moments of order statistics (Louzada
We now derive an explicit expression for the density of the
where
But the order statistic of a WP distribution is given by,
where
Let
where
MLEs for parameter vector
The components of the score vector
where
We can use numerical methods to solve the system of equations because there is no closed analytical method to find the estimators. The estimates of these parameters were therefore obtained by numerical methods, using an iterative process. We used the command
For the selection of the model that best fits the data were used the AIC model selection criteria, BIC, and the LR. The AIC and BIC are defined by
where
As LWP distribution is reduced in the LEP and LW distributions may be used the LR. LR can be used to discriminate when testing nested models. We can compute the maximum values of the unrestricted and restricted log-likelihoods to construct LR statistics for testing some sub-models. The statistic (
To examine the performance of the LWP model compared with LEP and LW models and assess the performance of MLEs for the parameters of the new model, a simulation study was done for sample sizes (40, 100, 400, 600, and 800) with 40% and 60% censored observations in each sample that generated 1,000 random samples. In this study, the survival time of
Generate
Generate
where
Generate variable of censure
Find
If
For each combination of
Table 1 shows the proportion of times that the AIC and BIC of the LWP model was less than the LW and LEP models. Furthermore, it was also calculated the power of the test with 5% of the significance for different sample sizes, percentages of censored observations, and different parameter values of the new model. Table 1 indicates that the AIC proportion of the LWP model was lower than the LW model as well as between 35.2% and 82.5% to
In relation to BIC, the proportion varied between 0.15% and 46.60%. For
The higher proportions of the AIC was in the scenario in which the parameter
In relation to the power of the test, the values varied between 8.3% and 78.3%. However, the power of the minimum test was 40.4% when
A simulation study was then done with times of censure
In this section, we compared the LWP distribution with LW and LEP distributions on two data sets to identify an appropriate survival time model. The first data set was related to the time (in days) until the recurrence to the crime of 477 individuals who are in a semi-open regime, in which 60% of the observations are censored. The second data set was extracted from Kalbfleisch and Prentice (2002). The data referred to the survival time (in days) of 195 patients with carcinoma of the oropharynx, in which 30% of the observations are censored. Data was verified before we fit a model to the risk function of the times observed using a graphical method based on the total time test (TTT) that is also known as a TTT plot. This method is useful when there is information about the risk function of the studied variable. According to Aarset (1987), the empirical version of the TTT plot is given by
We also compared LWP distribution with its particular cases in considering AIC and BIC. AIC and BIC values are presented in Table 4 with the log-likelihood
A likelihood ratio test was used to select models that best fit the data because LWP distribution is reduced in the LEP and LW distributions. The test procedure is shown in Section 3.2. The hypothesis for LEP and LWP distribution were:
This paper proposed a LWP model. This model is an extension of the WP distribution proposed by Bereta
Proportion of the AIC, BIC, and power of the test of LWP, LW, and LEP models for different values of
% of censorship | LW model | LEP model | ||||||
---|---|---|---|---|---|---|---|---|
AIC | BIC | Power of the test | AIC | BIC | Power of the test | |||
0.20 | 40 | 40% | 0.089 | 0.015 | 0.048 | 0.217 | 0.002 | 0.142 |
60% | 0.052 | 0.010 | 0.070 | 0.270 | 0.023 | 0.168 | ||
100 | 40% | 0.298 | 0.045 | 0.213 | 0.317 | 0.009 | 0.197 | |
60% | 0.019 | 0.026 | 0.132 | 0.233 | 0.027 | 0.298 | ||
400 | 40% | 0.670 | 0.216 | 0.636 | 0.687 | 0.071 | 0.531 | |
60% | 0.409 | 0.205 | 0.372 | 0.549 | 0.102 | 0.404 | ||
600 | 40% | 0.766 | 0.257 | 0.745 | 0.612 | 0.178 | 0.508 | |
60% | 0.465 | 0.212 | 0.448 | 0.532 | 0.270 | 0.470 | ||
800 | 40% | 0.825 | 0.258 | 0.758 | 0.849 | 0.400 | 0.783 | |
60% | 0.748 | 0.232 | 0.449 | 0.803 | 0.303 | 0.709 | ||
0.30 | 40 | 40% | 0.118 | 0.001 | 0.006 | 0.230 | 0.005 | 0.083 |
60% | 0.109 | 0.005 | 0.005 | 0.165 | 0.071 | 0.170 | ||
100 | 40% | 0.296 | 0.002 | 0.225 | 0.263 | 0.005 | 0.108 | |
60% | 0.209 | 0.011 | 0.171 | 0.144 | 0.018 | 0.100 | ||
400 | 40% | 0.750 | 0.199 | 0.632 | 0.622 | 0.050 | 0.485 | |
60% | 0.352 | 0.190 | 0.403 | 0.532 | 0.069 | 0.470 | ||
600 | 40% | 0.759 | 0.349 | 0.701 | 0.667 | 0.227 | 0.487 | |
60% | 0.355 | 0.204 | 0.409 | 0.618 | 0.213 | 0.462 | ||
800 | 40% | 0.819 | 0.466 | 0.733 | 0.609 | 0.231 | 0.768 | |
60% | 0.550 | 0.346 | 0.549 | 0.575 | 0.470 | 0.596 |
AIC = Akaike’s information criterion; BIC = Bayesian information criterion; LWP = Weibull-Poisson long-term, LW = Weibull long-term; LEP = exponential-Poisson long-term.
Mean and mean squared error (whithin parenthesis) of the estimates of the parameters of Weibull-Poisson long-term model with
Parameters | |||||
---|---|---|---|---|---|
40% | 60% | 40% | 60% | ||
40 | 0.776 (9.061) | 0.798 (8.817) | 0.612 (9.686) | 0.573 (9.837) | |
0.839 (0.179) | 0.859 (0.206) | 0.900 (0.242) | 0.989 (0.313) | ||
1.610 (0.259) | 1.731 (0.271) | 1.479 (0.373) | 1.653 (0.262) | ||
0.128 (0.009) | 0.281 (0.015) | 0.175 (0.020) | 0.380 (0.009) | ||
100 | 0.614 (9.856) | 0.813 (9.098) | 2.112 (7.107) | 0.420 (10.673) | |
0.845 (0.159) | 0.854 (0.178) | 1.883 (0.631) | 0.991 (0.282) | ||
1.511 (0.273) | 1.621 (0.202) | 1.549 (0.234) | 1.530 (0.271) | ||
0.135 (0.005) | 0.295 (0.011) | 0.265 (0.001) | 0.386 (0.008) | ||
400 | 2.329 (9.122) | 1.561 (10.125) | 1.512 (10.343) | 0.608 (14.399) | |
0.799 (0.138) | 0.755 (0.125) | 0.873 (0.201) | 0.953 (0.244) | ||
1.494 (0.268) | 1.604 (0.177) | 1.453 (0.532) | 1.493 (0.274) | ||
0.137 (0.004) | 0.300 (0.010) | 0.143 (0.001) | 0.388 (0.007) | ||
600 | 1.455 (6.859) | 2.074 (7.477) | 0.563 (8.688) | 1.563 (10.226) | |
0.737 (0.117) | 0.697 (0.101) | 0.898 (0.200) | 0.795 (0.165) | ||
1.508 (0.251) | 1.612 (0.162) | 1.367 (0.407) | 1.475 (0.290) | ||
0.137 (0.004) | 0.302 (0.010) | 0.193 (0.001) | 0.377 (0.006) | ||
800 | 5.596 (6.045) | 1.845 (6.905) | 2.834 (4.214) | 6.408 (9.554) | |
1.103 (0.201) | 0.712 (0.111) | 1.599 (0.007) | 1.119 (0.206) | ||
1.771(0.058) | 1.584(0.194) | 1.718(0.172) | 1.683 (0.111) | ||
0.217(0.004) | 0.292(0.011) | 0.231(0.110) | 0.265 (0.001) |
MLEs (and corresponding standard errors in parentheses) for the parameters of the fitted distributions for two datasets
Datasets | Distributions | ||||
---|---|---|---|---|---|
Recurrence to the crime | LEP | 0.007e–01 | - | 0.001 | 0.420 |
(0.001e–01) | - | (0.028) | (0.062) | ||
LW | 0.001 | 1.563 | - | 0.551 | |
(7.589e–04) | (0.104) | - | (0.026) | ||
LWP | 0.007e–01 | 1.735 | 2.228 | 0.544 | |
(0.002e–01) | (0.125) | (1.459) | (0.028) | ||
Carcinoma of the oropharynx | LEP | 0.001 | - | 0.007e–01 | 0.131 |
(0.001e–01) | - | (0.021) | (0.040) | ||
LW | 0.002 | 1.453 | - | 0.208 | |
(0.001e–01) | (0.103) | - | (0.034) | ||
LWP | 0.001 | 1.656 | 3.458 | 0.201 | |
(0.003e–01) | (0.132) | (2.013) | (0.036) |
MLE = maximum likelihood estimate; LEP = exponential-Poisson long-term distribution; LW = Weibull long-term distribution; LWP = long-term Weibull-Poisson.
The log-likelihood
Distributions | Recurrence to the crime | Carcinoma of the oropharynx | ||||
---|---|---|---|---|---|---|
AIC | BIC | AIC | BIC | |||
LEP | −1727.523 | 3461.036 | 3473.539 | −1082.608 | 2171.216 | 2181.035 |
LW | −1712.804 | 3431.608 | 3444.111 | −1082.608 | 2151.398 | 2161.217 |
LWP | −1710.762 | 3429.524 | 3446.194 | −1082.608 | 2150.508 | 2162.232 |
AIC = Akaike’s information criterion; BIC = Bayesian information criterion; LEP = exponential-Poisson long-term distribution; LW = Weibull long-term distribution; LWP = long-term Weibull-Poisson.