Pareto distribution is important to analyze data in actuarial sciences, reliability, finance, and climatology. In general, unknown parameters of the Pareto distribution are estimated based on the maximum likelihood method that may yield inadequate inference results for small sample sizes and high percent censored data. In this paper, a new approach based on the regression framework is proposed to estimate unknown parameters of the Pareto distribution under the progressive Type-II censoring scheme. The proposed method provides a new regression type estimator that employs the spacings of exponential progressive Type-II censored samples. In addition, the provided estimator is a consistent estimator with superior performance compared to maximum likelihood estimators in terms of the mean squared error and bias. The validity of the proposed method is assessed through Monte Carlo simulations and real data analysis.
Since its introduction in Cohen (1963), the progressive Type-II censoring scheme has gained considerable popularity and has been extensively studied in, for instance, Balakrishnan
In this study, a new estimation method based on the regression framework is proposed under the progressive Type-II censoring scheme. It focuses on estimations of the unknown parameters in the Pareto distribution. The cumulative distribution function (cdf) and the probability distribution function (pdf) of the random variable
respectively, where
The paper is organized as follows. In Section 2, existing methods for estimating the unknown parameters of the Pareto distribution are presented. In Section 3, a new approach based on the weighted least squares method is proposed. In Section 4, a Monte Carlo simulation is conducted, and real data analysis is performed to assess the proposed method. Finally, Section 5 concludes the paper.
This section provides the results of existing inferences for the scale parameter
The MLE
which has the inverse gamma distribution with parameters (
that has the gamma distribution with parameters (
respectively. Note that both
respectively.
In this section, a new approach is proposed to obtain an estimator that is superior to the MLE in terms of MSE and bias.
Let
Then,
which are independent and identically distributed standard exponential random variables. Then,
have the exponential distribution with the mean
and can lead to the following linear regression model:
where
by minimizing the squared distance
where
Let
Then, by the same argument, another estimator of
where
by Theorem 7.2.1 in Balakrishnan and Cramer (2014). To find the weight
It is clear that
Then, both (
which converges to a constant as
The estimator
by Lemma 1, the fraction term in (
In this section, the proposed estimators are assessed by Monte Carlo simulations; in addition, two real data sets are presented.
The estimators discussed in Sections 2 and 3 are compared in terms of MSE and bias. Unlike estimators
Table 1 shows that the proposed estimator
Two real data sets (device lifetime and business failure) are considered in Fernandez (2014). Wu
with the censoring scheme (1, 0, 2, 0, 3, 2, 0, 4) from the device lifetime data. The business failure data in Nigm and Hamdy (1987) represents the time (in years) for which a business operates until failure. A sample of fifteen businesses was used. Fernandez (2014) used a progressive Type-II censored sample
with the censoring scheme (9*0, 5) from the business failure data. Here, progressive Type-II censored samples were used to obtain the estimates discussed in Sections 2 and 3 (Table 2).
A new approach was proposed to estimate the unknown parameters of the Pareto distribution under the progressive Type-II censoring scheme in the regression framework; subsequently, it was proved that the proposed estimator,
MSEs (biases) for estimators of
Censoring scheme | |||||||||
---|---|---|---|---|---|---|---|---|---|
1 | 0.5 | 20 | 20 | (20*0) | 0.021 (0.056) | 0.015 (0.000) | 0.017 (0.002) | 0.028 (0.111) | 0.013 (0.000) |
10 | (2*0, 1, 0, 2, 0, 2, 2*0, 5) | 0.071 (0.125) | 0.036 (0.000) | 0.045 (0.027) | 0.028 (0.111) | 0.014 (0.000) | |||
30 | 30 | (30*0) | 0.012 (0.036) | 0.009 (0.000) | 0.011 (0.000) | 0.011 (0.071) | 0.005 (0.000) | ||
20 | (9*0, 10, 10*0) | 0.021 (0.056) | 0.015 (0.000) | 0.016 (−0.005) | 0.011 (0.071) | 0.005 (0.000) | |||
15 | (5, 6*0, 10, 7*0) | 0.034 (0.077) | 0.021 (0.000) | 0.023 (−0.006) | 0.011 (0.071) | 0.005 (0.000) | |||
40 | 40 | (40*0) | 0.008 (0.026) | 0.007 (0.000) | 0.008 (0.000) | 0.006 (0.053) | 0.003 (0.000) | ||
30 | (10*0, 5, 7*0, 3, 10*0, 2) | 0.012 (0.036) | 0.009 (0.000) | 0.011 (0.001) | 0.006 (0.053) | 0.003 (0.000) | |||
20 | (8*0, 2*10, 10*0) | 0.021 (0.056) | 0.015 (0.000) | 0.016 (−0.009) | 0.006 (0.053) | 0.003 (0.000) | |||
1.5 | 20 | 20 | (20*0) | 0.191 (0.167) | 0.132 (0.000) | 0.152 (0.007) | 0.002 (0.034) | 0.001 (0.000) | |
10 | (2*0, 1, 0, 2, 0, 2, 2*0, 5) | 0.643 (0.375) | 0.321 (0.000) | 0.405 (0.081) | 0.002 (0.034) | 0.001 (0.000) | |||
30 | 30 | (30*0) | 0.107 (0.107) | 0.083 (0.000) | 0.095 (0.001) | 0.001 (0.023) | 0.001 (0.000) | ||
20 | (9*0, 10, 10*0) | 0.191 (0.167) | 0.132 (0.000) | 0.148 (−0.014) | 0.001 (0.023) | 0.001 (0.000) | |||
15 | (5, 6*0, 10, 7*0) | 0.303 (0.231) | 0.188 (0.000) | 0.204 (−0.017) | 0.001 (0.023) | 0.001 (0.000) | |||
40 | 40 | (40*0) | 0.074 (0.079) | 0.061 (0.000) | 0.072 (−0.001) | 0.001 (0.017) | 0.000 (0.000) | ||
30 | (10*0, 5, 7*0, 3, 10*0, 2) | 0.107 (0.107) | 0.083 (0.000) | 0.095 (0.002) | 0.001 (0.017) | 0.000 (0.000) | |||
20 | (8*0, 2*10, 10*0) | 0.191 (0.167) | 0.132 (0.000) | 0.147 (−0.028) | 0.001 (0.017) | 0.000 (0.000) |
MSE = mean squared error.
Estimates for real data
Device | 0.17350 | 0.13012 | 0.09001 | 0.00980 | 0.00657 |
Business | 2.16083 | 1.72867 | 1.79410 | 1.01000 | 0.97538 |