In this paper, we study statistical inferences on the maximum likelihood estimation of a normal distribution when data are randomly censored. Likelihood equations are derived assuming that the censoring distribution does not involve any parameters of interest. The maximum likelihood estimators (MLEs) of the censored normal distribution do not have an explicit form, and it should be solved in an iterative way. We consider a simple method to derive an explicit form of the approximate MLEs with no iterations by expanding the nonlinear parts of the likelihood equations in Taylor series around some suitable points. The points are closely related to Kaplan-Meier estimators. By using the same method, the observed Fisher information is also approximated to obtain asymptotic variances of the estimators. An illustrative example is presented, and a simulation study is conducted to compare the performances of the estimators. In addition to their explicit form, the approximate MLEs are as efficient as the MLEs in terms of variances.
In statistical analysis of life time data, some well-known common distributions are exponential, Weibull, lognormal, and gamma distribution. The lognormal distribution is especially useful when a hazard rate is initially increasing and then decreasing. We also need inferences for a normal distribution since the logarithm of a lognormal variable follows a normal. As for censoring types, the most common and simplest censoring schemes are type I or type II censoring. For numerous censoring types, see Tableman and Kim (2004), and Lee and Wang (2003).
Gupta (1952), Cohen (1959, 1961), and Kim (2014b) studied the estimation of a type II censored normal distribution. Balakrishinan
Random censoring occurs frequently in survival studies. In this paper, we consider the parameter estimation of a normal distribution for randomly censored data by applying the same approximate method used in Kim (2014b) and Balakrishinan
The approximation method was first developed by Balakrishnan (1989) to find the approximate MLE of the scalar parameter in the Rayleigh distribution with left and right type II censoring. The method approximates the nonlinear part of the likelihood equations; subsequently, many researchers used it for other distributions under several censoring schemes that are most often progressively type II censoring. Balakrishinan
In Section 2, we derive MLEs and approximate MLEs for a normal under random censoring. In Section 3, we provide expressions for observed Fisher information to obtain the approximate variances of the estimators. Section 4 presents simulation results that compare MLEs and approximate MLEs. Section 5 ends the paper with some concluding remarks.
Let
The observed random pairs (
with
Usually lifetimes are positive. When we assume the lifetimes follow a lognormal distribution, the logarithm of the lifetimes should follow a normal distribution. In this case we need the inference for a normal. Let
Here, the lifetime itself should be
where
Using
with
where
The likelihood
The process can be done by approximating
Substituting
From (
with
Substituting
with
Using (
with
and
Since we can show
Now let us think about the explicit form of
where Φ^{−1} is the inverse of Φ, and
with
The second equality in (
From
Let
from the likelihood
For randomly censored data, the Kaplan-Meier estimator
The estimator has been studied in Kaplan and Meier (1958), Efron (1967), Breslow and Crowley (1974), and Meier (1975). Michael and Schucany (1986) suggested the modified Kaplan-Meier estimator
that reduces to
The last term in
Now we have defined
In this section, the observed Fisher information is computed to give the asymptotic variances and covariance of the MLEs
we obtain
From (
we get
From (
By inverting (
From the approximate likelihood
and the approximate observed Fisher information matrix
We compare the performance of the MLEs with the approximate MLEs through a simulation study. To control the ratio of the censored data, we use two different random censoring models. One is the Koziol and Green (1976) censorship model, that is
where
and we call
When the second component is parallel with
We call it P model. The model is closely related to generalized exponential distribution. See Gupta and Kundu (2007), and Kim (2014a). In this case, the expected censoring ratio is
The model is also considered in Kim (2011). The two models are the same when
Table 1 gives the averages of the MLEs of
From Tables 1
As an illustrative example, we consider the tumor-free time data of the 30 rats fed with saturated diets. The data set is to investigate the relationship between diet and the development of tumors. The study divided 90 rats into three groups and fed them low fat, saturated fat, and unsaturated fat diets. The data were originally reported by King
The data are sorted in order and given below. The other two groups are not shown.
Lee and Wang (2003, Chapter 8, 9) checked the Cox-Snell residual plot for the fitted lognormal model, and showed the goodness-of-fit tests based on the asymptotic likelihood inference and the BIC, AIC. As a result, the lognormal distribution would be selected rather than the exponential or the Weibull distribution by the procedures.
When we compute the MLEs of the parameters, we have
The variance-covariance matrices in (
We can see that the results are very close each other.
In this study, the approximate MLEs for a normal distribution under random censoring are proposed by linearizing the nonlinear functions in the likelihood equations. As results, they give an explicit form and need no iterations contrary to the MLEs. In addition, the approximate MLEs are as efficient as the MLEs in terms of biases and variances from the simulation study. As for censoring models, we considered the Koziol-Green model and P model. Apparently, the censoring models do not have clear influence on the estimation process since we assumed that the censoring distribution does not involve any parameters of interest and ignore the information contained in the censoring model.
This paper assumes no covariates, and the problem with covariates remains a good research topic for studying.
Averages of the MLEs
Var( | Var( | Cov( | |||||||
---|---|---|---|---|---|---|---|---|---|
20 | 0.016 | 0.962 | 0.122 | 0.077 | 0.053 | 0.126 | 0.079 | 0.057 | |
30 | 0.009 | 0.976 | 0.072 | 0.045 | 0.030 | 0.074 | 0.048 | 0.032 | |
40 | 0.004 | 0.974 | 0.051 | 0.033 | 0.021 | 0.052 | 0.034 | 0.022 | |
50 | −0.001 | 0.982 | 0.039 | 0.026 | 0.016 | 0.041 | 0.027 | 0.017 | |
20 | 0.003 | 0.957 | 0.082 | 0.052 | 0.026 | 0.084 | 0.055 | 0.029 | |
30 | 0.010 | 0.976 | 0.056 | 0.035 | 0.017 | 0.055 | 0.036 | 0.018 | |
40 | 0.008 | 0.989 | 0.042 | 0.026 | 0.012 | 0.041 | 0.027 | 0.013 | |
50 | 0.001 | 0.987 | 0.032 | 0.021 | 0.009 | 0.032 | 0.021 | 0.010 | |
20 | −0.003 | 0.955 | 0.063 | 0.039 | 0.011 | 0.061 | 0.038 | 0.012 | |
30 | 0.003 | 0.974 | 0.042 | 0.027 | 0.008 | 0.041 | 0.026 | 0.008 | |
40 | −0.004 | 0.981 | 0.032 | 0.019 | 0.005 | 0.031 | 0.019 | 0.006 | |
50 | −0.003 | 0.984 | 0.027 | 0.016 | 0.005 | 0.025 | 0.015 | 0.004 | |
20 | −0.001 | 0.960 | 0.056 | 0.035 | 0.008 | 0.057 | 0.034 | 0.008 | |
30 | −0.001 | 0.974 | 0.039 | 0.021 | 0.004 | 0.038 | 0.022 | 0.005 | |
40 | 0.000 | 0.981 | 0.028 | 0.017 | 0.003 | 0.028 | 0.017 | 0.003 | |
50 | −0.005 | 0.986 | 0.023 | 0.014 | 0.003 | 0.023 | 0.013 | 0.003 |
MLEs = maximum likelihood estimators.
Averages of the approximate MLEs
Var( | Var( | Cov( | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
20 | 0.009 | 0.970 | 0.0069 | −0.0075 | 0.120 | 0.073 | 0.052 | 0.129 | 0.079 | 0.059 | |
30 | 0.004 | 0.979 | 0.0051 | −0.0034 | 0.071 | 0.043 | 0.029 | 0.075 | 0.048 | 0.033 | |
40 | 0.000 | 0.976 | 0.0036 | −0.0021 | 0.051 | 0.032 | 0.020 | 0.053 | 0.034 | 0.022 | |
50 | −0.004 | 0.983 | 0.0030 | −0.0013 | 0.039 | 0.026 | 0.015 | 0.041 | 0.027 | 0.017 | |
20 | −0.003 | 0.961 | 0.0056 | −0.0039 | 0.081 | 0.051 | 0.026 | 0.086 | 0.055 | 0.030 | |
30 | 0.006 | 0.978 | 0.0037 | −0.0019 | 0.055 | 0.034 | 0.016 | 0.056 | 0.036 | 0.018 | |
40 | 0.006 | 0.990 | 0.0027 | −0.0011 | 0.041 | 0.026 | 0.012 | 0.042 | 0.027 | 0.013 | |
50 | −0.001 | 0.988 | 0.0022 | −0.0007 | 0.032 | 0.021 | 0.009 | 0.032 | 0.021 | 0.010 | |
20 | −0.006 | 0.957 | 0.0031 | −0.0015 | 0.062 | 0.039 | 0.012 | 0.062 | 0.038 | 0.012 | |
30 | 0.002 | 0.975 | 0.0019 | −0.0009 | 0.042 | 0.027 | 0.008 | 0.042 | 0.026 | 0.008 | |
40 | −0.005 | 0.981 | 0.0015 | −0.0004 | 0.032 | 0.019 | 0.005 | 0.031 | 0.019 | 0.006 | |
50 | −0.004 | 0.985 | 0.0010 | −0.0005 | 0.027 | 0.016 | 0.005 | 0.025 | 0.015 | 0.004 | |
20 | −0.003 | 0.961 | 0.0022 | −0.0009 | 0.055 | 0.035 | 0.008 | 0.057 | 0.034 | 0.008 | |
30 | −0.003 | 0.974 | 0.0013 | −0.0004 | 0.039 | 0.021 | 0.004 | 0.038 | 0.022 | 0.005 | |
40 | −0.001 | 0.981 | 0.0009 | −0.0004 | 0.028 | 0.017 | 0.003 | 0.029 | 0.017 | 0.003 | |
50 | −0.006 | 0.986 | 0.0008 | −0.0002 | 0.023 | 0.014 | 0.003 | 0.023 | 0.013 | 0.003 |
MLEs = maximum likelihood estimators.
Averages of the MLEs
20 | 0.015 | 0.964 | 0.107 | 0.070 | 0.040 | 0.120 | 0.073 | 0.050 | |
30 | 0.008 | 0.975 | 0.075 | 0.047 | 0.030 | 0.074 | 0.046 | 0.029 | |
40 | 0.005 | 0.979 | 0.049 | 0.033 | 0.019 | 0.052 | 0.032 | 0.020 | |
50 | −0.001 | 0.983 | 0.041 | 0.026 | 0.016 | 0.041 | 0.026 | 0.015 | |
20 | 0.020 | 0.966 | 0.088 | 0.053 | 0.027 | 0.088 | 0.058 | 0.031 | |
30 | 0.012 | 0.971 | 0.055 | 0.035 | 0.017 | 0.055 | 0.036 | 0.018 | |
40 | 0.003 | 0.978 | 0.042 | 0.026 | 0.013 | 0.040 | 0.026 | 0.013 | |
50 | 0.002 | 0.986 | 0.034 | 0.022 | 0.011 | 0.032 | 0.021 | 0.010 | |
20 | 0.010 | 0.966 | 0.064 | 0.041 | 0.011 | 0.062 | 0.042 | 0.014 | |
30 | 0.017 | 0.985 | 0.043 | 0.028 | 0.009 | 0.042 | 0.028 | 0.009 | |
40 | 0.002 | 0.984 | 0.030 | 0.021 | 0.006 | 0.031 | 0.020 | 0.006 | |
50 | 0.004 | 0.982 | 0.023 | 0.016 | 0.005 | 0.024 | 0.016 | 0.005 | |
20 | 0.001 | 0.965 | 0.056 | 0.037 | 0.008 | 0.056 | 0.036 | 0.009 | |
30 | 0.008 | 0.983 | 0.038 | 0.025 | 0.006 | 0.038 | 0.024 | 0.006 | |
40 | 0.006 | 0.988 | 0.026 | 0.019 | 0.003 | 0.028 | 0.018 | 0.004 | |
50 | 0.001 | 0.985 | 0.022 | 0.014 | 0.003 | 0.022 | 0.014 | 0.003 |
MLEs = maximum likelihood estimators.
Averages of the approximate MLEs
Var( | Var( | Cov( | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
20 | 0.010 | 0.987 | 0.0044 | −0.0229 | 0.108 | 0.063 | 0.042 | 0.126 | 0.074 | 0.053 | |
30 | 0.004 | 0.987 | 0.0041 | −0.0122 | 0.075 | 0.044 | 0.030 | 0.076 | 0.046 | 0.030 | |
40 | 0.002 | 0.987 | 0.0030 | −0.0082 | 0.049 | 0.031 | 0.019 | 0.053 | 0.032 | 0.020 | |
50 | −0.003 | 0.989 | 0.0025 | −0.0059 | 0.041 | 0.025 | 0.016 | 0.041 | 0.025 | 0.015 | |
20 | 0.014 | 0.970 | 0.0058 | −0.0040 | 0.087 | 0.051 | 0.027 | 0.090 | 0.058 | 0.032 | |
30 | 0.008 | 0.974 | 0.0037 | −0.0021 | 0.054 | 0.034 | 0.017 | 0.055 | 0.036 | 0.018 | |
40 | 0.001 | 0.979 | 0.0026 | −0.0011 | 0.042 | 0.026 | 0.013 | 0.041 | 0.026 | 0.013 | |
50 | 0.000 | 0.987 | 0.0022 | −0.0007 | 0.034 | 0.022 | 0.011 | 0.033 | 0.021 | 0.010 | |
20 | 0.007 | 0.965 | 0.0029 | 0.0006 | 0.063 | 0.040 | 0.011 | 0.063 | 0.042 | 0.014 | |
30 | 0.015 | 0.985 | 0.0018 | 0.0005 | 0.043 | 0.027 | 0.009 | 0.042 | 0.028 | 0.009 | |
40 | 0.001 | 0.983 | 0.0012 | 0.0002 | 0.030 | 0.021 | 0.006 | 0.031 | 0.020 | 0.006 | |
50 | 0.003 | 0.981 | 0.0010 | 0.0003 | 0.023 | 0.016 | 0.005 | 0.024 | 0.016 | 0.005 | |
20 | −0.001 | 0.965 | 0.0018 | 0.0006 | 0.056 | 0.036 | 0.008 | 0.056 | 0.036 | 0.009 | |
30 | 0.007 | 0.982 | 0.0011 | 0.0005 | 0.038 | 0.024 | 0.006 | 0.038 | 0.024 | 0.006 | |
40 | 0.005 | 0.988 | 0.0008 | 0.0003 | 0.026 | 0.019 | 0.003 | 0.028 | 0.018 | 0.004 | |
50 | 0.000 | 0.985 | 0.0006 | 0.0002 | 0.022 | 0.014 | 0.003 | 0.022 | 0.014 | 0.003 |
MLEs = maximum likelihood estimators.