Diagnostic tests in medical fields detect or diagnose a disease with results measured by continuous or discrete ordinal data. The performance of a diagnostic test is summarized using the receiver operating characteristic (ROC) curve and the area under the curve (AUC). The diagnostic test is considered clinically useful if the outcomes in actually-positive cases are higher than actually-negative cases and the ROC curve is concave. In this study, we apply the stochastic ordering method in a Bayesian hierarchical model to estimate the proper ROC curve and AUC when the diagnostic test results are measured in discrete ordinal data. We compare the conventional binormal model and binormal model under stochastic ordering. The simulation results and real data analysis for breast cancer indicate that the binormal model under stochastic ordering can be used to estimate the proper ROC curve with a small bias even though the sample sizes were small or the sample size of actually-negative cases varied from actually-positive cases. Therefore, it is appropriate to consider the binormal model under stochastic ordering in the presence of large differences for a sample size between actually-negative and actually-positive groups.
Diagnostic tests in medical fields detect or diagnose a disease with results measured by continuous or discrete ordinal data. The receiver operating characteristic (ROC) analysis is widely used to assess the diagnostic test performance and often presented as the ROC curve. It graphically represents the relationship between false positive and true positive rates. The false positive rate represents “1 – specificity” which indicates the probability that a truly non-diseased individual displays a positive test result, and the true positive rate represents “sensitivity” indicating the probability that a diseased individual will show a positive test result. The area under the curve (AUC) is often used to measure the accuracy of a ROC curve.
Numerous ROC curve estimation methods for continuous or discrete ordinal data have been proposed using a parametric, semiparametric, and nonparametric approach based on the frequentist or Bayesian method (Gonçalves
If the diagnostic test is effective, the ROC curve should be concave (Dorfman
To estimate the proper ROC curve for discrete ordinal data, Metz and Pan (1999) proposed a proper binormal model and a new algorithm using the monotonic transformation of the likelihood ratio. In the frequentist method, the bi-gamma (Dorfman
Pisano
In this study, we apply the stochastic ordering method in a Bayesian hierarchical model to estimate of the proper ROC curve and AUC when diagnostic test results are measured with discrete ordinal data. We describe the conventional binormal model and binormal model under stochastic ordering in Section 2, and compare these two models for various sample sizes using simulations in Section 3. In Section 4, we apply two models on breast cancer data. Finally, the conclusions are discussed in Section 5.
We consider
Suppose that the probability of a response in category
Suppose that the latent decision-variable axis is partitioned into
Let
Assume the decision-variables are distributed
Finally, we assume the standard logistic distribution with location 0 and scale 1 as a prior for the boundaries
and the joint prior distribution for
In the conventional binormal model without considering the stochastic ordering, the two CDFs are
and the probabilities of a response in category
Thus, the joint posterior density of
The conditional posterior distributions in the conventional binormal model are calculated similar to the binormal model under stochastic ordering and we use the grid method for the Markov chain Monte Carlo (MCMC) computations.
The stochastic order of two populations is defined as:
The ROC curve is proper when ROC(
where Φ(
Thus, the joint posterior density of
We use the Griddy Gibbs sampler to generate samples from the posterior density because the conditional posterior distributions are not standard form (Lee
The conditional posterior distribution of
and the parameter
where logit(
The conditional posterior distribution of
and the parameter
Finally, the conditional posterior distributions of the boundaries
and we also transform the parameter
and the samples of
The convergence of the MCMC algorithm are assessed using the trace plots and autocorrelation plots. We use a single chain; therefore, convergence is checked using the Geweke test, which compares the mean of the initial 10% and the last 50% samples of the total iteration (Geweke, 1992).
The ROC curves are estimated by
We conduct a simulation to compare the performance of the binormal model under stochastic ordering and the conventional binormal model. Metz and Pan (1999) discussed that the true population ROC curve does not have a hook, but the empirical ROC curve may have a hook in small sample size. The true ROC curve is a convex curve, but the estimated ROC curve from the simulated data with 5 categories (30 actually-negative and 40 actually-positive cases) has a hook in Figure 9 in Metz and Pan (1999). This population ROC curve has operating points (FPR, TPR) = (0.00025, 0.10), (0.000328, 0.25), (0.03040, 0.50), (0.28114, 0.85) (Dorfman and Berbaum, 1995) and the true AUC is 0.879 (Metz and Pan, 1999). The probabilities of 5 categories using the operating points are calculated as:
We generate 100 simulated data from the multinomial distribution with probabilities
Simulation results of AUCs using 100 simulated data with true AUC = 0.879 are summarized in Table 1. The means of AUC in the binormal model under stochastic ordering are closer to 0.879 than those in the conventional binormal model. All measures for AB, RAB, RPMSE, the widths of the credible interval, and the widths of the HPD credible interval in each sample size are smaller in the binormal model under stochastic ordering than in the conventional binormal model. The coverage probabilities in the binormal model under stochastic ordering are closer to 0.95 than those in the conventional binormal model. Figure 1 and Figure 2 represent the estimated ROC curves derived from the binormal model under stochastic ordering and the conventional binormal model for 100 simulated data with the true ROC curve. The estimated ROC curves gets closer to the true ROC curve and the variation decreases as the sample size in both models increases. Many estimated ROC curves are improper ROC curves with a hook in the case of smaller number of samples in addition, a few estimated ROC curves have small degree of hook when the ratio of actually-negative to actually-positive cases is larger. Therefore, this simulation study indicates that the binormal model under stochastic ordering could estimate the proper ROC curve with a small bias despite sample sizes being small or significant variations in the sample size for actually-negative cases from actually-positive cases.
Pisano
We apply the binormal model under stochastic ordering and the conventional binormal model. We draw 60,000 samples from the conditional posterior distribution for each parameter and select every 5th iterate after discarding 10,000 samples to obtain 10,000 samples for inference. All p-values of the Geweke test are greater than 0.1 and the effective sizes are higher than 9,000 in both models.
Table 3 summarizes the posterior mean (PM), PSD, 95% CI and HPD CI of the AUC for the breast cancer data. The PM is 0.764, PSD is 0.026, 95% CI is (0.714, 0.815), and 95% HPD CI is (0.714, 0.814) in the binormal model under stochastic ordering. The PM is 0.695, PSD is 0.044, 95% CI is (0.607, 0.777), and 95% HPD CI is (0.612, 0.780) in the conventional binormal model. The PM from the binormal model under stochastic ordering is larger than the conventional binormal model because the fitted ROC curve obtained from the conventional binormal model has a hook. The PSD and the width of CI and HPD CI of the binormal model under stochastic ordering are smaller as compared to the conventional binormal model.
In Figure 3, we display the ROC curves and the posterior densities of AUCs of the binormal model under stochastic ordering and the conventional binormal model for breast cancer data. The circles represent empirical operating points. The fitted ROC curve obtained from the conventional binormal model are an improper ROC curve with a hook, but proper in the binormal model under stochastic ordering. The AUC distribution in the conventional binormal model highly vary and shift to the left of those in the binormal model under stochastic ordering.
In the case of low-prevalence diseases, the number of people who actually have a disease in screening diagnostic tests is generally lower compared to those without a disease. Pesce
In this study, we apply the stochastic ordering method in a Bayesian hierarchical model to estimate the proper ROC curve and AUC when diagnostic test results are measured with discrete ordinal data and compare with the conventional binormal model and binormal model under stochastic ordering. The simulation study indicates that the binormal model under stochastic ordering can be used to estimate the proper ROC curve with a small bias despite the sample sizes being small or the large variation between actually-negative and actually-positive cases. In breast cancer data, the fitted ROC curve derived from the conventional binormal model represents an improper ROC curve with a hook, but proper in the binormal model under stochastic ordering. The distribution of AUC derived from the conventional binormal model varied greatly and shifted to the left in the binormal model under stochastic ordering. Therefore, it is appropriate to consider the binormal model under stochastic ordering in the case of a sample size large variation between actually-negative and actually-positive groups.
This work was supported by a 2016 Research Funds of Andong National University grant.
Simulation results for AUCs using 100 simulated data with true AUC = 0.879
(1) Conventional binormal model | ||||||||
---|---|---|---|---|---|---|---|---|
Sample size | AUC | AB | RAB | RPMSE | C-CI | W-CI | C-HPD | W-HPD |
(60, 30) | 0.818 | 0.063 | 0.072 | 0.091 | 0.82 | 0.230 | 0.88 | 0.225 |
(45, 45) | 0.820 | 0.060 | 0.068 | 0.082 | 0.87 | 0.200 | 0.90 | 0.196 |
(30, 60) | 0.770 | 0.109 | 0.124 | 0.131 | 0.69 | 0.277 | 0.75 | 0.268 |
(600, 300) | 0.863 | 0.017 | 0.019 | 0.024 | 0.84 | 0.061 | 0.86 | 0.060 |
(450, 450) | 0.850 | 0.029 | 0.033 | 0.033 | 0.51 | 0.060 | 0.53 | 0.059 |
(1000, 100) | 0.867 | 0.017 | 0.020 | 0.033 | 0.97 | 0.100 | 0.99 | 0.098 |
(10000, 100) | 0.861 | 0.023 | 0.026 | 0.036 | 0.89 | 0.098 | 0.90 | 0.097 |
(15000, 100) | 0.861 | 0.024 | 0.028 | 0.037 | 0.90 | 0.097 | 0.92 | 0.096 |
Average | 0.839 | 0.043 | 0.049 | 0.058 | 0.811 | 0.140 | 0.841 | 0.137 |
(2) Binormal model under stochastic ordering | ||||||||
---|---|---|---|---|---|---|---|---|
Sample size | AUC | AB | RAB | RPMSE | C-CI | W-CI | C-HPD | W-HPD |
(60,30) | 0.830 | 0.052 | 0.059 | 0.074 | 0.82 | 0.186 | 0.830 | 0.183 |
(45,45) | 0.827 | 0.053 | 0.060 | 0.073 | 0.88 | 0.180 | 0.890 | 0.177 |
(30,60) | 0.800 | 0.079 | 0.090 | 0.097 | 0.71 | 0.209 | 0.780 | 0.206 |
(600,300) | 0.871 | 0.011 | 0.013 | 0.019 | 0.96 | 0.054 | 0.980 | 0.054 |
(450,450) | 0.862 | 0.017 | 0.019 | 0.022 | 0.85 | 0.053 | 0.870 | 0.052 |
(1000,100) | 0.875 | 0.014 | 0.016 | 0.027 | 1.00 | 0.084 | 1.000 | 0.083 |
(10000,100) | 0.873 | 0.017 | 0.019 | 0.028 | 0.94 | 0.079 | 0.940 | 0.079 |
(15000,100) | 0.873 | 0.018 | 0.020 | 0.028 | 0.95 | 0.078 | 0.950 | 0.077 |
Average | 0.851 | 0.033 | 0.037 | 0.046 | 0.889 | 0.115 | 0.905 | 0.114 |
AUC = area under the curve; AB = absolute bias; RAB = relative absolute bias; RPMSE = root posterior mean squared error; C-CI = coverage probability for 95% credible interval; W-CI = width of a credible interval; C-HPD = coverage probability HPD credible interval; W-HPD = width of the HPD credible interval; HPD = highest posterior density.
Artificial breast cancer data of film mammography screening
1 | 2 | 3 | 4 | 5 | |
---|---|---|---|---|---|
Patients without breast cancer | 11837 | 2839 | 1002 | 322 | 0 |
Patients with breast cancer | 37 | 12 | 13 | 24 | 14 |
Posterior mean, standard deviation, 95% credible intervals, and HPD CI of the area under the curve for the breast cancer data
PM | PSD | 95% CI | 95% HPD CI | |
---|---|---|---|---|
Binormal model under stochastic ordering | 0.764 | 0.026 | (0.714, 0.815) | (0.714, 0.814) |
Conventional binormal model | 0.695 | 0.044 | (0.607, 0.777) | (0.612, 0.780) |
PM = posterior mean; PSD = standard deviation; CI = 95% credible intervals; HPD = highest posterior density.