Health insurance is becoming increasingly popular as healthcare costs rise around the world. People buy health insurance to reduce their risk of personal bankruptcy due to medical expenses. One of the most popular types of private health insurance is group health insurance. People tend to enroll in group health insurance offered by their employers because it often provides coverage at a lower premium than individual health insurance.
Meanwhile, insurers are faced with the challenge of accurately predicting healthcare claims for risk management, liability assessment, and premium calculation. Therefore, many studies related to claims forecasting have been conducted, but there have also been criticisms of each methodology. For instance, a standard regression model can be applied to predict claims, yet the assumptions regarding linearity and independence are often inappropriate for real-world data.
More recently, Bayesian parametric (BP) models, which use prior distributions to account for the complexity of claims data, have been proposed as an alternative to standard regression models. Bayesian models can be used for claims prediction and premium calculation through credibility. A well-known credibility model in actuarial science is the Bühlmann model (2005). However, As stated by Huang and Meng (2020), the BP models require data to meet certain distributional assumptions, which are often not met in real-world data. Hong
Moreover, Fellingham
Although the BNP model has been shown to outperform the BP model, the application of the BNP model to predict group health claims is still rare compared to automobile insurance claims. One of the contributions of this research is to show using real data that the BNP model is more effective than the BP model in predicting group health insurance claims.
There are also fewer applications of the BNP model to insurance data from developing countries compared to developed countries. Previous studies such as Fellingham
Therefore, this study extends Fellingham
Our findings are as follows. First, we estimate models by applying BNP and BP models to real data in 2018, and then validating the prediction using the corresponding data from 2019. Consistent with Fellingham
The remainder is organized as follows. Sections 2 and 3 demonstrate two proposed Bayesian models for predicting group health claims. A Bayesian parametric (BP) model will be explained first and then extended to a nonparametric one. In Section 4, we present two numerical examples using the simulated and real data from a private life insurer in Indonesia to compare the performance between two models. At last, the conclusions are summarized in Section 5.
For modeling group health claims using a Bayesian model, we first define a likelihood function and a prior distribution of the parameters associated with this function. As noted by Fellingham
where
for
In the Bayesian models, the choice of prior distributions is crucial since it affects the accuracy of the predictions. More informative priors will result in more accurate predictions. As the first step in specifying the prior distributions in the BP model, we need to choose the random effects distribution for all parameters such as (
Hence, considering the non-negativity constraint of the parameters in the BP model, we choose the random effects distributions of (
where
with 0 <
Next, we need to specify the hyperprior distributions for the hyperparameters. In the context of a BP model, a hyperprior distribution is a prior distribution placed on the hyperparameters of the random effects distribution. We assume that
In summary, the posterior for full parameters of the BP model is proportional to:
where
As is known, high-dimensional integration and unfeasible computations are required to analyze posterior distributions of the BP model. An alternative is to use the MCMC method to generate samples from the posterior distribution. We describe the MCMC method used for the BP and BNP models in
The random-effects distributions for the parameters (
As pointed by MacEachern (2016), the Dirichlet process (DP) prior is one of the most frequently used priors in the Bayesian models. The DP prior consists of two parameters: A precision parameter, denoted by
where
where
Next, we will define the DP priors in the BNP model. We assume that the probability of zero claims and the distribution of claims in a group are independent. Therefore, we choose to handle parameters
assuming each parameter is independent. The parameters,
For the distributions of
where the hyperparameters are assumed to follow the same distributions as in the BP model.
Next, the marginalized version of model can be found by integrating
where
To get posterior distributions using MCMC method, we need to obtain the priors for each
and
where
The expressions in (
Two cases of data are generated to describe the possible distribution shapes of parameters in real data, particularly the unimodal case and the multimodal case. The random effects parameters of the unimodal case are drawn from unimodal distributions while the random effects parameters of the multimodal case are drawn from multimodal distributions.
To obtain simulated data, we first generate the random effects parameters (
The first case is the unimodal case. In here, the
For analyzing the BP model in both the unimodal case and the multimodal case, we use the following fixed values for hyperparameters, specifically
For fitting the BNP model, we also apply the same values of hyperparameters for the centering distributions and choose
The plots of the simulated parameters and posterior densities under both BP and BNP models are given in Figure 1. The histograms plot the generated parameters, the dashed lines fit the BP model, and the solid lines fit the BNP model. We can see that both the BP and BNP models are able to capture the shapes of parameters quite closely in the unimodal case. While in the multimodal case, the BNP model is superior to the BP model in replicating the modes of parameters. The BP model will not replicate the modes of parameters unless we define it explicitly in the random effects distribution. On the other hand, defining multiple modes is not necessary under the BNP model since the DP priors can capture all distribution shapes of the parameters.
Next, following Albert (2009), we perform posterior predictive checks to check the convergence of the BP and BNP models. One way to perform the posterior predictive checks is to compare summary statistics or visualization of posterior predictive densities. First, we generate posterior predictive samples using the likelihood function (
The formula used for calculating MSE is as below:
where
A better model will result in smaller MSE value. Table 1 displays that MSE values under the BNP model for both the unimodal case and the multimodal case are smaller than MSE values under the BP model. This indicates that the BNP model has greater accuracy in predicting claims of simulated data than the BP model.
Then, we choose one group from simulated data for both cases and draw posterior predictive densities as can be seen in Figure 2. The histograms plot the generated claims amount, the dashed lines fit the BP model, and the solid lines fit the BNP model. Obviously, we can observe that posterior predictive densities under the BNP model are closer to the real data than posterior predictive densities under the BP model.
Again, we also examine the MSE values as shown in Table 2. This table reveals the same results as before where the MSE values under the BP model are greater than the MSE values under the BNP model. Therefore, it can be concluded that the BNP model outperforms the similar BP model in predicting claims of simulated data.
The actual data used is group health claims data from a private life insurer in Indonesia from policies in 2018 and 2019. All claim severities are presented in Indonesian currency, namely Indonesia Rupiah or IDR. As demonstrated by Fellingham
Table 3 provides some statistics such as the number of participants (insured), number of groups,
As previously mentioned, our data is from an insurer in a developing country. Since our data has some different characteristics compared to the data from an insurer in a developed country used by Fellingham
As mentioned before, our data is from an insurer in a developing country. Since our data has some different characteristics compared to the data from an insurer in a developed country used by Fellingham
In the context of this study, we only analyze the renewal groups or the existing groups from the previous year. Also, we exclude some groups which have no claims and obtain 25 groups to be analyzed. For fitting both BP and BNP models, we use the same hyperparameters as simulated data, specifically
Figure 3 below gives the plot of posterior distributions for
Furthermore, we choose one company called company F as the most representative group in 2018 and 2019 and predict its claims under the BP and BNP models. As displayed in Table 3, Company F has 162 participants in 2018 and 193 participants in 2019. We do not consider the presence of the same participants in both years. For the analysis, we obtain the posterior predictive densities of company F using the corresponding parameters (
Additionally, we compare both BP and BNP models by calculating the absolute value of mean difference between the actual and predicted claims of company F and MSE values of claims paid using the formula as in (
In our results, under the BP model, it has an absolute value of mean difference of 3,430 and MSE of 9.857 ×105. While under the BNP model, it has a smaller absolute value of mean difference than the BP model at 1,358 and also a smaller MSE at 1.928 ×105. Obviously, this indicates that the predicted values under the BNP model are closer to real claims amount. This finding is in line with the results from simulated data and shows that the BNP model is better than the BP model in predicting group health claims of an insurer in Indonesia.
One of the most commonly used models to predict claims in the insurance industry is the BP model. However, the BP model is not flexible enough to accurately capture the complex distributions of health insurance claims. To overcome this limitation, the BNP model has been proposed, which employs a flexible prior distribution and offers a higher accuracy in predicting health insurance claims.
In the context of group health insurance, the BNP model proposed by Fellingham
In this study, we implement this BNP model to predict group health insurance claims in Indonesia and compare its performance with a BP model. Our findings show that the BNP model is more suitable for describing non-standard forms of distributions, especially multi-modality behavior, in the posterior distributions of real data in a developing country. In contrast, the BP model cannot predict such behavior. Therefore, the use of the BNP model can improve risk management for insurers.
However, this study also has some limitations as follows. We only implement the Bayesian models for predicting actual claims for an insurer in a developing country for the renewal groups. To predict claims for new business groups, different models need to be developed. Furthermore, since we only focus on the performance of the BP and BNP models, covariate information is omitted. Therefore, considering the possibility of covariates in all models can improve the results of this study in the future.
As mentioned earlier, the posterior distributions for both BP and BNP models can be obtained by applying the MCMC method. Specifically, the Metropolis-Hastings (M-H) algorithm involves to solve multivariate distributions in the Bayesian models. The M-H algorithm produces a Markov chain based on a proposal distribution.
For instance, we want to produce samples from a posterior distribution
Generate a candidate from a proposal distribution
Accept the candidate value or set
and
We have to note that the distribution will be stationary to
Predicted data under the BP model can be obtained by drawing new parameters using the marginalized version of the model. We first do MCMC iterations to obtain the current values of hyperparameters. We use a normal proposal distribution with the current state as mean and a fixed variance to draw the new hyperparameters. Then, given the current hyperparameters, new parameters (
In the BNP model, we use density functions defined in (
Let
Draw a candidate
Accept the candidate value or
where
Repeat the above steps a few times.
Next, we need to update the (
Let (
Draw a candidate (
Set
and
Repeat the above steps a few times.
Then the predicted values for each renewal group are drawn from function (
As previously stated, burning of some samples is intended in the MCMC method. A longer burn-in may allow the chain to reach its stationary distribution and improve convergence. Therefore, to determine the appropriate burn-in length for the BNP model, we compare model performance using two burn-in lengths: 1,000 iterations and 10,000 iterations. The results of the comparison on simulated data can be seen in Table B.1. Meanwhile, the comparison for one group of real data is shown in Table B.2. Both tables indicate that using burn-in of 10,000 iterations is better than burn-in of 1,000 iterations for the BNP model. The large MSE value on real data is caused by the currency used for the claim amount.
MSE of all groups in the simulated data
Model | Unimodal case | Multimodal case |
---|---|---|
BP | 0.689 | 82.386 |
BNP | 0.204 | 9.254 × 10−6 |
MSE of one group in the simulated data
Model | Unimodal case | Multimodal case |
---|---|---|
BP | 4.680 | 0.042 |
BNP | 0.419 | 0 |
Descriptive statistics of real data from policies in 2018 and 2019
Data | Year | n insured | n group | Mean | Std. Dev. | Maximum | Skewness | Kurtosis | |
---|---|---|---|---|---|---|---|---|---|
All | 2018 | 9,338 | 112 | 3,224 | 13,440 | 463,333 | 13.47 | 284.93 | 67.18% |
Groups | 2019 | 8,843 | 47 | 4,951 | 21,101 | 728,832 | 15.72 | 379.38 | 56.09% |
Company | 2018 | 162 | 1 | 6,350 | 15,308 | 171,800 | 8,30 | 85,90 | 29.01% |
F | 2019 | 193 | 7,369 | 14,865 | 101,027 | 3,89 | 17.18 | 56.09% |
MSE of a representative group in real data
Model | | |
MSE |
---|---|---|
BP | 3,430 | 9.857 ×105 |
BNP | 1,358 | 1.928 ×105 |
MSE of BNP model for simulated data per burn-in length
Data | Unimodal case | Multimodal case | ||
---|---|---|---|---|
1,000 burn-in | 10,000 burn-in | 1,000 burn-in | 10,000 burn-in | |
All groups | 5.295 ×108 | 0.204 | 2.179 ×108 | 9.254 ×10−6 |
One group | 1.881 ×108 | 0.419 | 0.839 ×106 | 0 |
MSE of BNP model for real data per burn-in length
Data | MSE | |
---|---|---|
1,000 burn-in | 10,000 burn-in | |
BNP | 4.655 ×108 | 1.928 ×105 |