search for

CrossRef (0)
Predictive analysis in insurance: An application of generalized linear mixed models
Communications for Statistical Applications and Methods 2023;30:437-451
Published online September 30, 2023
© 2023 Korean Statistical Society.

Rosy Oh1,a, Nayoung Woob, Jae Keun Yoob, Jae Youn Ahnb

aDepartment of Mathematics, Korea Military Academy, Korea;
bDepartment of Statistics, Ewha Womans University, Korea
Correspondence to: 1 Department of Mathematics, Korea Military Academy, PO Box 77-1 Nowon-gu, Seoul 01805, Korea. E-mail: rosy.oh5@gmail.com
Received February 7, 2023; Revised June 7, 2023; Accepted July 11, 2023.
Generalized linear models and generalized linear mixed models (GLMMs) are fundamental tools for predictive analyses. In insurance, GLMMs are particularly important, because they provide not only a tool for prediction but also a theoretical justification for setting premiums. Although thousands of resources are available for introducing GLMMs as a classical and fundamental tool in statistical analysis, few resources seem to be available for the insurance industry. This study targets insurance professionals already familiar with basic actuarial mathematics and explains GLMMs and their linkage with classical actuarial pricing tools, such as the Bühlmann premium method. Focus of the study is mainly on the modeling aspect of GLMMs and their application to pricing, while avoiding technical issues related to statistical estimation, which can be automatically handled by most statistical software.
Keywords : generalized linear model, generalized linear mixed model, predictive analysis, ratemaking, Buhlmann premium, premium, insurance
1. Introduction

Generalized linear models (GLMs) and generalized linear mixed models (GLMMs) are fundamental tools for predictive analysis. These statistical tools have motivated many modern insurance-pricing techniques (Nelder and Verrall, 1997; Antonio and Beirlant, 2007; Wuthrich, 2019). Meanwhile, the credibility theory is a field within actuarial science that focuses on calculating risk premiums. It originated from Mowbray (1914) and Whitney (1918), who aimed to determine a premium that balanced the experience of an individual policyholder with that of a group of similar policyholders (Nelder and Verrall, 1997). While GLMMs and credibility theory are both widely used in risk classification in non-life insurance sectors, such as auto insurance, practitioners in these sectors tend to be more familiar with credibility theory.

GLMMs have several advantages over credibility theory. They offer a wider range of models and can be easily fitted and tested using standard statistical software. As a result, GLMMs provide additional modeling options for various actuarial problems, such as premium ratings and claims reserves. Furthermore, it is well known that credibility theory is closely related to the framework of GLMMs (Nelder and Verrall, 1997). While the insurance literature has some initial discussion on credibility theory based on GLMMs (Ohlsson, 2008; Lai and Sun, 2012), most assume above-average knowledge of the statistical aspects of GLMM, which makes its application inaccessible to traditional insurance practitioners.

This study aims to explain GLMs and GLMMs in plain actuarial language and demonstrate how to obtain a simple and fair premium. The remainder of the article is organized as follows. Section 2 introduces the notations used. Section 3 provides the general settings of risk classification procedures for insurance using the language of GLMs. Sections 4 and 5 explain the modeling and prediction methods for GLMs and GLMMs, respectively. Section 6 provides the theoretical link between GLMMs and credibility theory. Finally, the results are discussed in Section 7.

2. Definitions and notations

We denote ℕ, ℕ0, ℝ, and ℝ+ as the set of natural numbers, the set of non-negative integers, the set of real numbers, and the set of positive real numbers, respectively. We define Nit as the number of claims in the tth policy year for the ith policyholder. We denote the realization of Nit as nit. Actuarial science literature often refers to Nit as the frequency of insurance claims.

We differentiate a fixed effect from a random effect. The former is based on the policyholders’ observed risk characteristics, whereas the latter is based on the policyholders’ unobserved risk characteristics after controlling for fixed effects. Let the data matrix Xi of the ith policyholder have a dimension of τ × (m + 1), where τ is the number of observed years and m is the total number of explanatory variables. Denote


where Xit := (1, Xit1, . . ., Xitm) is the (m+1)-dimensional row vector of the a priori risk characteristics of the ith policyholder in the tth year. Let γi denote the policyholder-specific random effect of the ith policyholder. For simplicity, we assume that the random effect γi does not have the subscript t, because we assume that the a priori risk characteristics remain constant over time. Under GLM and GLMM, the distribution within the exponential dispersion family (EDF) has been addressed by McCullagh and Nelder (1989) and Breslow and Clayton (1993). EDF can be viewed as an abstraction of many well-known distributions, including normal, gamma, and Poisson distributions. This provides a unified approach to statistical modeling. However, owing to its abstract form, it may be difficult to understand intuitively. Therefore, we do not explore the benefits of a generalized form of EDF in this study, nor do we attempt to identify the best parametric assumption for the distribution that fits the data. Instead, we focus primarily on Poisson distribution because the general idea can easily be extended to other distributions within the EDF.

We use Pois(λ) to denote a Poisson distribution with mean λ. Using gamma(μ, ψ), we denote a gamma distribution whose mean and variance parameters are μ and μ2ψ. Gamma(μ, ψ) can also be interpreted as a gamma distribution with the shape parameter 1/ψ and scale parameter μψ. We use N(μ,σ2) to denote a normal distribution, whose mean and variance are given by μ and σ2, respectively. For Y ~ N(μ,σ2), we define the distribution of exp(Y) as lognormal(μ,σ2).

3. General setting

In terms of data, for each policyholder i = 1, . . ., z in τ policy years, with m explanatory variables, the dataset has the following format:


where z is the total number of policyholders and xit = (xit1, xit2, . . ., xitm). An example of a typical dataset is listed in Table 1.

First, we are interested in a fair premium, defined as , for the ith policyholder in calendar year τ* = τ + 1. Throughout this study, although we do not specify so in the mathematical notation, we assume that the explanatory variables are always known and conditioned. Hence, for example, the fair premium should be understood as


The fair premium for each policy year is determined solely by the fixed effect. In insurance, a fair premium based on the fixed effect is called the a priori rate and the risk classification process based on the fixed effect is called the a priori risk classification procedure. The premium determined by the a priori risk classification procedure can be further updated based on the claim history as follows:


which is called the a posteriori rate. This premium adjustment mechanism based on claim history is called the a posteriori risk classification procedure. Typically, in insurance, the risk characteristics for a year are known at the beginning of each year. Hence, in predicting the premium in the τ*th year, the explanatory variables xi1, . . ., xiτ* are assumed to be known. In the following sections, we examine the a posteriori risk classification procedure in (3.2) under various model assumptions.

4. Generalized linear model

GLMs are a flexible method that extend the fixed-effect linear regression model to response variables, which are not normally distributed. The expected value of Nit depends on the observed risk characteristics with the following relation:


where Xitβ is the linear predictor and η(·) is the link function, which relates to a linear predictor.

4.1. Modelling and estimation of the GLM

Model 1. (Poisson GLM)Consider z different policyholders with the observation given in the form of (3.1) and assume mutual independence among policyholders. Furthermore, we assume the following joint distribution of (Ni1, . . ., Niτ*):

  • The distribution of the ithpolicyholder’s frequency Nitin the tthyear is specified as

    Nit~Pois (Λit),


    Λit:=exp (Xitβ)=exp (β0+β1Xit1++βmXitm)

    with the coefficient parameters (fixed effects) β := (β0, β1, . . ., βm)′. Here, we assume the link function η to be a log function that gives the exponential expression in (4.1).

  • Ni1, . . ., Niτ*are independent.

In Model 1, the only unknown parameters are the coefficients β shared by each policyholder. Hence, once the coefficients β are known, the conditional distribution of frequency for each policyholder in the future calendar year τ* = t + 1 can be determined exactly as

Niτ*~Pois (Λiτ*),

with the a priori rate for the ith policyholder in calendar year τ* given by .

We assume that we have a dataset in the form of (3.1) with z policyholders and that each policyholder is observed for τ years. Then, the log-likelihood function for Model 1 is written as

(β):=i=1zt=1τlog (fβ(nit)),

where fβ (nit) is the probability mass function of the Poisson distribution given by

fβ(nit):=exp (-Λit)(Λit)nitnit!,

where Λit = exp(Xitβ). For the estimation of unknown coefficients β, typically, maximum likelihood estimation (MLE) is considered and the estimator of β is defined as


4.2. Prediction in the GLM

In this subsection, we predict the a posteriori rate in (3.2), which is the premium in the τ*th year. given the claim history up to the τth year. Under the settings in Model 1, the a posteriori rate defined in (3.2) can be obtained as


based on the assumption of independence in Model 1. The a posteriori rate in (4.2) coincides with the a priori rate. As (4.2) shows, the a posteriori rate is not a function of the past claim history under the setting in Model 1 and there is no premium adjustment by the claim history ni1, . . ., niτ.

4.3. Data example

Throughout this study, we use the dataset for collision coverage on vehicles collected by the Wisconsin Local Government Property Insurance Fund (LGPIF) from 2006 to 2011 (Frees et al., 2016). Collision coverage covers the impact of a vehicle on an object, the impact of a vehicle on an attached vehicle, or the overturning of a vehicle. The datasets contain 497 government entities. Each policyholder has six levels of entity types and average coverage over time as explanatory variables as shown in Table 2. For further explanation of the dataset, refer to Oh et al. (2020).

To fit the GLM in Model 1, we can use the glm function in R, whose form is given by

 > glm(formula, family=familytype(link=linkfunction), data=dataset,. . .). 

The specific form of the function under the setting in Model 1 is given by

 > glm(Frequency~factor(entityType)+avgCoverage, family=poisson(link=log), data=mydata). 

If the categorical variable (entity type) is presented as a dummy variable (e.g., TypeCity, TypeCounty) in a dataset, the formula specifying the fixed effects part of the function is modified as follows:

 > glm(Frequency~TypeCity+TypeCounty+· · · +avgCoverage, family=poisson(link=log), data=mydata). 

The estimation results of this model are summarized in Table 3. The asterisk (*) signs indicate that the parameters are significant at the significance level of 0.05. Based on the table, the a posteriori rate, which is equivalent to the a priori rate, can be obtained as

E[Nit]^=exp (Xitβ^).

Hence, the estimated parameter β̂1, for example, can be interpreted as the effect of the entity type of the county compared to the baseline, miscellaneous entity type, in the logarithm of the estimated a posteriori rate for the same average coverage. The table indicates that the county entity type is expected to incur more accidents than any other entity type having the same amount of coverage. Within the same entity type, a higher average coverage corresponds to a higher expected accident count.

Moreover, the third policyholder in the dataset, for example, is in a county and has an average coverage of 2.495, so that under Model 1, the estimated expectation is

exp (β^0+β^2+β^62.495)=exp (-2.339+2.633+0.195×2.495)=2.182.
5. Generalized linear mixed model

GLMMs are a popular method to model correlated data; they extend GLMs by including random effects in the predictor. The conditional expected value of Nit|γi depends on the observed risk characteristics using the following relation:


where γi is the random effect and η(·) is a link function that converts into a linear predictor Xitβ. The random effect γi measures the difference between the average score of policyholder i and the average score in the portfolio.

5.1. Modelling and estimation of the GLMM

In Model 1, we assume that the selected explanatory variable Xi1, . . ., Xiτ* can explain the distribution of the frequency. However, in reality, it is almost impossible to describe human behavior and contributors are often not observed or overlooked. These unobserved components can be used in a regression setting, as shown in the following example:

Example 1. Consider Model 1 with three explanatory variables: Xit1, Xit2, and Xit3. If we assume that they are observed, then the a priori premium is

Λit=exp (β0+β1Xit1+β2Xit2+β3Xit3).

Now, we assume that only the first explanatory variable Xit1 is observed and the other explanatory variables, Xit2 and Xit3, are not. Then, a reasonable description of the a priori premium for frequency under this assumption is

Λit:=exp (β0+β1Xit1).

If the unobserved explanatory variables are denoted as


where the explanatory variables are assumed invariant over time, we have the following conditional model:

Nitγi~Pois (exp (β0+β1Xit1+γi)).

By further defining Ri := exp(γi), the conditional distribution in (5.1) can be represented as

NitRi~Pois (RiΛit).

Regardless of whether observed or unobserved, the explanatory variables can also be assumed to be random variables. By placing an additional distributional assumption on the unobserved explanatory variables, Example 5.1 can be formally represented as follows:

Model 2. (Poisson GLMM)Consider z policyholders with the observation given in the form of (3.1), assuming mutual independence among policyholders. Furthermore, we assume the following conditional joint distribution of (Ni1, . . ., Niτ* ):

  • Conditional on the random effect Ri, the distribution of the ithpolicyholder’s frequency Nitin the tthyear is specified as

    NitRi~Pois (RiΛit),


    Λit:=exp (xitβ)=exp (β0+β1Xit1++βmXitm)

    with the coefficient parameters (fixed effects) β := (β0, β1, . . ., βm)′. Here, we assume the link function η to be a log function that gives the exponential expression in (5.2).

  • The random effect Rifollows distribution function G with a parameter ψ.

  • Furthermore, we assume that Ni1, . . ., Niτ*are independent conditional on Ri.

In the case of a normal random effect, that is, γi ~ N(0, σ2), we have Ri ~ lognormal(0, σ2). Otherwise, if the random effect distribution is non-normal, the mixed model is called a hierarchical generalized linear mixed model (HGLMM) (Lee and Nelder, 1996). However, we do not differentiate between GLMMs and HGLMMs in this study. Although it is possible to assume that some of the parameters in (5.2) are random effects, known as random slopes in the classical GLMM literature, we do not pursue this direction for the sake of simplicity. We refer the reader to Zimmer (2018) for information on the impact of random slopes on insurance pricing.

In Model 2, the mean of the random effect Ri is often set to be one for the identifiability issue. For example, we may assume

Ri~gamma (1,ψ)

with the mean of 1 and the dispersion parameter ψ. We refer to Section 2 for the details of parameterization. Note that using the random effect Ri introduces longitudinal dependence among Nits. Specifically, in Model 2, the law of total covariance is derived as

cov [Nit1,Ni,t2]=cov [E[Nit1Ri],E[Nit2Ri]]+E[cov [Nit1,Nit2Ri]]=cov [Λi,t1Ri,Λi,t2Ri]+0=Λit1Λit2Var(Ri).

However, in Model 1, we have

cov [Nit1,Nit2]=0.

An important implication in the comparison of the covariance in (5.4) under Model 2 and the covariance in (5.5) under Model 1 is that the random effect induces dependence among the observations. Thus, the random effect not only explains the unobserved explanatory components but also introduces dependence among observations. Hence, because the observations of the ith policyholder,


are not independent under Model 2, the corresponding likelihood function under Model 2 cannot be written as the product of the likelihood function. Instead, we have the following log-likelihood function for the observations:

i(β,ψ):=log (t=1τfβ(nitri)gψ(ri)dri),

which requires one-dimensional integration. Here, fβ(· | ri) is the probability mass function of the Poisson distribution defined by

fβ(nitri):=exp (-Λitri)(Λitri)nitnit!,

and gψ is the probability density function of the random effect in (5.3). From a purely mathematical perspective, the introduction of a random effect in Model 2 is for modeling dependence within the observations. In this sense, the random-effect model is linked to the copula model (Frees and Valdez, 1998; Embrechts, 2009; Krupskii and Joe, 2013).

Next, for data in the form of (3.1), with z policyholders and each policyholder observed for τ years, the log-likelihood function for Model 2 is written as


For the estimation of unknown coefficients β and parameter ψ, MLE is usually considered, and the estimators of β and ψ are typically defined as


5.2. Prediction in the GLMM

The goal of the a posteriori risk classification is to predict the net premium for each policyholder. In Model 2, by definition, we have the following conditional expectation:


Because the random effect for the ith policyholder, Ri, is an unknown random variable, the conditional expectation in (5.8) cannot be used directly to predict the a posteriori rate in (3.2). However, in combination with the prediction of a random effect,


the conditional expectation in (5.8) can calculate the a posteriori rate in (3.2) as follows:


Hence, the prediction of the a posteriori rate in Model 2 is reduced to the estimation of predicting the random effect in (5.9). Note that the prediction of the random effect in (5.9) allows for a closed-form expression depending on the choice of distributional assumption for the random effect. Otherwise, the random effect in (5.9) can be predicted numerically, either based on numerical integration or a Monte Carlo simulation. The next subsection explains a specific method for predicting the random effect in (5.9).

5.3. Parametric assumptions for the random effect in the generalized linear mixed model

Model 3. Considering the settings in Model 2, we further assume that the parametric assumption for Riin (5.3) is

Ri~gamma (1,ψ).

In Model 3, the likelihood function in (5.6) can be analytically expressed as

i(β,ψ)=log (t=1τexp (-rλit)(rλit)nitnit!1Γ(1/ψ)ψ1/ψ(r1/ψ-1)exp (-r/ψ)dr)=log (1Γ(1/ψ)ψ1/ψΓ(1/ψ+t=1τnit)(1/ψ+t=1τλit)-(1/ψ+t=1τnit)t=1τλitnitnit!),

where λit denotes the realization of Λit in (4.1).

Hence, the statistical estimation procedure is based on the simple optimization of (5.11), which does not require complicated numerical integration. Furthermore, since the gamma distribution is a conjugate prior of the Poisson distribution, the posterior mean of the random effect in (5.9) can be analytical obtained as follows:


from the fact that the conditional distribution Ri|ni1, . . ., niτ follows a gamma distribution with shape and scale parameters given by

1/ψ+t=1τnit         and         11/ψ+t=1τλit,

respectively. In the conjugate families, the posterior mean of the individual’s random effect in (5.12) can be represented by a weighted average between the prior mean of Ri (weighted by its precision) and the ratio of the total number of insurance claims across τ years for policyholder i to the sum of the respective rate parameters, (t=1τnit)/(t=1τλit) (weighted by the sum of the rate parameters):


Because later acts as a multiplier of the expectation from the fixed effects alone (see the example at the end of Section 5, under Model 3), suggests no modification in the premium. Additionally, the ratio (t=1τnit)/(t=1τλit) indicates how the observed number of claims relates to the expected number of claims over time for policyholder i. For example, if t=1τnit>t=1τλit, then this would lead to a ratio greater than one, resulting in .

So far, we have analyzed the properties of GLMMs in Model 2, in which the random effect Ri is assumed to follow a gamma distribution. This distributional assumption is convenient because it provides an analytical expression for the likelihood function in (5.11) and the analytical conditional distribution of Ri. However, inherited from the linear mixed model, in which both the response and random variables are assumed to follow a normal distribution, the more popular choice of distributional assumption for Ri is lognormal distribution. The following model provides a GLMM model with lognormal assumption for Ri.

Model 4. We now consider the settings in Model 2 and we further assume that the parametric assumption for Riis

Ri~lognormal (0,ψ).

We have

E[Ri]=exp(ψ/2)         and         VabRi=(exp(ψ)-1)exp(ψ)

under the parameterization in Model 4. Unlike the gamma distributional assumption for the random effect as in Model 3, the lognormal distributional assumption for the random effect does not lead to a closed-form expression for the likelihood function in (5.6). Hence, for the statistical estimation procedure in (5.7), numerical integration is required in (5.6). Standard numerical integration methods, such as the Laplace approximation (Laplace, 1986) and Gaussian quadrature (Golub and Welsch, 1969), can efficiently approximate the likelihood function for GLMMs. We explain that the a posteriori rate in (5.10) can be approximated as an affine function of observations in Section 6.1. In the next subsection the use of statistical programming for GLMMs is explained.

5.4. Data example

To fit the GLMMs in Model 2, we consider the hglm function in the hglm package (Rönnegård et al., 2010) in R. The function is for fitting HGLMs via extended quasi-likelihood (Lee et al., 2006), which extends GLMs by relaxing the assumption that error components are independent. Its grammatical form is as follows:

> hglm(formula, family=familytype(link=linkfunction),
random=var, rand.family=familytype(link=linkfunction),
data=dataset,. . .).

This is similar to the glm function in R but with additional terms for the random effect: random and rand.family. First, we specify the policyholder-specific random effect of interest as random=~1|Polic yNum. To easily specify the distribution of the random effect, we consider the conditional expectation of frequency as

E[Nitγi]=η-1(xitβ+γi)=exp (xitβ+γi),

and having a log link function. To use the gamma distribution for the random effect in the function, the random effect γi is given by γi = log(Ri) and Ri ~ gamma(1, ψ). Thus, we need a log link function to specify the distribution for the random effect as rand.family=gamma(link=log). Similarly, for the log-normal distribution, we can use the code rand.family=gaussian(link=identity), which implies that γi ~ N(0, ψ); that is, Ri ~ lognormal(0, ψ). The specific form of the function under the setting of Model 3 is

> hglm(Frequency~TypeCity+· · · +avgCoverage,

Under the settings in Model 4,

> hglm(Frequency~TypeCity+· · · +avgCoverage,

Alternative functions to fit GLMM are the glmer function of the lme4 package and the glmmPQL function of the MASS package. The
glmer function is commonly used, and it fits the model via maximum likelihood. Meanwhile, glmmPQL uses the penalized quasi-likelihood approach to evaluate the likelihood of GLMM. The functions in Model 4 are

> glmer(Frequency~TypeCity+ · · · +avgCoverage+(1|PolicyNum),


> glmmPQL(Frequency~TypeCity+· · · +avgCoverage,

The estimation results of these models for the parameters are summarized in Table 4. The estimates for the fixed effects differ slightly depending on the distribution of the random effect; however, we have the same interpretation for the tendency in risk characteristics. In this example, the function provides an estimate of the dispersion parameters for the random effect. Because the dispersion parameter of the gamma distribution is the reciprocal of the shape parameter and has a mean of 1, ψ is implied. Furthermore, using the code print(summary(res), print.ranef = TRUE), we obtain estimates of random effects for all policyholders. Table 5 lists the estimates for selected policyholders in Models 3 and 4. This function provides a different form of estimates depending on the link function of the random effect. From the estimation results in Tables 4 and 5, we obtain the estimated conditional expectation as follows:

E^[Niτ*ni1,,niτ]=exp (xiτ*β^+γ^i).

Here, γ̂i changes over time, based on historical observations, whereas the corresponding expectation of GLMs in (4.3) remains invariant with respect to time and historical observations. For example, the third policyholder in the dataset is in a county and has an average coverage of 2.495; thus, under Model 3, the estimated conditional expectation is

exp (β^0+β^2+0.38256β^6)·R^3=exp (-2.356+1.940+0.383×2.495)×2.074=3.555,

and under Model 4, it is

exp (β^0+β^2+0.38256β^6+γ^3)=exp (-2.849+2.229+0.337×2.495+0.995)=3.375.
6. Bühlmann method and GLMM

Although GLMMs offer a unified approach for calculating the a posteriori insurance rate, they often lack closed-form expressions, except for a few distributional assumptions; for instance, see Model 3. In this section, we demonstrate the incorporation of the credibility theory into the framework of GLMMs by providing a closed-form approximation of the a posteriori rate.

For simplicity, we consider the settings in Model 2 with the further assumption that the risk characteristics are the same across years; that is, Xi1 = · · · = Xiτ* . While one of the main interests of insurance is the prediction of the net premium of each policyholder defined in (5.8) using the a posteriori rate defined in (3.2), the closed form of the a posteriori rate defined in (3.2) is generally not possible. Alternatively, we may consider approximating the a posteriori rate for the ith policyholder defined in (3.2) as the affine function of the historical observations, by the following:



(α^0,,α^τ):=arg min(α0,,αr)τE[(E[Niτ*Ri]-(α0+α1Ni1++ατNiτ))2].

Following the classical procedure in the Bühlmann premium (Bühlmann and Gisler, 2006), the optimal premium for the ith policyholder in (6.1) is obtained as

α^0=(1-Zτ)E[Niτ*]         and         α^1==α^τ=Zττ,

where Zτ, the Bühlmann factor, is given by


Furthermore, , and can be obtained as


from equations



u(Ri):=E[Niτ*Ri]=RiΛi         and         v(Ri):=Var(Niτ*Ri)=RiΛi.
7. Discussion and conclusion

In this study, we present an intuitive explanation of the structure of GLMs and GLMMs and their relationship with insurance pricing methods; and we explain their connection with the Bühlmann method in insurance. With recent advances in computing tools, such as deep neural networks, insurance pricing can become more accurate through the use of complex predictive expressions. However, the traditional credibility method, which seeks an intuitive representation of insurance prices, seems at odds with modern statistical techniques. An interesting future topic is to explore how to adapt modern statistical computing tools to the credibility method in order to improve accuracy while preserving its intuitive and simple form.

Here, we mainly focus on the Poisson distribution in GLM and GLMMs to model the frequency of insurance claims, which are commonly used for count data. However, based on the assumption that the variance and mean are the same in the distribution, there are limitations, such as incorrect inference when overdispersion is present in the dataset. To address overdispersion in the Poisson model, overdispersion adjustments can be made using the quasi-likelihood method (quasi-Poisson model) or an alternative distribution method, such as negative binomial distribution. We note that, with some limitation (Lee et al., 2020), the problem of overdispersion can be reduced for Poisson distribution by using the GLMM framework.


Table 1

Typical data example

IDCalendar yearSalary (Xit1)Home value (Xit2)· · ·Age (Xitm)Frequency (Nit)
12017150020500· · ·370
12018147020900· · ·381
12019156320500· · ·390
1022017140145000· · ·522
1022018157347000· · ·530
1022019156250000· · ·540

Table 2

Observable policy characteristics used as covariates

Continuous variableDescriptionProportions
Type of local government entity
Entity typeCounty11.47%

Continuous variableMinMeanMax

avgCoverageCollision coverage amount0.0091.09713.135

Table 3

Estimation results under Model 1

Fixed effect (Intercept)β̂0−2.3390.302−7.758<0.0001*
Type = Cityβ̂11.6180.3145.158<0.0001*
Type = Countyβ̂22.6330.3078.572<0.0001*
Type = Schoolβ̂31.0210.3093.310<0.0001*
Type = Townβ̂4−0.7300.387−1.8880.0591
Type = Villageβ̂50.8350.3172.6320.0085*

Table 4

Estimation results for the parameters under Model 3 and Model 4

ParameterModel 3Model 4

Fixed effect (Intercept)β̂0−2.3560.319−7.373<0.0001*−2.8490.364−7.829<0.0001*
Type = Cityβ̂11.0480.3702.8330.0047*1.3230.4173.1770.0015*
Type = Countyβ̂21.9400.3795.121<0.0001*2.2290.4225.284<0.0001*
Type = Schoolβ̂30.9200.3332.7660.0057*0.9170.3792.4180.0157*
Type = Townβ̂4−0.9090.390−2.3310.0199*−0.7130.442−1.6120.1072
Type = Villageβ̂50.4340.3491.2450.21310.6130.3961.5490.1216

Random effectψ1.0721.191

Table 5

Selected estimation results for random effects under Models 3 and Model 4

Model 3Model 4


  1. Antonio K and Beirlant J (2007). Actuarial statistics with generalized linear mixed models. Insurance: Mathematics and Economics, 40, 58-76.
  2. Breslow NE and Clayton DG (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9-25.
  3. B체hlmann H and Gisler A (2005). A Course in Credibility Theory and Its Applications (317), Berlin, Springer.
  4. Embrechts P (2009). Copulas: A personal view. Journal of Risk and Insurance, 76, 639-650.
  5. Frees EW, Lee G, and Yang L (2016). Multivariate frequency-severity regression models in insurance. Risks, 4, 4.
  6. Frees EW and Valdez EA (1998). Understanding relationships using copulas. North American Actuarial Journal, 2, 1-25.
  7. Golub GH and Welsch JH (1969). Calculation of gauss quadrature rules. Mathematics of Computation, 23, 221-230.
  8. Krupskii P and Joe H (2013). Factor copula models for multivariate data. Journal of Multivariate Analysis, 120, 85-101.
  9. Lai TL and Sun KH (2012). Evolutionary credibility theory: A generalized linear mixed modeling approach. North American Actuarial Journal, 16, 273-284.
  10. Laplace PS (1986). Memoir on the probability of the causes of events. Statistical Science, 1, 364-378.
  11. Lee W, Kim J, and Ahn JY (2020). The poisson random effect model for experience ratemaking: Limitations and alternative solutions. Insurance: Mathematics and Economics, 91, 26-36.
  12. Lee Y and Nelder JA (1996). Hierarchical generalized linear models. Journal of the Royal Statistical Society: Series B (Methodological), 58, 619-656.
  13. McCullagh P and Nelder JA (1989). Generalized Linear Models (2nd ed), London, Chapman and Hall.
  14. Mowbray AH (1914). How extensive a payroll exposure is necessary to give a dependable pure premium. Proceedings of the Casualty Actuarial Society, 1, 24-30.
  15. Nelder JA and Verrall RJ (1997). Credibility theory and generalized linear models. ASTIN Bulletin: The Journal of the IAA, 27, 71-82.
  16. Oh R, Shi P, and Ahn JY (Array). Bonus-malus premiums under the dependent frequency-severity modeling. Scandinavian Actuarial Journal, 172-195.
  17. Ohlsson E (Array). Combining generalized linear models and credibility models in practice. Scandinavian Actuarial Journal, 301-314.
  18. R철nneg책rd L, Shen X, and Alam M (2010). hglm: A package for fitting hierarchical generalized linear models. The R Journal, 2, 20-28.
  19. Whitney A (1918). The theory of experience rating. Proceedings of the Casualty Actuarial Society, 4, 274-292.
  20. Wuthrich MV (2021) From Generalized Linear Models to Neural Networks, and Back. [S.l.]: SSRN .
  21. Zimmer DM (2018). The heterogeneous impact of insurance on health care demand among young adults: A panel data analysis. Journal of Applied Statistics, 45, 1277-1291.