search for

CrossRef (0)
A spatial heterogeneity mixed model with skew-elliptical distributions
Communications for Statistical Applications and Methods 2022;29:373-391
Published online May 31, 2022
© 2022 Korean Statistical Society.

Mohadeseh Alsadat Farzammehr1,a, Geoffrey J. McLachlanb

aIranian Judiciary Research Institute, Iran;
bDepartment of Mathematics, University of Queensland, Australia
Correspondence to: 1 Iranian Judiciary Research Institute, Tehran, Iran. E-mail: ma.farzammehr@gmail.com
Received November 9, 2021; Revised January 23, 2022; Accepted March 24, 2022.
The distribution of observations in most econometric studies with spatial heterogeneity is skewed. Usually, a single transformation of the data is used to approximate normality and to model the transformed data with a normal assumption. This assumption is however not always appropriate due to the fact that panel data often exhibit non-normal characteristics. In this work, the normality assumption is relaxed in spatial mixed models, allowing for spatial heterogeneity. An inference procedure based on Bayesian mixed modeling is carried out with a multivariate skew-elliptical distribution, which includes the skew-t, skew-normal, student-t, and normal distributions as special cases. The methodology is illustrated through a simulation study and according to the empirical literature, we fit our models to non-life insurance consumption observed between 1998 and 2002 across a spatial panel of 103 Italian provinces in order to determine its determinants. Analyzing the posterior distribution of some parameters and comparing various model comparison criteria indicate the proposed model to be superior to conventional ones.
Keywords : panel data, linear mixed model, spatial heterogeneity, multivariate skew-elliptical distributions, Bayesian hierarchical approach, MCMC
1. Introduction

In panel data, observations are compiled over several time periods on a cross section. In spatial econometrics, spatial panels are cross-sections of spatial units (such as regions, provinces, or countries), in which panel data analysis has received much attention.

The spatial heterogeneity structure can be used to analyze economic data that explicitly incorporates spatial information. Differences in time or space might cause spatial heterogeneity. Fundamentally, spatial heterogeneity is introduced by enabling the model to have heterogeneous spatial and temporal effects. Chesson (1985) described three types of variations in these models. The distinction is between pure spatial variation, which is constant from one time period to another, pure temporal variation, which is constant across time and space, and pure spatiotemporal variation, for which both spatial and temporal variations can occur simultaneously. The first two types consider spatial and temporal perspectives separately, so they omit the possibility of spatio-temporal variations when studying spatial heterogeneity.

A spatial heterogeneity involving spatial-temporal variations is examined in this paper. This is achieved through incorporating spatial sampling units (space-specific effects) into model structures. In such a setting, four types of space-specific effects as presented by Zheng et al. Zheng et al. (2008) are considered. Models for describing space-specific effects include (i) Homogeneous model (homogeneous), (ii) Heterogeneous model (heterogeneous), (iii) Fixed-effects model (fixed), and (iv) Random-effects model (random). These models are explained in Section 3.1.

Most spatial panel data exhibit characteristics like skewness in practice. A common technique in empirical econometric models is to transform the response and explanatory variables in order to apply classical models that are based on the normality assumption efficiently. Even though using a transformation to handle departures from normality may result in reasonable empirical results, it is in many cases inappropriate and restrictive. Frequently, an alternative theoretical model (that addresses skewness directly) performs better than data transformation (Farzammehr et al., 2019). Furthermore, data transformation comes with several limitations, such as reduced information and no guarantee of normality of the transformed data. Additionally, the transformed variables can be difficult to interpret, and the transformations may not be applicable to other datasets. In light of this, we propose new spatial heterogeneity panel models that offer greater flexibility in the distributional assumption of error terms to reduce the impact of the unrealistic normality assumption.

It is becoming increasingly popular as an analytical tool for data sets that exhibit non-normal characteristics to use multivariate skew-elliptical distributions. These distributions have additional features such as asymmetry and heavy tails that are not available in traditional normal and student-t distributions. The connections between the various multivariate skew-t distributions are analogues to those between the skew-normal distributions discussed in many studies (see, for example, the papers by Azzalini (2005), Arellano-Valle and Genton (2005), Arellano-Valle and Azzalini (2006), Lee and Mchlachlan (2013) and Azzalini (2014)).

Several extensions of standard linear mixed models have been proposed in the statistical literature that replace the normality assumption for the errors. Using space-specific effects, mixed models also provide a convenient framework for modeling spatial heterogeneity. To the best of our knowledge, hardly any work has been performed to discuss the simultaneous effects induced by spatial heterogeneity and non-normality, which are inherent to spatial panel data.

The spatial heterogeneity perspective should always be considered when dealing with panel analyses based on sets of spaces, since additional methodological issues arise with panels of spaces, where the relevant markets within spaces are well defined by the heterogeneity structure affecting panels studies. As such, insurance market is widely believed to be important for the economic growth of a country. Based on premiums per capita, we analyze non-life insurance consumption in Italy’s 103 provinces from 1998 to 2002. In order to accomplish this, we adopt an intermediate approach to the empirical study of insurance consumption, one that bridges existing panel studies and analyses of datasets on household income, wealth, and consumption. Note that the availability of data for Italian insurance premiums limits the analysis to the provincial level. Overall, the goal of this paper is to estimate the parameters of symmetric and asymmetric models. To do so, we follow the Bayesian hierarchical framework described in Jara et al. (2008), Lunn et al. (2009), which uses Markov chain Monte Carlo (MCMC) algorithm for estimating spatial heterogeneity panel model with multivariate skew-elliptical distributions. This alternative modeling approach has the advantage that models are easily fitted in free software R under OpenBUGS (Lunn et al., 2009), and that the computational effort is almost the same as for the standard model.

The rest of the paper is organized as follow. Section 2 presents the spatial heterogeneity panel model and Section 3 explains multivariate skew-elliptical distributions. Further, we discuss model selection criteria that may be used to evaluate the performance of our models. Section 4 is devoted to a simulation study examining the appropriateness of the proposed skew models. Bayesian analysis of non-life insurance data set is presented in Section 5. The main goal of the paper is to investigate the determinants of the development of the non-life insurance market for Italy. Finally, Section 6 includes concluding remarks.

2. The model

In order to begin, we define a regression model that is expressed as,


for i = 1, . . ., N space units, and t = 1, . . ., T time periods. In the above, we let yit denote the response variable at space i and time t, xit is a known 1×K vector of explanatory variables, and β is a K×1 vector of fixed but unknown regression parameters. The variable εit denotes an independently and identically distributed (iid) error term for space i and time t with zero mean and variance σ2. For λit, there are various assumptions depending on the application. One of them is λit = αi + ηt, which is typically used for models that include both time-dependent and space-dependent effects simultaneously. Other commonly adopted assumptions include λit = αi for models that have space-specific effects only. If only time-specific effects are required and expressed, the assumption λit = ηt can also be used. The present study examines models with space-specific effects.

Using (2.1), the matrix form of the simple model on panel data is as follows,


where yt = (y1t, . . ., yNt)’ and Xt = (x1t, . . ., xNt)’, respectively, denote the vector of the response variables and the matrix of the explanatory variables in all spaces at time t. Thus, ut = (u1t, . . ., uNt)’ is the model error component involving the sum of two disturbances. One of these disturbances is the N × 1 vector of αt = (α1, . . ., αN)’ for which, with appropriate specifications, we can obtain the four types of spatial panel models described in Zhang et al. (2008) (refered to as homogeneous, heterogeneous, fixed and random). This allows the model to account for differences among different spaces which is refered to as spatial heterogeneity structure. The other disturbance in ut is the vector of the remainder disturbances εt = (ε1t, . . ., εNt)’ in each period.

Let y=(y1,,yT) and X=(X1,,XT) be a concatenated form of yt and Xt, respectively, such that they denote the response and explanatory variables, respectively, where y denotes an NT ×1 vector of the response variable and X is a NT × K matrix of the non-stochastic exogenous regressors. In a similar manner, the (concatenated form of the) disturbance vector can be written as


where α = (INιT)αt, (ιp is a p × 1 vector of ones, Ip is a p × p identity matrix and the symbol ⊗ denotes the Kronecker product) and ɛ=(ɛ1,,ɛT) denotes the NT × 1 idiosyncratic error vector.

As previously mentioned, the common assumption for spatial panel models is that u follows a normal distribution. When the responses are skewed, the normality assumption may be unrealistic, so it would be useful to examine skew-elliptical distributions as an alternative to the normal distribution to account for skewness in the data. Here, for the homogeneous, heterogeneous, fixed and random space-specific effects, we consider the skew-elliptical distributions for the remainder error term.

Bayesian inference in the aforementioned models requires defining the prior distribution for all unknown parameters (2.2), from which a posterior distribution could be derived for model parameters. One solution for selecting prior distribution is to consider the conditionally non-informative conjugate priors proposed by Gelman and Hill (2006).

To achieve well-defined and proper posteriors, we assign conjugate weakly-informative priors, as in Jara et al (2008) methodology. It is important to note that a family of prior distributions p(ϕ) is conditionally conjugate for ϕ if the conditional posterior distribution p(ϕ|y) is also in that class. Besides these parameters, we also need to specify the prior distribution and derive the posterior distribution for different specification of unobserved effects αt.

It is necessary to decide, in methodological papers and also when applying spatial models to real data, whether the unobserved space-specific effect αi is to be treated as a random or a fixed effect. In conventional usage, αi is called a random effect when treated as a random variable, and a fixed effect when treated as a parameter to be estimated for each cross-sectional observation i. When αi refers to an unobserved random effect, it is frequently assumed that they are not correlated with the xit; that is, it is assumed that E(αi|xi1, . . ., xiT) = 0.

3. Multivariate skew-elliptical distributions

In the following, we give the formal definition of the unrestricted multivariate skew-normal (uMSN) and unrestricted multivariate skew-t (uMST) distributions according to Lee and McLachlan (2013) formulations as special cases of the class of skew-elliptical distributions by Arellano-Valle and Genton (2005). This type of multivariate skew-normal distribution was studied in Sahu et al. (2003). Suppose z0 and z1 are jointly normally distributed as


Let z denote a n-dimensional random vector. Then z = μ + Δ|z0| + z1 defines the convolution-type stochastic representation of the unrestricted multivariate skew-normal distribution, and its density is given by


where Ω = Σ+ΔΔ’ and Λ = InΔΩ−1Δ = (In+ΔΣ−1Δ)−1. In the above, μ is a n×1 vector of location parameters, Σ is a n × n diagonal positive definite scale matrix and Δ is a n × n diagonal matrix of skewness parameters. Further, φn(·; μ, Σ) and Φn(·; μ, Σ), respectively, denote the probability density function (pdf) and the cumulative distribution function (cdf) of a Nn(μ, Σ) random variable. Note that when μ = 0 and Σ = In; φn(·; μ, Σ) and Φn(·; μ, Σ) will be denoted by φn(·) and Φn(·) respectively.

We shall follow the terminology of Lee and McLachlan (2013) and adopt the notation z ~ uSNn(μ, Σ, Δ) if z has the uMSN density. It is noted that if Δ = 0, the second part of (3.2) evaluates to 2n, and we again recover the multivariate normal density φn(z; μ, Σ). Then the density of z reduces to the usual symmetric multivariate density, Nn(μ, Σ).

Following Sahu et al. (2003), the uMSN distribution admits a convenient hierachial representation. Let w = |z0|, the hierarchical representation of (3.2) is given by


where HNn(0, Σ) represents the n-dimensional half-normal distribution with mean 0 and scale matrix Σ.

Analogous to the skew-normal case, the uMST distribution has a similar stochastic representation to the uMSN distribution. Specifically, z = μ+Δ|z0|+ z1 has the uMST distribution, where conditional on the gamma variable r,


It follows that the density of z is given by


where c(z) = ΔΩ−1(zμ) and d(z) = (zμ)TΩ−1(zμ). This density is expressed as the product of a multivariate student-t pdf and a multivariate student-t cdf. In fact, Tn(·; μ, Σ, τ), is the cdf of tn(·; μ, Σ, τ). Here, the notation z ~ uSTn(μ, Σ, Δ, ν) will be used. In the latter case, ν corresponds to the degrees of freedom parameters. When ν→∞, the multivariate skew-t distribution approaches the multivariate skew-normal distribution and reduces to the multivariate normal distribution when Δ = 0.

Similar to the uMSN version, the uMST admits a convenient hierarchial representation,


Here, for the homogeneous, heterogeneous and fixed region-specific effects, we only consider the skew-elliptical distributions (uMSN and uMST) for the remainder error component. For the random region-specific effect, we allow both εt and αt to have a skew-t, skew-normal, student-t, and normal distributions.

The proposed panel model incorporates all types of spatial heterogenity mentioned above and is referred to as a skew-elliptical linear spatial panel model. The MCMC computations for the skew-elliptical panel model will be derived based on the following spatial hierarchical Bayes model. The hierarchical representation is important for obtaining the parameter estimates and will facilitate easy implementation in commonly used softwares such as R and OpenBUGS. In this case, from the stochastic representation, and by introducing the N × 1 random vector wk, the hierarchical representation of the model can be formulated as follows. For t = 1, . . ., T we have,


where I(wk > 0) is an indicator function which is 1 if every element of wk is greater than zero, and zero otherwise, and wk = |ξ| with ξ ~ Nk(0, Ik). Here, |ξ| denotes the vector with ith element equal to the magnitude of the ith element of ξ.

3.1. Distributions of priors and posteriors

In order to implement Bayesian inference for the aforementioned models, we need to specify the prior distribution for all unknown parameters in the models. A popular choice is to consider proper conditionally non-informative conjugate priors as suggested by Gelman and Hill (2006). To obtain well defined and appropriate posteriors we assign conjugate weakly informative priors. Note that, a family of prior distributions p(ϕ) is conditionally conjugate for ϕ if the conditional posterior distribution (ϕ|y) is also in that class.

The regression parameter β is customarily modelled using a multivariate normal distribution N(β0, Sβ0). Usually vague priors are specified by taking Sβ0 to be a diagonal matrix with large elements. The variance components σɛ2 and σα2 are taken to have an inverse gamma distribution and the skewness parameters δε and δα are taken to have independent truncated normal distributions. More specifically, we adopt IG(σɛ02,γɛ0) and IG(σα02,γα0) for σɛ2 and σα2, respectively, and N(με, γε)I(δε > 0) and N(μα, γα)I(δα > 0) for δεand δα respectively.

To implement Gibbs sampling methodology and, in particular, to simulate samples from the posterior distribution iteratively, it is critical to obtain the conditional posterior distributions of the parameters. The full conditional distributions of β, Wεt, σɛ2, δε, can be obtained after some algebraic manipulations and are given by






and where,


Additionally, we must specify the prior distribution and derive the posterior distribution for different specifications of unobserved effects αt.

As previously mentioned, in this paper we assume that the αi are unobserved effects and that they may vary with spatial and temporal variation depending on the specifications. We will consider all four spatial-temporal types of models specified in Zheng et al. (2008). The following sections provide technical details.

3.1.1. Homogeneous model

In the case of the homogeneous model, the unobserved region-specific effects at time t are assumed to be region-homogeneous and time-homogeneous with αt = αιN, where α is an overall mean. The prior distribution for αt can be specified as follows,


where Γα0 = γα0IN. Thus the full conditional distribution of αt is,


where Aαt=(NT/σɛ2+1/γα0)-1IN and at=(yt-Xtβ-δɛwɛt)/σɛ2+α0/γα0.

3.1.2. Heterogeneous model

In the heterogeneous model, the unobserved region-specific effects are assumed to be region-homogeneous but time-inhomogeneous with αt = αtιN, where αt is a mean across regions at time t. Suppose that α = (α1, . . ., αT)’ denotes the vector of the means over time. Also, as the regression coefficients and the spatial autocorrelation coefficient are assumed to be region-homogeneous and time-inhomogeneous, we can write β=(β1,,βT), where βt = (β1t, β2t, . . ., βkt)’. By using the aforementioned stochastic representation, we have


where = (1, . . ., T)’. Let t = ιtXt, where ιt denotes the tth row of IT .

We obtain the full conditional distribution of β as follows,


where Aβ=Γβ0-1+t=1TX˜tX˜t/σɛ2 and aβ=Γβ0-1β0+t=1TX˜t(yt-αt-δɛwɛt)/σɛ2, and the full conditional distribution of σɛ2 is given by,


where σɛn2=σɛ02+NT/2 and γɛn=γɛ0+t=1T(yt-X˜tβ-αt-δɛwɛt)'(yt-X˜tβ-αt-δɛwɛt)/2. In addition, we assume that


with Γα0 = γα0INT, and thus the full conditional distribution of α is given by,


where Aα=Γα0-1+σɛ2A and aα=Γα0-1α0+σɛ2B, such that A is a T × T diagonal matrix with diagonal elements A(t,t)=ιNιN and B is a T dimensional vector with the tth element B(t)=ιN(yt-X˜tβ-δɛwɛt).

3.1.3. Fixed-effects model

In this model, the unobserved region-specific effects are assumed to be time invariant and thus we can set αt = α̃ with no time subscript. Note that, however, this vector is assumed to be time-homogeneous but region-inhomogeneous, that is, αt = α̃ = (α1, . . ., αN)’ in which αi is a mean over time for region i. We use the following prior distribution for α̃,


with Γα0 = γα0IN. Thus the full conditional distribution of α̃ is,


where Aα˜=Γα0-1+T/σɛ2 and aα˜=Γα0-1α0+t=1T(yt-Xtβ-δɛwɛt).

3.1.4. Random-effects model

In this case, the unobserved region-specific effects vector are assumed to be time-homogeneous but region-inhomogeneous with αt = α̃ = (α1, . . ., αN)’, where the αi’s are iid random variables with N(α,σα2) distribution. Thus the full conditional distribution of α̃ is,


where Aα˜=(1/σα2+T/σɛ2)IN and aα˜=αιN/σα2+t=1T(yt-Xtβ-δɛwɛt)/σɛ2. We use the following prior distributions for α and σα2,


The full conditional distribution of α is then,


where αn=γαn(α0/γα0+ιNα˜/σα2) and γαn=(1/γα0+N/σα2)-1. The full conditional distribution of σα2 is,


When αt has the skew-t distribution, we have that


and so the full conditional distribution of Wαt given αt, σα2, and δα is given by


where Awαt=(δα2/σα2+1)IN and awαt=(δα/σα2)αt. Similarly, the full conditional distribution of σα2 given αt, δα, and wαt is given by


where σαn2=σα02+NT/2 and γαn=γα0+t=1T(αt-δαwαt)(αt-δαwαt)/2. Further, we have


where μδα=σδα2{μα/γα+t=1Twαt(αt)/σα2} and σδα2=(1/γα+t=1Twαtwαt/σα2)-1.

In order to determine whether the model fits the data adequately, key selection criteria must be applied. These criteria are based on the values of the penalized log likelihood, the Akaike information criterion (AIC) (Akaike, 1974), the Bayesian information criterion (BIC) (Schwarz, 1978) and the Hannan Quinn criterion (HQC) (Hannan and Quinn, 1979) are used to assess the performance of asymmetric models and compare it to the symmetric models. They are defined by AIC= + 2P, BIC= + Plog(N), and HQC= + 2Plog(log(N)), respectively, where is −2 times log likelihood evaluated at the maximum likelihood estimate, P is the number of parameters, and N is the sample size. It should be noted that a model with a smaller value of AIC, BIC, and HQC is preferred when comparing different fitted results.

4. Simulation study

The following section presents simulated data to be used in assessing the performance of the panel models with spatial heterogeneity taken into account. We compare the results of skew models to those obtained by traditional models. Following is a representation of a simple panel data model that provides heterogeneity across units. To generate random samples for our simulations, we can use it as a simple method. For a given observation, an intercept varying over units results in the structure:


The true values of the model parameters are chosen such that all simulated data make sense and resemble a realistic scenario. Specifically, the true values of the model parameters are (β1, β2, β3) = (1.5, 6, 9) and for the model generating the εit ~ SN1(0, 1, 10) and εit ~ ST1(0, 1, 10). The xrit is generated similar to Nervlove (1971) by taking xrit = 0.1t + 0.5xr,i,t–1 + zrit, where zrit is uniformly distributed on the interval (−0.5, 0.5) and xri0 = 5 + 10zri0. There are four interpretations of αi in this context: as a parameter to be estimated in the model (the so-called homogeneous, heterogeneous and fixed models) or alternatively, as a component of the disturbance process, giving rise to a composite error term [αi + εit] (the so-called random model), where αi ~ SN1(0, 4, 5) and αi ~ ST1(0, 4, 5, 3) in skew version of αi. Under each interpretation, αi is taken as a random variable. The sample sizes (N, T) in the different experiments were chosen as (25,5).

In this simulation, 100 Monte Carlo data sets were simulated from the model (4.1) according to the additional specifications described below. For the sake of comparison, prior distributions on parameters were the same for all type of models. Fixed effects are assigned a N3(0, 102I3) distribution. For the scale parameter of error distribution, σε and σα, we use IG(0.01, 0.01) as the prior, and for the skewness parameters δε and δα, independent truncated N(0, 102) distributions are assigned. Note that all these prior distributions are considered close to being non-informative (with large variances). Thus, we expect robust results with respect to prior distributions. For the purpose of undertaking inference, we have used 50,000 iterates in a generated chain after discarding the initial 2,000 iterates. In Table 1 and Table 2, summaries of the Monte Carlo results for the posterior mean average, standard deviation average, and the equitailed 95% credible intervals average for the parameters for the four types of skew elliptical spatial panel models are recorded. These listed results for the equitailed 95% credible intervals are the averages of their lower and upper endpoints. We note that an another measure that could have been used for these intervals was their coverage probabilities.

It is found that, for any of the 100 data sets for each model, no value of the AIC, BIC and HQC criteria leads to the selection of the normal model. This indicates an obvious departure from normality and suggests strong evidence of skewness. We also found that the parameter estimates among the four skew-elliptical models are different from those obtained with symmetric (normal and student-t) and asymmetric (skew-normal and skew-t) distributions. The parameters estimated for each of the four spatial panel models with skew-normal and skew-t error terms can be summarized as follows: (i) Equivalent 95% credible intervals associated with each parameter fitted value of symmetric models are larger than those of the corresponding asymmetric models; (ii) The parameters estimated for skew models have a smaller standard deviation compared to those for symmetric models; (iii) the estimate of the variance component of error σɛ2, and σα2 for the random model is smaller in the class of skew models compared to the normal and student-t ones, primarily because of skewness; (iv) the estimated equitailed 95% credible intervals of δε, and δα for the multivariate skew-normal and skew-t models do not include zero, confirming positive skewness of the responses.

In our empirical analysis, we expect the asymmetry parameters to be positive since the log transformation is the most popular transformation for positively skewed distributions in linear regression. The results also show that consideration of departures from normality can help in improving model fit.

The results shown in Table 1 and Table 2 indicate that we can recover the true values of the parameters used to generate the data with a reasonable level of accuracy. In fact, we analysed the simulated data generated from the model to check our ability in recovering the true values of the parameters. In most of the different samples, the true values of each of these parameters lie within their respective posterior equilibrated 95% credible intervals, indicating that we are able to recover the true values of the parameters used to generate the datasets. As demonstrated by this simulation study, the skew-t model can provide a better fit for all types of models with unobserved spatial heterogeneity than the model that places the normality assumptions on error terms. Particularly, it seems that our evaluations of spatial heterogeneity models with the skew-t assumption are very useful in estimating the regression coefficients β.

5. Analysis of real demand data (Italian non-life insurance)

Using a dataset derived from records of insurance activities in Italy, we illustrate the proposed spatial panel models in this section. A country’s economic development depends on the insurance industry, which is usually classified into life and non-life insurance. Life and non-life insurance consumption have different needs, so they’re usually analyzed separately. Here we focus on non-life insurance consumption. Purchase of non-life insurance is an economic strategy used to provide protection for future losses by paying a premium today, thus transferring wealth from an indefinite to a definite state. This branch of the insurance market is believed to play an important role in promoting the welfare and growth of the industry. It does so by protecting families and companies against financial hardship caused by unexpected/unusual events such as fire, theft, disease, and accidents.

In our study, we examine non-life insurance consumption across 103 provinces in Italy spanning 1998–2002 in order to assess its determinants. The records include: real per-capita GDP (Gross Domestic Product) (rgdp), real per-capita bank deposits (bank), inhabitants density per square Km (den), real interest rate on lending to families and small enterprises (rirs), density of insurance agencies per 1000 inhabitants (agen), share of people with second grade schooling or more (school), share of value added, agriculture sector (vaagr), average number of family members (fam), judicial inefficiency index: average years to settle first degree of civil case (inef), survey result to the question “do you trust others?” (trust), real non-life insurance premiums per capita (ppcd), the province code (code) and the year of observation (year), reported in Millo and Carmeci (2010). These data are publicly available from splm package in R (Millo and Piras, 2012).

As outlined by Millo and Carmeci (2010), these data are challenging to analyze. In particular, it is difficult to estimate the determinants of the development of the nonlife insurance market for Italy and hence good proxies are needed. There is also a high spatial differentiation across the Italian provinces as well as a high degree of spatial correlation in insurance density. For this illustration, the explanatory variables are rgdp, bank, den, rirs, agen, school, vaagr, fam, inef and trust and the response variable is ppcd. For a discussion of the determinants of insurance, see the papers by Browne et al. (2000), Beck and Webb (2003), and Esho et al. (2004).

Let us consider the simple panel regression model (2.1) for the total number of non-life insurance premiums years in Italian provinces. The purpose of the study is to investigate the relationship between non-life insurance premiums and some determinants of insurance consumption in the model. In the context of non-life insurance premiums, some observable (income, wealth, etc) as well as unobservable effects (capital stock, litigation attitudes, etc) may affect the panel results.

There is a limit to the level of analysis for Italian insurance premiums. Due to this, we expect some heterogeneity within provinces. Another option would be to eliminate the differences between provinces that are associated with heterogeneity.

The unobserved space-specific effects are captured by the αi terms. Therefore, following the arguments in Section 3.1, we consider different specifications for the unobserved vector α and then assess the spatial panel models based on the estimation results.

In the homogeneous model the provincial unobserved effects are the same across both time and provinces, such that α can be defined as


which is a 515 × 1 vector. In the heterogeneous model the provincial unobserved effects are the same across the provinces but are different over time. In this case, α has the following structure,


On the other hand, in the fixed model and random model, the provincial unobserved effects are assumed to be the same over time but are different across provinces. Hence, in these cases, α is defined as follows,


where, in the random setting, α follows a specified distribution. For our model, the skew-elliptical distribution is used for the αi terms. The above-mentioned structures provide four different ways to introduce unobserved region-specific heterogeneity in the constant terms or random variables of the panel data model. We employ the approach described in Section 3 for the estimation of model parameters. These models allow us to account for the spatial heterogeneity among different regions. In fact, we follow the Bayesian methodology to perform the inference procedure. We assign independent prior distributions for β, σɛ2,σα2, δε and δα and obtain samples from the posterior distribution of the parameters using MCMC methods.

In Millo and Carmeci (2010) the log transformation of the data is used to approximate normality, and then the normal process is applied to model the transformed data. However, it can be observed from the histogram of log-transformed data in Figure 1(a) that the data do not appear normal even after log transformation. The histogram of the raw insurance consumption data in Figure 1(b) also shows that the data look asymmetric. This indicates that the normality assumption is not satisfied and that fitting a skew model to the data set seems more appropriate. Furthermore, we note that the skewness index for the raw non-life insurance premiums is 0.47.

As shown in Figure 2, the histograms display the distribution of residual values obtained after fitting the four types of spatial panel models to the normal distribution assumption. Evidently, the residual distributions are positively skewed and heavily tailed. We do not need to transform data since we use the proposed spatial heterogeneity panel models for the skewed data.

Analysis of the posterior distributions of some of the parameters and different model comparison criteria in Table 3 and Table 4 suggest that the proposed models outperform standard ones used in the literature. It motivates us to use skew models for skew data. Based on the model selection criteria values shown in the bottom of Table 3 and Table 4, the results indicate that skew-t model may be the best fitting model and skew-normal model is next, supporting the contention of a departure from normality (not shown). The estimates of other parameters of the skew-elliptical models show similar qualitative behaviour as seen in the analysis of the simulation data when compared with the normal models.

For the spatial heterogeneity panel models, details of implementation of the Gibbs sampler algorithm are as follows. For the skew-elliptical proposal distributions, to proceed with Bayesian inference as in Section 3, we ran the Gibbs sampler algorithm with 40,000 iterations, discarded the 20,000 initial iterates and stored every 20th iteration. Our findings in Table 3 and Table 4 are rather diverse, with coefficients often changing sign and numerical values, which can be seen as evidence in favour of the need to investigate on skewness and unobserved heterogeneity in panel studies.

Most authors (e.g., Beenstock et al. (1988), Outreville (1990), Enz (2000), Esho et al. (2004), Millo and Carmeci (2010)) have commented on the elasticity of insurance consumption with respect to income and wealth. Elasticity estimates are frequently used as one of the basic indicators as they are unit-free, easily interpreted, and comparable across studies. In econometrics models with log-transformed data, elasticity values are estimated as the coefficients of the independent variables when the features of the dependent variable, estimation technique, and model structure are determined. However, there are only minor differences among the coefficient estimates and elasticites.

In a wider economic sense, the elasticity value of insurance consumption to income is an interesting subject for research. According to Millo and Carmeci (2010), model specifications and real GDP per capita account for both income and the general level of economic activity, and real bank deposits per capita, as an instrument for the stock of wealth. A negative value for income elasticity is considered consistent with the hypothesis of insurance as an inferior good (Millo and Carmeci, 2010). In empirical studies, it is discussed whether insurance is a superior or a normal good. If income elasticity is positive and smaller than one, then insurance is considered as a normal good. On the contrary, if elasticity is greater than one, insurance is a superior good.

Tables 3 and Table 4 provide estimates of the regression coefficients (posterior means) that appear reasonable based on the observable insurance determinantes. A study of the elasticity of insurance consumption with respect to income and wealth, which is relevant to spatial econometrics applications, is presented in this paper. Although there are some numerical differences between the posterior means for skew panel models with unobserved space-specific heterogeneity in the constant terms, they are mostly consistent. Focusing on the results of the random model with skew-t distribution in Table 4, the elasticity of insurance consumption with respect to income (1.049) turns out to be statistically positive and greater than one, thus asserting the view of non-life insurance as a superior good. The elasticity of wealth is positive but lower than one (0.420), suggesting that the tendency to insure is actually decreasing with wealth. The density of people in this area proves to be negative and significant, indicating that it is a poor agent for risk conditions. The effect of interest rates on insurance consumption is significantly negative. It is evident that the density of agencies is a positive incentive, consistent with the idea that insurance is a complex product with a substantial cost associated with locating an appropriate contract. In the non-life insurance market, family numerousness appears to play a weak role with a negative but significant coefficient, and so does education; the market appears not to be affected by person capital. Agriculture has a positive and significant share of the economy. Judicial system inefficiency has a meaningfully negative impact on insurance. The results confirm the argument that bad implementation of property laws negatively impacts people’s willingness to insure. For probably similar reasons as in Guiso et al. (2004), trust is an important positive determinant of insurance. These findings are important for understanding Italy’s non-life insurance market and its economic development.

The Markov chain history and density for some parameters under random skew-t spatial panel model is presented in Figures 3 and 4. The following plots confirm convergence of the MCMC.

6. Conclusion

Previous studies have rarely examined the behaviour of the error terms when using spatial heterogeneity, i.e. variation in relationships across spatially dispersed areas. In this paper, we examined multivariate skew-elliptical distributions for the remaining error components of the spatial panel data model. Specifically, our methodologies can be used to model data with skewness. The data can be modeled directly without the need for transformations. The standard approach of including skewness in the error terms of the econometrics is new and differs from the methods used by other studies to calculate other econometrics parameters estimates. Bayesian hierarchical parameter estimation was used for our model. Applying a number of criteria, we investigated the model fitting results for each sub model of our skew-elliptical spatial heterogeneity model. To illustrate the effectiveness of our proposed methodology, we conducted a simulation study and applied our models to real data sets, namely the insurance consumption data sets. The results from these analyses demonstrated the necessity of taking into account skewness in the data. By using a spatial panel data model with skew-elliptical parameters for the response models, more accurate estimates of econometric parameters can be obtained. In order to bring this about, we used R to implement the MCMC sampling scheme. Analysis of the non-life insurance consumption data set is reported to demonstrate the proposed methodologies. We have also discussed the elasticity of insurance consumption in relation to income and wealth. Panel models using non-normal remainder error components not only provide better results, but they also avoid the need for transformations and work with unrealistic normality assumptions.

Fig. 1. (a) Histogram and Q-Q plot of Italian insurance premiums with log-transformed non-life insurance data. (b) Histogram and Q-Q plot of Italian insurance premiums with raw non-life insurance data.
Fig. 2. Data on non-life insurance consumption in Italy. Histogram of residuals for each type of spatial effect after fitting the multivariate normal distribution, for each space.
Fig. 3. Markov chain history for parameters under model of the Italian non-life insurance data.
Fig. 4. Markov chain density for parameters under model of the Italian non-life insurance data.

Table 1

The average mean (average standard deviation) and the equivalence 95% credible intervals for the estimated parameters of the simulation study are given in large parentheses

ModelParameterRegion-specific effect type

(a) Normalβ11.97 (0.49)2.00 (0.51)2.42 (0.55)1.75 (0.68)
(0.79, 3.33)(0.84, 2.99)(0.96, 3.23)(0.08, 3.01)

β25.62 (0.66)5.68 (0.89)5.67 (0.74)5.89 (0.90)
(5.01, 6.78)(5.66, 6.75)(5.22, 6.97)(4.77, 6.22)

β38.53 (0.55)8.25 (0.57)8.94 (0.54)8.57 (0.59)
(7.66, 9.93)(7.59, 9.21)(7.93, 9.90)(7.14, 9.60)

α---10.62 (1.05)
---(8.43, 10.67)

σɛ240.03 (4.55)39.32 (5.01)39.11 (5.65)42.78 (5.88)
(30.86, 48.74)(26.38, 49.08)(29.99, 49.62)(32.14, 50.00)

σα2---1.11 (1.79)
---(0.11, 5.87)

(b) Skew-Normalβ11.55 (0.50)1.32 (0.49)1.43 (0.47)1.55 (0.31)
(0.60, 1.93)(0.49, 1.76)(0.57, 1.64)(0.55, 1.62)

β25.42 (0.62)5.45 (0.60)4.99 (0.64)5.04 (0.81)
(4.44, 5.93)(4.77, 5.90)(4.29, 5.98)(4.33, 5.35)

β37.30 (0.32)7.61 (0.41)7.56 (0.35)7.27 (0.55)
(7.14, 8.99)(7.17, 8.01)(7.12, 8.79)(7.20, 8.72)

σɛ21.60 (1.29)1.64 (1.38)1.78 (1.34)1.84 (1.10)
(0.08, 4.23)(0.13, 4.35)(0.10, 4.49)(0.37, 4.04)

σα2---4.06 (1.77)
---(1.56, 5.90)

δε9.01 (0.88)8.87 (0.76)8.09 (0.69)9.92 (0.99)
(7.09, 9.64)(7.02, 10.03)(7.77, 9.89)(8.59, 10.10)

δα---4.08 (1.60)
---(0.65, 6.02)

Table 2

average mean (average standard deviation) and equitailed 95% credible intervals average in big parentheses for the estimated parameters of the simulation study

ModelParameterRegion-specific effect type

(a) Student-Tβ11.64 (0.45)1.87 (0.48)1.78 (0.51)1.43 (0.65)
(0.88, 2.57)(0.85, 2.81)(0.73, 3.31)(0.90, 2.88)

β25.43 (0.61)5.12 (0.62)5.01 (0.49)5.03 (0.52)
(4.78, 6.22)(4.12, 6.09)(4.48, 6.66)(4.90, 6.01)

β38.48 (0.57)8.795 (0.59)8.88 (0.52)8.60 (0.55)
(7.23, 9.87)(7.60, 8.93)(7.80, 9.10)(7.76, 9.46)

α---10.07 (1.11)
---(8.77, 10.46)

σɛ239.39 (5.80)38.81 (5.39)39.06 (5.10)37.95 (5.44)
(29.60, 49.04)(27.69, 50.01)(29.80, 48.98)(30.70, 49.57)

σα2---1.01 (1.67)
---(0.19, 5.59)

(b) Skew-Tβ11.47 (0.44)1.53 (0.46)1.52 (0.45)1.52 (0.39)
(0.67, 1.72)(0.52, 1.70)(0.58, 1.60)(0.50, 1.59)

β25.22 (0.53)5.28 (0.51)5.02 (0.52)5.05 (0.32)
(4.69, 5.72)(4.44, 5.70)(4.11, 5.29)(4.42, 5.28)

β37.72 (0.32)7.46 (0.46)7.57 (0.56)7.85 (0.47)
(6.12, 7.86)(6.26, 7.99)(7.11, 8.27)(7.13, 8.22)

σɛ21.49 (1.20)1.55 (1.27)1.98 (1.25)1.02 (1.14)
(0.39, 4.50)(0.18, 4.32)(0.36, 4.39)(0.33, 3.96)

σα2---4.22 (3.13)
---(0.71, 4.93)

δε9.23 (0.75)9.03 (0.71)8.90 (0.66)9.97 (0.62)
(7.40, 9.51)(7.53, 9.80)(7.62, 9.09)(8.83, 10.06)

δα---5.01 (1.23)
---(0.89, 5.72)

Table 3

The posterior mean (standard deviation) and equitailed 95% credible intervals for the parameters in the skew-normal spatial-temporal models fitted to the Italian non-life insurance data

rgdp0.119 (0.007)0.122 (0.008)0.120 (0.007)0.120 (0.007)
(0.105, 0.133)(0.108, 0.137)(0.107, 0.134)(0.108, 0.013)

bank0.009 (0.001)0.009 (0.001)0.009 (0.001)0.009 (0.001)
(0.008, 0.105)(0.007, 0.105)(0.007, 0.0103)(0.007, 0.010)

den0.013 (0.004)0.013 (0.005)0.013 (0.004)0.013 (0.004)
(0.005, 0.021)(0.004, 0.021)(0.004, 0.021)(0.005, 0.022)

rirs−10.48 (1.722)−10.17 (1.847)−10.64 (1.684)−10.05 (1.628)
(−14.32, −7.298)(−13.46, −6.624)(−14.09, −7.497)(−13.07, −6.854)

agen−1.058 (8.997)−1.341 (8.999)−0.079 (8.897)−0.100 (9.506)
(−17.44, 16.46)(−18.80, 16.62)(−17.71, 17.23)(−19.03, 18.31)

school−0.882 (0.308)−0.885 (0.402)−1.015 (0.322)−0.964 (0.360)
(−1.515, −0.296)(−1.710, −0.126)(−1.692, −0.450)(−1.756, −0.271)

vaagr0.486 (0.713)0.448 (0.681)0.394 (0.734)0.372 (0.670)
(−0.881, 1.903)(−0.947, 1.848)(−1.064, 1.788)(−0.996, 1.685)

fam−26.85 (5.192)−25.86 (6.149)−27.49 (6.645)−28.07 (6.277)
(−36.92, −16.16)(−37.37, −14.13)(−39.88, −15.86)(−40.11, −15.81)

inef−4.360 (1.280)−4.305 (1.244)−4.453 (1.315)−4.515 (1.321)
(−6.911, −1.99)(−6.678, −1.871)(−7.088, −1.680)(−6.987, −1.865)

trust14.00 (4.970)10.80 (5.834)15.73 (7.128)14.54 (6.204)
(4.548, 24.07)(0.847, 23.54)(3.061, 31.43)(0.689, 25.47)

σɛ2307.4 (58.67)302.7 (58.61)284.5 (56.38)294.5 (60.45)
(209.3, 443.6)(207.1, 432.6)(190.0, 412.7)(184.5, 422.2)

δε48.12 (2.507)47.88 (2.473)48.04 (2.584)47.95 (2.647)
(43.13, 52.86)(43.07, 52.73)(43.02, 52.99)(42.49, 53.17)

−log likelihood2383.523782359.52366

Table 4

The posterior mean (standard deviation) and equitailed 95% credible intervals for the parameters in the skew-t spatial-temporal models fitted to the Italian non-life insurance data

rgdp0.012 (0.001)0.011 (0.001)0.012 (0.001)0.012 (0.001)
(0.011, 0.013)(0.010, 0.012)(0.010, 0.013)(0.011, 0.013)

bank0.010 (0.001)0.010 (0.001)0.009 (0.001)0.001 (0.000)
(0.009, 0.011)(0.008, 0.011)(0.008, 0.011)(0.009, 0.011)

den0.015 (0.003)0.012 (0.003.)0.017 (0.004)0.015 (0.003)
(0.009, 0.022)(0.006, 0.018)(0.010, 0.024)(0.008, 0.021)

rirs−7.034 (1.522)−9.835 (1.902)−7.572 (1.593)−7.577 (1.423)
(−9.768, −3.906)(−13.55, −6.491)(−9.986, −3.383)(−10.32, −4.819)

agen−38.07 (1.977)−20.93 (1.725)−20.04 (1.747)−31.48 (1.969)
(−74.96, −1.522)(−65.36, 6.566)(−58.44, 4.77)(−71.72, 1.646)

school−0.5176 (0.242)−0.401 (0.253)−0.773 (0.286)−0.362 (0.198)
(−0.945, −0.031)(−0.924, 0.052)(−1.392, −0.241)(−0.799, −0.026)

vaagr0.6778 (0.555)0.027 (0.545)0.479 (0.595)0.549 (0.558)
(−0.497, 1.726)(−1.018, 1.138)(−0.681, 1.661)(−0.523, 1.687)

fam−46.81 (5.183)−28.70 (7.510)−51.36 (5.045)−42.62 (4.768)
(−56.67, −36.08)(−44.41, −14.41)(−62.19, −41.9)(−50.65, −32.54)

inef−3.022 (0.967)−3.689 (1.08)−3.007 (1.094)−3.435 (1.159)
(−4.708, −0.971)(−5.598, −1.455)(−5.147, −0.822)(−5.615, −0.993)

trust18.82 (5.752)12.86 (5.426)28.16 (4.893)15.77 (4.967)
(8.165, 28.43)(4.427, 24.180)(18.90, 37.03)(6.391, 25.16)

σɛ273.04 (25.86)75.36 (25.48)72.84 (26.45)63.71 (2.126)
(34.45, 134.3)(36.62, 139.3)(30.74, 134.3)(30.31, 107.9)

δε38.99 (3.273)37.42 (3.005)36.44 (3.26)39.95 (3.012)
(32.53, 45.34)(31.14, 42.65)(29.90, 42.49)(34.09, 46.1)

−log likelihood2244.5225922442202

  1. Akaike H (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716-723.
  2. Arellano-Valle RB and Genton MG (2005). On fundamental skew distributions. Journal of Multivariate Analysis, 96, 93-116.
  3. Arellano-Valle RB and Azzalini A (2006). On the unification of families of skew-normal distributions. Scandinavian Journal of Statistics, 33, 561-574.
  4. Azzalini A (2005). The skew-normal distribution and related multivariate families. Scandinavian Journal of Statistics, 32, 159-188.
  5. Azzalini A (2014). The Skew-Normal and Related Families, Cambridge University Press.
  6. Beck T and Webb I (2003). Economic, demographic and institutional determinants of life insurance consumption across countries. World Bank Economic Review, 17, 51-88.
  7. Beenstock M, Dickinson G, and Khajuria S (1988). The relationship between propertyliability insurance premiums and income: An international analysis. The Journal of Risk and Insurance, 55, 259-272.
  8. Browne MJ, Chung J, and Frees EW (2000). International property-liability insurance consumption. The Journal of Risk and Insurance, 67, 73-90.
  9. Chesson PL (1985). Coexistence of competitors in spatially and temporally varying environments: A look at the combined effects of different sorts of variability. Theoretical Population Biology, 28, 263-287.
  10. Enz R (2000). The s-curve relation between per-capita income and insurance penetration. Geneva Papers on Risk and Insurance: Issues and Practice, 25, 396-406.
  11. Esho NA, Kirievsky A, Ward D, and Zurbruegg R (2004). Law and the determinants of property-casualty insurance. The Journal of Risk and Insurance, 71, 265-283.
  12. Farzammehr MA, Zadkarami MR, and McLachlan GJ (2019). Skew-normal generalized spatial panel data model. Communications in Statistics-Simulation and Computation, 50, 1-29.
  13. Gelman A and Hill J (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press.
  14. Guiso L, Sapienza P, and Zingales L (2004). The role of social capital in financial development. American Economic Review, 94, 526-556.
  15. Hannan EJ and Quinn BG (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society Series B, 41, 190-195.
  16. Jara A, Quintana F, and San Martin E (2008). Linear mixed models with skew-elliptical distributions: A Bayesian approach. Computational Statistics and Data Analysis, 52, 5033-5045.
  17. Lee SX and Mclachlan GJ (2013). On mixtures of skew normal and skew t-distributions. Advances in Data Analysis and Classification, 7, 241-266.
  18. Lunn D, Spiegelhalter D, Thomas A, and Best N (2009). The BUGS project: Evolution, critique and future directions. Statistics in Medicine, 28, 3049-3067.
    Pubmed CrossRef
  19. Millo G and Carmeci G (2010). Non-life insutrance consumption in Italy: A sub-regional panel data analysis. Journal of Geographical Systems, 13, 273-298.
  20. Millo G and Piras G (2012). splm: Spatial panel data models in R. Journal of Statistical Software, 47, 1-38.
  21. Nerlove M (1971). Further evidence on the estimation of dynamic economic relations from a time series of cross sections. Econometrica, 39, 359-382.
  22. Outreville JF (1990). The economic significance of insurance markets in developing countries. The Journal of Risk and Insurance, 18, 487-498.
  23. Sahu SK, Dey DK, and Branco MD (2003). A new class of multivariate skew distributions with applications to Bayesian regression models. Canadian Journal of Statistics, 31, 129-150.
  24. Schwarz G (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461-464.
  25. Zheng Y, Zhu J, and Li D (2008). Analyzing spatial panel data of cigarette demand: A Bayesian hierarchical modeling approach. Journal of Data Science, 6, 467-489.