There are many applications with the statistical structures for which a constrained linear regression model is proper. It is often reasonable that the incomplete information available in an applied regression require some different types of constraints on their parameters to be estimated. For example, the amount of apples naturally increases as an apple tree grows older. However, if there is not enough data, the analysis results may differ from common sense. In order to prevent this, it is recommended to place an ordinal constraint. For these points, Chen and Deely (1996) considered one such application and illustrated why the Bayesian model is both attractive and appropriate. They then used the normal distribution for error terms in a constrained linear multiple regression model. But, the normality assumption of the errors is not appropriate in real data because some data sometimes have heavy tails or are asymmetric in various fields. To overcome this, as expected, the assumption of the error terms would be more flexible. Therefore, we use the skew normal distribution for error terms containing the normal distribution for the Bayesian ordered multiple linear regression model as a flexible distribution.
The skew normal distribution was first introduced by O’Hagan and Leonard (1976). Azzalini (1985) conducted study on the construction of the family of univariate skew-normal distributions. Azzalini and Dalla-Valle (1996) extended the univariate skew-normal distribution to the multivariate case. The skew-normal distribution is a family of distributions with an additional parameter of bias. Some applications of the skew-normal regression are presented in Azzalini and Capitanio (1999), under a classical method. Sahu
Our goal is to propose the Bayesian inference for an ordered multiple regression model with skewnormal error terms. We then show that Bayesian methodology is particularly suited to the constrained problem compared to frequentist methodology. Here, we use the Markov chain Monte Carlo (MCMC) method to resolve complicated integration problems related to Bayesian inference. We also perform simulation studies and apply our proposed model to New Zealand apple data that was already used by Chen and Deely (1996). We verify the convergence of MCMC for all parameters based on the Gelman-Rubin shrinkage factor. For model comparison between normal error model and skew normal error model, we use the Bayes factor (BF) and deviance information criterion (DIC) given by Spiegelhalter
The paper is organized as follows. In Chapter 2, we review the skew normal distribution and an ordered multiple linear regression. In Chapter 3, we discuss a Bayesian ordered multiple regression model with skew normal errors and its Bayesian computations using the MCMC method. The propriety of the associated posterior density is also shown under the given improper priors. We present the model comparison using BF and DIC. The measure
According to Azzalini (1985), a random variable
First, we introduce the general expression of linear regression model as:
Our model to be considered as:
According to Sahu
We consider a noninformative priors on
We provide a proof of Theorem 1 in the ref-type="app" rid="app1">Appendix.
Here, it is hard to handle the integration part to calculate full conditional density of each parameter for Bayesian computation. Using the data augmentation suggested by Tanner and Wong (1987), we consider
To sample from the complete posterior distribution
Define
Next, the full conditional density of
Finally, the full conditional density of
It can be seen that the full conditional distributions in (3.8) and (3.9) are the truncated normal distibution. Therefore,
Step 1: Start with initial point (
Step 2: Set
Step 3: for
Step 4: for
Step 5: Using Metropolis-Hastings algorithm, generate
Step 6: Using Metropolis-Hastings algorithm, generate
Step 7:
Step 8: Repeat Steps 3–7,
Here,
Generally, for testing
It is natural to use criterion based on trade-off between the fit of the data to the model and the corresponding complexity of the model. Therefore, we compare the models based on DIC given by Spiegelhalter
To see the effect of single observation
Pettit and Young (1990) suggested that an observation with |
Therefore, the difference between including and not including
The New Zealand Apple and Pear Marketing Board (NZAPB) is a statutory authority that trades and manages every contract in exporting New Zealand apples. Therefore, the more than 1,500 apple growers in New Zealand engage in the global community as one grower making international trade contracts. It is significant to realize that the data recorded consist of total harvest for each grower and for each apple variety. However, the NZAPB has also recorded the number of trees at each age, for each grower and for each variety. For the purpose of NZAPB, ages of trees vary from 2 to 11, where a tree of age “11” means 11 or older and is considered to be a full-grown tree. The year 1 is not included in the analysis because its production is close to zero. This is a set of 207 records, each record consisting of the total amount of fruit produced and the number of trees of each age. See Chen and Deely (1996) for more details of NZAPB’s apple data. It is reasonable that a linear regression model be used, where the quantity of fruit produced is regressed on the tree numbers at each age, beginning at physical age 2, for 10 years up to a full-grown tree. For notational convenience, we consider ages varying from 1 to 10, matching the association that age 1 is the physical age 2 and so on. Thus, we let, for
We investigated the robustness of the ordered multiple linear regression with skew normal model according to the different values of
To compare the skew normal model to the original normal model informally, compute BF_{01} in (3.13), the effective number of parameters
Table 3 shows the Bayesian estimates of parameters of regressions, variation and skewness from rival models. All ten regression parameters
In Figure 1, the plot
In this paper, we presented a Bayesian ordered multiple linear regression with skew normal errors. For the case for hard to compute full conditional density, we use data augmentation method and expand the observed likelihood. Thus, we easily get estimates of parameters by MCMC. We show that the associated posterior density based on our Bayesian skew normal model is proper under the improper priors. That is, the marginal posterior density is finite.
We can detect which observation support normal error or skew normal error using the
Finally, we apply our model to data that has various types of
We first consider the integral of the likelihood in (3.2) times the prior in (3.5). Let
It is sufficient to show that
Then, the inner integral in the bracket of the equation (A.2) with respect
DICs for values of
1 | 10 | 100 | |
---|---|---|---|
−1.0 | 2832.941 | 2832.997 | 2833.964 |
−0.5 | 2832.924 | 2832.906 | 2833.867 |
0.0 | 2832.792 | 2832.865 | 2833.693 |
0.5 | 2832.713 | 2832.799 | 2833.547 |
1.0 | 2832.670 | 2832.683 | 2833.444 |
DIC = deviance information criterion.
Model comparison for apple data
Model | DIC | ||
---|---|---|---|
Normal error | 2831.803 | 11.1669 | 2822.9699 |
Skew normal error | 2731.566 | 51.1039 | 2782.6699 |
DIC = deviance information criterion.
Normal error and skew normal error estimation
Parameter | Normal error | Skew normal error | ||
---|---|---|---|---|
Estimate | Standard error | Estimate | Standard error | |
0.01374 | 0.00806 | 0.01723 | 0.00019 | |
0.02496 | 0.00820 | 0.01723 | 0.00019 | |
0.17749 | 0.01113 | 0.18122 | 0.00016 | |
0.31129 | 0.04111 | 0.29833 | 0.00158 | |
0.55423 | 0.07618 | 0.55195 | 0.00601 | |
0.78064 | 0.03463 | 0.81104 | 0.00126 | |
0.81682 | 0.04071 | 0.81131 | 0.00129 | |
0.93711 | 0.09977 | 0.86252 | 0.04474 | |
1.04100 | 0.11945 | 0.89909 | 0.05384 | |
1.22821 | 0.17487 | 1.29236 | 0.28248 | |
53469.3 | 5406.4 | 51267.8 | 5002.1 | |
· | · | 2.03320 | 0.08099 |