TEXT SIZE

• •   CrossRef (0) Nonnegative estimates of variance components in a two-way random model  Jaesung Choi1,a

aDepartment of Statistics, Keimyung University. Korea
Correspondence to: 1Department of Statistics, Keimyung University, 1095 Dalgubeoldaero, Dalseogu, Daegu 42601, Korea.
E-mail: jschoi@kmu.ac.kr
Received January 22, 2018; Revised May 8, 2016; Accepted June 5, 2018.
Abstract

This paper discusses a method for obtaining nonnegative estimates for variance components in a random effects model. A variance component should be positive by definition. Nevertheless, estimates of variance components are sometimes given as negative values, which is not desirable. The proposed method is based on two basic ideas. One is the identification of the orthogonal vector subspaces according to factors and the other is to ascertain the projection in each orthogonal vector subspace. Hence, an observation vector can be denoted by the sum of projections. The method suggested here always produces nonnegative estimates using projections. Hartley’s synthesis is used for the calculation of expected values of quadratic forms. It also discusses how to set up a residual model for each projection.

Keywords : projections, random effects, vector space, quadratic forms, synthesis
1. Introduction

When a linear model contains more than one random component the model is said to be a random effects model or a mixed effects model depending on whether the fixed effects are or not. A random effect is a random variable representing the effect of a randomly chosen level from a population of levels that a random factor can assume. Many papers have tried to solve negative values of variance components of random effects included in a linear model, which is an underlying model for the analysis of data from a certain experiment. Henderson (1953) suggested three methods to estimate variance components that came to be known as Henderson’s method 1, 2, and 3. Searle (1971) dealt with them in detail. There are many papers devoted to the estimation of variance components; however, it is unfortunate that any method discussed in papers can have negative estimates of variances. Milliken and Johnson (1984) suggested a proposal for eliminating negative estimates. EI-Leithy et al. (2016) and Searle et al. (2009) discussed the nonnegative estimation of variance components in mixed linear models. This paper proposes a proper method that always yields nonnegative estimates for variance components. It is based on the concept of projections defined on a vector space. The basic idea of projection and related concepts are well-defined by Johnson and Wichern (1988). Milliken and Johnson (1984) used the following data as an example for estimating variance components. It is again used here to compare estimates between the Henderson’s method 1 and the method proposed in this paper that produces nonnegative estimates for variance components.

2. Data and model

The data in Table 1 are from Milliken and Johnson (1984). R and C denote the random factors of a two-way treatment structure in a completely randomized design structure.

First, consider a more general two-way random components model for analyzing such a data. The model of general form is given by

$yijk=μ+αi+βj+(αβ)ij+ϵijk, i=1,2,…,a; j=1,2,…,b; k=1,2,…,nij,$

where yijk is the response from the kth experimental unit in the treatment combination of ith level of R and jth level of C. μ is the grand mean, αi is the effect of the ith level of the random factor R and is assumed to be distributed i.i.d. $N(0,σα2)$, and βj is the effect of the jth level of the random factor C, which is assumed to be distributed i.i.d. $N(0,σβ2)$. It also assumes (αβ)i j the effect of the interaction R × C and is assumed to be distributed i.i.d. $N(0,σαβ2)$, and ϵijk is the random error associated with yijk where it is assumed ϵijk are distributed i.i.d. $N(0,σϵ2)$. These assumptions allow the variances and covariances of the observations to be evaluated. The variance of an observation is

$Var(yijk)=Var(μ+αi+βj+(αβ)ij+ϵijk)=σα2+σβ2+σαβ2+σϵ2.$

There are four components in the variance of yijk. The above model can be described in matrix notation as

$y=jμ+X1α+X2β+X3(αβ)+ϵ,$

where j is an n × 1 vector of ones ($n=∑i-1a∑j=1bnij$), X1 is an n × a coefficient matrix of α, α is the a × 1 random vector assumed to have $N(0,σα2Ia)$, X2 is an n × b coefficient matrix of β, β is the b × 1 random vector assumed to have $N(0,σβ2Ib)$, X3 is an n × ab coefficient matrix of (αβ), (αβ) is the ab × 1 random vector assumed to have $N(0,σαβ2Iab)$, and ϵ is the n × 1 random vector assumed to be distributed $N(0,σϵ2In)$. The covariance matrix of the observation vector y denoted by ∑ is

$Σ=Var(y)=Var(jμ+X1α+X2β+X3(αβ)+ϵ)=σα2X1X1′+σβ2X2X2′+σαβ2X3X3′+σϵ2In.$

The analysis of the variance method called Henderson’s Method 1 can be used for the estimation of variance components. But the problem is that it sometimes yields negative values with which we do not agree as an estimate of variance. This unpleasant thing can be avoided by way of projections. Let x and y be vectors in an n dimensional vector space. The projection of y on the vector x denoted by px is

$px=x(x′x)-1x′y.$

Let X be a model matrix from (2.1) where X is defined as X = (j, X1, X2, X3). Projection of y onto the vector subspace generated by X is

$pX=X(X′X)-X′y=XX-y,$

where X is Moore-Penrose generalized inverse and denotes (XX)X′. The definition of it is well defined on Graybill (1983) as terminology g-inverse. If X is of full rank, (XX)−1 is used instead of (XX).

3. Projection method

A linear model is called a random components model when it has random components only except μ. Random components may happen from the treatment structure or design structure (or both) in an experiment. Henderson (1953) suggested Method 1 for the estimation of variance components in a random model. An alternative method that always produces nonnegative values is discussed because it allows impermissible estimates as variances. An observation vector y in a vector space can be decomposed into projections that are defined on orthogonal vector subspaces generated by model matrices from fitting sub-models of model (2.1). This implies that the projection defined on each orthogonal vector subspace has its own nonnegative variation not affected by other projections. Henceforth, there is the possibility of finding nonnegative variance estimates. To find nonnegative estimates of variance components in model (2.1) first consider fitting a full model. In fitting (2.1), we get (2.2). After fitting the whole model we get the residual vector r,

$r=(I-XX-)(Xδ+ϵ)=(I-XX-)ϵ,$

where δ denotes (μ, α, β, (α β))′. From (3.1) we see that the error vector subspace is generated by sub-model matrix (IXX). Let the residual model matrix be Xf. The residual sum of squares can be obtained by the squared distance of the projection of r onto the error space. The residual random vector r has only one random component ϵ which has the information about $σϵ2$. The projection of r denoted by pe is given by

$pe=[(I-XX-)] [I-XX-]-r=XfXf-r,$

where Xf = IXX. So, the residual sum of squares denoted by SSe is

$pe′pe=r′XfXf-r=y′(I-XX-)y.$

XX is the projection of y into the vector subspace generated by the model matrix X. Second, think of the residual random model for the estimation of $σαβ2$ and $σϵ2$. The model is

$rb=(I-XbXb-) (Xbδb+X3(αβ)+ϵ)=(I-XbXb-) X3(αβ)+ϵb),$

where Xb = (j, X1, X2), δb = (μ, α, β)′, and $ϵb=(I-XbXb-)ϵ$. Letting SSb be the sum of squares adjusted for δb, SSb = rbrb. E(SSb) will reveal information about the two variance components $σαβ2$ and $σϵ2$. The projection of rb into the vector space generated by sub-model matrix is revealed as

$pab=[(I-XbXb-)X3] [(I-XbXb-)X3]-rb=XαβXαβ-rb,$

where $Xαβ=(I-XbXb-)X3$. The sum of squares due to (αβ) is the squared length of the projection which is expressed by $pab′pab$. Then we can see that $SSb=pab′pab+pe′pe$. From (3.3), it can be seen the residual random vector rb has two variance components $σαβ2$ and $σϵ2$. These are estimated separately from each corresponding vector subspace that is orthogonal to each other. Third, for the estimation of variance component $σβ2$ a model to be fitted is given by

$ra=(I-XaXa-) (Xaδa+X2β+X3(αβ)+ϵ)=(I-XaXa-)X2β+ϵa,$

where Xa = (j, X1), δa = (μ, α)′, and $ϵa=(I-XaXa-)(X3(αβ)+ϵ)$. It should be noted that ra has three variance components from (3.4), one of which is estimated from the estimation space and the others are from the error space. Hence, it has the information about $σβ2,σαβ2$, and $σϵ2$. Let SSa be rara. The projection denoted by pb is given as

$pb=[(I-XaXa-)X2] [(I-XaXa-) X2-]ra=XβXβ-ra,$

where $Xβ=(I-XaXa-)X2$. The sum of squares due to β is given by the squared distance of the projection $pb′pb$. So, $SSa=pb′pb+pab′pab+pe′pe$. Lastly, the fitting model for $σα2$ is

$rj=(I-XjXj-) (Xjδj+X1α+X2β+X3(αβ)+ϵ)=(I-XjXj-)X1α+ϵj,$

where Xj = j, δj = μ, and $ϵj=(I-XjXj-)(X2β+X3(αβ)+ϵ)$. Let SSj be $rj′rj$. Since the residual random vector has all four random components, it has the information about $σα2,σβ2,σαβ2$, and $σϵ2$. From (3.5), $σα2$ can be estimated in the estimation space independently of the others. The projection of rj into the space generated by $(I-XjXj-)X1$ is

$pa=[(I-XjXj-)X1] [(I-XjXj-)X1]-rj=XαXα-rj,$

where Xα = (IXjXj)X1. The sum of squares adjusted for α is $pa′pa$. Hence, $SSj=pa′pa+pb′pb+pab′pab+pe′pe$. It should be noted that random residual models are derived in a reverse order using the residual vector. After establishing the model in each step, identify the model matrix associated with the random effects and then derive the corresponding projection at that step using the matrix. Then calculate the sum of squares due to random effects in terms of projection.

4. Expectations of quadratic forms

Variance components are estimated by equating the sums of squares to expected values. The sums of squares are quadratic forms in the observations. Hartleys synthesis (Hartley, 1967) is used for the calculation of the expected values of quadratic forms. The method applies to the calculations of the coefficients of variance components of the expected value of quadratic forms in observations as a variation source. Let Q be a quadratic form in the observation vector y and q = yQy where Q is assumed to be a symmetric matrix of constants. Then the expectation of Q is normally given by

$E(y′Qy)=Tr[QE(yy′)]=Tr(QΣ)+μ2j′Qj.$

The matrix of a quadratic form in y for the analysis of variance can be constructed such that μ2jQj = 0. Hence, the expectation of the quadratic form that does not depend on μ is

$E(y′Qy)=Tr(QΣ)=Tr[Q(σα2X1X1′+σβ2X2X2′+σαβ2X3X3′+σϵ2In)]=σα2Tr(X1′QX1)+σβ2Tr(X2′QX2)+σαβ2Tr(X3′QX3)+σϵ2Tr(Q),$

where Tr(Q) denotes the trace of the square matrix Q. It should be noted that the $Tr(Xi′QXi)$ can vary at times because of the dependence of sub-model matrices Xi’s which are not orthogonal to each other. For the calculation of the coefficients of variance components orthogonal sub-model matrices are used instead of Xi’s. First, think of the coefficients of $σϵ2$. From the equation of (3.2) the expectation of SSe is

$E(pe′pe)=E(r′(I-XX-) (I-XX-)-r)=E[y′(I-XX-)y]=ce1σϵ2+ce2σαβ2+ce3σβ2+ce4σα2,$

where y is supposed of consisting of orthogonal components vectors. First, the coefficient of $σϵ2$, ce1 is obtained by using Xf, which is the residual model matrix of (3.1). That is,

$Tr(Xf′(I-XX-)Xf)=Tr((I-XX-)′(I-XX-)(I-XX-))=Tr(I-XX-).$

Second, the coefficient of $σαβ2$, ce2 is given as

$Tr(Xαβ′(I-XX-)Xαβ)=Tr(X3′(I-XbXb-) (I-XX-) (I-XbXb-)X3)=0,$

which shows Xαβ and IXX are orthogonal matrices. Third, the coefficient of $σβ2$, ce3 is given as

$Tr(Xβ′(I-XX-)Xβ)=Tr(X2′(I-XaXa-)(I-XX-)(I-XaXa-)X2),=0,$

where Xβ denotes the model matrix of (3.4). It shows Xβ and IXX are orthogonal matrices. Fourth, the coefficient of $σα2$, ce4, is given as

$Tr(Xα′(I-XX-)Xα)=Tr(X1′(I-XaXa-)(I-XX-)(I-XaXa-)X1)=0,$

where Xα is the model matrix of (3.5). This expression also shows Xα and IXX are orthogonal matrices. All matrices associated with the variance components $σαβ2,σβ2$, and $σα2$ are orthogonal to the coefficient matrix of the random error vector ϵ. Hence, (4.1) is reduced to

$E(pe′pe)=Tr(I-XX-)σϵ2=ce1σϵ2.$

Next, we compute E(SSb) having two variance components because the random residual vector rb has random effects related to $σαβ2$ and $σϵ2$. E(SSb) is

$E(rb′rb)=E(y′(I-XbXb-)y)=cb1σϵ2+cb2σαβ2+cb3σβ2+cb4σα2.$

First, the coefficient of $σϵ2$, cb1 is obtained using the model matrix IXX of (3.1) denoted by Xf. That is,

$Tr(Xf′(I-XbXb-)Xf)=Tr((I-XX-) (I-XbXb-) (I-XX-))=Tr(I-XX-).$

Second, the coefficient of $σαβ2$, cb2 is obtained using the model matrix $(I-XbXb-)X3$ of (3.3) denoted by Xαβ. This turns out to be

$Tr(Xβ′(I-XbXb-)Xαβ)=Tr(X3′ (I-XbXb-) (I-XbXb-) (I-XbXb-)X3)=Tr(X3′(I-XbXb-)X3).$

Third, the coefficient of $σβ2$, cb3 is obtained using the model matrix $(I-XbXb-)X2$ of (3.4) denoted by Xβ. That is,

$Tr(Xβ′(I-XbXb-)Xβ)=Tr(X2′(I-XaXa-) (I-XbXb-) (I-XaXa-)X2)=0,$

where two matrices Xβ and $I-XbXb-$ are orthogonal. Lastly, the coefficient of $σα2$, cb4 is obtained using the model matrix $(I-XjXj-)X1$ of (3.5) denoted by Xα. That is,

$Tr(Xα′(I-XbXb-)Xα)=Tr(X1′(I-XaXa-) (I-XbXb-) (I-XaXa-)X1)=0.$

Hence, the expectation of SSb is represented by

$E(rb′rb)=Tr(I-XX-)σϵ2+Tr(X3′(I-XbXb-)X3)σαβ2=cb1σϵ2+cb2σαβ2.$

Since ra defined on (3.4) has three random effects, E(SSa) has the information about the coefficients of $σβ2σαβ2$, and $σϵ2$. E(SSa) is

$E(ra′ra)=E(y′ (I-XaXa-)y)=ca1σϵ2+ca2σαβ2+ca3σαβ2+ca4σα2.$

First, the coefficient of $σϵ2$, ca1 is obtained using the model matrix of (3.1) denoted by Xf. That is,

$Tr(Xf′(I-XaXa-)Xf)=Tr((I-XX-)(I-XaXa-)(I-XX-))=Tr(I-XX-).$

Second, the coefficient of $σαβ2$, ca2 is obtained using the model matrix of (3.3) denoted by Xαβ. That is,

$Tr(Xαβ′(I-XaXa-)Xαβ)=Tr(X3′(I-XbXb-) (I-XaXa-) (I-XbXb-)X3)=Tr(X3′(I-XbXb-)X3).$

Third, the coefficient of $σβ2$, ca3 is found using the model matrix of (3.4) denoted by Xβ. That is,

$Tr(Xβ′(I-XaXa-)Xβ)=Tr(X2′(I-XaXa-) (I-XaXa-) (I-XaXa-)X2)=Tr(X2′(I-XaXa-)X2),$

where two matrices Xβ and $I-XbXb-$ are orthogonal. Lastly, the coefficient of $σα2$, ca4 is obtained using the model matrix of (3.5) denoted by Xα. That is,

$Tr(Xα′(I-XaXa-)Xα)=Tr(X1′(I-XaXa-) (I-XaXa-) (I-XaXa-)X1)=Tr(X1′(I-XaXa-)X1)=0.$

Hence, E(SSa) is represented by

$E(ra′ra)=Tr(I-XX-)σϵ2+Tr(X3′(I-XbXb-)X3)σαβ2+Tr(X2′(I-XaXa-)X2)σβ2=ca1σϵ2+ca2σαβ2+ca3σβ2.$

Finally, E(SSj) is

$E(rj′rj)=E(y′(I-Xj′Xj-)y)=cj1σϵ2+cj2σαβ2+cj3σβ2+cj4σα2.$

The coefficients of variance components of (4.2) can be obtained similarly as before with the model matrices Xf, Xαβ, Xβ, and Xα. With the Xf, the coefficient of $σϵ2$, cj1 is given as

$Tr(Xf′(I-XjXj-)Xf)=Tr((I-XX-) (I-XjXj-) (I-XX-))=Tr(I-XX-).$

With the Xαβ, the coefficient of $σαβ2$, cj2 is given as

$Tr(Xαβ′(I-XjXj-)Xαβ)=Tr(X3′(I-XbXb-) (I-XjXj-) (I-XbXb-)X3)=Tr(X3′(I-XbXb-)X3).$

With the Xβ, the coefficient of $σβ2$, ca3 is given as

$Tr(Xβ′(I-XjXj-)Xβ)=Tr(X2′(I-XaXa-) (I-XjXj-) (I-XaXa-)X2)=Tr(X2′(I-XaXa-)X2).$

With the Xα, the coefficient of $σα2$, cj4 is given as

$Tr(Xα′(I-XjXj-)Xα)=Tr(X1′(I-XaXa-) (I-XjXj-) (I-XaXa-)X1)=Tr(X1′(I-XaXa-)X1).$

Hence, the expectation of SSa is represented by

$E(rj′rj)=Tr(I-XX-)σϵ2+Tr(X3′(I-XbXb-)X3)σαβ2+Tr(X2′(I-XaXa-)X2)σβ2+Tr(X1′(I-XaXa-)X1)σα2=cj1σϵ2+cj2σαβ2+cj3σβ2+cj4σα2.$

The equations solving for variance components are

$SSe=ce1σ^ϵ2+ce2σ^αβ2+ce3σ^β2+ce3σ^α2,SSb=cb1σ^ϵ2+cb2σ^αβ2+cb3σ^β2+cb4σ^α2,SSa=ca1σ^ϵ2+ca2σ^αβ2+ca3σ^β2+ca4σ^α2,SSj=cj1σ^ϵ2+cj2σ^αβ2+cj3σ^β2+cj4σ^α2.$

The solutions from a system of linear equations in ($σi2$)’s (4.3) are given as nonnegative estimates of variance components for the assumed model (2.1).

5. Example

In this section the data presented in Section 2 is used as an example of getting nonnegative estimates for four variance components. From the model fitting procedure discussed previously we get SSe = 30.66667, SSb = 139.81176, SSa = 154.57143, and SSj = 155.2143. The expectations of these are given as

$E(SSe)=8σϵ2E(SSb)=8σϵ2+4.52σαβ2E(SSa)=12σϵ2+4.52σαβ2+9.14σβ2E(SSj)=8σϵ2+4.52σαβ2+9.14σβ2+7σα2.$

By equating the sums of squares to their respective expected sums of squares, we get

$σ^ϵ2=3.8333, σ^αβ2=24.1597, σ^β2=1.6143, σ^α2=0.0918.$

Nonnegative estimates are obtained by the projection method. These values are compared with the Type I estimates from Milliken and Johnson (1984) which are

$σ^ϵ2=3.8333, σ^αβ2=22.4620, σ^β2=-10.5877, σ^α2=-8.0329.$
6. Conclusion

This paper suggests a method to estimate variance components in a two-way random model. The method is based on the concept of an orthogonal projection of an observation vector in n-dimensional space onto the vector subspace of itself. Orthogonal projections are used for the estimation of variance components associated with the random effects of the assumed model. There are many methods that are available in the literature; however, most still have the problem of negative estimates of variance components when it happens. To avoid this, it is necessary to build proper models to obtain right projections onto vector subspaces that are orthogonal to each other. Therefore, it is important to identify the residual random vector in each step and then construct a residual random model for it.

The corresponding orthogonal projection is obtained by projecting the residual random vector onto the vector subspace spanned by the residual model matrix of the residual random model. The observation vector can be denoted as the sum of projections since projections are defined on their own vector subspaces that are orthogonal.

Hartley’s synthesis was used for the calculations of the expectations of sums of squares expressed as sums of squared distances of projections that are orthogonal. Each projection is defined on its own vector subspace orthogonal to others; therefore, the expectation on the sums of squares for the residual vector defined on each stage has nonzero coefficients for variance components only if it has related random effects.

Variance as a measure of variation of data can never be a negative value from its definition; Nevertheless, we accredit the negative estimate as evidence that the true value of the component is zero as is often. However, the projection method shows that the estimates of variance components are always nonnegative.

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (Grant No.2018R1D1A1B07043021).

TABLES

### Table 1

Data for two-way design

RC11RC12RC13RC21RC22RC23
101321161311
121519181913
1114

References
1. EI-Leithy AH, Abdel-Wahed AZ, and Abdallah SM (2016). On non-negative estimation of variance components in mixed linear models. Journal of Advanced Research, 7, 59-68.
2. Graybill FA (1983). Matrices with Applications in Statistics, California, Wadsworth.
3. Hartley HO (1967). Expectations, variances and covariances of ANOVA mean squares by ‚Äúsynthesis‚Äù. Biometrics, 23, 105-114.
4. Henderson CR (1953). Estimation of variance and covariance components. Biometrics, 9, 226-252.
5. Johnson RA and Wichern DW (1988). Applied Multivariate Statistical Analysis, New Jersey, Prentice-Hall.
6. Milliken GA and Johnson DE (1984). Analysis of Messy Data, New York, Van Nostrand Reinhold.
7. Searle SR (1971). Linear Models, New York, John Wiley & Sons.
8. Searle SR, Casella G, and McCulloch CE (2009). Variance Components, New York, John Wiley & Sons.