search for

CrossRef (0)
Application of covariance adjustment to seemingly unrelated multivariate regressions
Communications for Statistical Applications and Methods 2018;25:577-590
Published online November 30, 2018
© 2018 Korean Statistical Society.

Lichun Wang1,a, Lawrence Pettitb

aDepartment of Mathematics, Beijing Jiaotong University, China;
bSchool of Mathematical Sciences, Queen Mary University of London, UK
Correspondence to: 1Department of Mathematics, Beijing Jiaotong University, No.3 Shangyuancun Haidian District, Beijing 100044, China. E-mail: lchwang@bjtu.edu.cn
Received February 27, 2018; Revised August 24, 2018; Accepted September 21, 2018.

Employing the covariance adjustment technique, we show that in the system of two seemingly unrelated multivariate regressions the estimator of regression coefficients can be expressed as a matrix power series, and conclude that the matrix series only has a unique simpler form. In the case that the covariance matrix of the system is unknown, we define a two-stage estimator for the regression coefficients which is shown to be unique and unbiased. Numerical simulations are also presented to illustrate its superiority over the ordinary least square estimator. Also, as an example we apply our results to the seemingly unrelated growth curve models.

Keywords : seemingly unrelated regressions, matrix power series, two-stage estimator
1. Introduction

The system of seemingly unrelated regressions (SUR) has been investigated by many authors since the pioneering works of Zellner (1962, 1963), which can be used to model subtle interactions among individual statistical relationships. For more details, the readers are referred to Revankar (1974), Schmidt (1977), Wang (1989), Percy (1992), Liu (2000), Liu (2002). Among these, the cases of orthogonal regressors (Zeller, 1963) and triangular SUR models (Revankar, 1974) and an SUR with unequal numbers of observations (Schmidt, 1977) are more impressive. Some examples in the econometrics literature (Srivastava and Giles, 1987) suggest that the SUR model is appropriate and useful for a wide range of applications. Further, Velu and Richards (2008) focuses on some applications of reduced-rank model in the context of SUR. Alkhamisi (2010) proposes two SUR type estimators based on combining the SUR ridge regression and the restricted least squares methods as well as evaluates their performances by means of some designated criteria. Zhou et al. (2011) also employs seemingly unrelated nonparametric regression models to fit the multivariate panel data. Shukur and Zeebari (2012) considers median regression for SUR models with the same explanatory variables and obtains an interesting feature of the generalized least absolute deviations method. However, this paper will show some interesting facts about the SUR system by employing the covariance adjustment technique. We start from the system of seemingly unrelated multivariate regressions (Gupta and Kabe, 1998), namely


where Yi (i = 1, 2) are n × q observation variables; Xi (i = 1, 2) are n × pi matrices with full column rank; Bi (i = 1, 2) are pi × q unknown regression coefficients; E1 and E2 are random error matrices and the row variable of (E1, E2) follow a common unspecified multivariate distribution with mean zero and covariance matrix V, where V is a 2 × 2 non-diagonal partitioned matrix and given by


where Vi is the variance-covariance matrix of the row variable of Ei (i = 1, 2) and D denotes the covariance matrix between the row variable of E1 and the corresponding row variable of E2. The different rows of (E1, E2) are also assumed to be uncorrelated. The case of multivariate SUR is common in biological science. For instance, if the ith row of Y1 denotes the observation vector of the weight of the ith rabbit at q different time points and the ith row of Y2 denotes the observation vector of the length of the ith rabbit at the same q time points, and the observation values of different rabbits are uncorrelated, then the multivariate SUR (1.1) reasonably model the interactions among the observation vectors of weight and length of n rabbits.

If one neglects the correlation between Y1 and Y2, i.e., taking D as zero, then only by the first equation of the system (1.1), one would obtain the least square estimator (LSE) for Vec(B1) as


and correspondingly, the LSE of the coefficients matrix B1 is B^1=(X1TX1)-1X1TY1, where Vec(A) denotes the direct operator of matrix A, ⊗, and Iq are the Kronecker product operator and the identity matrix of q order, respectively.

However, if we denote Y = (Y1, Y2), B = (B1, B2), and E = (E1, E2), then the system (1.1) can also be represented as:


Hence, from (1.4), one can obtain the LSE of Vec(B), say Vec¯(B), and accordingly another estimator for Vec(B1), denoted by Vec¯(B1), can be proposed since Vec¯(B)T=(Vec¯(B1)T,Vec¯(B2)T). We think it makes sense that Vec¯(B1) and its corresponding two-stage estimator version Vec¯(B1)2-stage (in case of unknown V) should outperform Vec^(B1) (1.3) since they take the another equation information on B1 into account.

The covariance adjustment technique is usually employed to obtain an optimal unbiased estimator of a vector parameter θ via linearly combining an unbiased estimator of θ, say T1, and an unbiased estimator of a zero vector, say T2 (Rao, 1967; Baksalary, 1991).

Applying the covariance adjustment technique to the estimator Vec^(B1), which only uses the first equation information on Vec(B1), we firstly use (IqN2)Vec(Y2) to improve Vec^(B1) noting E[(IqN2)Vec(Y2)] = (IqN2)(IqX2)Vec(B2) = 0 and obtain Vec^(B1)(1), secondly we again improve Vec^(B1)(1) by (IqN1)Vec(Y1) due to E[(IqN1)Vec(Y1)] = 0 and get Vec^(B1)(2). Repeating this process, we obtain the following estimator sequence (k ≥ 1) for Vec(B1):


where Vec^(B1)(0)=Vec^(B1),Ni=In-Xi(XiTXi)-1XiT(i=1,2), and A denotes any a generalized inverse matrix of A.

Note that Cov(Vec(Yi),Vec(Yi)) = ViIn (i = 1, 2) and Cov(Vec(Y1),Vec(Y2)) = DIn. By some algebra computations, we obtain that for k ≥ 1


Denote V-1=(V11V12V21V22) and Q = (V11)−1V12(V22)−1V21. By (1.2) and the inverse of partitioned matrix, we have


and (V11)-1V12=-DV2-1. Thus we have


where Pi=In-Ni=Xi(XiTXi)-1XiT and we use the facts that (QP2P1)0 = IqIn = Iqn, (V11)−1(QT )kV11 = Qk for k ≥ 0 and X1T(N2N1)k=-X1T(P2P1)k-1P2N1 for k ≥ 1.

Then, we integrate the above conclusions into the following theorem, which indicates the limit of the covariance adjustment sequence and the covariance of Vec¯(B1).

Theorem 1

For the system (1.1), the limit of the covariance adjustment sequence of Vec(B1) equals to Vec¯(B1), i.e.,limkVec^(B1)(k)=Vec¯(B1), and




The first conclusion follows from the above discussion. Denote
Vec¯(B1)=M(Q){Vec(Y1)+[(V11)-1V12N2]Vec(Y2)} with
M(Q)=[Iq(X1TX1)-1X1T]{Iqn-Qi=0(QP2P1)iP2N1}, we have


where we use the following fact


Together with the expression of M(Q), we have


Using QV1=DV2-1DT,X1TN2X1=X1T(In-P1P2P1)X1, and X1TP1=X1T, we have


where the last step uses the facts that (P1P2P1)0 = In and QiDV2-1DT=DV2-1DT(QT)i for i ≥ 0.

The proof of Theorem 1 is finished.

Note that Q0DV2-1DT=DV2-1DT0, InP1P2P1 ≥ 0 and for i ≥ 1


and (P1P2P1)i – (P1P2P1)i+1 ≥ 0. Hence


Further, since Cov(Vec^(B1))=V1(X1TX1)-1, we have


which means Vec¯(B1) is superior to Vec^(B1) in the sense of having less covariance. This result is exactly consistent with the fact that Vec^(B1) only uses the first regression information on Vec(B1), whereas Vec¯(B1) combines the second regression equation with the first one via covariance adjustment.

2. The characteristics of matrix series

Note that for i = 1, 2, . . . ,


We only need to prove that the right equality implies the left equality. Note that X1T(P2P1)i-1P2N1N2=0 concludes X1T(P2P1)i-1N2N1N2=0, hence one has X1T(P2P1)i-1N2N1N2(P1P2)i-1X1=0, thus X1T(P2P1)i-1N2N1=0, where we use N12=N1. Further, replace N2 by InP2 and note that X1TN1=0 and P1N1 = 0, we have X1T(P2P1)i-1P2N1=0.

Therefore, (2.1) implies that for i = 1, 2, . . . ,


which further shows that for i = 1, 2, . . . ,


where we note that Q=DV2-1DTV1-1 and Qi(V11)-1V12=-QiDV2-1 and D is the covariance matrix of E1 and E2, and that both Q and Qi(V11)−1V12 are invertible.



The following theorem shows that the matrix series (1.10) only have one degeneration form Vec¯(B1)s.

Theorem 2

Vec¯(B1)sis the unique simpler form ofVec^(B1)().


Note that for any a fixed i (i ≥ 1) that: if X1T(P2P1)i-1P2N1=0, then X1T(P2P1)iP2N1=X1T(P2P1)i-1P2(In-N1)P2N1=0. Step by step, we come to

X1T(P2P1)k-1P2N1=0, 듼 듼 듼k=i+1,i+2,....

Thus, we find

Qi(X1TX1)-1X1T(P2P1)i-1P2N1=0, 듼 듼 듼for any a fixed i(i1)Qk(X1TX1)-1X1T(P2P1)k-1P2N1=0, 듼 듼 듼k=i+1,i+2,....

On the other hand, if for any a fixed i (i ≥ 2) one has X1T(P2P1)i-1P2N1=0, then it is easy to see that


where the last step comes from the fact (2.1). Thus, step by step we conclude that

Qi(X1TX1)-1X1T(P2P1)i-1P2N1=0, 듼 듼 듼for any a fixed i(i2)Qk(X1TX1)-1X1T(P2P1)k-1P2N1=0, 듼 듼 듼k=1,2,,i-1.

Combining (2.6) with (2.8), we know that for any a fixed i (i ≥ 1) if


then the infinite series


and by (2.3), concurrently we conclude that the infinite series


Hence, Vec^(B1)() has unique simpler form Vec¯(B1)s in the sense that if one term in (2.10) or (2.11) is zero, then both infinite sums turn into zero.

The proof of Theorem 2 is finished.

3. The properties of two-stage estimator

If the covariance matrix V is unknown, then both Vec^(B1)() and the simpler form Vec¯(B1)s are not available to use. Set X = (X1, X2), we estimate V by


where R(X ) is the rank of X and PX = X(X T X )X T .

Following from E(aT Ab) = trace[ACov(b, a)] + (Ea)T A(Eb) and (InPX )Xi = 0(i = 1, 2), where a and b denote two random vectors, we have E[YiT(In-PX˜)Yi]=Vi[n-R(X˜)](i=1,2) and E[Y1T(In-PX˜)Y2]=D[n-R(X˜)], which show that


Substituting the estimator V for V in the expressions of Vec^(B1)() and Vec¯(B1)s , we obtain the following two two-stage estimators


with M(Q^)=[Iq(X1TX1)-1X1T]{Iqn-Q^i=0(Q^P2P1)iP2N1} and Q^=D^V^2-1D^TV^1-1, and


Similar to Theorem 2, we know that Vec¯(B1)s,2-stage is the unique simpler form of Vec^(B1)2-stage(). Hence, we focus on the performances of Vec¯(B1)s,2-stage.

The matrix-variate normal distribution is a commonly used distribution in the class of matrix elliptically symmetric distributions. It plays an important role in the investigation of multivariate regression models such as the growth curve model (GCM). In what follows, in order to establish the unbiasedness of Vec¯(B1)s,2-stage, we first briefly present the definition of the matrix-variate normal distribution as well as two related properties and then make some assumptions on the distributions of random error matrices Ei (i = 1, 2).

Definition 1

A random matrix Z with order n×q is said to follow a matrix-variate normal distribution if its probability function is of the form

f(Z)=(2π)-nq2[det(Σ)]-q2[det(Ω)]-n2exp (-12trace{Ω-1[Z-M]TΣ-1[Z-M]}),

where M, ∑ > 0, and Ω > 0 are n × q, n × n, and q × q matrices, respectively, and det(A) is the determinant of the square matrix A. In this case, it is usually denoted that Z ~ Nn,q(M, ∑,Ω).

The following two lemmas point out that the relationship between the matrix-variate and vectorvariate normal distributions and an affine transformation of a matrix-variate normal variable also follows a matrix-variate normal distribution. The readers are referred to the first chapter of Pan and Fang (2007) for more details.

Lemma 1

Let Z be a n × q random matrix and z = Vec(Z). Then Z ~ Nn,q(M, ∑,Ω) if z ~ Nnq (Vec(M),Ω ⊗ ∑).

Lemma 2

Suppose Z ~ Nn,q(M, ∑,Ω), and that C, A1 > 0, and A2 > 0 are given matrices with orders n × q, n × n, and q × q, respectively. ThenA1ZA2+C~Nn,q(A1MA2+C,A1ΣA1T,A2ΩA2T).

In the following, we assume that in the system (1.1) the random error matrices Ei (i = 1, 2) follow the matrix-variate normal distribution Nn,q(0, In, Vi), which indicate that the rows of Ei are iid random vectors with common distribution Nq(0, Vi) (i = 1, 2), respectively. Thus, the rows of E = (E1, E2) are iid random vectors with common distribution N2q(0, V), i.e., E ~ Nn,2q(0, In, V). Hence, by Lemmas 1 and 2 we know that


Denote Yi=(y1(i),y2(i),,yq(i))(i=1,2). Then the matrix D = [nR(X)]−1(di j)q×q with the element


where the matrix Oi,q+j(2q × 2q) with order 2q × 2q consists of all zeros only except the element in the ith row and the (q + j)th column is one. Similarly, the (i, j)th element of V2 is equal to


where the 2q × 2q order matrix Oq+i,q+j(2q × 2q) consists of all zeros only; except the element in the (q + i)th row and the (q + j)th column is one.

Note that [Iq(X1TX1)-1X1TN2]Vec(Y2)=[0qp1×nq,Iq(X1TX1)-1X1TN2]Vec(Y). Hence, using X1TN2[In-PX˜]=0 and following from the discriminant condition of independence of the linear function and quadratic function of normal variables and the following easily verified facts:




We know that


Thus, we obtain the following theorem, which states the unbiasedness of the two-stage estimator.

Theorem 3

Under the assumptions that Ei ~ Nn,q(0, In, Vi) (i = 1, 2), the two-stage estimatorVec¯(B1)s,2-stageis unbiased, i.e.,E[Vec¯(B1)s,2-stage]=Vec(B1).

In the following, we refer to Grunfeld’s data in Maddala (1977) and present two simulation studies to compare the performances of Vec¯(B1)s,2-stage with those of Vec^(B1) under the conditions that there are some known relationships between the design matrices X1 and X2 and no relationships between X1 and X2, respectively.

(I) The case that X1 = (X2, L)

Where the system (1.1) is of the form Yi = XiBi + Ei (i = 1, 2) with E = (E1, E2) ~ Nn,4(0, In, V), and

B1=(111213), 듼 듼 듼B2=(16-32), 듼 듼 듼V=(10ρ0010ρρ0100ρ01).

Set S (B1) = (Y1X1B1)T (Y1X1B1). Note that the estimator B^1=(X1TX1)-1X1TY1 given by (1.3), which corresponds to the LSE Vec^(B1), actually makes the residual sum of squares (in the sense of nonnegative definite), trace of S (B1), determinant of S (B1) and the largest eigenvalue of S (B1) achieve their minimums (Muirhead, 1982). Therefore, under the four different criteria of measurement, if only the first equation Y1 = X1B1 + E1 is used then the LSE of the regression coefficient B1 are completely identical (Fang and Zhang, 1990). Thus, without loss of generality, we illustrate the superiorities of Vec¯(B1)s,2-stage by comparing trace(S (B1)) with trace(S (B1,s,2-stage)), where
B¯1,s,2-stage=(X1TX1)-1X1TY1+(X1TX1)-1X1TN2Y2(-D^V^2-1)T, which corresponds to Vec¯(B1)s,2-stage. We also present the values of trace(S (B1,s)) for contrast, where B¯1,s=(X1TX1)-1X1TY1+(X1TX1)-1X1TN2Y2(-DV2-1)T corresponds to (2.4).

In Table 1, based on different combinations of the correlation ρ and sample size, we present some numerical demonstrations to compare trace(S (B1,s,2-stage)) with trace(S (B1)) and trace(S (B1,s)), which exhibit the performances of the simplified two-stage estimator Vec¯(B1)s,2-stage when the sample size is relatively small and moderate. Consequently, we find that the performance of the two-stage estimator tends to improve as the sample size increases. However, it also depends on the correlation ρ, and especially when n ≥ 20 and ρ ≥ 0.5, we easily see that |trace(S (B1,s,2-stage)) – trace(S (B1,s))| < |trace(S (B1)) – trace(S (B1,s))|, which shows that the two-stage estimator Vec¯(B1)s,2-stage is closer to Vec¯(B1)s.

(II) The case that there are no relationships between X1 and X2

In this case we assume that the system (1.1) has the same form as (3.10) but there are no relationships between X1 and X2. The simulations are presented below. From Table 2, we see that trace(S (B1,s,2-stage)) is getting closer to trace(S (B1,s)), which implies that the two-stage estimator Vec¯(B1)s,2-stage is becoming better than the LSE Vec^(B1) as the sample size goes large (n ≥ 20 or larger), also the fact depends on the value of the correlation ρ (≥ 0.5). This is because that from the viewpoint of covariance adjustment the one-step covariance adjustment estimator Vec^(B1)(1), which is exactly equal to Vec¯(B1)s, is superior to Vec^(B1) in the sense of having less covariance even though there are no relationships between X1 and X2. Hence, the simulation study discloses the tendency of Vec¯(B1)s,2-stage performing better, which is consistent with a two-stage estimator that incorporates more information.

4. An illustrating example

The GCM is a generalized multivariate analysis-of-variance model, which is useful especially for investigating growth problems on short time series in economics, biology and medical research (see Lee and Geisser 1972, Pan and Fang 2007). The seemingly unrelated GCMs are defined as


where Yi are n × q observation matrices, Xi and Zi are known design matrices of full column rank and full row rank, respectively, and the regression parameters B1 and B2 are unknown. The assumptions on E1 and E2 are the same as those in the system (1.1).

Therefore, without considering the interactions between the two equations, we obtain the LSE of B1 from the first equation as


which is unbiased and the corresponding covariance Cov(B^1)=Cov(Vec(B^1))=(Z1V1-1Z1T)-1(X1TX1)-1. However, combining the information of the second equation and the assumption X1TX2=0, we obtain the system LSE for B1 as


which is unbiased and with less covariance


which is less than Cov(B1) since
V1-1(V1-DV2-1DT)-1=V11 and correspondingly (Z1V1-1Z1T)-1(Z1V11Z1T)-1.

In the case that the covariance matrix V is unknown, under the assumption that E = (E1, E2) ~ Nn,2q(0, In, V), we use the same form estimator as that of the equation (3.1) to estimate V, which is easily shown to be unbiased. Hence, a two-stage estimator for B1 is defined as


where V^11=(V^1-D^V^2-1D^T)-1 and V^21=-V^2-1D^TV^11. Analogous to the previous discussions, we can establish the unbiasedness of the estimator B1,2-stage.

In the following, we illustrate a simulation study to compare the performances of B1,2-stage with those of B1 under the matrix 2-norm criterion, where the 2-norm of a matrix A is given by A2=Vec(A)2=(ijaij2)1/2. The performances of B1 are also presented as a contrast. In each simulation, a sample of size n observations is randomly generated from a 2q-variate normal distribution with mean zero and covariance matrix V, which is considered as the error matrix En×2q = (E1, E2). Next, B1, B1,2-stage, and B1 are calculated in each simulation. Simulations are repeated 500 times and the matrix 2-norms of the average values of B1B1, B1,2-stageB1, and B1B1 are given in Table 3.

Three cases are studied. The first of them corresponds to n = 10, the second one considers the case of n = 20 and the third one corresponds to the case of n = 50. All cases adopt the same V as (3.10), but with the correlation ρ having a number of alternative values.

Simulations for the case (i) with

X1T=(616-30219231925261711614-3022715202662425837-28125-4-2914),X2T=(14256352582634216853),B1=(111213), 듼 듼 듼B2=(16-32), 듼 듼 듼Z1=(1234), 듼 듼 듼Z2=(1560.5).

Simulations for the case (ii) with X1T being




where B1, B2, Z1, and Z2 are the same as the case (i).

Simulations for the case (iii) with X1 = [a1, a2, a3]50×3 being randomly generated and X2 = [a4, a5]50×2 being obtained from the null space of X1T, and in this case B1, B2, Z1, and Z2 remain the same as those of the case (i).

From Table 3, except the situations that ρ = 0.2 and ρ = 0.5 with n = 10, we find that norm(B1,2-stageB1) is uniformly smaller than norm(B1B1), which shows that the two-stage estimator B1,2-stage is closer to the true value B1 than the LSE B1.

5. Concluding remarks

In summary, we have investigated regression coefficients estimation and inference for the system of two multivariate SURs. Note that we focus on the estimation problem of B1 since the positions of B1 and B2 are equipotent. In Section 1, we find that together with another equation information the estimator of regression coefficients can be presented as a matrix power series via the method of covariance adjustment. In Section 2, we further indicate that the matrix series has exactly one simpler form which is just the one-step covariance adjustment estimator of the regression coefficients. In Section 3, in the case that the covariance matrix of the system is unknown, we illustrate that the degeneration form of the two-stage estimator sequence is unique, and an unbiased two-stage estimator is proposed and numerical simulations are also presented to verify its superiority. The results established in the present paper enrich the existing results since they include Zellner’s univariate SURs as a special case.


Table 1

Comparisons between the two-stage estimator and the least square estimator

ρ n trace(S(B1)) trace(S(B1,s,2-stage)) trace(S(B1,s))
0.2 10 12.7118 12.8775 12.7718
20 30.0456 30.2155 30.0873
50 141.1215 141.9674 141.3323

0.5 10 15.3561 15.4368 15.5200
20 35.3589 35.3747 35.3775
50 108.2523 108.2895 108.2894

0.7 10 7.7018 8.0020 7.9504
20 42.4375 45.8153 44.6528
50 103.9155 104.3501 104.2045

0.9 10 18.6055 18.6758 18.6689
20 35.6630 37.5563 37.2077
50 119.9993 122.4540 122.2700

Table 2

Comparisons between the two-stage estimator and the least square estimator

ρ n trace(S(B1)) trace(S(B1,s,2-stage)) trace(S(B1,s))
0.2 10 16.4822 16.7363 16.5106
20 45.1567 45.3013 45.1656
50 90.6848 90.7475 91.0296

0.5 10 11.7975 12.4334 12.0509
20 30.6931 31.4717 31.2451
50 91.1112 91.1782 91.2101

0.7 10 15.0877 15.4275 15.2887
20 34.0316 34.9262 34.6478
50 78.8424 79.1979 79.4065

0.9 10 7.7416 8.8334 9.0277
20 37.1639 39.6293 38.6530
50 106.1452 106.7684 106.7741

Table 3

Comparisons between several estimators under the matrix 2 norm

ρ n norm(B1B1) norm(B1,2-stageB1) norm(B1B1)
0.2 10 0.1931 0.2668 0.1879
20 0.0183 0.0193 0.0182
50 0.2259 0.2266 0.2205

0.5 10 0.2013 0.2283 0.1679
20 0.0186 0.0177 0.0171
50 0.2557 0.2271 0.2198

0.7 10 0.1896 0.1872 0.1383
20 0.0184 0.0134 0.0127
50 0.2532 0.1820 0.1763

0.9 10 0.1923 0.1138 0.0847
20 0.0173 0.0085 0.0079
50 0.2471 0.1133 0.1103

  1. Alkhamisi, MA (2010). Simulation study of new estimators combining the SUR ridge regression and the restricted least squares methodologies. Statistical Papers. 51, 651-672.
  2. Baksalary, JK (1991). Covariance adjustment in biased estimation. Computational Statistics & Data Analysis. 12, 221-230.
  3. Fang, KT, and Zhang, YT (1990). Generalized Multivariate Analysis. Berlin and Beijing: Springer-Verlag and Science Press
  4. Gupta, AK, and Kabe, DG (1998). A note on a result for two SUR models. Statistical Papers. 39, 417-421.
  5. Lee, JC, and Geisser, S (1972). Growth curve prediction. Sankhya A. 34, 393-412.
  6. Liu, AY (2002). Efficient estimation of two seemingly unrelated regression equations. Journal of Multivariate Analysis. 82, 445-456.
  7. Liu, JS (2000). MSEM dominance of estimators in two seemingly unrelated regressions. Journal of Statistical Planning and Inference. 88, 255-266.
  8. Maddala, GS (1977). Econometrics. New York: McGraw-Hill
  9. Muirhead, RJ (1982). Aspects of Multivariate Statistical Theory. New York: Wiley and Sons
  10. Pan, JX, and Fang, KT (2007). Growth Curve Models and Statistical Diagnostics. Beijing: Science Press
  11. Percy, DF (1992). Prediction for seemingly unrelated regressions. Journal of the Royal Statistical Society Series B (Methodological). 54, 243-252.
  12. Rao, CR (1967). Least square theory using an estimated dispersion matrix and its application to measurement of signal. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, LeCam, LM, and Neyman, J, ed, pp. 355-372
  13. Revankar, NS (1974). Some finite sample results in the context of two seemingly unrelated regression equations. Journal of the American Statistical Association. 69, 187-190.
  14. Schmidt, P (1977). Estimation of seemingly unrelated regressions with unequal numbers of observations. Journal of Econometrics. 5, 365-377.
  15. Shukur, G, and Zeebari, Z (2012). Median regression for SUR models with the same explanatory variables in each equation. Journal of Applied Statistics. 39, 1765-1779.
  16. Srivastava, VK, and Giles, DEA (1987). Seemingly Unrelated Regression Equations Models. New York: Marcel Dekker
  17. Velu, R, and Richards, J (2008). Seemingly unrelated reduced-rank regression model. Journal of Statistical Planning and Inference. 138, 2837-2846.
  18. Wang, SG (1989). A new estimate of regression coefficients in seemingly unrelated regression system. Science in China Series A. 32, 808-816.
  19. Zellner, A (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American Statistical Association. 57, 348-368.
  20. Zellner, A (1963). Estimators of seemingly unrelated regression equations: some exact finite sample results. Journal of the American Statistical Association. 58, 977-992.
  21. Zhou, B, Xu, Q, and You, J (2011). Efficient estimation for error component seemingly unrelated nonparametric regression models. Metrika. 73, 121-138.