TEXT SIZE

search for



CrossRef (0)
MBRDR: R-package for response dimension reduction in multivariate regression
Communications for Statistical Applications and Methods 2024;31:179-189
Published online March 31, 2024
© 2024 Korean Statistical Society.

Heesung Ahna, Jae Keun Yoo1,a

aDepartment of Statistics, Ewha Womans University, Korea
Correspondence to: 1 Department of Statistics, Ewha Womans University, 52 Ewhayeodae-gil, Seodaemun-gu, Seoul 03760, Korea. E-mail: peter.yoo@ewha.ac.kr
For Jae Keun Yoo and Heesung Ahn, this work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korean Ministry of Education (RS-2023-00240564 and RS-2023-00217022).
Received January 4, 2024; Revised February 3, 2024; Accepted February 3, 2024.
 Abstract
In multivariate regression with a high-dimensional response Y ∈ 꽍r and a relatively low-dimensional predictor X ∈ 꽍p (where r ≥ 2), the statistical analysis of such data presents significant challenges due to the exponential increase in the number of parameters as the dimension of the response grows. Most existing dimension reduction techniques primarily focus on reducing the dimension of the predictors (X), not the dimension of the response variable (Y). Yoo and Cook (2008) introduced a response dimension reduction method that preserves information about the conditional mean E(Y|X). Building upon this foundational work, Yoo (2018) proposed two semi-parametric methods, principal response reduction (PRR) and principal fitted response reduction (PFRR), then expanded these methods to unstructured principal fitted response reduction (UPFRR) (Yoo (2019)). This paper reviews these four response dimension reduction methodologies mentioned above. In addition, it introduces the implementation of the mbrdr package in R. The mbrdr is a unique tool in the R community, as it is specifically designed for response dimension reduction, setting it apart from existing dimension reduction packages that focus solely on predictors.
Keywords : multivariate regression, nonparametric reduction, prinicipal response reduction, principal fitted response reduction, unstructured principal response reduction, R-package
1. Introduction

Sufficient dimension reduction (SDR) in regression replaces p-dimensional predictor X to lower dimensional linear projection MTX without loss of information on Y|X. It can be expressed as YX|MTX. For all M ∈ 꽍p×d, is called a sufficient dimension reduction (SDR) subspace, where is the subspace spanned by columns of M. The intersection of all possible dimension reduction subspace is called the central subspace , which is the target of SDR. Sliced inverse regression (SIR) (Li, 1991) and sliced average variance estimation (SAVE) (Cook and Weisberg, 1991) are ones of the most popular SDR methods. SIR uses inverse mean E(X|Y) and SAVE uses inverse covariance cov(X|Y) to estimate central subspace .

Multivariate regression of multi-dimensional Y ∈ 꽍r|X ∈ 꽍p, r ≥ 2 has been common in many fields including repeated measures, longitudinal studies, time series data, functional data analysis. However, analysis of such data is often challenging. It is because dimension of response is high, while the dimension of the predictors is relatively low. Then, the number of parameters in the analysis exponentially increases as the dimension of the responses grows. Reducing the dimension of responses still capturing the information of regression would be helpful to handle such data. Unfortunately, most dimension reduction methodologies including SIR and SAVE focused on reducing the dimension of the predictors X, not response Y.

Yoo and Cook (2008) proposed a response dimension reduction methodology in a multivariate regression, without loss of information on the conditional mean E(Y|X). They defined two types of response dimension reduction, called linear and conditional response reduction, and provided a non-parametric methodology to estimate the linear response reduction.

Yoo (2013) investigated theoretical relation between linear and conditional response reduction under the envelope model setting (Cook et al., 2010) and opened a possibility of a model-based response dimension reduction. Following this seminal work, Yoo (2018) proposed two semi-parametric response dimension reduction approaches, called principal response reduction (PRR) and principal fitted response reduction (PFRR). Yoo (2008) showed that these semi-parametric approaches outperform non-parametric method from Yoo and Cook (2008) through numerical studies and real data example. Yoo (2019) developed another model based approach called unstructured prinicpal fitted response reduction (UPFRR), which do not assume the structure of the covariance matrix in PFRR from Yoo (2018).

For the response dimension reduction for multivariate regression, the mbrdr package is recently developed in R. It can implement the Yoo-Cook method and three versions of model-based response dimension reduction mentioned above. The mbrdr package can be found in R-CRAN (https://cran.rproject.org/web/packages/mbrdr/index.html). The mbrdr package is the very first package in the response dimension reduction in the sufficient dimension reduction context. The existing dr package is used only for reducing the dimension of the predictors, not responses. So, the mbrdr is unique, and it makes the mbrdr valuable in the R community.

The organization of the paper is as follows. Section 2 provides reviews for four response dimension reduction methodologies mentioned above, which are the Yoo-Cook method, PRR, PFRR, and UPFRR. The implementation of mbrdr is introduced in Section 3. Section 4 summarizes the work.

We will use the following notations throughout the rest of the paper. A p-dimensional random variable X will be denoted as X ∈ 꽍p. So, X ∈ 꽍p means a random variable, although there is no specific mention. For X ∈ 꽍p and Y ∈ 꽍r, we define that x and y are the covariance matrix of X and Y, respectively. Additionally, it is assumed that x and y are positive-definite.

2. Collection of response dimension methodologies in mbrdr

2.1. Yoo-Cook method

For a multivariate regression of Y ∈ 꽍r|X ∈ 꽍p, it is supposed that there exists a r × q matrix L to have the smallest rank among all matrices to satisfy the following relation for E(Y|X):

E(YX)=E{PL(Σy)YX},

where qr and PL(y) = L(LTyL)−1LTy is an orthogonal projection operator relative to the inner product ω1,ω2Σy=ω1Σyω2.

Equation (2.1) indicates that the predictors X have effects to the components of the conditional mean E(Y|X) only through PL(y). So, the q-dimensional linearly transformed PΣyY can successfully replace the original response Y without loss of information on E(Y|X). In Yoo and Cook (2008), this response dimension reduction is called as a linear response reduction.

Next, suppose that there exists a k × k matrix K satisfying the following equivalences:

E(YX)=E{E(YX,KY)X}=E{E(YKY)X}=E{g(KY)X},

where kr, KIr and g(·) is an unknown function.

By the last equivalence, another dimension reduction of Y can be done, if k < r, and this response reduction is called a conditional response reduction. In Yoo and Cook (2008), the column spaces of L and K are called a response dimension reduction subspace.

Yoo and Cook (2008) prove that for L and K in equations (2.1) and (2.2), respectivey, and that under the following condition: A1. E(Y|KTY = a) is linear in a. The condition holds, if Y is elliptically distributed. If condition A1 is not satisfied, Y is usually power-transformed for the normality. Under condition A1, the quantity Σy-1cov(Y,X)Σx-1 is proposed to estimate L and K in Yoo and Cook (2008).

2.2. Principal response reduction

We consider the following multivariate regression model with assuming E(Y) = 0 and E(X) = 0 without loss of generality:

Y=Γvx+,

where Γ ∈ 꽍r×d with ΓTΓ = Id and dr, ~ N(0, ), cov(vx, ) = 0 and vx is a d-dimensional unknown random function of the predictors X with a positive definite sample covariance and xvx = 0. If vx = X, the model in (2.3) is the same as a multivariate linear regression.

A crucial assumption required for the model given in (2.3) is that is an invariant and reducing subspace of . This guarantees that Σ=ΓΩΓ+Γ0Ω0Γ0, where Γ0 ∈ 꽍r×(rd) with Γ0Γ0=Ir-d and Γ0Γ=0, Ω = ΓT∑Γ and Ω0=Γ0ΣΓ0.

According to Yoo (2018), for model (2.3), we have that E(YX)=E(PΓ(Σy)YX). That is, the original response Y can be reduced through Γ without loss of information of E(Y|X).

Then, the unknown Γ in model (2.3) is estimated by maximizing its likelihood function, because the normality of is assumed. Denote y as an unusual moment estimator of y. Yoo (2018) proves that the maximum likelihood estimator (MLE) of Γ is a set of the eigenvectors corresponding to the first d largest eigenvalues of y. This dimension reduction is called principal response reduction (PRR).

2.3. Principal fitted response reduction (PFRR)

For PRR, the information of X is excluded in the estimation of Γ. To incorporate the predictors X, it is assumed that vx = ψfx:

Y=Γψfx+,

where ψ is an unknown d × q matrix, and fx ∈ 꽍q is a known q dimensional vector-valued function of X with xfx = 0. For convenience, the following notations are defined.

  • : the n × r data matrix for the responses.

  • : the n × p data matrix for the predictors.

  • : the q × n matrix constructed by stacking fx and .

  • and res = yfit.

As the candidates of fx, Yoo (2018) considers X,X2, exp(X), their combinations and the cluster indicator of X constructed from the K-means clustering algorithm. If is equal to the ordinary least squares, and hence fit is the regression product sums of square.

The MLE of Γ in model (2.4) is not in a close form. The likelihood function in Γ is summarized as follows according to Yoo (2018):

L(Γ,Γ0)=-n2logΓ0Σ^yΓ0-n2logΓΣ^resΓ.

Therefore, the MLE of Γ is clearly affected by both y and res. A sequential selection algorithm among a set of all the eigenvectors of y, fit and res is adopted from Cook (2007: Section 6.2). This approach to estimate Γ is called principal fitted response reduction (PFRR).

2.4. Unstructured principal fitted response reduction

We consider the following model with assuming that ~ N(0, > 0) and cov(vx , ) = 0 :

Y=Γvx+.

The difference between models (2.3) and (2.5) is the structure of . In model 2.5, the condition that Σ=ΓΩΓ+Γ0Ω0Γ0 is not required any more.

In Yoo (2019), the following relationship between and y for the invariant condition is shown that if and only if . That is, the invariant condition for is equivalent to that for y. According to Yoo (2019), it is established that E(YX)=E(PΓ(Σy)YX) for model (2.5), if the invariance of for y holds. So, hereafter, the invariant condition of Γ for y will be assumed in model (2.5).

To incorporate the information of X in the estimation of Γ, its fitted component version is constructed as follows:

Y=Γψfx+.

We define the following quantities:

  • Ed and are the first d largest eigenvectors of a matrix Ed and the column subspace of Ed, respectively.


    B = −1/2fit−1/2, Bres=Σ^res-1/2Σ^fitΣ^res-1/2, and By=Σ^y-1/2Σ^fitΣ^y-1/2.

  • Λ = (λ1, . . . , λq) and V = (γ1, . . . , γq) are the ordered eigenvalues and corresponding eigenvectors of Bres.

  • Kd = diag(0, . . . , 0, λd+1, . . . , λq).

Then, under model (2.6), the following results are derived in Yoo (2019):

  • Σ^=Σ^res+Σ^res1/2V^K^dV^Σ^res1/2=Σ^res1/2(Ir+V^K^dV^)Σ^res1/2.

  • LUPFRRd=(-n/2)logΣ^res+(n/2)i=d+1qlog(1+λ^i).

The response reduction through model (2.6) will be called unstructured principal fitted response reduction (UPFRR).

3. Illustration of mbrdr package

3.1. Outline of mbrdr package

R-package mbrdr can realize four response dimension reduction methodologies discussed in Section 2. The arguments are as follows.

mdrdr(formula, method="upfrr", data, subset, na.action=na.fail, weights).

The main function mbrdr creates “mbrdr” class and four subclasses depending on the values of mbrdr. The values of method and its resulting subclasses are as follows. The default is “upfrr”.

method = "yc": Yoo-Cook method/ “yc” subclass
method = "prr": principal response reduction / “prr” subclass
method = "pfrr": principal fitted response reduction / “pfrr” subclass
method = "upfrr": unstructured principal fitted response reduction / “upfrr” subclass.

The function mbrdr provides eigenvectors for response dimension reduction and test statistics about decision of its dimension. For the estimation of the true dimension d, cumulative sum of eigenvalues is provided in all four methods. These methods follow a sequence of hypothesis test, H0 : dy = m vs. H1 : dy = r (Rao, 1965). Starting from m = 0, if H0 is rejected, we increment m by 1 and test again. dy is determined to be m at the first time H0 is not rejected. The test statistic of Yoo-Cook method is provided in Yoo and Cook (2008: Section 3.3). Unlike Yoo-Cook method, prr, pfrr, and upfrr use maximum likelihood estimator to estimate Γ. For the test statistic, these three methods adopt likelihood ratio test (LRT) to decide the optimal dimension for reduction. For pfrr and upfrr, d can be estimated by χ2 statistic, which are χq(r-m)2 and χ(q-m)(r-m)2 respectively. However, χ2 statistics cannot be applied to prr. It is because Ω is not estimable, then the covariance matrix cannot be estimated too. On the other hand, for example, in pfrr, Ω and Ω0 can be estimated with Γ^Σ^resΓ^ and Γ^0Σ^yΓ^0, respectively. For this reason, mbrdr package provide the result of χ2-test for dimension only for method = "pfrr" and method = "upfrr".

To fit method = "pfrr" and method = "upfrr" in mbrdr function, users should select fx via fx.choice option. The option fx.choice is implemented through a function choose.fx. The function choose.fx returns n × q matix of fx. q depends on the choice of fx. When user run the function mbrdr with fx.choice specified, the function choose.fx is automatically implemented. In the function choose.fx, the predictor X are normalized to have zero sample means and a sample correlation matrix, instead of a sample covariance matrix. The option fx.choice has the following four values.

fx.choice = 1: fx = X.
fx.choice = 2: fx = (X,X2).
fx.choice = 3: fx = (X, exp(X)).
fx.choice = 4: (c − 1) dummy variables for c-cluster indicators constructed through K-means clustering X.

For fx.choice = 4, nclust should be specified in mbrdr function. The command set.seed(0) will be used for producing the same clustering results in the following examples. The value of fx.choice must be one of 1, 2, 3 and 4. If users want to choose user-own fx other than fx.choice, the option fx should be used. If a matrix is given in fx, it surpasses any values in fx.choice. The default values for fx.choice, nclust and fx are 1, 5 and NULL, respectively. The candidate function fx can be determined by examining scatter plot matrices between Y and X. If a linear trend is observed between Y and X, fx = X is recommended. In the case of a non-linear trend, considerations may include fx = (X,X2), exp(X), (X, exp(X)). If there is no patterns between Y and X, cluster indicator from K-means clustering algorithm can be used as fx. According to numerical studies, fx = X is considered a suitable default choice. For illustration of how to use mbrdr function, the data set named mps is included in the mbrdr package. In the following section, we apply the package to the mps data.

3.2. Real data example 1: Minneapolis elementary school data

To illustrate the outputs of mbrdr, Minneapolis elementary school data in 1972 is adopted. The data set named mps is available in the mbrdr. The data set is made of 63 observations with 4 responses and 11 predictors. The responses are A4, B4, which are percentage of 4th graders scoring above/below average on a standard 4th grade vocabulary test in 1972, and A6, B6, which are percentage of 6th graders scoring above/below average on a standard 6th grade comprehension test in 1972. 9 predictors will be used in the model: (1) the percentage of children receiving aid to families with dependent children (AFDC), (2) the average percentage of children in attendance during the year (Attend), (3) the percentage of children in the school not living with both parents (B), (4) the number of children enrolled in the school (Enrol), (5) the percentage of adults in the school area who have completed high school (HS), (6) the percent minority children in the area (Minority), (7) the percentage of children who started in a school, but did not finish there (Mobility), (8) the percentage of persons in the school area who are above the federal poverty levels (Poverty), (9) pupil-teacher ratio (PTR). This data set would be suitable for response dimension reduction since the dimension of response is relatively high considering the observation size and the dimension of predictor.

The function mbrdr with default options; method = "upfrr", fx.choice = 1 is as follows. It uses predictors X in a normalized form.

library(mbrdr)
attach(mps)
X<-cbind(AFDC,Attend,B,Enrol,HS,Minority,Mobility,
Poverty,PTR)
Y<-cbind(A4, B4, A6, B6)
rdr0<-mbrdr(Y˜X)
summary(rdr0).

The results of the function mbrdr is in the Table 1. The χ2-test determines that dy = 1 with p-value 0 in a significance level α = 0.05. Then, users can extract the eigenvector to reduce the dimension of Y by the command summary(rdr0)$ evectors[,1]. If interpreting the response reduction result through the eigenvectors in the Table 1(a), the first direction indicates a linear combination of 4 responses, placing more weight on A6, B6 than others. The second one rarely represents B4. The third one has strong focus on A4 and does not incorporate B6.

When users want to use the predictors without normalization, users should specify the fx in the function mbrdr . In the following code, fx0 means the original predictor X. The function update(object,...) can be used to update the previous call and re-fit a model.

#non-normalized fx
fx0<-as.matrix(mps[,c(5:7, 9:14)])
rdr.upfrr.fx0<-update(rdr0, fx = fx0)
summary(rdr.upfrr.fx0).

The test statistics, p-value and the eigenvectors in Table 2 are similar to the normalized version given in Table 1. The test result is also same for dy = 1 with p-value 0.

#fitting pfrr, fx.choice = 1
set.seed(0)
rdr.pfrr1<-update(rdr0, method = "pfrr", fx.choice = 1)
summary(rdr.pfrr1).

For method = "pfrr", users should specify fx.choice. Additionally, if fx.choice = 4, nclust should be identified. In Table 3, dy returns to be 1 because the test for d = 0 vs d = 1 is accepted in 95 % significance level and the test for d = 1 vs d = 2 is rejected in the same level. On the other hand, in Table 4, which implement method = "pfrr", fx.choice = 4, nclust = 6, the χ2-test rejects every hypothesis of dimension for reduction d. It indicates that the response dimension reduction is not proper for the fx specified. For nclust = 8, 9 and 10, the χ2-test about d = 0 vs d = 1 returns to be significant, and the dimension is determined to be 1 in α = 0.05.

#fitting pfrr, fx.choice = 4, nclust = 6
set.seed(0)
rdr.pfrr4.6<-update(rdr0,method = "pfrr",fx.choice = 4,nclust = 6)
summary(rdr.pfrr4.6).
#fitting yoo-cook
rdr.yc<-update(rdr0, method = "yc")
summary(rdr.yc).

#fitting prr
rdr.prr<-update(rdr0, method = "prr")
summary(rdr.prr).

For method = "yc" and method = "prr", χ2-test is not provided. Instead, the cumulative sum of the eigenvalues, corresponding to the eigenvectors in Table 5, 6, is presented in the return value stats. Table 5, 6 shows the cumulative sum of eigenvalues for Yoo-Cook method and prr, respectively. The dimension d is determined by min(which(summary( rdr.yc)$ stat/sum(rdr.yc$ evalues)>0.95)). Although cumulative sums of eigenvalues in Table 5, 6(b) looks completely different, cumulative proportions are similar. Cumulative proportion is calculated by summary(rdr.yc) $stat/sum(rdr.yc $evalues), which is part of command choosing d. For Yoo-Cook method, cumulative proportions are 63.01, 10.05, 1.22 for d = 0, 1, 2, respectively. For prr, cumulative proportions are 63, 12.76, 3.94 for d = 0, 1, 2, respectively. In 95% confidence level, d for both Yoo-Cook method and prr is determined to be 1.

3.3. Real data example 2: Epilepsy data

Another real data analysis will be performed using Epilepsy data from Thall and Vail (1990). The data set is about a clinical treatment of 59 epilepsy patients. The patients were randomly devided into 2 groups, one was provided with the anti-epileptic drug Progabide, and the other received a placebo. The number of seizures in the last 2 weeks had been reported 4 times, total of 8 weeks. Four 2-weeks seizure counts are response variables, while the treatment of placebo or progabide, the baseline seizure count, and age are predictors. For treatment variables, placebo is coded as 0 and progabide is coded as 1. We will use logarithm scale for baseline seizure count and age. For this data set, response dimension reduction would be suitable since the number of response is bigger than that of predictor. In this section 3.3, only the method UPFRR will proceed.

#fitting upfrr
Y_ep<-cbind(epil$y1, epil$y2, epil$y3, epil$y4)
X_ep<-cbind(epil$trt, log(epil$base), log(epil$age))
rdr_ep0<-mbrdr(Y_ep˜X_ep, data = epil)
summary(rdr_ep0).

The χ2-test for UPFRR determines that dy = 1 with p-value 0 in a significance level α = 0.05. In Figure 1, Scatter plot between dimension-reduced response and the predictor baseline suggests that patient # 207 is an outlier, who have much higher baseline counts than other patients. After deleting this observation, choice of dimension dy is still 1 but the eigenvectors are changed significantly. The result after eliminating the outlier is in Table 8. The dimension-reduced response can be obtained by command Y_ep %*% rdr_ep $ evectors[,1].

epil<-epil0[-which(epil0$id==207),]
Y_ep<-cbind(epil$y1, epil$y2, epil$y3, epil$y4)
X_ep<-cbind(epil$trt, log(epil$base), log(epil$age))

#upfrr after eliminating outlier
rdr_ep<-mbrdr(Y_ep˜X_ep, data = epil)
summary(rdr_ep)
Y_ep%*%rdr_ep$evectors[,1]
4. Discussion

In multivariate regression of Y ∈ 꽍r|X ∈ 꽍p, r ≥ 2, multi-dimensional response Y makes the analysis difficult because of the curse of dimension. Response dimension reduction would help solve such difficulty if it still contains the original information of E(Y|X). Recently, many response dimension reduction methodologies have been developed. Section 2 provides researches on four response dimension reduction methodologies, which are Yoo-Cook method, principal response reduction (PRR), principal fitted response reduction (PFRR), unstructured principal fitted response reduction (UPFRR).

In Section 3.2, the R-package mbrdr has been described by using the data set mps which is available in the package. Another real data analysis has been conducted with Epilepsy data set for clear understanding of the package in Section 3.3. The package mbrdr can implement four response dimension methods mentioned in Section 2. It provides the test for the optimal dimension d and the eigenvector for response dimension reduction. The package mbrdr enables users to fit these response dimension reduction methods for multivariate regression.

Acknowledgements

For Jae Keun Yoo and Heesung Ahn, this work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korean Ministry of Education (RS-2023-00240564 and RS-2023-00217022).

Figures
Fig. 1. Scatterplot between dimension-reduced response and baseline.
TABLES

Table 1

The result for mbrdr in upfrr

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
A4 0.3336 −0.376512 0.8292
B4 −0.3784 0.001862 0.398
A6 0.6515 −0.489923 0.3486
B6 −0.5666 −0.786264 −0.1789
(b) The χ2-test result for the dimension

Stat df p-value
0D vs ≥ 1D 144.013 36 0.0000
1D vs ≥ 2D 30.087 24 0.1819
2D vs ≥ 3D 7.071 14 0.9319
3D vs ≥ 4D 2.253 6 0.8950

Table 2

The result for mbrdr in non-normalized upfrr

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
A4 0.3082 −0.34713 0.8836
B4 −0.3578 −0.05721 0.1658
A6 0.6721 −0.52751 −0.4205
B6 −0.5704 −0.77328 −0.1221
(b) The χ2-test result for the dimension

Stat df p-value
0D vs ≥ 1D 142.730 36 0.0000
1D vs ≥ 2D 29.038 24 0.2187
2D vs ≥ 3D 6.186 14 0.9616
3D vs ≥ 4D 1.645 6 0.9493

Table 3

The result for mbrdr in pfrr (fx.choice = 1)

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
A4 −0.4310 −0.2592 0.6927
B4 0.3335 −0.6947 −0.4140
A6 −0.6166 −0.5494 −0.1943
B6 0.5681 −0.3852 0.5577
(b) The χ2-test result for the dimension

Stat df p-value
0D vs ≥ 1D 144.013 36 0.0000
1D vs ≥ 2D 38.489 27 0.0704
2D vs ≥ 3D 23.617 18 0.1680
3D vs ≥ 4D 8.985 9 0.4386

Table 4

The result for mbrdr in pfrr (fx.choice = 4, nclust = 6)

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
A4 −0.2650 0.4786 0.5218
B4 −0.6901 −0.3662 0.4824
A6 −0.5577 0.5896 −0.5377
B6 −0.3774 −0.5378 −0.4538
(b) The χ2-test result for the dimension

Stat df p-value
0D vs ≥ 1D 22.217 20 0.3288
1D vs ≥ 2D 14.344 15 0.4996
2D vs ≥ 3D 6.670 10 0.7562
3D vs ≥ 4D 2.117 5 0.8327

Table 5

The result for mbrdr in Yoo-Cook method

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
A4 −0.5511 0.6679 0.4138
B4 −0.5971 −0.6192 0.4102
A6 0.4311 −0.2800 0.6328
B6 0.3924 0.3035 0.5099
(b) The cumulative sum of eigenvalues

Stat
H0: d = 0 0.2076
H0: d = 1 0.0331
H0: d = 2 0.004

Table 6

The result for mbrdr in prr

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
A4 0.4786 0.6617 0.2650
B4 −0.3662 −0.3895 0.6901
A6 0.5896 −0.2356 0.5577
B6 −0.5378 0.5957 0.3774
(b) The cumulative sum of eigenvalues

Stat
H0: d = 0 39187
H0: d = 1 7936
H0: d = 2 2455

Table 7

The result for mbrdr in upfrr

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
y1 0.3307 0.78189 −0.4145
y2 0.4992 0.61812 −0.5210
y3 0.5108 0.07975 −0.1132
y4 0.6169 0.01504 0.7375
(b) The χ2-test result for the dimension

Stat df p-value
0D vs ≥ 1D 42.970 12 0.0000
1D vs ≥ 2D 4.427 5 0.6191
2D vs ≥ 3D 1.344 2 0.5106
3D vs ≥ 4D 0.000 0 1.0000

Table 8

The result for mbrdr in upfrr after deleting an outlier

(a) The eigenvector for the dimension reduction

Dir1 Dir2 Dir3
y1 0.5365 0.7467 −0.2838
y2 0.2959 −0.5954 −0.4050
y3 0.4172 −0.2229 −0.4960
y4 0.6712 −0.1959 0.7137
(b) The χ2-test result for the dimension

Stat df p-value
0D vs ≥ 1D 63.813 12 0.0000
1D vs ≥ 2D 6.519 6 0.3676
2D vs ≥ 3D 0.969 2 0.6160
3D vs ≥ 4D 0.000 0 1.0000

References
  1. Cook RD (2007). Fisher lecture: Dimension reduction in regression. Statistical Science, 22, 1-26.
    CrossRef
  2. Cook RD, Li B, and Chiaromonte F (2010). Envelope models for parsimonious and efficient multivariate linear regression. Statistica Sinica, 20, 927-960.
  3. Cook RD and Weisberg S (1991). Sliced inverse regression for dimension reduction: Comment. Journal of the American Statistical Association, 86, 328-332.
    CrossRef
  4. Li KC (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, 316-327.
    CrossRef
  5. Rao CR (1965). Linear Statistical Inference and Its Applications, New York, Wiley.
  6. Thall PF and Vail SC (1990). Some covariance models for longitudinal count data with overdispersion. Biometrics, 46, 657-671.
    CrossRef
  7. Yoo JK and Cook RD (2008). Response dimension reduction for the conditional mean in multivariate regression. Computational Statistics & Data Analysis, 53, 334-343.
    CrossRef
  8. Yoo JK (2013). A theoretical view of the envelope model for multivariate linear regression as response dimension reduction. Journal of the Korean Statistical Society, 42, 143-148.
    CrossRef
  9. Yoo JK (2018). Response dimension reduction: Model-based approach. Journal of Theoretical and Applied Statistics, 52, 409-425.
    CrossRef
  10. Yoo JK (2019). Unstructured principal fitted response reduction in multivariate regression. Journal of the Korean Statistical Society, 48, 561-567.
    CrossRef