TEXT SIZE

CrossRef (0)
A note on the test for the covariance matrix under normality

Hyo-Il Park

aDepartment of Statistics, Cheongju University, Korea
Correspondence to: 1Department of Statistics, Cheongju University, 298 Dae-Sung Ro, Cheongwon-gu, Cheongju-si, Chungcheongbuk-do 28503, Korea. E-mail: hipark@cju.ac.kr
Received October 11, 2017; Revised December 4, 2017; Accepted December 5, 2017.
Abstract

In this study, we consider the likelihood ratio test for the covariance matrix of the multivariate normal data. For this, we propose a method for obtaining null distributions of the likelihood ratio statistics by the Monte-Carlo approach when it is difficult to derive the exact null distributions theoretically. Then we compare the performance and precision of distributions obtained by the asymptotic normality and the Monte-Carlo method for the likelihood ratio test through a simulation study. Finally we discuss some interesting features related to the likelihood ratio test for the covariance matrix and the Monte-Carlo method for obtaining null distributions for the likelihood ratio statistics.

Keywords : asymptotic normality, likelihood ratio test, Monte-Carlo method, multivariate data
1. Introduction

The inferences about scale parameter or variance in the univariate case, have many results shown in the literature. Especially, the likelihood ratio (LR) procedure for testing the problem for variance has been completely achieved and verified for its efficiency and uniqueness under normality. For example, LR test statistics follow exactly the chi-square distributions under the null hypothesis and the LR tests themselves are optimal in the sense of power of test. However for the multivariate case with multivariate normality, only asymptotic procedures are available even though the statistics for the covariance matrix have been derived by applying the LR principle. The reason for this phenomenon may come from the fact that the distributional theories for the matrix-valued statistics have not been fully investigated. Any prospect for the theoretic development would also not be seen in any near future because of complexity or non-existence of distributions for matrix-valued statistics. For this reason, several modifications with high dimensional cases have been reported (Bai et al., 2009; Cai and Ma, 2013; Gupta and Bodnar, 2014) or applications of the bootstrap method, which is a re-sampling method, have been applied (Beran and Srivastava, 1985). Pinto and Mingoti (2015) also performed a comparison study for the asymptotic LR test with the VMAX proposed by Costa and Machado (2008).

For the test procedure of the covariance matrix, the LR functions have been mainly expressed with the corresponding eigenvalues of the sample covariance matrix. Even though the eigenvalues of a sample covariance matrix may consist of a vector instead of a matrix, discussions of the distributions and their properties for the LR functions or related statistics have not been fully investigated or justified in a theoretic manner. All the results up to this date, have been confined only to limiting distributions based on log likelihood arguments. Therefore we may question how close the limiting distributions are to the exact ones if they are obtainable or whether the conclusions based on the limiting distributions would be reliable when the p-values are too close to a significance level, say 0.05. For those reasons, it would be necessary to achieve a sensible and reasonable method to obtain the null distributions of LR functions or related statistics.

In the multivariate analysis under normality, the distributions of LR statistics have been fully studied and tabulated systematically for many cases. However when it would be difficult to derive the exact distributions theoretically, one may consider deriving the limiting distributions asymptotically which may be obtained using log likelihood arguments. However, one may obtain null distributions using one of the popular re-sampling methods such as bootstrap or permutation methods that are heavily dependent on the computer power and its facilities. Along with this, one may also obtain the null distribution of an LR statistic, LR, in the following idea and rationale. For this discussion, let E0(LR) be the expectation of LR under the null hypothesis. Since we can generate pseudo random vectors by the scenario of the null hypothesis, we may consider the computed quantity of LR with the generated pseudo random vectors an unbiased estimator of E0(LR) under the null hypothesis. If we iterate this process many times, say, M times, then we may consider having a sample of size M from a population with mean, E0(LR) but unknown distribution function, G, say, under the null hypothesis. From this sample, one may construct an empirical distribution function, ĜM, which can be considered an estimator of G. Since ĜM is a consistent estimator of G from the Glivenko-Cantelli lemma (Chung, 2001), one can estimate consistently a quantile or critical value for any given probability or significance level. We will call this process to obtain null distribution of an LR statistic, the Monte-Carlo (MC) method.

In this research, we consider obtaining null distributions of the LR functions for testing the covariance matrices under the multivariate normal distribution and compare them with the limiting distributions. For this purpose, the rest of this paper will be organized in the following order. In the next section, we review the LR tests with limiting distributions in some detail and propose the MC method to obtain quantiles as critical values for the LR functions. Then we illustrate the usage of distributions with numerical examples for the decision of structures of the covariance matrices and compare the precision between the two methods by obtaining empirical powers through a simulation study in the Section 3. In the Section 4, we discuss some interesting features related with the LR functions and the MC method.

2. Likelihood ratio test for the covariance matrix

Let X1, …, Xn be a random sample of q-variate column vectors with size n, from a q-variate normal distribution with mean vector, μ and covariance matrix, ∑. Then it is of our interest to test

$H0:Σ=Σ0,$

with the condition that the mean vector μ is unknown but ∑0 is a pre-specified positive definite q × q matrix. However we note that without loss of generality, we may assume that ∑0 = I since $XΣ0-1/2$ has I as its covariance matrix under H0 : ∑ = ∑0, where I is the q × q identity matrix. Then it is well-known that

$Sn=1n∑i=1n(Xi-X¯)(Xi-X¯)T$

is the maximum likelihood estimator of ∑, where is the sample mean vector of μ, also known as the maximum likelihood estimator of μ and (·)T is the transpose of a vector or matrix. We also assume that Sn is positive definite for each n. In order to discuss the LR function, L(∑; X1, …, Xn) for testing H0 : ∑ = I, let Λjn be the jth eigenvalue of Sn, j = 1, …, q. Then the LR function for testing H0 : ∑ = I against H1 : ∑ ≠ I can be expressed as, with the notation that |A| and TR(A) are the determinant and trace of the matrix A, respectively,

$L(Σ;X1,…,Xn)=sup {∏i=1nf(Xi;μ,Σ∣H0)}sup {∏i=1nf(Xi;μ,Σ∣H0∪H1)}=∣Sn∣n2exp [-n2TR(Sn)+qn2]=(∏j=1qΛjn)n2exp [-n2∑j=1qΛjn+qn2]=∏j=1q{Λjnn2 exp [-n2Λjn+n2]}.$

f in L(∑; X1, …, Xn) denotes the q-variate normal probability density function. Then the testing rule would be to reject H0 : ∑ = I in favor of H1 : ∑ ≠ I for some small but positive values of L(∑; X1, …, Xn) in the light of LR principle. Then in order to complete the multivariate test H0 : ∑ = I, we need a null distribution of the LR function for any form listed above. However any exact null distribution of any form of LR statistics has not been reported and only an asymptotic result based on the log likelihood arguments has been available. In the following, we state a limiting distribution for H0 : ∑ = I. The proof for this, you may refer to Silvey (1975) or Mardia et al. (1979).

### Lemma 1

The distribution of

$l=-2 log L(Σ;X1,…,Xn)$

is a chi-square with q(q + 1)/2 degrees of freedom (df) asymptotically under H0 : ∑ = I.

Then the testing rule based on l would be to reject H0 : ∑ = I for some large values of l. Thus one may complete the test H0 : ∑ = I asymptotically by invoking a table for the chi-square distributions. One may also take the MC approach for obtaining null distribution of l in the following order.

• Generate pseudo random normal vectors of size n from Nq(0, I).

• Then compute the LR statistic of l.

• Iterate (I) and (II) M times and order M number of the LR statistics of l.

• From the ordered statistics of l’s, obtain (or estimate) pth quantile for any given probability p.

• Repeat K times from (I) to (IV), obtain K number of pth quantiles and average them.

Then one can carry out the LR test by obtaining the critical values for any given significance levels or p-values using the procedure from (I) to (V).

In order to investigate the behavior of quantiles obtained from the MC method and compare them with quantiles from the chi-square distributions which are the limiting distributions, we have obtained quantiles (or critical values) of l, the log likelihood ratio statistic, for some selected sample sizes, 10, 15, 20, 25, and 30 and probabilities (or significance levels), 0.01, 0.05, 01, 0.9, 0.95, and 0.99 by choosing M = 100,000 and K = 2,000 for N2(0, I) and N3(0, I). We tabulated the results in Tables 1 and 2 for N2(0, I) and N3(0, I), respectively. Also we included quantiles, $χp2(3)$ and $χp2(6)$ for the chi-square distributions with 3 and 6 df s to compare them with quantiles obtained by the MC method. We note that quantiles of l obtained from MC method approach to $χp2(3)$ and $χp2(6)$ as the sample sizes increase. Therefore one may conclude that one should use quantiles obtained from the MC method especially when sample sizes are small. The simulation study will confirm this observation later. We also note that as q, the dimension, increases, the difference between quantiles from the MC method and chi-square distributions tends to become wider. All computations have been conducted using the SAS/IML PC-version.

Then we may finish the test for testing H0 : ∑ = I by obtaining a critical value for l using the MC method for the given significance level.

It would be interesting to compare their performance and precision between the two LR tests. This will be accomplished in the next section with a simulation study. In the tables of the next section, MC implies the LR test based on the permutation principle and AS means the LR one applying asymptotically the chi-square distribution. We begin the next section with some numerical examples.

3. Examples and a simulation study

We first illustrate the two tests, MC and AS with the head data of brothers (Frets, 1921) summarized in Mardia et al. (1979) and the turtles data in Jolicoeur and Mosimann (1960). We note that the brothers data set is bivariate and the sample size is 25. We also note that the turtles data set is tri-variate and the sample size is 24. Therefore we have used the chi-square distributions with 3 and 6 dfs for the AS tests. For the brothers data, one may use the null distribution of obtained from the MC method in Table 1 when n = 25. However we have applied the MC method again to obtain the p-values for the comparison between the two LR tests for both cases. Mardia et al. (1979) were interested in investigating the structure of the covariance matrix ∑ for the head length between the first and second sons through a testing approach whether they are independent or not. However their conclusions were implicit since they could not obtain exact p-values even though they are asymptotic. In this study, also we consider investigating the structure of the covariance matrix ∑ by using the two LR tests for the following two hypotheses such as

$H01:(σ12σ12σ21σ22)=(1005050100)$

and

$H02:(σ12σ12σ21σ22)=(10000100).$

We obtained the p-values for the MC and AS tests for the two null hypotheses, H01 and H02 and tabulated them in Tables 3 and 4. The two test, MC and AS show similar patterns for the p-values for each case. Therefore one may choose H01 for the covariance matrix ∑.

As another example, 24 turtles were collected and for each turtle, the carapace dimensions were measured in three mutually perpendicular directions of space: length, maximum width, and height. More detailed definitions and explanations of these measurements and contents, you may refer to Jolicoeur and Mosimann (1960). Each specimen is therefore represented in this study by a set of three measurements. Originally Jolicoeur and Mosimann (1960) were interested in the principal component analysis among three variables. However in this study, we are interested in detecting the structures of covariance matrix which are described as the two null hypotheses, H03 and H04.

$H03:(σ12σ12σ13σ21σ22σ23σ31σ32σ32)=(1407535755020352010)$

and

$H04:(σ12σ12σ13σ21σ22σ23σ31σ32σ32)=(1400005000010).$

We have obtained that 6.2096584 and 110.33072 as the values of l, respectively. The respective p-values are summarized in Tables 5 and 6 with the two methods to obtain the p-values. The tables show the strong evidence for H03 for the covariance matrix.

Now we compare performance and precision between the two LR tests, MC and AS by obtaining empirical powers through a simulation study under several scenarios for the bivariate normal case. We conducted a simulation study by generating bivariate normal pseudo-random vectors with a zero mean vector and varying the values of components of the covariance matrix with six cases of the sample sizes, 5, 10, 15, 20, 25, and 30 in order to inspect the behaviors of the two tests for the small sample cases. In the tables, (1, 1, 0) means that $σ12=σ22=1$ and σ12 = 0, which is I. We have applied the chi-square distribution with 3 df for the AS test since we deal with the bivariate case. In Table 7, we only consider varying the values of σ12 with $σ12=σ22=1$. In Table 8, we considered the case by varying the value of $σ22$ with $σ12=1$ and σ12 = 0. Finally we consider the case that both variances vary independently (σ12 = 0) and dependently (σ12 = 0.5) in Table 9. We chose 0.05 for the nominal significance level for all cases with 10,000 iterations for each simulation. All the computations were conducted with SAS/IML with PC-version.

We first note that from Table 7, the results are almost symmetric when the values of covariance are assigned with the opposite signs. For this reason, we consider only positive values for covariance in Tables 8 and 9. From Table 7, MC test achieves its nominal significance level well while AS one always achieves higher values than the nominal significance level for all cases. The reason for this may come from the fact that quantiles of the chi-square distribution with 3 df are lower than those from the MC method for all sample sizes as observed in Table 1. However as the sample sizes increase, empirical significance levels approach to the nominal one for the AS test. It is therefore recommended to use quantiles obtained from the MC method for the small sample case. In Table 8, one may observe that reversal phenomenon about empirical powers happened for the AS test. Even though the sample size increases, the empirical power decreases for the AS test for some cases. In all the tables, as the difference between two variances increases and/or covariance approaches to 1, the empirical powers increases. Finally we note that the empirical powers of MC are all lower than those of AS as we have expected since the quantiles of l approaches to $χ0.052(3)$ from above.

4. Concluding remarks

In (2.1), we have expressed the LR function with a multiplication of q number of LR functions with individual eigenvalues of the sample covariance matrix Sn and noted that (2.1) is a multiplication of Λjn’s which are independent and distributed as chi-square with n − 1 df. Using this expression, we have tried to obtain a reasonable test procedure for the covariance matrices based on Λjn’s but simply failed. The failure of obtaining any test procedure with Λjn’s may be because the relation between a matrix and its corresponding eigenvalues may not be fully investigated in the sense that which eigenvalue corresponds to which component of the matrix. Therefore it would be salient to have a precise or reasonable relation between a matrix and its eigenvalues.

In Tables 1 and 2, we have already noticed that as the sample sizes increase, quantiles obtained from the MC method approach to the limiting quantiles of the chi-square distribution. This phenomenon is a standard that confirms the large sample approximation theory in general. Therefore we may recommend to apply the MC method, when sample sizes are small. Or even for any reasonable sample sizes, the computation time to obtain a distribution with the MC method would be negligible.

The MC method can be applied other than the LR statistics for the test of covariance matrix if the null hypothesis is non-ambiguous and well-defined. For example, Park (2017) has used the MC method to obtain a null distribution of LR statistics for the multivariate simultaneous test. However it would be difficult to apply the MC method for a nonparametric test since the distribution of population for the null hypothesis is too broad to choose a specific one. One may also note that Kim and Cheon (2013) applied the MC method to estimate the posterior distribution in the Bayesian analysis.

Finally we note that the quantiles from the chi-square distribution in Table 1 are higher than those obtained the MC method for all cases. This is why the empirical powers of AS test in Table 7 to 9 are higher than those of the MC one for all cases. One may also be suspicious that the empirical powers of MC test are not exactly 0.0500 under H0 : ∑ = I in spite of using the same MC method to obtain quantiles of l. The reason for this is that we have used different seed numbers to generate pseudo random vectors for each case.

TABLES

### Table 1

Quantiles (or critical values) for some selected probabilities (or significance levels) and sample sizes for l for N2(0, I)

Quantile$χp2(3)$n

1015202530
q0.010.11480.14380.13280.12790.12510.1232
q0.050.35180.44020.40670.39170.38290.3775
q0.100.58440.73100.67540.65050.63610.6270
q0.906.25147.78197.20946.94896.80056.6736
q0.957.81479.71719.00788.68398.49848.3793
q0.9911.344914.075013.062812.596712.330512.1624

### Table 2

Quantiles (or critical values) for some selected probabilities (or significance levels) and sample sizes for l for N3(0, I)

Quantile$χp2(6)$n

1015202530
q0.010.87211.13441.03150.98650.96110.9453
q0.051.63542.12731.93421.85031.80281.7728
q0.102.20412.86722.60682.49392.42952.3893
q0.9010.644613.839212.581912.036911.731111.5359
q0.9512.591616.370914.882614.237813.876313.6454
q0.9916.811921.866819.868919.003818.525718.2169

p-values for H01

Testp-value
MC0.2722
AS0.2349

p-values for H02

Testp-value
MC0.0005
AS0.0042

p-values for H03

Testp-value
MC0.4683
AS0.4001

p-values for H04

Testp-value
MC0.0000
AS0.0000

### Table 7

Empirical powers by varying covariance only

Testn($σ12,σ22$, σ12)

(1, 1, 0)(1, 1, 0.2)(1, 1, −0.2)(1, 1, 0.5)(1, 1, −0.5)(1, 1, 0.8)(1, 1, −0.8)
MC50.06370.07510.07360.13240.13500.33040.3341
100.05030.07430.07110.22820.21680.72650.7283
150.04930.08130.08240.34520.34190.94190.9383
200.05000.09700.09060.45830.45560.99210.9924
250.05100.11520.11200.59440.58420.99990.9996
300.04900.12140.12290.68860.68101.00001.0000

AS50.18390.19690.20000.29730.29240.61440.6158
100.09940.12910.12760.32860.32010.87900.8821
150.08060.12010.12090.43400.42870.98350.9823
200.07230.12860.12550.53840.53590.99830.9975
250.06730.13770.13560.64270.62820.99990.9999
300.06260.14570.14570.72990.72371.00001.0000

### Table 8

Empirical powers by varying only one variance with 0 covariance

Testn($σ12,σ22$, σ12)

(1, 1.2, 0)(1, 0.8, 0)(1, 1.5, 0)(1, 0.5, 0)(1, 1.8, 0)(1, 0.2, 0)
MC50.07660.06090.11040.08410.15370.2022
100.07000.05500.13060.12000.21480.5617
150.07160.05730.15470.18790.28300.8632
200.07260.06170.18420.25980.35500.9722
250.07410.06290.19350.46020.36150.9990
300.07670.07110.21000.54120.43540.9998

AS50.17490.21110.17940.29550.20380.5930
100.09740.12980.13380.28120.20010.8439
150.08690.11720.14800.34900.26030.9683
200.08050.11490.17300.43110.33000.9963
250.08430.11590.20280.51080.39650.9995
300.08440.12020.24060.59120.47081.0000

### Table 9

Empirical powers by varying both variances and covariance

Testn($σ12,σ22$, σ12)

(1.2, 0.8, 0)(1.2, 0.8, 0.5)(1.5, 0.5, 0)(1.5, 0.5, 0.5)(1.8, 0.2, 0)(1.8, 0.2, 0.5)
MC50.07340.14430.13430.24160.33150.7114
100.07390.26470.22560.51550.73130.9990
150.08300.40460.34140.75950.94211.0000
200.09940.54330.46160.90210.99061.0000
250.11370.69190.59960.97770.99941.0000
300.12490.78140.68800.99351.00001.0000

AS50.19960.31710.29570.47010.61550.9713
100.12660.37800.32640.67650.88121.0000
150.12080.50230.43100.85970.98091.0000
200.12820.62130.53970.95010.99731.0000
250.13630.73080.64270.98400.99971.0000
300.14960.81530.72540.99581.00001.0000

References
1. Bai, Z, Jiang, D, Yao, JF, and Zheng, S (2009). Corrections to LRT on large-dimensional covariance matrix by RMT. The Annals of Statistics. 37, 3822-3840.
2. Beran, R, and Srivastava, MS (1985). Bootstrap tests and confidence regions for functions of a covariance matrix. The Annals of Statistics. 13, 95-115.
3. Cai, TT, and Ma, Z (2013). Optimal hypothesis testing for high dimensional covariance matrices. Bernoulli. 19, 2359-2388.
4. Costa, AFB, and Machado, MAG (2008). A new chart for monitoring the covariance matrix of bivariate processes. Communications in Statistics - Simulation and Computation. 37, 1453-1465.
5. Chung, KL (2001). A Course in Probability Theory. New York: Academic Press
6. Frets, GP (1921). Heredity of headform in man. Genetica. 3, 193-384.
7. Gupta, AK, and Bodnar, T (2014). An exact test about the covariance matrix. Journal of Multivariate Analysis. 125, 176-189.
8. Jolicoeur, P, and Mosimann, JE (1960). Size and shape variation in the painted turtle: a principal component analysis. Growth. 24, 339-354.
9. Kim, J, and Cheon, S (2013). Bayesian multiple change-point estimation and segmentation. Communications for Statistical Applications and Methods. 20, 439-454.
10. Mardia, KV, Kent, JT, and Bibby, JM (1979). Multivariate Analysis. New York: Academic Press
11. Park, HI (2017). A simultaneous inference for the multivariate data. Journal of the Korean Data Analysis Society. 19, 557-564.
12. Pinto, LP, and Mingoti, SA (2015). On hypothesis tests for covariance matrices under multivariate normality. Pesquisa Operacional. 35, 123-142.
13. Silvey, SD (1975). Statistical Inference. London: Chapman and Hall