TEXT SIZE

search for



CrossRef (0)
A comparison of tests for homoscedasticity using simulation and empirical data
Communications for Statistical Applications and Methods 2024;31:1-35
Published online January 31, 2024
© 2024 Korean Statistical Society.

Anastasios Katsileros1,a, Nikolaos Antonetsisa, Paschalis Mouzaidisb, Eleni Tania, Penelope J. Bebelia, Alex Karagrigoriouc

aDepartment of Crop Science, Agricultural University of Athens, Greece;
bDepartment of Forestry and Natural Resources, Democritus University of Thrace, Greece;
cDepartment of Statistics and Actuarial-Financial Mathematics, University of the Aegean, Greece
Correspondence to: 1 Laboratory of Plant Breeding and Biometry, Department of Crop Science, Agricultural University of Athens, Iera Odos 75, 11855, Athens, Greece. E-mail: katsileros@aua.gr
Received April 23, 2023; Revised October 7, 2023; Accepted October 14, 2023.
 Abstract
The assumption of homoscedasticity is one of the most crucial assumptions for many parametric tests used in the biological sciences. The aim of this paper is to compare the empirical probability of type I error and the power of ten parametric and two non-parametric tests for homoscedasticity with simulations under different types of distributions, number of groups, number of samples per group, variance ratio and significance levels, as well as through empirical data from an agricultural experiment. According to the findings of the simulation study, when there is no violation of the assumption of normality and the groups have equal variances and equal number of samples, the Bhandary-Dai, Cochran’s C, Hartley’s Fmax, Levene (trimmed mean) and Bartlett tests are considered robust. The Levene (absolute and square deviations) tests show a high probability of type I error in a small number of samples, which increases as the number of groups rises. When data groups display a non-normal distribution, researchers should utilize the Levene (trimmed mean), O’Brien and Brown-Forsythe tests. On the other hand, if the assumption of normality is not violated but diagnostic plots indicate unequal variances between groups, researchers are advised to use the Bartlett, Z-variance, Bhandary-Dai and Levene (trimmed mean) tests. Assessing the tests being considered, the test that stands out as the most well-rounded choice is the Levene’s test (trimmed mean), which provides satisfactory type I error control and relatively high power. According to the findings of the study and for the scenarios considered, the two non-parametric tests are not recommended. In conclusion, it is suggested to initially check for normality and consider the number of samples per group before choosing the most appropriate test for homoscedasticity.
Keywords : power of test, type I error, violation of normality, sample size, homogeneity of variance, experimental data analysis
1. Introduction

The validity of statistical analysis of most biological traits depends on whether the underlying assumptions are met. The assumptions for normality, homoscedasticity and independence are applied in most parametric statistical tests, including student’s t-test, analysis of variance (ANOVA) and regression analysis. Homoscedasticity, which sometimes is referred to as the homogeneity of variances, assumes that different groups-treatments have the same variance, even if they come from different populations.

It is very important for researchers to test for homoscedasticity because when this assumption is violated, there is a greater probability of falsely rejecting the null hypothesis, even if the distributions are normal (Wilcox, 2003). In the case that the assumption of homoscedasticity is not met, the researcher applies the proper data transformation, which often results in contradictory conclusions (Stroup, 2015) or uses statistical tests that do not require the assumption of homoscedasticity such as the Welch’s ANOVA and the Brown-Forsythe test (Lix et al., 1996; Dag et al., 2018; Delacre et al., 2019), and corresponding tests for the multiple comparisons of means such as the Games-Howell, C, T2 and T3 tests (Hsiung and Olejnik, 1994).

Therefore, it is necessary for researchers to validate the assumption of homoscedasticity. This validation can be explored with diagnostic plots and/or statistical tests. The most commonly used diagnostic tools are the residuals plot, which is a scatterplot that shows whether the residuals per group have similar dispersions, and the Box-and-Whisker Plot. Diagnostic plots are useful, but some expertise is required in reading and understanding them (Kozak and Piepho, 2018). Therefore, in most cases, statistical tests are used to confirm visual conclusions that are obtained through graphical methods. There are plenty of tests for homoscedasticity, which are available in the literature and statistical software packages, that have been conducted by many researchers (Piepho, 1996; Bhandary and Dai, 2008; Parra-Frutos, 2013; Hatchavanich, 2014; Wang et al., 2017; Mirtagioglu et al., 2017; Onifade and Olanrewaju, 2020). Hence, the researcher faces a fundamental problem, namely, how to choose the most appropriate test for a specific dataset.

The difficulty in selecting the appropriate statistical test is even more evident in biological and agricultural experiments that are conducted both in the laboratory and in the field. These experiments often have deviations from the assumptions of normality, homoscedasticity and independence, due to the small and unequal number of samples (Kim et al., 2021), skewed distributions (Webster and Lark, 2019), spatial dependence (Rossoni and Lima, 2019; Yamamotto et al., 2022) the use of untreated controls (Raudonius, 2017) and unbalance designs (Lix et al., 1996; Parra-Frutos, 2013).

This research paper focuses on the evaluation of different parametric and non-parametric tests for homoscedasticity. It employs simulation and empirical datasets under a range of conditions and scenarios to determine the most appropriate tests for analyzing agricultural datasets.

2. Materials and methods

Tests for homoscedasticity

Homoscedasticity, or the homogeneity of variances is an assumption of ANOVA in which the population variances of two or more groups-treatments are considered equal. The null and alternative hypotheses for testing homogeneity of variances are:

H0:σ12=σ22==σk2,H1:σi2σj2 듼 듼 듼for at least one pair (i,j), 듼 듼 듼ij,

where: k denotes the number of groups-treatments and i, j = 1, 2, . . . , k.

Although various statistical tests have been suggested in the literature for assessing the assumption of homoscedasticity, the Levene and Bartlett tests are commonly employed for this objective and are readily available in most statistical software packages (Mirtagioglu et al., 2017). In the present study, ten parametric and two non-parametric tests were evaluated, including both commonly used tests and their variations as well as lesser known but promising tests. The parametric tests for homoscedasticity that were evaluated are as follows: Hartley’s Fmax (HF), Cochran’s C (CC), Levene (absolute deviations-LE), Levene (square deviations-LS), Levene (trimmed mean-LT), Brown–Forsythe (BF), O’Brien (OB), Bartlett (BA), Z-variance (ZV) and Bhandary-Dai (BD) test. Furthermore, two non-parametric tests for homoscedasticity were evaluated: Fligner-Killeen (FL) and Conover’s Squared Ranks (CO) tests. Non-parametric tests are commonly used to check for homoscedasticity when the assumptions of parametric tests are violated or when dealing with non-normal or skewed data. They are robust and provide reliable results even with small number of samples or unequal variances across groups (Conover and Iman, 1981).

The Hartley’s test (Hartley, 1950), also known as the Fmax-test, is used to test if k groups have equal variances based on the ratio between maximum and minimum group variance. It is a test sensitive to violations of the normality assumption and is defined as:

Fmax=simax2simin2,

where: simax2 is the maximum sample variance and simin2 is the minimum sample variance among the k sample variances si2.

The Hartley’s test rejects the hypothesis that the variances are equal if the computed Fmax, is greater than the critical value of the table of the sampling distribution of Fmax (David, 1952).

The Cochran’s C test (Cochran, 1941) is used to test if k groups have equal variances based on the ratio between the maximum group variance and the sum of the group variances. It is more powerful than Hartley’s Fmax test, at least for small equal number of samples and is defined as:

C=simax2Σi=1ksi2,

where: simax2 is the maximum sample variance.

The Cochran’s C test rejects the hypothesis that all variances are equal if the computed C is greater than the upper critical value CUL(α,n,k), and thus, the population variance associated with simax2 is significantly larger than all other population variances. CUL(α,n,k) is defined as:

CUL(α,n,k)=[1+k-1F(1-α/k;(n-1);(k-1)(n-1))]-1,

where: k is the number of groups, n is the common number of samples and α the level of significance.

Levene’s test (Levene, 1960) is used to test if k groups have equal variances by carrying out an analysis of variance on the absolute deviations of observations from the group mean.

Wor FLevene=(N-k)(k-1)Σi=1kni(Z¯i.-Z¯..)2Σi=1kΣj=1ni(Zij-Z¯i.)2,

where: k is the number of groups, ni is the number of samples of the ith group, N is the total observations, Yi j is the jth observation of the ith group , Yi. is the mean of the ith group and Zi j = |Yi jYi.| or |Yi jYi.|2 are the absolute deviations (LE) or the square deviations (LS) of observations from the group mean.

The Levene’s test rejects the hypothesis that the variances are equal if the computed W is greater than the upper critical value of the F distribution with k – 1 and Nk degrees of freedom at a significance level of α.

Brown and Forsythe (1974) extended Levene’s test to use either the median (BF) or the trimmed mean (LT) in addition to the mean and are shown as follows:

Zij=|Yij-Y˜i.|,

where: Yi. is the median of the ith group.

Zij=|Yij-Y¯i.p|,

where: Y¯i.p is the p% trimmed mean of the ith group.

The Bartlett’s test (Snedecor and Cochran, 1983) is used to test if k groups have equal variances and is sensitive to departures from normality. The test is defined as:

χBartlett2=(N-k)lnsp2-Σi=1k(ni-1)lnsi21+(1/3(k-1))(Σi=1k(1/(ni-1))-1/(N-k),

where: k is the number of the groups, si2 is the variance of the ith group, ni is the number of samples of the ith group, N is the total observations and sp2=(Σi=1k(ni-1)si2)/(N-k) is the pooled variance.

The Bartlett’s test rejects the hypothesis that the variances are equal if the computed χBartlett2 is greater than the critical value of the chi-square distribution with k – 1 degrees of freedom and a significance level of α.

The O’Brien’s test (O’Brien, 1979; 1981) is used to test for homoscedasticity by carrying out an analysis of variance on the transformed observations. The observations are being transformed to:

uij=ni(ni-1.5)(Yij-Y¯i.)2-0.5·si2(ni-1)(ni-1)(ni-2),

where: Yi. is the mean of the ith group, ni is the number of samples of the ith group and si2 is the variance of the ith group.

The O’Brien’s test rejects the hypothesis that the variances are equal if the computed F (FO′Brien) of the analysis of variance on the transformed observations is greater than the upper critical value of the F distribution with k – 1 and Nk degrees of freedom at a significance level of α.

The Z-variance test (Overall and Woodward, 1974) is used to test if k groups have equal variances. The test statistic is defined as:

F=Σi=1kZi2k-1.

Zi is a transformation of the groups’ variance:

Zi=c(ni-1)si2MSerror-c(ni-1)-c2,

where : c = 2 + 1/ni, si2 is an unbiased estimate of variance for the ith group, ni is the number of samples of the ith group and MSerror is the pooled within group error variance.

The Z-variance test rejects the hypothesis that the variances are equal if the computed F-ratio is greater than the upper critical value of the F distribution with k − 1 and ∞ degrees of freedom at a significance level of α.

The Bhandary and Dai’s test (Bhandary and Dai, 2008; 2013) is employed to examine whether k groups have equal variances using a Bonferroni-type adjustment procedure. The algorithm of the test is given as follows:

Algorithm 1

The steps for computing the Bhandary and Dai’s test

Step 1: Initially, the variance (si2) of the ith group (i = 1, . . . , k) and the pooled variance of the rest groups (sp,i2) are calculated. The degrees of freedom of the pooled variance is determined as: ri=j=1,jik(nj-1), where ni is the number of samples of the ith group.F-test statistics are as follows: Fi,calc.=si2/sp,i2 and F¯i,calc.=sp,i2/si2.
Step 2: The P-values are defined as the right tail probability of the statistics calculated in Step 1 such that Pi = P(X > Fi,calc.), where X ~ Fni1,ri and Pi = P(X > Fi,calc.), where X ~ Fri,ni1.
Step 3: The P-values obtained in Step 2 are sorted in an ascending order and denoted by P(1), P(2),. . . ,P(2i). H0 is rejected if P(i)< (1/2k)α.

The Fligner-Killeen test (Fligner and Killeen, 1976) is a non-parametric test for homoscedasticity based on ranks. The test jointly ranks the absolute values of |Yi jYj| and assigns increasing scores aN,i = Φ1((1 + (i/(N + 1)))/2), based on the ranks of all observations. The test statistic is defined as:

xFK2=Σj=1knj(A¯j-a¯)2V2,

where: N=Σj=1knj, j is the mean score in the jth group, a¯=(1/N)Σi=1NaN,i is the overall mean score and V2=(1/(N-1))Σi=1N(aN,i-a¯)2.

The Fligner-Killeen test rejects the hypothesis that k variances are equal if the computed χFK2 is greater to the upper critical value of the χ2 distribution with k – 1 degrees of freedom at a significance level of α.

Conover’s Squared Ranks test (Conover and Iman, 1981) is a non-parametric test for homoscedasticity based on ranks. The test statistic is defined as:

T=(Σi=1k(Si2/ni)-NS¯)(1/(N-1))(Σi=1kΣj=1nrij4-NS¯),

where: ri j = rank(|Yi jYi.|) is the rank of the absolute value, S i is the sum of squared ranks in group i, S¯=(1/N)Σi=1kSi and ni is the number of samples of the ith group.

The Conover’s Squared Ranks test rejects the hypothesis that k variances are equal if the computed T is greater to the upper critical value of the χ2 distribution with k – 1 degrees of freedom at a significance level of α.

Simulations

Since a theoretical comparison is not feasible, a simulation procedure was used to evaluate these homoscedasticity tests. Initially, 10,000 simulations were carried out in which different number of groups (k = 3, 6, 9, 12, 15) with different number of numbers per group (n = 5, 10, 15, 20, 30, 40, 50) and equal variances were generated from the standard normal, student’s t, chi-square, and skewed normal distributions. Moreover, simulation processes were carried out in which the above combinations of number of groups (k) and number of samples per group (n), but with unequal number of samples, were generated from the standard normal distribution. The simulations with unequal number of samples were conducted in the following scenarios: a) one group has split or double the number of samples compared to the other groups (for example k = 3, 0.5n : n : n or 2n : n : n), b) two-thirds of the groups have double the number of samples compared to the number of samples of the remaining one-third of the groups (for example k = 3, n : 2n : 2n ), and c) the number of samples progressively increased (each one-third of the groups has the number of samples of n, 2n, and 3n, respectively). Then, the empirical probability of type I error, which is defined as the number of times the null hypothesis is rejected divided by the total number of simulations (10,000), was evaluated and the results of the simulations are presented as percentages in Tables B.1B.11.

To assess the robustness of type I error, further investigation was performed using Bradley’s criteria (Bradley, 1978). Bradley (1978, 1980) specified three criteria of robustness: Stringent, moderate, and liberal. Bradley’s stringent criterion stated that for a robust test, the type I error should lie in the range α ± 0.1α, the moderate criterion should lie in the range α ± 0.2α, and his liberal criterion should lie in the range α ±0.5α. This study categorizes tests based on their adherence to Bradley’s liberal criterion. Tests falling outside the limits of the liberal criterion are deemed non-robust (for example < 0.025 or > 0.075 for α = 0.05), while those falling within the liberal and moderate limits are considered sufficiently robust (for example α = 0.05, 0.025–0.04 and 0.06–0.075). Finally, tests within the moderate limits (for example α = 0.05, 0.04–0.06) are considered robust.

In order to compare the power of the tests for homoscedasticity, a simulation process was carried out in which the above combinations of number of groups (k) and number of samples (n) with unequal variances were generated from the standard normal distribution. The simulations with unequal variances were conducted in the following scenarios: a) one of the groups has two, three, four or eight times the variance of the other groups (for example k = 3, variance ratio: 1 : 1 : 2, 1 : 1 : 3, 1 : 1 : 4 and 1 : 1 : 8), b) two-thirds of the groups have double or triple variance compared to the variance of the remaining one-third of the groups (for example k = 3, variance ratio: 1 : 2 : 2 or 1 : 3 : 3), and c) the group variances progressively increased (the ratio of the variances for each one-third of the groups is 1 : 2 : 3). The empirical power of the test, which is calculated as the ratio of the number of times the null hypothesis is rejected over 10,000 (the number of simulations) when the alternative hypothesis of non-homogeneity is true, was evaluated and the results are presented as percentages in Tables B.12B.18.

In addition, empirical data from a wheat evaluation experiment, having a violation of normality and the dispersions of the residuals being unequal, was used to confirm the results of the study.

Tukey’s contaminated normal model (Tukey, 1960) was used to create the skewed normal distributions. The contaminated distributions were a mixture of two normal distributions with different means and variances, which resulted in the creation of skewed distributions which is common in crop yield distributions (Hennessy, 2009).

F(x)=(1-)N(μ1,σ12)+N(μ2,σ22) 듼 듼 듼and 듼 듼 듼0<<1.

The simulations were performed with the statistical software R 4.1 and the one waytests, outliers, PMCMRplus, car, agricolae and homnormal packages were used. Functions have been created for the analysis of the Z-variance and Conover’s Squared Ranks tests, which are presented in the Appendix.

3. Results

Type I error - Equal variance and number of samples

The empirical probability of the type I error for 0.01, 0.05 and 0.10 levels of significance and equal variance and number of samples, is presented as percentages of rejected null hypothesis relative to the total number of simulations conducted in Tables B.1B.3. All tests performed well against homoscedasticity for all levels of significance, number of samples and number of groups used, except for the Levene (LE, LS) and Conover’s Squared Ranks tests, which showed an increased probability of type I error in a small number of samples (n = 5) per group. As the number of groups (k = 3, 6, . . . , 15) increases, the probability of committing a type I error in these tests also increases. The Brown-Forsythe and O’Brien tests are considered conservative tests, as they consistently exhibit the lowest probability of committing a type I error. According to Bradley’s criteria when k = 3 groups are evaluated, the Bhandary-Dai, Cochran’s C, Hartley’s Fmax, and Levene (LS and LT) tests are considered robust for all number of samples (n = 5, 10, . . . , 50) per group because the empirical type I error satisfies Bradley’s moderate/stringent criteria (green area, Figure 1). As the number of groups increases the tests considered to be robust are Bartlett, Bhandary-Dai, Cochran’s C, Hartley’s Fmax and Levene (LT) tests. The Levene (LE and LS), Brown-Forsythe, O’Brien, Fligner-Killeen and Conover’s Squared Ranks tests are considered non-robust when evaluating more than six groups with small-moderate number of samples (n = 5–15) because the empirical type I error fails to satisfy Bradley’s liberal criteria (red area, Figures 2 and 3).

Type I error - Equal variance and unequal number of samples

When one group has a split or double the number of samples than the number of samples of the other groups, most of the tests performed well against homoscedasticity for all number of samples and number of groups used (Tables B.4B.5). When k = 3 and the number of samples is small (n = 5), the Levene (LE, LS) and Conover’s Squared Ranks tests have an increased probability of type I error. Moreover, Levene’s tests (LE and LS) and Cochrans’ C show an increased probability of type I error as the number of groups and the number of samples increase. According to Bradley’s criteria, the Levene (LT), Bartlett, Z-variance and Bhandary-Dai tests are considered sufficiently robust or robust for all the number of groups and the number of samples, while the Brown-Forsythe, O’Brien, Conover’s Squared Ranks and Fligner-Killeen tests are considered non-robust.

Tables B.6B.7 summarize the empirical probability of the type I error with a significance level of α = 0.05 for unequal number of groups (two-thirds of the groups has double the number of samples compared to the remaining one-third of the groups and progressively increased number of groups). The Hartley’s Fmax, Levene (LE and LS) and Cochrans’ C show an increased probability of type I error when the number of samples is small (n = 5 – 10). According to Bradley’s criteria, the Levene (LT), Bartlett, Z-variance and Bhandary-Dai tests are considered sufficiently robust or robust for all the number of groups and the number of samples, while the Brown-Forsythe, O’Brien, Conover’s Squared Ranks and Fligner-Killeen tests are considered non-robust.

Type I error - Equal variance and non-normality

The empirical probabilities of the type I error for groups with data generated from the t(3) distribution (student t-distribution with 3 degrees of freedom (d.f.)), which is a symmetric distribution with higher kurtosis than the normal distribution, are presented in Table B.8. The Bartlett, Cochran’s C, Hartley’s Fmax, Z-variance and Bhandary-Dai tests showed an increased probability of type I error as the number of groups and the number of samples increase. The Levene (LE) and Conover’s Squared Ranks tests show an increased probability of type I error for a small number of samples and a large number of groups, which decreased as the number of samples increases. According to Bradley’s criteria, the Levene (LT) tests is considered sufficiently robust or robust.

The empirical probabilities of the type I error for groups with data generated from the χ2(4) distribution (chi-squared distribution with 4 d.f.), which is an asymmetric distribution, are presented in Table B.9. The Bartlett, Cochran’s C, Hartley’s Fmax, Conover’s Squared Ranks, Z-variance and Bhandary-Dai tests show an increased probability of type I error as the number of groups and the number of samples increase. The Levene (LE, LS and LT) tests show an increased probability of type I error for a small number of samples and a large number of groups, which decrease as the number of samples increases. According to Bradley’s criteria Brown-Forsythe and O’Brien tests are considered sufficiently robust or robust.

Under the contaminated normal distributions with positive and negative skewness (Tables B.10B.11), the Bartlett, Cochran’s C, Hartley’s Fmax, Bhandary-Dai and Z-variance tests showed an increased probability of type I error as the number of groups and the number of samples increase. The Levene (LE) and Conover’s Squared Ranks tests showed an increased probability of type I error for a small number of samples and a large number of groups, which decrease as the number of samples increases. The Levene-trimmed test showed an increased probability of type I error for small and the moderate number of samples. According to Bradley’s criteria, the Levene (LT) test is considered sufficiently robust or robust.

Power of the test - Unequal variance

Tables B.12B.15 summarize the empirical powers with a significance level of α = 0.05 for unequal variance (one group has two, three, four, or eight times greater variance compared to the other groups). The simulation results demonstrate that when the number of samples is increased, there is a corresponding increase in the power of the tests. In the cases where one group has twice, triple or quadruple the variance of the other groups, the Levene (LE) and Bartlett tests have the highest power for small size (n = 5), Cochran’s C, Z-variance, Bhandary-Dai, Bartlett and Levene (LS) have the highest power for small-moderate number of samples (n = 10 – 15) and Cochran’s C, Z-variance, Bhandary-Dai and Levene (LS) tests have the highest power for moderate-large number of samples (n = 20–30) (Tables B.12B.14). In the case where one group has eight times the variance of the other groups (Table B.15), all tests are more powerful than the previous cases. The Cochran’s C, Z-variance, Bhandary-Dai and Bartlett have the highest power for small number of samples (n = 5 – 10), while the Brown-Forsythe and O’Brien tests have the lowest power. The non-parametric Fligner-Killeen test performs the poorest among all tests in various scenarios in terms of power. While the Conover test is also one of the weaker tests, it consistently outperforms the Fligner-Killeen test.

Tables B.16B.18 summarize the empirical powers with a significance level of α = 0.05 for unequal variances (two-thirds of the groups has double or triple variance compared to the variance of the remaining one-third of the groups and progressively increased variances). The Bartlett, Z-variance, and Levene (LE) tests have the highest power which increases as the number and size of groups increases, while the Levene (LS), Brown-Forsythe and O’Brien tests have the lowest power across all group numbers and sizes.

Comparative evaluation of the tests

Based on the performance of the tests in terms of controlling type I errors across various scenarios, they can be divided into two categories. The first category comprises tests such as Levene (LT), Brown-Forsythe, and O’Brien, which demonstrate appropriateness across all situations. In contrast, the second category consists of the Z-variance, Bhandary-Dai, Bartlett and Levene (LT) tests that are deemed appropriate when dealing with normally distributed data. Combining the above findings along with the outcomes of the power simulations, it appears that the Levene (LT) test stands out as the most well-balanced option, effectively ensuring type I error control while maintaining high statistical power compared to the other tests of the two categories (Figures 45).

Experimental data analysis

In this section we present for illustrative purposes the analysis of an experiment that took place at the facilities of the agricultural university of Athens (AUA), Greece. In an experiment, k = 4 treatments-varieties of durum wheat were evaluated for the yield response variable, according to the randomized complete block (RCB) design. Figure 6 shows the diagnostic tools (QQ plot and residual plot) for normality and homoscedasticity testing. Statistical testing of normality was conducted with the Shapiro-Wilk test, which rejected the null hypothesis of normality (p-value = 0.019). The residual distribution of the data has positive skewness (0.91) and kurtosis (3.55).

The results of the test for homoscedacity are presented in Table 1. The tests show conflicting results. The Hartley’s Fmax, Levene (LS), O’Brien, Bartlett, Z-variance and Bhandary-Dai tests show almost similar results by rejecting the hypothesis of the equality of variance, while the rest of the tests did not reject the null hypothesis. Some possible choices for the researcher in this experiment are as follows: a. to perform a non-parametric analysis with the Kruskal-Wallis test and multiple comparisons of rank means with the Nemenyi test. b. to perform analysis of variance and multiple comparisons of means with the Tukey test. and c. to perform analysis with the Welch-F test and multiple comparisons of means with the Games-Howell test. Due to the violation of normality in the case, we recommend the researcher accept the equality of variances and perform analysis of variance, which is quite robust against violations of the normality assumption, instead of performing a non-parametric analysis that has less statistical power.

4. Discussion and conclusion

This comprehensive study evaluates the performance of ten parametric and two non-parametric tests for homoscedacity with simulations involving differences in the types of distributions, number of groups, number of samples per group, variance ratio and significance levels. The tests encompassed both widely utilized ones and their variations, along with less familiar yet innovative, alternative tests (e.g. Bhandary-Dai test). All the tests underwent evaluation across numerous and diverse scenarios encountered in the field of biological sciences, as well as in empirical data derived from an agricultural experiment. The various simulations were chosen to cover a great range of scenarios that could appear in practice and at the same time provide the capabilities of the tests under investigation. In addition, R code for conducting two tests (Conover and Z-variance tests) are provided in the Appendix.

The simulation results show that when there is no violation of the assumption of normality and the groups have equal variances and equal or unequal number of samples per group, the Bhandary-Dai, Cochran’s C, Hartley’s Fmax, Levene (LT) and Bartlett tests are considered robust according to Bradley’s criteria. Additionally, although the Levene’s test (LE) is included in most statistical software, it should be avoided when there are unequal and a small number of samples. These findings, mainly regarding the Levene test (LE), support the results of simulations by other researchers (Bhandary and Dai, 2009; Gorbunova and Lemeshko, 2012; Hatchavanich, 2014). When the assumption of normality is violated, the Bartlett, Hartley’s Fmax tests, Cochran’s C and Z-variance tests are not robust. Under such circumstances, it is advisable to avoid using the Levene test (LE) and instead consider employing the O’Brien and Brown–Forsythe tests. These findings align with the results of prior research (Lee et al., 2010; Sharma and Kibria, 2013).

In situations where the group variances are unequal, the power of all tests is enhanced, and this effect becomes more pronounced with the larger number of samples. The Cochran’s C, Bhandary-Dai, Z-variance, and Levene (LS) tests exhibit higher power when the variance of one group differs from the variances of the remaining groups. On the other hand, the Bartlett, Z-variance, and Levene (LT and LE) tests demonstrate increased power when the variance ratios vary between the groups. These simulation findings are consistent with the results of previous studies (Bhandary and Dai, 2009; Gorbunova and Lemeshko, 2012; Parra-Frutos, 2012; Wang et al., 2017).

The findings confirm that non-parametric tests are less reliable in terms of size (type I error) but more stable in terms of power. The Fligner-Killeen test appears to be unreliable in terms of size at least for small sample sizes (n < 30). At the same time the Conover test appears to be unstable with the size of the test ranging from over the nominal level (for a very small number of samples) to under the nominal level (for a larger number of samples). Both tests fail for non-normal distributions with sizes ranging from 0% to over 10% for the Fligner-Killeen test and as much as over 30% for the Conover test. In terms of power the Fligner-Killeen test is the worst among all tests considered for all scenarios. The Conover test is one of the worst tests but almost always better than the Fligner-Killeen test. According to our findings and for the scenarios considered the two non-parametric tests are not recommended.

In conclusion, agricultural research data often have deviations from the assumptions, so researchers must be meticulous in selecting the appropriate statistical analysis. It is suggested that the researchers should first perform a test of normality with the available diagnostic plots and statistical tests and then, based on the number of samples per group, choose the most appropriate test for their data. Based on the findings of the comprehensive simulation study conducted in the preceding section, when dealing with a small number of samples and a suspected non-normal distribution, researchers should consider utilizing the Levene (LT), O’Brien, and Brown–Forsythe tests as the preferred options. In case the assumption of normality is not violated, but there is evidence from diagnostic plots that suggest unequal variances between groups, it is advisable to use Levene (LT), Bhandary-Dai, Bartlett and Z-variance tests which provide both sufficient type I error control and relatively high power. According to the findings presented above, it is evident that the Levene (LT) test stands out as the most well-rounded test and should be incorporated into existing statistical software. In most situations, it consistently maintains robustness against type I errors and almost invariably possesses higher statistical power compared to the tests of O’Brien (OB) and Brown-Forsythe (BF), which are more conservative tests. Researchers, depending on their experience, should combine diagnostic plots with tests for homoscedasticity in order to select the appropriate test for group-treatment differences and to obtain reliable results. The conclusions of this study could be proven very helpful in the analysis of data from agricultural experiments in the future.

Appendix A: Functions

Conover square rank test

conover_square_rank_test=function(trt,Y,data) {
fit=aov(Y~trt)
data$Residual=residuals(fit)
data$Absolute=abs(data$Residual)
data$Rank=rank(data$Absolute)
ni=tapply(data$Rankˆ2, data$trt,length)
N=sum(ni)
Si=tapply(data$Rankˆ2, data$trt,sum)
S=sum(Si)
Ssqn=sum(Siˆ2/ni)
R4=sum(data$Rankˆ4)
xsq=(Ssqn-Sˆ2/N)/((R4-Sˆ2/N)/(N-1))
k=length(unique(data$trt))
df=k-1
p.value=pchisq(xsq, df, lower.tail = F)
print(p.value)
}

Z-variance test

Z_variance_test=function(trt,Y,data) {
fit=aov(Y~trt,data)
MSE=summary(fit)[[1]][2,3]
ni=tapply(data$Y, data$trt,length)
s2i=tapply(data$Y, data$trt,var)
k=length(unique(data$trt))
c=2+1/ni
Zi=sqrt((c*(ni-1)*s2i)/MSE)-sqrt(c*(ni-1)-c/2)
F=sum(Ziˆ2)/(k-1) k=length(unique(data$trt))
df1=k-1
df2=Inf
p.value=pf(F, df1, df2)
print(p.value)
}
Appendix B

Table B.1

The empirical probability of type I error for equal variances and number of samples (a = 0.01). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 1 : 1, N(0, 1), α = 0.01
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 0.8 1.4 1.3 0.5 0.6 0.0 0.1 1.0 0.1 0.6 0.0 0.9
10 0.4 0.6 1.0 0.7 1.1 1.0 0.4 1.6 0.4 0.8 0.3 1.1
15 0.6 0.8 0.8 0.8 0.8 0.2 0.2 1.0 0.4 0.9 0.3 0.5
20 0.9 0.6 0.9 0.3 1.0 0.7 1.3 0.6 0.3 0.8 0.4 0.5
30 0.8 0.7 1.0 1.6 1.1 0.7 0.6 0.9 0.3 1.1 0.4 1.4
50 1.1 0.9 0.9 0.5 0.9 0.7 0.4 0.7 0.6 1.0 0.4 1.0
k = 6, var. ratio (5x)1 : 1, N(0, 1), α = 0.01
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 1.1 1.2 2.9 3.2 1.1 0.0 0.1 1.1 0.1 1.3 0.0 0.9
10 1.1 0.5 1.6 1.1 0.8 0.6 0.5 0.8 0.3 0.6 0.3 0.7
15 0.3 0.7 1.1 1.1 0.9 0.3 0.3 0.9 1.3 0.6 0.2 1.0
20 0.6 1.5 1.3 0.8 1.4 0.5 0.5 0.7 0.3 1.1 0.7 1.3
30 0.4 1.3 0.9 1.2 0.9 0.1 0.5 0.5 0.6 0.9 0.8 1.3
50 0.8 0.9 0.7 1.0 0.5 0.6 0.7 0.5 1.1 1.0 0.9 1.1
k = 9, var. ratio (8x)1 : 1, N(0, 1), α = 0.01
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 0.9 0.6 3.3 3.9 0.9 0.0 0.0 1.1 0.3 1.0 0.0 1.6
10 1.0 1.2 2.0 1.9 1.5 0.6 0.4 0.9 0.8 0.8 0.4 0.9
15 1.3 1.2 2.2 0.7 0.9 0.4 0.2 1.0 0.5 0.7 0.1 1.1
20 1.2 1.0 1.6 1.1 0.5 0.6 0.5 0.8 0.6 1.3 0.6 1.5
30 0.6 0.9 0.8 0.9 0.9 0.8 0.6 1.6 1.1 1.4 1.0 0.8
50 0.6 0.5 0.8 0.7 1.4 0.8 0.3 1.5 0.6 1.3 0.8 0.7
k = 12, var. ratio (11x)1 : 1, N(0, 1), α = 0.01
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 1.2 1.2 4.7 3.2 0.7 0.0 0.0 0.9 0.6 0.7 0.0 1.7
10 1.1 1.1 2.3 1.7 1.2 0.2 0.6 1.7 0.7 0.9 0.1 1.2
15 1.3 0.8 1.5 1.7 0.9 0.1 0.1 1.3 0.8 0.6 0.2 1.4
20 0.8 0.5 1.2 1.3 1.1 0.2 0.4 1.3 1.0 1.1 0.4 1.2
30 0.8 1.0 1.0 0.9 0.8 1.0 0.7 1.3 0.8 1.1 0.5 1.1
50 1.8 0.9 1.3 1.1 1.0 0.9 0.4 1.3 0.6 0.8 0.4 0.5
k = 15, var. ratio (14x)1 : 1, N(0, 1), α = 0.01
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 0.9 1.0 4.8 5.3 0.8 0.0 0.0 1.1 0.1 0.7 0.0 1.5
10 0.8 1.3 2.5 2.2 1.5 0.2 0.2 0.8 0.5 1.2 0.0 1.4
15 1.2 1.4 1.2 1.6 1.1 0.2 0.0 1.3 0.6 0.9 0.1 1.2
20 1.7 1.1 1.3 1.5 0.9 0.4 0.4 1.0 0.7 1.1 0.2 0.9
30 1.2 0.6 1.5 1.9 0.5 0.2 0.3 1.2 0.7 1.0 0.8 1.4
50 1.4 0.6 0.8 1.9 0.5 0.7 0.9 1.1 0.5 1.3 0.9 1.3

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.2

The empirical probability of type I error for equal variances and number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 1 : 1, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 4.6 5.2 9.0 4.1 4.9 0.3 0.6 4.9 0 4.3 0.1 7.7
10 5.2 4.4 6.0 4.3 5.0 3.8 3.8 3.7 2.6 5.1 2.2 5.6
15 4.8 5.3 7.2 4.3 5.1 3.7 3.4 5.3 2.7 4.3 2.1 4.7
20 4.8 4.0 5.8 5.1 4.5 4.3 3.3 5.4 3.4 5.1 3.3 6.0
30 4.1 4.7 5.0 5.7 4.8 3.6 4.2 5.2 3.7 4.9 4.6 3.2
50 5.3 5.0 4.4 5.8 4.4 4.1 3.6 6.1 5.0 5.4 4.6 5.6
k = 6, var. ratio (5x)1 : 1, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 3.9 4.3 10.9 9.2 6.3 0.4 0.3 3.3 2.5 4.1 0.2 6.6
10 6.0 4.7 6.6 8.4 5.6 2.7 2.1 3.6 3.0 3.2 2.2 5.6
15 5.6 5.6 6.7 6.7 5.0 2.9 2.1 4.2 3.9 5.3 2.5 6.0
20 4.3 5.4 6.7 5.1 5.1 4.3 3.9 5.2 4.0 6.9 3.4 4.9
30 3.8 4.6 5.6 5.0 5.3 4.1 3.5 5.3 4.4 4.7 4.0 4.5
50 4.4 4.3 5.2 4.5 5.5 3.7 4.5 4.0 3.5 4.3 3.3 6.0
k = 9, var. ratio (8x)1 : 1, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.4 5.6 10.9 10.1 5.9 0.2 0.2 3.8 2.9 5.0 0.1 9.4
10 4.8 4.1 8.6 7.8 5.8 2.3 2.8 5.0 3.0 5.1 2.3 6.8
15 5.4 4.4 7.9 8.1 5.1 1.7 2.2 5.8 5.2 5.8 1.7 5.9
20 3.6 4.7 5.0 4.8 4.0 2.3 2.9 5.6 4.6 4.7 2.5 5.3
30 5.7 4.4 5.7 5.4 3.4 3.9 2.5 5.5 4.2 5.3 2.4 5.6
50 4.3 5.0 6.4 5.2 4.5 3.4 3.3 4.6 4.0 4.9 3.9 4.9
k = 12, var. ratio (11x)1 : 1, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 6.2 4.7 12.9 12.0 6.0 0.1 0.1 5.0 2.1 5.3 0.1 8.5
10 5.7 4.9 6.7 8.5 5.4 2.3 2.4 5.6 3.8 5.1 2.0 7.3
15 5.1 5.4 6.3 5.6 5.9 2.1 1.7 4.6 3.7 6.1 1.1 6.7
20 5.1 5.4 7.3 5.4 4.7 3.0 2.9 5.5 5.2 4.4 2.3 5.5
30 4.7 5.6 5.3 5.4 5.7 3.0 3.3 4.9 5.3 5.2 2.4 4.3
50 4.9 4.6 4.3 4.9 4.5 3.7 4.4 4.8 5.1 5.0 2.9 4.3
k = 15, var. ratio (14x)1 : 1, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.3 4.0 13.5 14.1 4.6 0.1 0.1 4.9 2.7 5.2 0.1 10.3
10 5.1 4.8 10.5 8.0 5.6 2.0 2.6 4.6 3.8 5.2 1.6 6.5
15 3.9 4.9 6.9 7.6 5.9 1.7 1.7 5.3 4.0 4.9 1.3 6.1
20 4.7 5.7 6.8 6.3 4.9 3.3 2.1 4.4 4.3 4.2 2.4 5.3
30 5.1 4.1 5.8 6.8 4.4 2.8 2.6 5.3 4.8 5.0 3.3 5.2
50 5.1 5.2 5.8 6.4 3.5 4.5 4.0 4.5 5.6 4.2 3.4 4.9

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.3

The empirical probability of type I error for equal variances and number of samples (α = 0.10). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 1 : 1, N(0, 1), α = 0.10
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 9.7 10.7 15.1 12.3 10.6 1.5 2.0 8.9 5.1 8.9 0.7 15.4
10 11.0 10.4 12.2 10.4 9.0 6.8 7.4 10.5 6.9 9.2 7.1 11.5
15 10.4 10.6 12.6 11.1 10.1 7.9 7.6 10.3 9.0 8.2 6.8 11.9
20 9.5 10.5 10.6 11.1 10.5 8.4 9.6 10.3 9.1 8.6 8.9 9.7
30 9.9 11.1 10.1 9.6 9.1 8.3 8.2 10.2 8.1 8.4 9.0 8.7
50 11.2 10.1 11.2 10.7 11.3 7.9 9.4 11.4 10.9 8.5 8.3 10.6
k = 6, var. ratio (5x)1 : 1, N(0, 1), α = 0.10
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.2 9.6 18.5 15.4 11.0 0.7 1.0 10.5 5.6 10.3 0.0 14.7
10 9.6 10.8 13.3 11.6 10.0 5.8 5.8 10.2 8.1 9.1 6.7 9.0
15 10.9 10.1 10.2 13.0 9.4 6.2 6.0 8.7 9.8 8.9 5.2 12.1
20 10.4 9.4 8.8 12.0 9.4 7.2 7.7 9.1 10.5 9.1 6.2 12.7
30 11.4 12.4 11.7 10.4 9.5 7.1 7.3 9.6 9.7 10.8 7.8 8.6
50 10.4 8.8 11.7 10.0 11.0 10.9 9.2 9.6 11.7 9.9 8.9 11.5
k = 9, var. ratio (8x)1 : 1, N(0, 1), α = 0.10
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.7 10.4 21.7 18.2 11.1 1.1 0.1 8.6 5.7 11.0 0.2 17.8
10 9.7 12.2 15.4 12.5 9.6 5.7 6.4 9.8 7.9 11.4 5.4 13.0
15 10.0 12.0 12.4 10.5 12.1 4.1 6.0 9.5 10.2 8.1 4.9 14.0
20 8.6 10.8 12.1 12.7 9.4 6.4 6.6 9.5 10.5 10.3 6.8 11.7
30 10.2 9.6 11.2 10.0 9.6 8.1 7.5 8.8 9.7 10.0 7.2 10.1
50 10.7 10.1 11.4 9.0 9.7 6.5 9.3 9.8 10.0 7.4 7.3 9.9
k = 12, var. ratio (11x)1 : 1, N(0, 1), α = 0.10
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 11.4 11.3 25.2 22.6 9.7 0.4 0.3 9.4 8.5 10.2 0.0 17.6
10 10.5 11.8 16.8 13.9 8.6 5.0 5.3 10.1 8.1 10.5 5.0 11.7
15 9.3 9.9 13.5 13.7 9.7 4.2 4.1 10.4 8.7 9.4 4.0 11.7
20 10.2 9.2 12.7 10.6 10.0 5.2 6.4 9.8 7.4 9.2 6.7 12.2
30 10.8 9.1 10.6 11.9 10.1 7.8 6.7 10.0 9.3 11.5 6.5 12.2
50 11.4 9.6 9.8 10.2 10.1 9.0 8.4 11.0 9.8 10.3 7.9 9.9
k = 15, var. ratio (14x)1 : 1, N(0, 1), α = 0.10
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 8.1 9.3 24.5 21.6 10.3 0.2 0.3 11.2 5.6 10.5 0.0 20.2
10 9.7 9.5 16.1 14.9 10.4 4.5 6.6 10.1 11.2 9.1 3.6 14.8
15 10.7 8.7 13.2 13.8 9.5 5.3 3.8 9.4 8.9 9.6 2.6 12.0
20 10.6 9.5 11.8 11.2 10.8 6.8 8.2 10.1 8.6 10.6 4.4 11.5
30 11.1 11.8 12.2 11.6 9.6 8.2 6.4 9.8 10.1 11.4 6.6 11.1
50 11.1 11.1 10.6 11.2 8.3 8.1 8.4 11.0 11.2 9.1 8.6 10.9

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.4

The empirical probability of type I error for equal variances and unequal number of samples (2n : n : n) (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, 2n : n : n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 6.5 6.1 7.7 4.1 4.7 2.2 1.9 4.8 4.4 4.7 1.6 6.8
10 5.5 7.5 6.3 5.1 4.3 4.1 4.2 4.5 5.0 3.8 4.6 5.8
15 6.4 5.4 6.0 5.2 5.4 3.2 3.0 5.2 5.5 4.7 2.7 5.5
20 5.5 5.7 4.5 6.2 5.6 3.7 4.5 5.3 4.1 4.0 4.0 5.0
30 6.4 5.3 6.3 3.7 5.9 4.2 4.4 6.5 5.1 4.5 4.3 5.9
50 6.6 6.6 5.4 4.7 4.8 4.6 5.0 4.0 5.4 3.5 4.0 4.5
k = 6, 2n : (5x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.0 6.0 8.2 7.7 6.4 0.7 0.3 4.9 4.2 5.3 0.4 7.7
10 5.0 5.6 5.6 6.7 4.9 3.0 3.0 5.0 4.4 4.1 3.7 5.0
15 6.7 5.7 5.9 6.9 4.5 2.7 2.1 6.4 5.1 4.2 2.6 6.7
20 6.3 6.5 5.9 5.2 4.8 3.1 3.0 5.5 4.7 5.2 3.5 5.7
30 4.9 6.8 5.5 5.9 5.2 2.6 4.5 6.0 5.7 4.6 3.3 6.3
50 4.5 8.1 4.6 4.9 5.1 3.9 3.3 4.6 5.6 3.8 4.3 5.2
k = 9, 2n : (8x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.7 6.0 9.9 8.4 5.9 0.3 0.4 4.2 3.6 5.2 0.2 8.6
10 5.8 6.5 7.7 6.4 4.2 2.6 2.1 4.2 3.8 4.7 2.7 6.0
15 5.9 6.5 7.0 5.8 5.5 2.4 1.9 4.9 4.1 5.6 2.8 7.1
20 4.5 6.7 6.5 5.8 4.0 3.8 3.2 5.8 5.7 4.9 3.0 6.3
30 5.3 4.8 5.0 5.6 5.6 3.6 4.2 4.7 4.2 4.9 3.4 4.1
50 5.9 6.8 6.5 6.0 4.2 4.4 4.3 4.1 3.4 6.4 4.1 4.6
k = 12, 2n : (11x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 4.4 5.4 13.2 10.9 7.5 0.4 0.2 5.8 3.0 5.4 0.1 10.0
10 6.4 5.8 7.6 7.8 4.6 2.3 2.2 5.5 5.1 6.1 2.6 6.8
15 5.5 5.9 7.3 5.2 6.6 2.0 2.7 5.0 4.3 5.7 1.8 7.7
20 4.8 5.9 5.8 6.6 4.9 2.8 2.8 5.0 5.2 5.7 3.1 5.9
30 6.8 5.7 5.9 5.6 4.0 3.1 4.3 5.0 5.2 3.6 3.0 5.4
50 4.1 5.7 4.9 5.6 4.3 4.2 4.0 3.8 4.2 4.6 3.0 4.9
k = 15, 2n : (14x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.0 5.2 14.1 12.5 5.8 0.2 0.0 4.4 4.1 3.5 0.0 9.9
10 6.0 7.1 8.1 7.7 3.7 2.4 2.1 5.6 5.0 3.5 2.2 6.6
15 5.8 5.5 6.9 6.8 4.9 1.3 1.3 4.9 4.9 4.6 1.2 6.8
20 5.7 5.6 6.1 6.1 5.7 2.1 3.3 5.4 4.1 4.5 3.0 5.1
30 5.1 7.3 3.9 6.2 5.9 3.7 2.9 3.6 4.4 4.9 3.3 5.9
50 5.3 4.8 5.2 5.6 4.8 5.4 4.2 4.5 6.6 6.5 4.5 5.7

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.5

The empirical probability of type I error for equal variances and unequal number of samples (0.5n : n : n) (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, 0.5n : n : n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
6 5.9 8.4 9.2 5.6 5.4 2.5 2.3 4.6 0.0 4.7 2.6 6.1
10 4.7 7.2 6.9 3.6 5.6 2.3 2.8 4.6 1.6 4.5 2.2 6.0
14 6.3 7.0 6.1 4.8 3.8 2.7 4.1 5.1 3.3 5.4 4.1 5.8
20 5.9 8.1 7.6 4.0 4.3 2.6 4.4 5.7 3.3 5.6 3.3 5.7
30 5.4 7.0 5.3 4.5 4.8 3.8 4.7 4.0 4.6 3.8 3.2 4.8
50 6.0 5.2 4.8 4.6 4.5 5.4 4.8 3.7 5.5 4.9 3.2 4.8
k = 6, 0.5n : (5x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
6 5.8 7.8 10.6 8.5 6.5 1.5 1.9 4.2 2.0 4.5 2.3 7.9
10 4.6 6.0 6.4 4.6 5.7 3.0 2.7 5.7 4.5 4.3 1.5 6.5
14 6.3 6.8 7.1 6.1 4.5 3.3 2.9 5.3 4.2 4.0 2.0 5.4
20 6.2 5.0 5.9 4.9 3.6 3.2 3.7 4.0 5.0 4.3 2.5 4.1
30 5.6 7.1 4.6 3.4 4.6 3.7 4.6 5.9 4.3 4.7 3.0 4.3
50 5.3 6.5 4.6 5.7 5.6 3.6 5.0 3.7 5.3 5.1 2.5 5.6
k = 9, 0.5n : (8x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
6 5.3 6.6 10.6 8.3 6.0 2.0 1.8 4.7 2.9 5.2 2.0 7.8
10 5.5 6.7 10.8 6.7 3.9 2.1 3.1 4.9 3.5 5.7 2.3 4.8
14 5.4 6.3 5.6 6.4 5.5 2.9 3.0 5.4 4.3 5.1 1.9 6.8
20 4.3 6.3 7.0 7.1 4.4 2.3 3.3 5.0 4.7 5.6 2.2 5.5
30 6.5 5.9 5.7 6.0 5.1 4.0 3.0 5.3 4.8 4.8 2.7 4.8
50 6.0 6.8 4.9 5.1 5.2 3.5 3.5 3.6 4.4 4.9 3.9 5.1
k = 12, 0.5n : (11x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
6 5.5 5.9 12.8 12.4 6.9 2.1 2.5 3.3 3.4 5.4 1.9 7.1
10 5.7 7.0 7.7 8.1 4.4 3.0 1.2 4.8 3.9 4.1 2.1 6.9
14 6.1 8.1 8.9 6.4 5.4 2.7 1.9 4.7 3.5 4.4 2.1 5.7
20 5.1 6.0 6.2 6.6 4.2 3.1 2.6 4.5 4.6 4.9 2.7 5.3
30 4.9 5.6 7.2 6.3 4.5 1.8 2.8 5.2 4.7 5.1 2.7 4.1
50 4.8 6.7 5.0 6.0 5.3 2.8 4.1 4.0 4.9 4.5 3.4 5.3
k = 15, (0.5n : (14x)n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
6 6.3 6.7 12.9 13.1 6.5 2.8 2.3 5.1 2.6 4.9 1.7 9.8
10 5.6 7.2 7.5 8.1 5.3 1.3 1.6 4.9 3.7 4.2 2.1 7.7
14 5.9 6.9 8.4 6.8 3.6 1.3 1.9 5.0 4.3 5.6 2.3 5.9
20 5.9 8.1 7.2 6.9 3.7 2.7 3.0 4.5 4.8 5.4 2.0 5.6
30 5.9 6.4 7.1 5.8 4.6 2.7 2.6 6.1 4.5 5.1 3.3 6.4
50 5.9 7.3 4.2 6.5 6.8 3.9 3.5 4.7 4.5 5.6 4.3 4.5

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.6

The empirical probability of type I error for equal variances and unequal number of samples (n : 2n : 2n) (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, n : 2n : 2n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 7.1 5.5 6.5 10.8 6.5 2.7 2.1 4.7 4.3 4.9 2.6 6.8
10 7.3 5.4 6.4 6.7 5.1 4.2 3.0 5.7 5.1 4.9 2.5 5.9
15 5.7 6.3 4.5 6.2 4.7 3.1 3.2 5.3 4.8 4.2 3.7 4.9
20 6.0 6.7 5.0 4.9 5.0 5.1 4.4 5.3 5.3 4.3 3.0 5.3
30 6.7 6.5 4.2 5.1 5.5 4.7 3.6 5.2 5.5 5.3 5.2 5.8
50 5.5 7.4 4.9 5.3 5.6 4.2 4.2 4.3 5.3 4.8 5.3 4.9
k = 6, (2x)n : (4x)2n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.7 7.1 8.0 11.0 5.8 1.8 1.5 5.2 3.0 5.9 1.2 7.0
10 6.4 8.3 6.3 6.7 5.9 3.1 3.7 6.0 4.1 6.1 3.5 6.8
15 5.4 7.3 5.2 6.2 4.5 3.1 4.7 5.5 4.5 5.1 2.8 6.7
20 5.2 7.1 6.2 5.1 4.8 4.1 4.2 4.8 5.6 5.8 3.3 4.9
30 5.2 7.5 5.8 5.2 4.7 4.8 4.0 5.1 4.6 4.3 3.8 4.4
50 7.8 6.6 4.7 5.0 5.5 5.0 3.7 4.4 4.9 4.9 6.0 5.1
k = 9, (3x)n : (6x)2n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.8 8.9 9.2 9.1 4.3 1.6 2.0 4.8 4.3 5.4 1.8 6.3
10 4.7 8.8 6.0 7.1 6.1 3.9 4.5 5.0 5.2 5.0 2.6 6.3
15 7.1 8.4 6.7 7.1 4.4 3.4 4.1 5.1 5.4 5.8 2.5 5.6
20 5.8 7.4 6.2 5.1 5.4 3.1 2.7 4.9 6.1 4.1 3.5 4.7
30 6.5 6.9 5.8 4.5 4.6 3.5 4.4 4.1 4.9 6.2 4.2 6.3
50 7.0 7.4 3.9 4.9 4.1 3.8 4.3 5.8 5.3 4.9 3.9 5.8
k = 12, (4x)n : (8x)2n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 6.7 9.2 7.7 9.8 4.2 1.5 1.1 5.8 5.2 4.1 0.7 8.0
10 6.6 8.9 7.3 7.4 4.5 3.0 2.9 4.3 4.5 4.9 2.7 5.8
15 5.3 9.4 6.7 7.1 4.5 2.9 3.1 5.8 4.4 5.8 2.9 5.4
20 5.9 8.1 6.9 4.3 4.1 3.5 2.5 4.2 4.8 6.5 2.7 5.7
30 6.6 9.3 5.8 4.3 2.9 3.6 3.3 4.9 4.4 4.7 3.8 5.7
50 8.3 8.3 6.6 5.2 5.3 4.8 3.8 5.1 5.8 4.7 4.4 5.6
k = 15, (5x)n : (10x)2n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 4.9 8.7 8.8 10.1 4.4 1.2 1.3 6.1 5.0 5.5 0.8 6.5
10 5.0 6.8 6.6 7.1 3.5 2.9 2.1 4.5 4.8 4.7 2.6 5.4
15 6.7 8.5 7.0 6.4 4.4 2.6 2.6 5.2 5.3 4.2 1.9 5.8
20 6.0 9.0 5.9 4.8 4.2 4.0 2.9 4.7 5.8 4.5 3.5 6.2
30 6.2 9.1 5.6 4.1 3.5 3.6 4.3 4.3 5.2 6.0 3.1 5.8
50 5.7 9.1 4.3 4.9 5.1 4.6 3.6 5.5 4.7 6.0 4.2 6.3

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.7

The empirical probability of type I error for equal variances and unequal number of samples (n : 2n : 3n) (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, n : 2n : 3n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
05 7.2 8.5 7.6 7.5 4.5 1.5 3.1 5.0 2.8 3.6 2.1 6.1
10 7.3 8.5 5.0 8.1 4.8 3.2 4.0 5.6 3.9 4.8 3.5 5.4
15 7.9 7.7 4.8 5.7 4.7 3.7 4.7 4.3 3.6 4.9 3.9 5.5
20 8.5 7.0 4.6 6.6 4.5 3.5 4.8 4.8 4.2 5.3 3.8 5.9
30 7.8 7.3 4.4 6.1 4.7 5.2 4.1 5.4 4.8 4.6 4.8 5.2
50 8.0 8.7 5.7 5.4 5.1 4.5 4.9 4.9 5.5 3.9 3.9 4.8
k = 6, (2x)n : (2x)2n : (2x)3n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 6.3 9.0 7.0 8.5 4.3 2.0 2.3 5.3 3.9 5.0 1.2 5.8
10 7.2 9.5 5.4 7.7 3.2 3.8 4.0 6.8 5.2 3.7 3.4 5.1
15 7.3 10.9 4.9 6.4 4.5 3.3 3.5 5.1 3.4 4.3 2.9 5.0
20 6.9 9.3 6.7 5.8 5.1 3.5 3.8 4.7 5.1 4.9 3.9 5.4
30 6.5 9.6 4.4 5.3 4.4 5.0 4.3 5.9 5.4 5.9 4.1 3.9
50 8.2 10.1 4.5 5.9 4.5 4.1 3.9 4.7 6.3 4.4 4.5 5.4
k = 9, (3x)n : (3x)2n : (3x)3n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 6.1 12.0 7.2 8.1 5.4 1.4 0.9 4.1 4.5 5.3 1.7 7.0
10 5.1 10.8 5.5 5.5 4.5 3.8 3.5 6.4 4.8 5.2 3.1 5.4
15 6.2 10.5 6.4 5.4 5.9 3.0 3.9 5.3 4.9 5.2 3.2 5.5
20 7.9 11.3 5.5 6.4 5.9 4.2 4.0 5.2 5.6 6.0 3.2 4.7
30 8.8 12.0 5.1 5.7 5.1 3.4 4.1 4.1 5.1 5.2 4.6 3.8
50 7.4 10.4 5.6 5.5 3.6 4.7 3.6 3.9 4.7 4.9 4.3 4.4
k = 12, (4x)n : (4x)2n : (4x)3n, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 6.1 12.9 6.9 9.7 5.4 0.9 1.9 4.5 5.1 5.2 0.8 7.0
10 6.2 11.7 5.3 5.4 4.7 3.0 4.0 5.0 4.9 5.0 3.0 4.7
15 6.3 12.9 5.1 5.9 5.4 2.9 3.8 5.4 3.9 4.7 3.6 5.0
20 8.0 13.9 5.9 5.3 5.4 3.3 4.1 4.5 4.6 3.3 3.2 5.4
30 5.8 13.0 5.4 4.5 5.2 3.6 4.2 4.8 5.5 5.0 4.0 4.1
50 6.6 13.2 4.8 5.2 5.6 4.2 4.0 5.3 5.0 5.8 4.3 5.5
k = 15, (5x)n : (5x)2n : (5x)3n, N(0, 1), α = 0.05
5 6.8 13.7 7.3 6.8 5.3 1.2 1.5 4.2 3.1 5.6 0.7 6.6
10 5.6 13.7 7.6 5.6 4.5 2.7 2.7 4.9 5.0 4.9 2.7 6.1
15 6.4 15.3 6.1 5.8 5.0 3.1 3.5 5.5 4.7 5.5 2.3 7.1
20 6.7 14.6 3.9 5.8 5.6 3.6 3.6 5.5 6.0 5.0 2.2 6.1
30 6.2 11.3 4.9 5.6 5.0 5.1 4.3 4.0 5.7 4.7 3.9 5.6
50 8.2 15.1 5.5 4.3 5.3 4.2 5.2 6.2 4.3 4.7 4.2 4.5

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.8

The empirical probability of type I error for equal variances and t-distribution (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, t(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 17.6 22.1 12.0 5.9 4.9 0.8 0.4 18.8 16.0 20.2 0.1 12.3
10 29.8 27.0 8.4 4.5 4.9 4.5 2.8 29.4 27.3 30.6 2.5 9.8
15 37.6 34.4 7.6 4.1 3.6 3.2 3.2 34.9 35.8 31.9 3.4 6.5
20 40.7 35.8 7.1 2.4 3.2 3.1 3.7 40.1 41.4 39.3 4.5 8.9
30 45.5 45.0 6.5 3.3 3.8 4.3 4.3 46.5 47.1 46.3 4.1 8.2
50 52.2 46.1 7.3 3.5 4.7 3.9 4.3 50.8 54.3 51.4 4.5 7.2
k = 6, t(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 23.5 35.3 21.0 8.9 6.2 0.9 0.3 35.0 31.2 31.5 0.1 19.2
10 47.0 47.4 11.7 5.5 4.6 3.4 2.8 51.8 50.2 46.2 3.9 13.6
15 53.3 53.9 10.5 4.4 4.9 3.2 2.7 60.1 62.0 56.5 2.5 11.9
20 60.2 55.4 8.5 4.3 4.7 2.9 3.0 66.2 67.0 63.9 3.6 10.4
30 68.5 64.9 6.9 3.8 4.2 3.4 3.2 71.4 71.8 68.8 4.3 9.8
50 74.1 66.8 7.3 4.0 3.5 3.9 4.4 78.3 82.1 75.2 5.5 7.1
k = 9, t(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 29.3 42.4 29.4 12.3 7.7 0.6 1.5 41.6 43.7 39.3 0.0 24.6
10 55.8 57.7 16.9 6.7 5.9 4.2 3.6 66.8 65.4 57.6 3.8 17.2
15 68.8 64.2 13.0 4.5 5.8 3.0 3.0 74.9 74.3 65.4 2.3 14.9
20 72.9 68.3 11.8 5.3 4.5 3.0 4.2 81.1 79.5 71.3 2.7 11.9
30 81.0 74.5 8.5 4.3 4.3 3.3 4.5 83.8 85.4 78.1 3.5 9.8
50 87.9 81.2 6.9 2.9 4.3 5.0 3.2 90.6 90.2 86.7 4.0 9.4
k = 12, t(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 31.6 47.7 38.7 13.9 6.6 0.8 1.4 53.6 56.2 45.4 0.0 30.6
10 61.4 64.3 20.7 7.9 4.8 3.4 4.4 75.8 75.0 63.9 2.8 20.1
15 74.2 73.7 15.6 5.8 4.4 4.0 4.3 83.4 83.4 73.6 1.8 16.5
20 79.6 77.3 12.0 6.3 4.1 2.7 4.2 87.4 88.6 79.7 3.0 14.9
30 88.1 83.7 11.8 4.9 4.3 4.3 3.4 92.7 92.9 86.8 4.1 11.3
50 94.7 85.6 8.9 4.4 4.0 4.7 4.7 96.0 96.0 89.7 4.4 7.3
k = 15, t(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 38.4 56.4 46.4 17.8 8.9 0.4 1.1 60.7 61.8 53.9 0.1 36.0
10 69.2 71.9 23.8 7.9 4.7 3.6 3.4 80.2 82.1 67.2 2.8 25.6
15 80.2 79.2 21.1 5.8 4.8 2.6 3.2 89.2 91.3 82.7 1.9 18.8
20 88.3 86.1 17.2 5.5 4.8 3.8 3.5 93.7 91.9 83.2 3.6 15.1
30 91.7 88.7 11.0 5.3 4.8 4.1 4.1 95.7 96.1 90.0 4.0 12.6
50 94.9 91.9 10.1 4.0 5.1 4.0 4.6 97.8 98.1 94.5 3.8 9.5

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.9

The empirical probability of type I error for equal variances and χ2-distribution (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, χ2(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 16.4 17.1 17.0 10.9 10.0 1.2 0.6 16.7 15.2 17.7 0.0 18.5
10 21.2 24.0 16.9 7.4 7.1 4.4 4.1 25.2 22.6 24.8 5.7 21.6
15 24.6 25.4 14.9 7.0 7.7 4.9 3.6 27.3 26.2 28.3 4.2 20.8
20 30.7 27.5 14.9 6.6 5.2 3.4 4.5 30.6 28.9 28.5 5.7 21.8
30 45.5 45.0 6.5 3.3 3.8 4.3 4.3 46.5 31.6 30.8 4.1 28.2
50 52.2 46.1 7.3 3.9 4.7 3.9 4.3 50.8 33.9 31.0 4.5 29.2
k = 6, χ2(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 21.0 23.8 26.8 12.8 10.7 0.4 1.0 27.1 26.2 25.5 0.0 26.2
10 34.3 32.6 23.7 8.4 8.4 4.2 4.4 39.0 38.9 33.6 6.5 33.8
15 41.7 35.4 22.6 8.1 8.6 4.5 3.9 44.4 43.0 36.5 5.7 34.8
20 42.5 35.1 21.3 7.2 7.3 5.4 4.7 48.2 49.9 43.0 7.9 36.3
30 45.7 37.9 20.7 5.0 6.4 4.6 4.1 52.4 50.7 42.0 6.8 38.7
50 50.1 41.8 22.6 7.2 6.8 4.1 3.8 53.1 56.7 46.9 10.0 38.9
k = 9, χ2(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 23.4 30.9 37.1 15.0 9.9 0.7 1.0 37.1 36.3 32.3 0.0 36.5
10 43.5 35.7 29.0 10.4 9.2 5.3 3.3 53.5 53.5 41.5 7.2 40.3
15 47.1 45.7 28.9 9.6 8.1 3.4 4.1 59.7 54.3 48.2 4.5 44.6
20 54.8 47.2 27.7 7.6 6.7 4.5 4.7 60.7 62.3 52.3 7.9 45.2
30 55.7 46.5 24.9 4.7 5.7 4.2 4.5 67.9 63.8 51.2 7.5 46.4
50 63.5 50.5 27.2 5.7 6.4 4.5 5.0 66.6 67.4 57.2 11.0 53.6
k = 12, χ2(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 25.0 35.1 49.2 17.2 12.3 0.6 0.3 43.8 43.0 33.9 0.1 47.9
10 48.0 44.0 37.3 11.4 10.0 3.7 3.3 63.4 64.2 48.5 7.2 50.8
15 57.3 50.6 32.0 9.0 8.9 3.0 3.9 71.1 68.4 54.2 4.0 53.0
20 60.3 49.8 34.8 8.1 5.7 3.9 4.2 73.7 71.7 55.7 10.3 55.9
30 67.4 53.6 32.5 7.5 7.1 4.9 4.5 72.9 74.8 62.6 8.9 61.0
50 68.9 57.7 32.2 5.4 6.2 4.5 4.6 76.8 76.5 63.4 12.1 61.3
k = 15, χ2(3), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 25.7 41.6 54.2 23.5 11.4 0.8 0.6 52.8 50.2 37.0 0.0 54.0
10 50.2 51.6 40.8 14.7 8.6 4.5 3.2 69.8 69.0 52.2 6.4 56.4
15 60.5 55.8 34.6 9.9 8.8 2.1 2.9 75.2 77.8 58.8 4.2 60.7
20 68.3 57.2 35.4 8.8 7.2 4.3 3.7 80.9 79.9 63.6 8.1 63.0
30 69.9 59.1 35.0 6.9 7.7 5.7 4.0 81.9 84.7 64.6 10.2 67.6
50 77.8 63.6 38.8 5.5 6.6 5.0 3.8 85.1 85.4 69.7 13.3 69.1

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.10

The empirical probability of type I error for equal variances and a positively skewed distribution (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, (skew = +1.7), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.1 11.1 9.4 6.7 5.5 1.0 0.7 10.7 8.0 8.9 0.0 10.1
10 13.8 12.8 7.9 4.9 6.6 4.1 3.5 14.3 14.6 14.9 2.8 7.7
15 15.7 17.3 7.8 4.0 5.8 3.1 4.7 18.6 16.9 16.3 3.2 8.6
20 16.3 13.9 6.6 4.6 5.1 3.6 2.8 17.3 18.2 16.9 3.7 7.6
30 17.4 14.0 7.1 4.8 4.0 4.4 5.0 17.9 17.8 17.3 5.5 6.9
50 19.3 15.4 6.2 5.5 5.3 3.2 5.3 18.1 18.3 16.9 4.7 6.5
k = 6, (skew = +1.7), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.6 16.4 14.1 8.9 7.3 0.8 0.1 14.9 14.0 14.6 0.0 11.3
10 20.1 19.4 8.2 8.4 6.4 4.1 3.1 20.9 22.4 22.2 2.6 10.7
15 22.6 18.4 8.9 7.2 6.2 3.2 3.6 23.1 22.0 17.7 2.1 7.1
20 23.9 19.3 8.1 5.7 5.7 4.4 2.3 25.5 25.6 21.2 4.7 8.6
30 24.9 19.2 7.9 4.8 4.7 4.9 3.9 29.2 25.4 21.9 4.8 8.5
50 25.8 22.4 8.1 3.8 4.8 5.1 4.6 28.9 32.0 27.5 4.5 7.6
k = 9, (skew = +1.7), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.5 16.9 18.1 8.0 7.7 0.8 0.6 18.2 20.3 17.2 0.1 11.3
10 20.1 22.4 13.1 8.1 5.4 2.3 3.4 29.4 29.7 21.8 4.4 11.0
15 26.4 20.8 10.8 5.7 6.7 2.3 2.2 31.0 36.0 25.4 3.3 9.2
20 26.8 24.0 6.8 7.6 4.5 3.0 3.9 38.0 33.3 25.4 4.5 9.9
30 30.4 24.3 8.8 5.8 5.6 4.9 5.5 34.1 33.4 26.2 4.1 7.8
50 31.3 21.6 9.2 5.2 5.7 4.4 4.7 37.0 35.3 26.2 4.4 8.3
k = 12, (skew = +1.7), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 11.5 21.3 23.9 16.1 7.8 0.3 0.3 23.5 23.1 19.6 0.0 19.2
10 22.5 24.6 15.7 8.5 7.5 3.5 3.8 32.6 33.8 21.9 2.8 12.1
15 29.5 25.5 12.5 8.8 5.8 2.6 3.3 37.7 38.8 25.8 2.2 12.5
20 29.3 26.3 9.9 6.2 4.2 3.5 3.2 39.5 39.8 27.0 3.0 10.7
30 33.7 29.1 10.2 6.7 5.3 4.1 4.3 42.7 43.1 30.4 3.3 7.4
50 35.2 26.6 8.5 6.0 5.7 3.6 4.8 42.6 46.1 32.0 3.2 9.4
k = 15, (skew = +1.7), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 11.6 22.9 25.1 18.6 8.1 0.5 0.3 25.3 26.0 18.3 0.0 22.1
10 23.1 24.7 14.3 10.0 5.5 2.7 3.4 39.5 40.5 23.4 2.1 13.1
15 30.6 28.9 11.9 8.3 6.6 2.9 1.8 45.0 44.4 27.6 1.9 11.9
20 33.3 27.5 12.8 7.7 4.6 3.6 3.8 45.8 47.2 30.6 3.1 12.0
30 37.3 29.3 8.7 6.8 6.6 4.1 5.0 48.8 47.6 32.8 4.2 9.0
50 39.2 27.2 9.2 4.9 5.0 4.3 4.4 48.9 48.9 33.9 4.4 11.3

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.11

The empirical probability of type I error for equal variances and a negatively skewed distribution (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, (skew = −1.8), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.4 12.3 7.4 7.3 7.2 0.4 0.3 11.2 10.2 11.1 0.0 7.5
10 14.0 14.9 8.6 5.4 5.0 3.5 2.8 14.4 14.1 14.9 3.3 7.6
15 15.4 13.6 7.5 6.6 4.7 2.9 4.0 14.3 13.2 14.1 3.5 8.3
20 15.0 16.2 9.1 6.1 5.6 4.5 4.9 18.8 17.5 16.1 5.0 6.4
30 19.8 15.3 7.2 4.5 5.4 3.5 4.0 18.1 15.5 14.8 3.7 7.6
50 18.7 18.4 5.9 3.5 4.3 4.0 5.6 18.3 20.4 18.6 5.1 7.2
k = 6, (skew = −1.8), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 12.8 13.6 12.4 9.6 7.1 0.5 0.4 15.1 13.8 12.0 0.1 10.8
10 18.5 20.4 11.0 8.6 6.5 3.5 3.8 24.9 22 18.5 3.2 10.5
15 23.3 21.5 8.1 4.6 5.6 2.4 3.5 23.9 25.3 22.4 2.4 8.9
20 22.9 19.5 8.4 5.9 5.3 4.3 3.4 26.6 26.2 21.7 3.7 7.7
30 24.0 19.1 6.9 5.4 5.2 3.5 4.3 24.7 26.9 22.3 3.8 8.9
50 23.4 19.6 7.9 5.5 5.2 5.2 4.7 26.8 28.5 24.0 4.3 6.8
k = 9, (skew = −1.8), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 12.0 18.3 18.4 13.7 7.0 0.8 0.7 17.8 15.5 16.2 0.0 15.5
10 18.6 22.7 13.6 8.1 5.6 2.2 4.1 27.7 28.9 23.2 2.9 12.5
15 25.9 23.0 10.4 6.7 4.8 3.4 4.2 29.1 32.6 23.3 1.9 9.9
20 29.6 24.3 8.2 7.8 4.4 5.1 4.1 35.8 30.6 24.4 4.0 8.8
30 30.9 23.4 7.6 6.0 5.9 2.5 3.4 37.4 36.7 27.0 3.0 8.9
50 32.5 24.4 7.0 5.3 5.3 3.9 4.4 36.6 36.6 27.3 3.8 8.3
k = 12, (skew = −1.8), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 12.3 21.2 20.4 15.4 9.6 0.1 0.7 23.9 22.9 20.3 0.0 16.9
10 22.4 22.9 13.6 9.3 4.8 2.6 3.1 33.5 34.4 22.9 3.5 11.8
15 27.5 24.4 11.7 6.7 6.8 2.6 2.5 37.6 40.1 27.6 2.0 9.5
20 30.1 27.4 10.2 6.2 5.1 3.7 4.7 41.1 42.2 28.7 2.9 11.5
30 32.9 27.4 10.3 6.4 6.0 4.3 4.4 43.0 43.6 28.5 3.3 9.3
50 33.1 27.4 8.7 5.8 5.1 4.4 5.0 44.6 44.2 32.9 3.1 9.2
k = 15, (skew = −1.8), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 12.6 22.2 27.6 17.2 7.4 0.4 0.2 26.0 27.3 18.9 0.0 20.8
10 24.1 24.7 14.9 9.3 7.5 3.6 4.2 38.7 42.2 24.8 2.4 12.9
15 27.1 27.3 11.6 9.4 4.9 2.6 2.4 44.0 46.7 27.9 1.9 10.2
20 31.2 25.9 10.9 7.6 4.3 3.2 3.6 45.3 46.1 32.4 3.5 10.2
30 37.8 26.9 12.0 7.1 5.2 4.0 5.0 49.5 48.1 31.5 3.0 10.3
50 37.1 29.5 8.6 5.5 4.0 4.9 4.1 51.2 49.0 32.4 5.2 9.8

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.12

The empirical powers for unequal variances (2x) and equal number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 1 : 2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 7.6 10.3 12.7 8.6 8.1 0.5 1.2 8.4 7.8 9.3 0.0 11.6
10 14.8 16.9 17.8 15.8 13.2 9.2 10.7 17.4 16.5 16.4 8.8 15.1
15 21.2 29.0 22.6 21.8 21.5 14.2 15.5 24.9 25.6 24.3 11.4 20.5
20 27.0 36.3 28.4 34.5 22.4 22.0 21.7 30.7 32.4 30.3 22.0 25.4
30 45.1 51.6 40.1 46.4 36.0 35.7 38.6 45.7 47.8 43.9 32.6 36.5
50 65.2 73.1 63.0 67.1 65.5 57.3 60.8 66.7 66.9 69.0 55.0 52.6
k = 6, var. ratio (5x)1 : 2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.5 10.2 14.5 10.8 9.1 0.6 0.5 7.7 6.4 8.7 0.1 10.2
10 12.9 18.8 18.3 19.8 14.1 8.6 9.4 16.1 15.7 15.7 8.2 15.2
15 21.3 27.6 21.9 25.5 19.3 13.5 15.0 20.2 23.0 25.1 9.8 18.0
20 25.1 36.9 28.9 34.00 24.2 21.5 19.6 29. 31.7 33.53 16.9 23.6
30 39.8 56.1 43.9 46.0 39.0 36.8 36.8 45.0 46.8 44.7 31.7 31.7
50 64.7 79.1 64.9 68.2 57.1 57.6 59.6 65.3 69.3 70.9 54.1 49.9
k = 6, var. ratio (5x)1 : 2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.8 12.2 16.9 14.7 7.5 0.4 0.8 8.2 6.5 9.4 0.0 14.0
10 10.5 17.5 16.7 19.0 13.8 8.1 7.0 13.7 16.1 16.0 5.4 12.3
15 15.3 29.4 21.6 26.6 20.5 10.5 10.5 19.8 19.9 20.2 8.9 17.0
20 23.0 36.9 28.8 32.0 20.3 18.1 18.8 26.7 31.8 33.4 17.1 18.9
30 39.4 56.0 36.2 48.5 37.5 31.9 31.2 37.7 42.9 46.8 27.5 29.3
50 59.3 59.1 56.4 69.7 53.2 55.7 58.4 56.2 59.9 73.5 49.2 48.9
k = 6, var. ratio (5x)1 : 2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 6.3 9.6 19.6 16.9 9.0 0.5 0.4 6.8 6.9 8.9 0.0 12.3
10 10.9 19.3 20.0 17.8 11.5 8.3 7.5 12.6 13.4 16.0 6.1 13.7
15 15.2 28.3 20.4 26.1 15.1 9.5 11.5 16.5 20.2 22.9 7.4 16.2
20 22.8 35.0 26.1 30.7 22.3 20.1 15.6 25.2 25.2 32.7 12.6 18.3
30 32.9 50.2 35.9 44.5 31.7 27.4 28.8 37.6 38.3 46.9 26.2 24.9
50 58.0 76.0 56.7 69.3 56.0 50.0 54.3 60.2 58.9 70.2 46.4 43.0
k = 6, var. ratio (5x)1 : 2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.1 11.1 19.9 19.5 7.5 0.3 0.1 7.9 6.3 8.0 0.0 13.8
10 9.6 17.4 16.1 19.2 12.8 5.6 5.1 11.5 12.2 15.0 5.1 12.0
15 14.8 25.5 20.0 24.2 15.1 7.2 8.6 17.9 19.0 21.4 5.7 16.1
20 22.3 37.9 26.8 32.5 18.7 16.4 15.3 22.5 25.6 31.4 13.0 17.2
30 34.1 52.3 36.1 45.4 32.4 25.7 28.4 32.4 37.6 47.4 22.0 25.6
50 58.8 75.8 58.8 68.1 50.8 46.9 47.2 58.1 57.0 72.3 43.9 39.2

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.13

The empirical powers for unequal variances (3x) and equal number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 1 : 3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 12.6 21.6 18.9 14.5 12.8 1.8 2.3 14.7 13.5 16.1 0.0 14.9
10 32.9 39.1 33.0 35.1 27.7 23.8 21.6 35.7 37.1 34.9 16.6 26.2
15 48.2 58.8 45.5 54.2 43.9 37.0 36.7 50.6 53.4 58.2 28.0 38.4
20 62.1 70.0 60.6 70.5 54.8 50.5 54.0 65.0 67.0 72.4 45.3 50.7
30 82.1 88.3 77.9 90.8 77.8 74.9 74.6 84.0 82.9 89.3 70.0 71.6
50 96.2 98.7 94.9 99.1 96.0 94.6 94.2 97.2 97.7 98.3 91.5 89.5
k = 6, var. ratio (5x)1 : 3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 11.6 22.4 24.5 22.1 12.9 2.7 2.3 15.2 15.5 22.9 0.0 17.6
10 26.8 43.3 36.2 45.4 30.6 22.8 22.4 35.2 36.2 43.3 17.0 26.1
15 45.6 64.3 49.9 63.7 43.4 36.6 34.4 51.9 55.5 62.7 29.5 37.8
20 57.8 77.5 62.6 76.9 59.6 54.8 53.0 66.9 68.2 78.8 47.7 47.6
30 82.6 90.5 81.9 93.8 77.4 76.5 77.7 86.1 87.9 91.0 67.9 66.6
50 97.8 99.4 95.8 99.4 96.1 93.6 95.0 97.5 97.6 99.0 93.4 88.4
k = 9, var. ratio (8x)1 : 3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 11.2 25.0 24.8 27.7 14.0 1.8 2.0 14.8 14.9 23.0 0.1 15.3
10 26.2 44.3 36.7 45.0 27.4 19.7 21.2 32.3 37.8 44.9 15.4 23.9
15 44.7 63.6 49.6 62.9 43.9 36.0 33.6 50.8 58.8 64.5 25.5 35.3
20 59.2 77.8 62.0 79.7 59.2 51.9 51.1 65.8 63.7 76.8 42.3 45.8
30 79.3 91.7 78.7 92.9 76.8 76.4 75.5 83.1 85.3 93.5 67.9 65.4
50 97.0 99.8 95.6 99.5 95.7 95.7 93.4 97.5 97.3 99.3 91.2 87.0
k = 12, var. ratio (11x)1 : 3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 9.7 24.1 26.6 30.4 13.7 1.4 1.1 13.6 16.6 21. 0.1 18.19
10 24.0 48.2 37.0 50.0 25.7 17.5 19.2 29.7 33.7 46. 10.9 23.73
15 38.3 62.9 48.1 64.9 40.4 31.7 31.1 46.9 50.7 63. 22.7 33.40
20 56.0 76.7 58.5 79.7 55.9 50.2 50.3 59.4 65.2 75. 39.3 42.36
30 79.6 89.8 78.0 93.4 72.9 72.8 72.9 80.2 83.5 91. 66.6 60.25
50 96.6 99.2 94.1 99.0 94.1 94.4 93.7 95.5 97.8 99. 90.4 84.96
k = 15, var. ratio (14x)1 : 3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.0 22.9 30.6 33.0 14.3 1.2 0.6 14.2 15.2 22.9 0.0 16.2
10 23.2 48.7 31.3 48.5 29.3 18.1 17.4 28.0 32.8 45.1 11.8 20.8
15 40.9 61.4 46.2 66.5 40.8 31.2 28.5 44.2 47.2 63.8 19.6 29.6
20 58.7 75.7 60.1 80.7 53.5 46.6 46.7 56.7 62.6 75.8 38.6 37.8
30 77.3 91.3 76.7 92.9 71.5 71.5 68.1 78.0 83.3 90.8 63.8 54.8
50 96.0 98.8 93.4 98.9 94.2 92.6 92.6 95.8 96.7 98.8 90.1 81.1

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.14

The empirical powers for unequal variances (4x) and equal number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 1 : 4, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 20.2 29.7 26.8 17.1 17.7 5.2 4.6 22.5 23.4 22.3 0.1 19.3
10 50.1 61.3 48.5 44.1 43.2 36.4 35.7 55.6 55.8 57.2 31.4 43.5
15 71.6 82.2 66.2 67.1 63.6 55.6 55.0 75.9 79.0 76.4 49.5 58.3
20 87.4 90.9 81.4 83.1 81.4 74.1 76.9 87.9 88.3 87.5 70.1 72.4
30 95.6 98.2 96.1 97.2 93.3 94.3 92.8 97.9 97.2 96.4 90.0 89.6
50 99.8 100 99.1 99.9 99.8 99.3 99.6 100 100 99.8 98.7 98.8
k = 6, var. ratio (5x)1 : 4, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 16.4 35.3 29.2 28.2 23.1 5.1 5.1 27.0 26.2 31.7 0.1 21.6
10 45.4 66.7 51.9 57.6 48.6 38.1 40.3 56.1 58.6 62.3 27.5 39.8
15 71.6 83.3 74.4 78.4 67.4 63.1 63.0 78.2 80.3 81.9 48.7 58.7
20 85.3 94.1 85.6 90.7 84.1 78.2 80.9 87.2 90.0 93.4 71.8 71.9
30 99.2 99.8 97.9 98.7 97.6 97.8 97.6 99.1 98.0 98.3 95.5 92.0
50 100 100 99.7 99.7 99.8 99.9 99.9 99.9 99.8 99.9 99.3 99.8
k = 9, var. ratio (8x)1 : 4, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 15.6 37.0 34.7 31.1 19.7 4.1 4.4 24.2 27.0 34.1 0.2 23.6
10 44.5 68.9 53.8 61.3 45.9 37.9 37.4 53.2 57.9 62.9 27.4 36.9
15 69.2 83.9 73.1 78.1 68.2 58.3 60.6 76.0 78.7 81.8 46.9 52.9
20 84.9 94.4 82.5 92.2 82.7 77.7 78.8 88.3 89.1 91.0 66.6 68.0
30 97.2 99.5 97.4 98.5 96.0 93.9 93.5 97.2 97.9 99.0 90.5 85.6
50 100 99.9 100 100 99.8 99.7 99.6 99.8 100 100 99.3 99.0
k = 12, var. ratio (11x)1 : 4, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 12.5 41.4 35.6 36.4 22.6 4.8 3.6 24.0 25.7 33.6 0.1 21.6
10 42.7 68.9 53.3 64.0 43.9 36.7 37.0 51.2 58.9 63.4 25.1 34.8
15 69.4 85.8 70.4 80.5 67.8 58.3 59.7 74.2 76.4 83.4 42.6 49.4
20 83.2 93.6 84.1 89.8 79.4 77.0 75.2 85.0 90.7 93.1 67.3 61.1
30 95.9 99.2 95.6 98.1 93.7 92.9 94.5 96.5 98.0 98.5 89.5 81.5
50 99.9 100 99.7 99.9 99.9 99.7 99.5 99.9 100 99.9 98.8 97.4
k = 15, var. ratio (14x)1 : 4, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 13.4 36.0 36.5 40.8 20.1 2.7 3.5 23.9 25.7 33.8 0.1 24.0
10 43.1 67.0 53.5 63.4 47.1 32.3 33.7 51.9 57.8 61.2 20.9 34.3
15 65.7 86.8 68.7 77.8 67.1 54.2 57.9 70.6 73.7 83.9 39.3 47.4
20 83.3 93.0 83.3 90.3 78.8 74.5 74.5 80.9 87.5 93.5 63.3 62.0
30 96.3 99.1 95.0 98.2 94.2 94.5 92.8 96.3 98.0 98.8 89.1 80.4
50 99.9 100 99.6 99.8 99.5 99.4 99.6 99.9 99.9 100 98.8 96.8

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.15

The empirical powers for unequal variances (8x) and equal number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 1 : 8, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 45.5 63.8 46.1 29.9 36.7 13.7 13.4 54.6 57.7 56.8 0.1 42.1
10 88.4 94.6 83.1 77.0 79.4 73.5 74.8 91.1 90.8 90.6 65.0 77.7
15 97.7 98.9 96.9 94.0 95.7 95.5 95.1 98.5 98.9 98.6 86.1 92.6
20 99.5 100 99.6 98.3 99.0 98.9 98.9 99.9 99.7 99.9 97.2 97.5
30 100 100 100 100 99.9 100 100 99.9 100 100 100 99.9
50 100 100 100 100 100 100 100 100 100 100 100 100
k = 6, var. ratio (5x)1 : 8, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 42.1 71.0 58.6 51.0 45.0 22.7 21.6 59.7 63.1 68.8 0.3 39.6
10 87.1 96.2 89.8 89.4 85.4 81.5 80.5 92.6 93.2 95.2 66.7 78.6
15 97.9 99.5 97.6 98.6 97.6 96.7 96.1 98.6 99.2 99.4 91.3 91.4
20 99.5 100 99.8 99.5 99.5 99.7 99.4 99.9 99.7 99.9 98.4 96.7
30 100 100 100 100 100 100 100 100 100 100 100 99.9
50 100 100 100 100 100 100 100 100 100 100 100 100
k = 9, var. ratio (8x)1 : 8, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 43.1 75.8 63.3 62.5 50.9 24.2 22.8 58.5 67.7 70.1 0.9 38.4
10 87.4 97.2 90.0 91.4 86.0 81.4 82.7 92.7 94.5 96.1 65.9 72.9
15 97.6 99.8 98.2 98.4 97.8 97.1 95.8 99.1 99.2 99.5 89.2 88.5
20 99.6 100 99.5 99.7 99.5 99.0 99.7 99.6 99.9 99.7 97.6 97.6
30 100 100 100 100 99.9 100 100 100 100 100 99.6 99.6
50 100 100 100 100 100 100 100 100 100 100 100 100
k = 12, var. ratio (11x)1 : 8, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 38.6 73.1 62.6 66.0 46.0 20.7 21.7 60.2 63.5 69.1 0.4 39.2
10 87.4 96.9 88.5 91.8 85.9 81.6 83.0 91.9 92.7 94.3 63.8 69.9
15 97.6 100 97.5 99.2 97.8 97.2 96.6 98.9 99.1 99.4 86.1 86.2
20 99.6 100 99.5 99.9 99.2 98.8 99.3 99.9 99.7 100 96.7 94.9
30 100 100 100 100 100 100 100 100 100 100 99.9 99.9
50 100 100 100 100 100 100 100 100 100 100 100 99.9
k = 15, var. ratio (14x)1 : 8, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 39.6 72.8 65.8 69.6 50.7 19.9 20.2 57.5 62.8 70.7 0.7 39.0
10 84.6 96.9 89.2 93.9 85.5 79.8 82.9 91.2 93.6 96.4 57.6 65.2
15 97.9 99.8 98.1 98.5 97.0 96.9 96.2 98.3 98.9 99.5 84.6 84.7
20 99.8 99.9 99.5 100 98.8 98.9 99.5 99.7 99.7 100 96.6 93.8
30 100 100 100 100 100 100 99.9 100 100 100 99.8 99.6
50 100 100 100 100 100 100 100 100 100 100 100 100

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.16

The empirical powers for unequal variances (1 : 2 : 2) and equal number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 2 : 2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 7.1 7.6 8.7 5.3 7.1 3.7 4.6 9.4 5.3 4.9 3.0 7.6
10 10.0 11.7 12.4 8.2 14.2 11.2 8.6 14.2 12.1 12.6 8.2 15.6
15 14.6 21.3 20.2 12.1 24.2 14.9 12.6 22.6 18.0 18.8 15.3 21.1
20 20.9 22.9 26.8 18.3 31.9 21.3 21.8 30.3 24.5 26.8 23.2 28.3
30 36.1 30.9 39.3 32.2 54.8 42.7 34.7 47.8 43.0 41.8 37.0 41.6
50 63.7 53.8 64.7 60.5 69.9 66.0 63.6 74.8 69.8 71.3 62.3 63.8
k = 6, var. ratio (2x)1 : (2x)2 : (2x)2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.9 9.6 12.4 5.7 8.8 5.0 4.7 10.0 6.0 8.10 4.7 9.4
10 10.1 14.9 18.0 6.3 12.2 9.5 9.6 18.7 16.9 14.8 11.0 15.7
15 14.0 15.0 25.2 11.3 24.6 20.1 17.2 29.5 25.5 15.7 18.9 25.4
20 21.4 22.1 34.4 18.6 32.4 32.3 27.8 44.6 37.2 29.6 30.0 36.3
30 38.2 32.3 55.0 32.2 55.3 52.9 50.2 64.6 62.4 42.6 50.4 56.1
50 72.6 51.8 84.9 63.9 85.2 81.0 81.9 90.4 89.0 71.3 81.4 80.5
k = 9, var. ratio (3x)1 : (3x)2 : (3x)2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 5.4 8.6 10.9 4.5 7.5 5.8 4.0 9.4 5.5 8.10 3.6 11.8
10 9.9 13.1 17.9 7.1 15.0 11.7 9.7 18.1 13.3 13.7 9.5 15.5
15 14.5 16.7 25.6 13.1 25.0 21.6 20.1 29.0 25.7 19.7 17.9 27.0
20 21.6 21.7 35.6 17.6 35.5 32.6 28.2 42.8 36.6 29.5 28.1 35.2
30 41.1 32.3 56.3 34.1 55.7 52.3 49.7 65.9 60.4 44.2 50.5 55.9
50 72.4 50.1 83.0 64.0 82.7 80.9 81.9 91.4 87.7 77.5 82.0 81.3
k = 12, var. ratio (4x)1 : (4x)2 : (4x)2, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 4.0 6.8 15.1 4.4 9.4 5.6 4.0 13.1 8.5 9.3 3.7 16.1
10 7.7 12.5 23.1 7.9 21.3 15.2 10.7 26.6 21.6 14.4 14.1 23.7
15 15.0 14.5 40.2 10.3 34.5 27.6 26.6 48.3 37.1 24.7 26.8 37.5
20 23.6 23.3 54.0 16.8 48.9 45.8 41.5 63.8 55.0 29.5 44.3 53.2
30 44.3 32.8 77.9 32.7 78.2 72.5 71.5 85.1 83.1 48.8 73.6 74.9
50 79.3 53.6 96.9 62.6 97.2 96.9 95.9 99.3 99.1 83.1 97.7 96.6
k = 15, var. ratio (5x)1 : (5x)2 : (5x)2,N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 4.5 8.0 18.6 6.1 9.6 5.0 4.3 12.6 5.8 9.6 4.8 17.8
10 10.2 13.3 28.1 7.8 20.7 16.4 13.1 30.8 24.3 15.1 16.7 28.7
15 13.7 17.7 44.7 12.6 39.4 34.5 29.7 53.6 41.9 24.9 36.8 43.8
20 25.8 21.1 59.5 16.3 58.0 51.6 48.2 70.2 64.1 30,0 51.6 58.7
30 48.3 29.8 83.9 33.5 80.3 81.7 77.8 90.9 89.9 52.8 82.6 85.0
50 82.9 55.8 99.1 63.5 98.8 98.8 98.4 99.7 99.2 82.6 98.6 99.3

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.17

The empirical powers for unequal variances (1 : 3 : 3) and equal number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 3 : 3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.7 15.3 14.8 7.1 10.2 7.0 6.3 17.3 8.3 12.2 6.5 11.7
10 32.8 25.9 30.7 15.8 27.5 22.3 20.5 40.1 32.1 33.9 23.0 33.5
15 52.0 38.6 48.4 30.9 45.2 44.7 42.4 63.9 55.0 53.2 44.9 51.8
20 69.8 49.5 67.0 46.5 66.7 63.2 59.4 79.4 72.0 73.2 62.4 68.8
30 91.6 68.1 88.4 80.9 86.7 87.0 86.0 94.6 92.7 93.0 87.3 88.9
50 99.5 94.8 99.5 98.9 99.2 99.4 98.7 99.7 100 100 99.2 99.4
k = 6, var. ratio (2x)1 : (2x)3 : (2x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 9.3 14.1 18.5 4.6 14.5 8.9 7.9 20.6 10.0 13.9 8.4 18.0
10 29.8 25.3 43.7 14.7 37.2 32.2 27.4 57.5 46.5 35.2 34.5 46.0
15 56.2 33.7 70.9 29.8 66.5 62.7 60.9 80.1 72.6 58.7 61.2 71.4
20 78.5 45.8 86.1 49.3 86.4 84.4 80.0 95.6 92.3 74.4 83.3 87.7
30 97.3 67.4 98.5 83.3 98.0 97.5 98.3 99.6 99.5 95.5 97.6 98.5
50 100 91.3 100 99.3 100 100 100 100 100 100 100 100
k = 9, var. ratio (3x)1 : (3x)3 : (3x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.8 13.4 25.1 6.3 19.0 9.3 8.7 27.4 14.3 14.8 10.8 23.4
10 30.0 26.0 57.4 16.7 46.5 43.9 38.9 71.3 54.9 36.5 40.5 57.4
15 63.1 39.2 85.1 25.9 81.9 78.0 75.5 90.5 88.4 61.8 74.8 86.7
20 82.7 46.5 95.6 47.8 95.4 93.5 92.6 99.5 97.1 83.3 93.6 95.3
30 98.0 68.6 99.7 82.1 99.7 99.5 99.8 99.9 100 98.3 99.7 99.7
50 100 93.6 100 98.8 100 100 100 100 100 100 100 100
k = 12, var. ratio (4x)1 : (4x)3 : (4x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 11.6 15.8 32.2 8.9 21.8 12.9 11.0 31.2 18.3 17.2 13.0 31.1
10 31.7 28.5 66.9 15.3 57.6 50.6 44.9 80.7 66.6 41.4 52.1 69.9
15 60.3 38.3 90.8 29.3 90.4 86.2 83.1 97.6 94.4 58.8 86.9 92.3
20 84.1 51.2 98.8 47.8 98.5 97.7 96.4 99.8 99.5 83.4 97.2 98.7
30 99.1 70.0 100 83.3 100 100 100 100 100 98.5 100 100
50 100 93.0 100 99.1 100 100 100 100 100 99.9 100 100
k = 15, var. ratio (5x)1 : (5x)3 : (5x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 10.2 12.1 36.1 7.9 25.6 11.7 11.7 36.8 21.8 16.8 11.9 36.8
10 35.0 27.2 74.1 15.0 66.2 59.4 57.4 87.3 75.8 36.5 58.4 79.2
15 66.5 40.2 97.1 30.0 93.8 89.8 90.3 99.4 97.5 67.9 92.1 96.1
20 87.8 54.2 99.9 48.1 99.4 99.1 99.0 100 99.9 90.3 99.2 99.4
30 99.5 72.0 100 82.0 100 100 100 100 100 99.1 100 100
50 100 95.2 100 99.2 100 100 100 100 100 100 100 100

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Table B.18

The empirical powers for unequal variances (1 : 2 : 3) and equal number of samples (α = 0.05). The cell values represent the percentage of rejected null hypothesis out of the total number of simulations conducted

k = 3, var. ratio 1 : 2 : 3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
05 12.4 22.5 17.2 10.4 15.0 10.4 6.9 16.6 13.6 13.6 9.1 14.7
10 25.8 41.1 34.2 24.1 31.3 26.0 23.4 37.5 34.7 34.7 23.8 30.4
15 42.0 54.8 52.9 41.8 48.0 44.2 43.8 58.6 56.0 52.8 41.1 47.6
20 59.0 69.7 68.8 59.3 64.4 64.6 60.1 73.0 71.3 69.4 58.7 62.8
30 85.8 88.1 84.9 84.0 84.3 84.3 83.6 91.7 90.5 87.0 83.4 82.2
50 98.1 98.6 98.1 98.7 98.3 98.3 98.0 99.6 99.2 98.5 98.2 97.0
k = 6, var. ratio (2x)1 : (2x)2 : (2x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 11.2 22.0 22.1 9.6 18.6 11.4 10.2 22.5 13.6 16.6 8.0 21.4
10 25.3 44.0 47.5 21.5 40.9 32.5 33.9 54.3 49.1 35.8 31.4 44.4
15 45.9 60.2 69.8 42.5 68.1 62.4 59.7 79.4 75.2 55.0 59.2 65.1
20 70.0 78.0 84.4 57.7 83.3 79.8 81.3 92.2 87.8 75.8 79.1 82.0
30 91.7 91.9 96.3 83.3 97.4 95.6 95.3 98.9 98.7 93.4 94.5 95.3
50 99.9 99.7 100 98.7 100 100 100 100 100 99.7 100 100
k = 9, var. ratio (3x)1 : (3x)2 : (3x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 9.1 20.0 26.9 8.3 19.5 14.5 10.4 29.4 20.1 15.1 11.4 27.8
10 28.6 44.6 59.9 24.8 51.9 44.2 44.0 65.9 62.3 33.7 41.1 56.2
15 53.0 65.2 81.7 41.9 79.3 74.1 74.7 89.9 85.4 53.6 73.1 77.8
20 76.8 78.4 93.4 61.1 93.4 90.9 88.5 98.0 95.5 71.3 91.2 91.0
30 95.0 94.3 99.3 82.3 99.6 99.4 99.3 99.9 99.9 91.1 99.1 99.2
50 100 100 100 98.0 100 100 100 100 100 99.9 100 100
k = 12, var. ratio (4x)1 : (4x)2 : (4x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 8.9 23.1 35.1 11.3 29.5 15.7 11.5 33.9 26.1 16.6 13.6 33.3
10 29.6 43.5 68.1 24.5 60.8 50.5 48.5 80.3 70.5 36.3 52.3 63.3
15 53.7 67.5 89.8 40.7 86.1 82.5 83.3 94.1 93.6 58.8 83.7 87.6
20 78.3 85.1 97.5 60.1 97.2 96.9 96.5 99.3 99.4 75.9 96.3 97.5
30 96.9 96.3 99.9 82.9 99.7 99.9 100 100 100 94.3 99.8 99.8
50 100 100 100 98.7 100 100 100 100 100 99.9 100 100
k = 15, var. ratio (5x)1 : (5x)2 : (5x)3, N(0, 1), α = 0.05
n HF CC LE LS LT BF OB BA ZV BD FL CO
5 8.6 21.5 39.5 9.3 29.0 17.1 14.4 36.9 28.3 17.5 15.7 36.5
10 25.6 49.8 75.4 24.6 70.3 62.4 58.2 86.9 76.0 36.1 59.6 72.8
15 56.5 70.2 94.1 42.0 93.3 90.9 90.1 98.0 96.9 59.7 87.4 93.2
20 80.6 86.3 99.0 60.6 98.6 98.5 99.0 99.9 99.6 76.3 98.0 97.9
30 97.7 97.4 100 83.6 100 100 100 100 100 95.9 100 100
50 100 99.9 100 98.0 100 100 100 100 100 100 100 100

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.

Acknowledgements

The authors extend their appreciation to the editor and referees for their valuable comments and suggestions, which has enhanced the quality of the current manuscript’s presentation.

Figures
Fig. 1. The type I error estimates (k = 3, 慣 = 0.05). The horizontal lines represent the limits of Bradley셲 criteria: Liberal (dotted), moderate (dotdash) and stringent (dashed). The different colors indicate the robustness of the homoscedasticity tests: Red (non-robust), yellow (sufficiently robust) and green (robust).
Fig. 2. The type I error estimates (k = 9, 慣 = 0.05). The horizontal lines represent the limits of Bradley셲 criteria: Liberal (dotted), moderate (dotdash) and stringent (dashed). The different colors indicate the robustness of the homoscedasticity tests: Red (non-robust), yellow (sufficiently robust) and green (robust).
Fig. 3. The type I error estimates (k = 15, 慣 = 0.05). The horizontal lines represent the limits of Bradley셲 criteria: Liberal (dotted), moderate (dotdash) and stringent (dashed). The different colors indicate the robustness of the homoscedasticity tests: Red (non-robust), yellow (sufficiently robust) and green (robust).
Fig. 4. The empirical power curves for different number of groups (k = 3, 9, 15), unequal variances (one of the groups has four times the variance of the other groups) and equal number of samples per group (n = 5 50) (慣 = 0.05).
Fig. 5. The empirical power curves for different number of groups (k = 3, 9, 15), progressively increasing variance (var. ratio 1 : 2 : 3) and equal number of samples per group (n = 5 50) (慣 = 0.05).
Fig. 6. The QQ plot and residuals plot of the experiment (k = 4, n = 10).
TABLES

Table 1

The probability of type I error of tests for homoscedasticity (n = 10, a = 0.05)

HF CC LE LS LT BF
0.025* 0.003 ns 0.152ns 0.001* 0.269ns 0.286 ns
OB BA ZV BD FL CO
0.286 ns 0.014 * 0.010 * 0.006* 0.327 ns 0.291 ns

HF = Hartley’s Fmax, CC = Cochran’s C, LE = Levene (absolute deviations), LS = Levene (square deviations), LT = Levene (trimmed mean), BF = Brown–Forsythe, OB = O’Brien, BA = Bartlett, ZV = Z-variance, BD = Bhandary-Dai, FL = Fligner-Killeen, CO = Conover’s Squared Ranks.


References
  1. Bhandary M and Dai H (2008). An alternative test for the equality of variances for several populations when the underlying distributions are normal. Communications in Statistics - Simulation and Computation, 38, 109-117.
    CrossRef
  2. Bhandary M and Dai H (2013). An alternative test for the equality of variances for several populations in randomised complete block design. Statistical Methodology, 11, 22-35.
    CrossRef
  3. Bradley JV (1978). Robustness?. British Journal of Mathematical and Statistical Psychology, 31, 144-152.
    CrossRef
  4. Bradley JV (1980). Nonrobustness in Z, t, and F tests at large sample sizes. Bulletin of the Psychonomic Society, 16, 333-336.
    CrossRef
  5. Brown MB and Forsythe AB (1974). Robust tests for the equality of variances. Journal of the American Statistical Association, 69, 364-367.
    CrossRef
  6. Cochran WG (1941). The distribution of the largest of a set of estimated variances as a fraction of their total. Annals of Human Genetics (London), 11, 47-52.
    CrossRef
  7. Conover WJ and Iman RL (1981). Rank transformations as a bridge between parametric and nonparametric statistics. The American Statistician, 35, 124-129.
    CrossRef
  8. Dag O, Dolgun A, and Konar NM (2018). Onewaytests: An R package for one-way tests in independent groups designs. R Journal, 10, 175-199.
    CrossRef
  9. David HA (1952). Upper 5 % maximum F-ratio. Biometrika, 39, 422-424.
    CrossRef
  10. de Mendiburu F and Yaseen M (2020) Agricolae: Statistical Procedures for Agricultural Research. R package version 1.4 .
  11. Delacre M, Leys C, Mora YL, and Lakens D (2019). Taking parametric assumptions seriously: Arguments for the use of Welch’s F-test instead of the classical F-test in one-way ANOVA. International Review of Social Psychology, 32, 1-12.
    CrossRef
  12. Fligner MA and Killeen TJ (1976). Distribution-free two sample tests for scale. Journal of the American Statistical Association, 71, 210-213.
    CrossRef
  13. Gorbunova AA and Lemeshko BY (2012). Application of parametric homogeneity of variances tests under violation of classical assumption. Proceedings of the 2nd Stochastic Modeling Techniques and Data Analysis International Conference Chania, Crete, Greece. , 1-9.
  14. Hartley HO (1950). The maximum F-ratio as a short cut test for homogeneity of variance. Biometrika, 37, 308-312.
    Pubmed CrossRef
  15. Hatchavanich D (2014). A comparison of type I error and power of Bartlett’s test, Levene’s test and O’Brien’s test for homogeneity of variance tests. Southeast Asian Journal of Sciences, 3, 181-194.
  16. Hennessy DA (2009). Crop yield skewness under law of the minimum technology. American Journal of Agricultural Economics, 91, 197-208.
    CrossRef
  17. Hsiung TH and Olejnik S (1994). Contrast analysis for additive non-orthogonal two-factor designs in unequal variance cases. British Journal of Mathematical and Statistical Psychology, 47, 337-354.
    CrossRef
  18. Kim SB, Kim DS, and Magana-Ramirez C (2021). Applications of statistical experimental designs to improve statistical inference in weed management. PLoS ONE, 16, e0257472.
    Pubmed KoreaMed CrossRef
  19. Kozak M and Piepho HP (2018). What’s normal anyway? residual plots are more telling than significance tests when checking ANOVA assumptions. Journal of Agronomy and Crop Science, 204, 86-98.
    CrossRef
  20. Lee HB, Katz GS, and Restori AF (2010). A Monte Carlo study of seven homogeneity of variance tests. Journal of Mathematics and Statistics, 6, 359-366.
    CrossRef
  21. Levene H (1960). Robust tests for equality of variances, Olkin I (Ed). Contributions to Probability and Statistics; Essays in Honor of Harold Hotelling, (pp. 278-292), Redwood City, CA, Stanford University Press.
  22. Lix LM, Keselman JC, and Keselman HJ (1996). Consequences of assumption violations revisited: A quantitative review of alternatives to the one-way analysis of variance F-test. Review of Educational Research, 66, 579-619.
    CrossRef
  23. Mirtagioglu H, Yi휓it S, Mendes E, and Mendes M (2017). A Monte Carlo simulation study for comparing performances of some homogeneity of variances tests. Journal of Applied Quantitative Methods, 12, 1-11.
  24. O’Brien RG (1979). A general ANOVA method for robust test of additive models for variance. Journal of the American Statistical Association, 74, 877-880.
    CrossRef
  25. O’Brien RG (1981). A simple test for variance effects in experimental designs. Psychological Bulletin, 89, 570-574.
    CrossRef
  26. Onifade OC and Olanrewaju SO (2020). Investigating performances of some statistical tests for heteroscedasticity assumption in generalized linear model: A Monte Carlo simulations study. Open Journal of Statistics, 10, 453-493.
    CrossRef
  27. Overall JE and Woodward JA (1974). A simple test for heterogeneity of variance in complex factorial designs. Psychometrika, 39, 311-318.
    CrossRef
  28. Parra-Frutos I (2013). Testing homogeneity of variances with unequal sample sizes. Computational Statistics, 28, 1269-1297.
    CrossRef
  29. Piepho HP (1996). A Monte Carlo test for variance homogeneity in linear models. Biometrical Journal, 38, 461-473.
    CrossRef
  30. Pohlert T (2021) PMCMRplus: Calculate Pairwise Multiple Comparisons of Mean Rank Sums Extended. R package version 1.9.3 .
  31. Raudonius S (2017). Application of statistics in plant and crop research: Important issues. Zemdirbyste-Agriculture, 104, 377-382.
    CrossRef
  32. Snedecor GW and Cochran WG (1983). Statistical Methods (6th ed), New Delhi, Oxford and IBH.
  33. Rossoni D and Lima RR (2019). Autoregressive analysis of variance for experiments with spatial dependence between plots: A simulation study. Brazilian Journal of Biometrics, 37, 244-257.
    CrossRef
  34. Stroup WW (2015). Rethinking the analysis of non-normal data in plant and soil science. Agronomy Journal, 107, 811-827.
    CrossRef
  35. Sharma D and Kibria BMG (2013). On some test statistics for testing homogeneity of variances: A comparative study. Journal of Statistical Computation and Simulation, 83, 1944-1963.
    CrossRef
  36. Tukey JW (1960). A survey of sampling from contaminated distributions, Olkin I (Ed). Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, (pp. 448-485), Redwood City, CA, Stanford University Press.
  37. Wang Y, Rodriguez P, Chen YH, Kromrey JD, Kim ES, Pham T, Nguyen D, and Romano JL (2017). Comparing the performance of approaches for testing the homogeneity of variance assumption in one-factor ANOVA models. Educational and Psychological Measurement, 77, 305-329.
    Pubmed KoreaMed CrossRef
  38. Wilcox RR (2003). Applying Contemporary Statistical Methods, San Diego, CA, Academic Press.
  39. Webster R and Lark RM (2019). Analysis of variance in soil research: Examining the assumptions. European Journal of Soil Science, 70, 990-1000.
    CrossRef
  40. Yamamotto ELM, Goncalves MC, Davide LMC, Rossoni DF, and Santos A (2022). Spatial variability in evaluation experiments of corn genotypes in the state of Mato Grosso do Sul, Brazil. Acta Scientiarum Agronomy, 44, e55972.
    CrossRef