Jeon et al. (2019) proposed a new criterion to distinguish EMAR from the other missing mechanisms as follows. The missing mechanism of Y_{1} is EMAR if there is a pair of j and j′ satisfying C_{1}, and that of Y_{2} is EMAR if there is a pair of i and i′ satisfying C_{2}.
$$\begin{array}{l}{C}_{1}:{\omega}_{j{j}^{\prime}}^{+}<{\omega}_{j{j}^{\prime}}^{\text{min}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{or\hspace{0.28em}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\omega}_{j{j}^{\prime}}^{+}>{\omega}_{j{j}^{\prime}}^{\text{max}},\\ {C}_{2}:{\omega}_{i{i}^{\prime}}^{+}<{\omega}_{i{i}^{\prime}}^{\text{min}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{or\hspace{0.28em}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\omega}_{i{i}^{\prime}}^{+}>{\omega}_{i{i}^{\prime}}^{\text{max}},\end{array}$$
where
$$\begin{array}{l}{\omega}_{i{i}^{\prime}}^{+}=\frac{{\pi}_{i+12}}{{\pi}_{{i}^{\prime}+12}},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\omega}_{i{i}^{\prime}}^{\text{max}}={\text{max}}_{j}\frac{{\pi}_{ij11}}{{\pi}_{{i}^{\prime}j11}},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{and\hspace{0.28em}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\omega}_{i{i}^{\prime}}^{\text{min}}=\underset{j}{\text{min}}\frac{{\pi}_{ij11}}{{\pi}_{{i}^{\prime}j11}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{for\hspace{0.17em}}i\ne {i}^{\prime};\\ {\omega}_{j{j}^{\prime}}^{+}=\frac{{\pi}_{+j21}}{{\pi}_{+{j}^{\prime}21}},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\omega}_{j{j}^{\prime}}^{\text{max}}={\text{max}}_{i}\frac{{\pi}_{ij11}}{{\pi}_{i{j}^{\prime}11}},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{and\hspace{0.28em}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\omega}_{j{j}^{\prime}}^{\text{min}}=\underset{i}{\text{min}}\frac{{\pi}_{ij11}}{{\pi}_{i{j}^{\prime}11}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{for\hspace{0.17em}}j\ne j.\end{array}$$
Therefore, the missing mechanisms of Y_{1} and Y_{2} are MCAR or MNAR when the conditions C_{1} and C_{2} are violated. Note that these conditions are, unfortunately, the necessary conditions for MNAR not to fall on a boundary solution, implying that there exists a MNAR suffering from a boundary solution eventhough it violates the condition C_{1} and C_{2}. Any test to differentiate MCAR from MNAR is meaningless in such cases because some of π_{i j}_{21}, π_{i j}_{12}, and π_{i j}_{22} are, by force, made equal to zero solely due to mathematical restictions. In practice, EMAR is used when applying a MNAR to missing cells suffers from a boundary solution as discussed before.
One of our interests is if and how the distance of ${\omega}_{j{j}^{\prime}}^{+}$ from the boundaries ${\omega}_{j{j}^{\prime}}^{\text{min}}$ and ${\omega}_{j{j}^{\prime}}^{\text{max}}$ in C_{1} and that of ${\omega}_{i{i}^{\prime}}^{+}$ from the corresponding boundaries of ${\omega}_{i{i}^{\prime}}^{\text{min}}$ and ${\omega}_{i{i}^{\prime}}^{\text{max}}$ in C_{2} affect the performance of LR, Wald, Score tests based on the observed likelihood. We call $\text{min}({\omega}_{j{j}^{\prime}}^{\text{max}}/{\omega}_{j{j}^{\prime}}^{+},{\omega}_{j{j}^{\prime}}^{+}/{\omega}_{j{j}^{\prime}}^{\text{min}})$ and $\text{min}({\omega}_{i{i}^{\prime}}^{\text{max}}/{\omega}_{i{i}^{\prime}}^{+},{\omega}_{i{i}^{\prime}}^{+}/{\omega}_{i{i}^{\prime}}^{\text{min}})$ the boundary proximity to the boundary. The closer the boundary proximity is to 1 the closer the solution for α_{i}_{·} and/or β_{·}_{j} under MNAR is to 0 (i.e., a boundary solution).
The simulations are carried out to compare the three statistics, LR, Wald, and Score, for testing MCAR against MNAR in terms of the significance level α = 0.05 and the power of each test statistic. We assume that Y_{1} is MCAR or MNAR and Y_{2} is EMAR and known so that we focus on only the missing mechanism of Y_{1} for simplicity of discussion. It is straightforward to show that when Y_{1} is MCAR and Y_{2} is EMAR, the maximum likelihood (ML) estimates maximizing the observed likelihood of (2.2) are given by
$$N{\widehat{\pi}}_{ij11}=\frac{{z}_{ij11}{z}_{+j+1}{z}_{++11}}{{z}_{+j11}{z}_{+++1}},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\widehat{\alpha}}_{..}=\frac{{z}_{++21}}{{z}_{++11}},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\widehat{\beta}}_{i\xb7}=\frac{{z}_{i+12}}{N{\widehat{\pi}}_{i+11}}.$$
When Y_{1} is MNAR and Y_{2} is EMAR, the ML estimates are
$${\widehat{\pi}}_{ij11}=\frac{{z}_{ij11}}{N},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\widehat{\alpha}}_{i\xb7}\mathrm{\hspace{0.17em}\u200a\u200a}\text{satisfying\hspace{0.17em}}\sum _{i}{z}_{ij11}{\widehat{\alpha}}_{i\xb7}={z}_{+j21},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\widehat{\beta}}_{i\xb7}=\frac{{z}_{i+12}}{{z}_{{z}_{i+11}}}.$$
Using theses ML estimates, we test H_{0}: MCAR vs. H_{a}: MNAR to check type 1 error and to examine the power of LR, Wald, and Score test statistics under three scenarios of boundary proximities. Table 2 summarizes the three scenarios with different degrees of MNAR for 2×2×2×2 and 3×3×2×2 contingency tables. α_{1·} = α_{2·} and α_{1·} = α_{2·} = α_{3·} are equivalent to MCAR in Table 2. The boundary proximities defined by $\text{min}({w}_{j{j}^{\prime}}^{\text{max}}/{w}_{j{j}^{\prime}}^{+},{w}_{j{j}^{\prime}}^{+}/{w}_{j{j}^{\prime}}^{\text{min}})$ are S 1 ~ S 3 for 2 × 2 × 2 × 2 and S 4 ~ S 6 for 3 × 3 × 2 × 2 contingency tables.
Scenario S 1(and S 4) is the furthest from the boundary solution, whereas S 3(and S 6) is the closest. The other simulation factors are sample sizes from 5,000 and 10,000 and missing rates (item missing rate) of Y_{1} from 5% to 15% with a fixed 5% of Y_{2} missing and 2% of unit missing. The reason for considering large-sized samples is to secure a sufficient number of missing values to identify the missing mechanism, which are between 250 and 1,500. Each simulation combination is repeated 10,000 times.
Table 3 shows type 1 errors that are probabilities rejecting H_{0} :Y_{1} is MCAR when MCAR is true with the nominal level α = 0.05. Except for S 6 with missing rate = 5% and sample size = 5,000, type 1 errors of all tests are well maintained near 5%. The type errors are close to 5% as sample size increases from 5,000 to 10,000.
Table 3S 4 and S 5 present powers of the three statistics for three degrees of boundary proximities in different sample sizes, missing rates and the strength of MNAR. As with the maintenance of the significance level, no noticeable difference in power was found in the three test statistics. The boundary proximity has a profound effect on the powers of test statistics. The powers are rapidly reduced as the boundary proximity approaches 1 for all the three statistics (i.e., as S 1 → S 2 → S 3 and S 4 → S 5 → S 6).
Table 3S 4 and S 5 also show that the power increases as sample size increases, missing percentage increases, or the degree of MNAR increases as desired. However, the powers decrease as the number of levels in 2-way tables, I, increases by comparing the powers of Table 4 with those of Table 5 as the larger the number of levels, the higher possibility the boundary proximity close to 1. Since the boundary proximity close to 1 implies that there are the levels of MNAR Y_{1} corresponding to α_{i} close to zero and the levels of MNAR Y_{2} corresponding to β_{·}_{j} close to zero, Such levels in incompletely observed data should be rare events. As seen in Table 5, in particular, the very low powers of the three statistics for the boundary proximities close to 1 indicates that one must be careful to accept MCAR if there is no prior information that no level of missing data is rare event.