We will apply the expressions of D_{r} in equation (2.1) to three data sets: the Hald data set (Draper and Smith, 1981) and the rat data set (Cook and Weisberg, 1982) which were analyzed also by Cook (1977), and the body fat data set (Neter et al., 1996, p. 261). Using the probabilistic behavior of β̂ −β̂_{(}_{r}_{)} through the spectral decomposition of its covariance matrix cov(β̂ − β̂_{(}_{r}_{)}), Kim (2015) introduced an influence measure ${M}_{r}={x}_{r}^{T}{({X}^{T}X)}^{-2}{x}_{r}/(1-{h}_{rr})$ to investigate the influence of deleting an observation on the least squares estimate β̂, and the problem of deleting multiple cases was considered by Kim (2016). For these three data sets, the result based on the D_{r} values will be compared with that based on the M_{r} values.
3.1. Hald data
The regression model with the intercept term β_{0} is fitted to the Hald data set which consists of 13 observations on a single dependent variable and four independent variables. The estimated regression coefficients are β̂_{0} = 62.41, β̂_{1} = 1.55, β̂_{2} = 0.51, β̂_{3} = 0.10, and β̂_{4} = −0.14.
For the values of D_{r}, observation 8 has the largest value D_{8} = 0.394 and observation 3 has the second largest value D_{3} = 0.301. Based on the D_{r} values, observation 8 is thus identified as the most influential observation. However, the influence measure M_{r} shows that observation 3 has the largest influence on the least squares estimate of β (M_{3} = 879.74) but observation 8 is not significantly influential (M_{8} = 37.16). In order to seek some reasons for which the two results are contradictory to each other, we will investigate sources of the D_{r} values for observations 3 and 8. The eigenvalues of X^{T}X and their associated eigenvectors are included in Tables 1 and 2, respectively.
(a) Each row in Table 3 shows a normalized vector of each β̂ − β̂_{(}_{r}_{)}. The values of cos θ_{ri} shown in Table 4 can be considered as a measure of closeness of β̂ − β̂_{(}_{r}_{)} to the i^{th} eigenvector g_{i} of X^{T}X. As β̂ − β̂_{(}_{r}_{)} gets close to g_{i}, the value of cos θ_{ri} approaches to one. We note from Tables 2 and 3 that both vectors β̂ − β̂_{(3)} and β̂ − β̂_{(8)} are almost parallel to the last eigenvector g_{5} of X^{T}X, which can be confirmed by the cos θ_{ri} values shown in the last column of Table 4. The second to fifth columns of Table 4 show that both vectors β̂ − β̂_{(3)} and β̂ − β̂_{(8)} are almost orthogonal to each of the eigenvectors g_{1}, . . . , g_{4} of X^{T}X.
(b) For observations 3 and 8, the ratios cos^{2}θ_{8}_{i}/cos^{2}θ_{3}_{i} listed in Table 5 are much larger than one for all i = 1, 2, 3, 4, and hence they show that observation 8 is located closer to all of four axes g_{1}, . . . , g_{4} than observation 3.
(c) Since the line V_{3} over which β̂ − β̂_{(3)} is distributed is almost parallel to the eigenvector g_{5} of X^{T}X, the component of D_{3} associated with the eigenvector g_{5} which is l_{5}[(β̂ − β̂_{(3)})^{T}g_{5}]^{2}/pσ̂^{2} nearly makes a real contribution to the influence of observation 3 on β̂. These components of D_{r} (r = 3, 8) are included in Table 6. Also, the proportion of l_{5}[(β̂ − β̂_{(}_{r}_{)})^{T}g_{5}]^{2}/pσ̂^{2} to D_{r} is about 79% for observation 3 and about 7% for observation 8. On the other hand, the line V_{3} is almost orthogonal to all of the eigenvectors g_{1}, . . . , g_{4} of X^{T}X, and therefore the component of D_{3} associated with the eigenvectors g_{1}, . . . , g_{4} which is ${\sum}_{i=1}^{4}{l}_{i}{[{(\widehat{\beta}-{\widehat{\beta}}_{(3)})}^{T}{g}_{i}]}^{2}/p{\widehat{\sigma}}^{2}$ is likely to distort the influence of observation 3 on β̂. Observation 8 can be interpreted similarly to observation 3. The difference of ${\sum}_{i=1}^{4}{l}_{i}{[{(\widehat{\beta}-{\widehat{\beta}}_{(r)})}^{T}{g}_{i}]}^{2}/p{\widehat{\sigma}}^{2}$ between observations 3 and 8 is approximately −0.303, while the difference of l_{5}[(β̂ − β̂_{(}_{r}_{)})^{T}g_{5}]^{2}/pσ̂^{2} between observations 3 and 8 is approximately 0.210. The extent that the distance D_{8} distorts the influence of observation 8 on β̂ is far more severe than that of D_{3}. Thus the component ${\sum}_{i=1}^{4}{l}_{i}{[{(\widehat{\beta}-{\widehat{\beta}}_{(r)})}^{T}{g}_{i}]}^{2}/p{\widehat{\sigma}}^{2}$ plays a role of making the value of D_{8} large, while it plays a role of making the value of D_{3} relatively small. Hence the distance D_{8} enlarges the influence of observation 8 on β̂, while the distance D_{3} reduces the influence of observation 3.
This is a reason for which the use of the D_{r} values identifies observation 8 that is not significantly influential as the most influential one and it cannot detect observation 3 as the most influential one.
Even though the D_{r} value was introduced as an overall measure of the combined influence of observation r on all of the estimated regression coefficients, it would be desirable if the use of the D_{r} values reveals influential observations for each regression coefficient, but the use of the D_{r} values does not. The use of the D_{r} values asserts that deletion of observation 8 has the largest change in β̂. However, deletion of observation 8 does not bring about a significant change in either estimated regression coefficient, while deletion of observation 3 has the largest change in all of the estimated regression coefficients, as can be seen in what follows. Numerical computations of the values β̂_{k} − β̂_{k}_{(}_{r}_{)}, k = 0, 1, . . . , 4; r = 1, . . . , 13 show that deletion of observation 3 has the largest change in β̂_{k} for all k = 0, 1, . . . , 4. Table 7 shows the change in β̂_{k} due to deletion of each of observations 3 and 8. After removal of observation 8 from the sample, numerical computations based on the remaining sample of size 12 show that deletion of observation 3 still has the largest change in β̂_{k} for all k = 0, 1, . . . , 4 as listed in Table 8. After removal of observation 3 from the sample, numerical computations based on the remaining sample of size 12 show that deletion of observation 4 has the largest change −75.77 in β̂_{0}, deletion of observation 11 has the largest change 0.77 in β̂_{1}, deletion of observation 4 has the largest change 0.79 in β̂_{2}, deletion of observation 11 has the largest change 0.90 in β̂_{3}, and deletion of observation 4 has the largest change 0.75 in β̂_{4}. We note that the M_{r} values provide useful information about influential observations for each regression coefficient.
3.2. Body fat data
We fit the regression model with the intercept term β_{0} to the the body fat data set which has 20 measurements on a single dependent variable and three independent variables. The least squares estimates of the regression coefficients are β̂_{0} = 117.08, β̂_{1} = 4.33, β̂_{2} = −2.86, and β̂_{3} = −2.19.
Observation 3 has the largest value D_{3} = 0.299 and observation 1 has the second largest distance D_{1} = 0.279. The D_{r} values assert that observation 3 is the most influential observation. However, for the M_{r} values, observation 1 has the largest value M_{1} = 401.19 and observation 3 has M_{3} = 150.22, not the second largest value. We have contradictory results also for the body fat data. We will seek some reasons for this contradictory results by investigating sources of the D_{r} values for observations 1 and 3. Detailed computations will not be included here. The four eigenvalues of X^{T}X are 81290.24, 294.25, 119.82, 0.00062. The eigenvector corresponding to the last eigenvalue is (0.99909, 0.03012, −0.02583, −0.01592). Euclidean norm ||β̂ − β̂_{(}_{r}_{)}|| is 72.92 for observation 1 and 37.47 for observation 3.
(a) An investigation of the closeness between a normalized vector of each β̂ −β̂_{(}_{r}_{)} and each eigenvector of X^{T}X shows that cos θ_{r}_{4} is −0.9999988 for observation 1 and 0.9999891 for observation 3, which implies that both vectors β̂ − β̂_{(1)} and β̂ − β̂_{(3)} are almost parallel to the last eigenvector g_{4} of X^{T}X. Also, both vectors β̂ −β̂_{(1)} and β̂ −β̂_{(3)} are almost orthogonal to each of the remaining eigenvectors of X^{T}X.
(b) For observations 1 and 3, the ratio cos^{2}θ_{3}_{i}/cos^{2}θ_{1}_{i} is 5.09, 5.18, 15.26 for i = 1, 2, 3, respectively. Hence observation 3 is located closer to all of three axes g_{1}, g_{2}, g_{3} than observation 1.
(c) In the light of the results in (a), among the components of D_{r} (r = 1, 3) given in the first expression of equation (2.1), only the component
$$\frac{{l}_{4}\hspace{0.17em}{\left[{\left(\widehat{\beta}-{\widehat{\beta}}_{(r)}\right)}^{T}{g}_{4}\right]}^{2}}{p{\widehat{\sigma}}^{2}}$$
nearly makes a real contribution to the influence of observation r on β̂, and its value is 0.133 for observation 1 and 0.035 for observation 3. Also, the proportion of l_{4}[(β̂ − β̂_{(}_{r}_{)})^{T}g_{4}]^{2}/pσ̂^{2} to D_{r} is about 48% for observation 1 and about 12% for observation 3. On the other hand, since the line V_{r} (r = 1, 3) is almost orthogonal to all of the remaining eigenvectors g_{1}, g_{2}, g_{3} of X^{T}X, the component of D_{r} associated with the eigenvectors g_{1}, g_{2}, g_{3} which is
$$\frac{1}{p{\widehat{\sigma}}^{2}}\sum _{i=1}^{3}{l}_{i}\hspace{0.17em}{\left[{\left(\widehat{\beta}-{\widehat{\beta}}_{(r)}\right)}^{T}\hspace{0.17em}{g}_{i}\right]}^{2}$$
is likely to distort the influence of observation r on β̂. The component ${\sum}_{i=1}^{3}{l}_{i}{[{(\widehat{\beta}-{\widehat{\beta}}_{(r)})}^{T}{g}_{i}]}^{2}$ is 0.146 for observation 1 and 0.264 for observation 3. The extent that the distance D_{3} distorts the influence of observation 3 on β̂ is more severe than that of D_{1}. The component ${\sum}_{i=1}^{3}{l}_{i}{[{(\widehat{\beta}-{\widehat{\beta}}_{(r)})}^{T}{g}_{i}]}^{2}/p{\widehat{\sigma}}^{2}$ plays a role of making the value of D_{3} large, while it plays a role of making the value of D_{1} relatively small. Hence the distance D_{3} enlarges the influence of observation 3 on β̂, while the distance D_{1} reduces the influence of observation 1.
This is a reason why observation 3 has the largest D_{r} value, D_{3} = 0.299, though it is not identified as a significantly influential observation by the M_{r} values.
Furthermore, the D_{r} values do not provide useful information about influential observations for each regression coefficient but the M_{r} values do as can be seen in what follows. Numerical computations of the values β̂_{k} − β̂_{k}_{(}_{r}_{)}, k = 0, 1, 2, 3; r = 1, . . . , 20 show that deletion of observation 1 has the largest change in β̂_{k} for all k = 0, 1, 2, 3: β̂_{0} − β̂_{0(}_{r}_{)} is −72.86 for observation 1 and 37.44 for observation 3, β̂_{1} − β̂_{1(}_{r}_{)} is −2.12 for observation 1 and 1.02 for observation 3, β̂_{2} − β̂_{2(}_{r}_{)} is 1.88 for observation 1 and −0.87 for observation 3, β̂_{3} − β̂_{3(}_{r}_{)} is 1.08 for observation 1 and −0.69 for observation 3. Observation 3 is identified as the most influential one by the D_{r} values but it does not have a significant influence on any estimate β̂_{k} (k = 0, 1, 2, 3).
3.3. Rat data
The regression model with the intercept term β_{0} is fitted to the rat data set which consists of 19 measurements on a single dependent variable and three independent variables. The least squares estimates of the regression coefficients are β̂_{0} = 0.27, β̂_{1} = −0.02, β̂_{2} = 0.01, and β̂_{3} = 4.18.
For the D_{r} values, observation 3 has the largest value D_{3} = 0.930. For the M_{r} values, observation 3 has the largest value M_{3} = 1864.3. Both influence measures lead to the same conclusion that observation 3 is the most influential one. We will briefly seek some reasons for the same conclusion. Detailed computations will not be included here. The four eigenvalues of X^{T}X are 565097.6, 20.5, 0.16, 0.003. The eigenvector g_{4} corresponding to the last eigenvalue is (0.0213, −0.0052, 0.0005, 0.9998). For observation 3, we have Euclidean norm ||β̂ − β̂_{(3)}|| = 2.684.
The cosine of the angle between β̂ − β̂_{(3)} and g_{4} is 0.9993, which implies that β̂ − β̂_{(3)} is almost parallel to the last eigenvector g_{4} of X^{T}X and that it is almost orthogonal to each of the remaining eigenvectors of X^{T}X. Hence, among the components of D_{3}, only the component l_{4}[(β̂ −β̂_{(3)})^{T}g_{4}]^{2}/pσ̂^{2} nearly makes a real contribution to the influence of observation 3 on β̂, and its value is 0.781. The component ${\sum}_{i=1}^{3}{l}_{i}{[{(\widehat{\beta}-{\widehat{\beta}}_{(3)})}^{T}{g}_{i}]}^{2}$ is 0.149. Also, the proportion of l_{4}[(β̂ − β̂_{(3)})^{T}g_{4}]^{2}/pσ̂^{2} to D_{3} is about 84% and it is very high. Therefore the extent that the distance D_{3} reflects the real influence of observation 3 on β̂ is very high so that the value D_{3} can yield the same result as the value M_{3}.