TEXT SIZE

• •   CrossRef (0) Resistant GPA algorithms based on the M and LMS estimation  Geehong Hyuna, Bo-Hui Leeb, Yong-Seok Choi1,b

aDepartment of Epidemiology and Cancer Control, St. Jude Children’s Research Hospital, USA;
bDepartment of Statistics, Pusan National University, Korea
Correspondence to: 1Department of Statistics, Pusan National University, 2, Busandaehak-ro 63beon-gil, Geumjeonggu, Busan 46241, Korea. E-mail: yschoi@pusan.ac.kr
Received October 11, 2018; Revised November 13, 2018; Accepted November 13, 2018.
Abstract

Procrustes analysis is a useful technique useful to measure, compare shape differences and estimate a mean shape for objects; however it is based on a least squares criterion and is affected by some outliers. Therefore, we propose two generalized Procrustes analysis methods based on M-estimation and least median of squares estimation that are resistant to object outliers. In addition, two algorithms are given for practical implementation. A simulation study and some examples are used to examine and compared the performances of the algorithms with the least square method. Moreover since these resistant GPA methods are available for higher dimensions, we need some methods to visualize the objects and mean shape effectively. Also since we have concentrated on resistant fitting methods without considering shape distributions, we wish to shape analysis not be sensitive to particular model.

Keywords : generalized Procrustes analysis, least median of squares estimator, least squares, M-estimation, mean shape, resistant, shape analysis
1. Introduction

The shape is all geometrical information that remains when location, size and rotational effects are filtered from an object (Kendall, 1984). Two objects have the same shape if they can be translated, scaled and rotated to each other such that they match exactly. Procrustes analysis is an important tool for estimating a mean shape as well as measuring, describing and comparing the shapes of objects in many disciplines.

When several objects are fitted using Procrustes analysis the method has been called generalized Procrustes analysis (GPA) (Gower, 1975); however, the method has been called ordinary Procrustes analysis (OPA) when a single objects is fitted to one other. Since the methods use a least squares (LS) technique, they are not resistant to landmarks or object outliers. We consider procedures that are not sensitive to these outliers. Verboon and Heiser (1992) proposed the method using M-estimator in OPA, in this paper we produce the theorem for estimating parameters on high dimension and Dryden and Walker (1999) adapted the least median of squares (LMS) estimator.

In GPA, we propose two algorithms, calling them M GPA and LMS GPA algorithms, having resistant property in low concentration (which implies the data with large variability) as well as high concentration data (which implies the data with small variability) with several object outliers. We then compare these methods with LS method by visualizing the estimated mean shapes. A simulation and some examples is used to demonstrate the performances of the algorithms explicitly.

2. Procrustes analysis

Procrustes analysis is a very useful tools to analyze landmark data and involves fitting the objects using similarity transformations to be as close as possible by minimizing the sum of squared Euclidean distances. There are two techniques are two techniques in Procrustes analysis. One is OPA, which is used for fitting and comparing two configurations. Another is GPA for obtaining a mean shape and to explore the structure of shape variability in a dataset of more than two configurations.

Consider two centered k×m matrices X1 and X2 of coordinates from k landmarks in m dimensions. The estimation of α, Γ, and β by OPA method based on LS (LS OPA) is carried out by minimizing the squared Euclidean distance

$O(X1,X2)=ŌĆ¢X2-βX1Γ-1kαTŌĆ¢2,$

where $ŌĆ¢AŌĆ¢ =tr(ATA)$, α is an m × 1 location, Γ is an m × m rotation such that |Γ| = 1 and β > 0 is a scale parameter (Dryden and Mardia, 2016).

Consider GPA method by LS (LS GPA) with n ≥ 2 configurations of k landmarks in m ≥ 2 dimensions. Let k × m configurations X1,X2, . . . ,Xn be centered and scaled. We wish to minimize following objective function,

$G(X1,…,Xn)=1n∑i=1n-1∑j=i+1nŌĆ¢(βiXiΓi+1kαiT)-(βjXjΓj+1kαjT)ŌĆ¢2=∑i=1nŌĆ¢(βiXiΓi+1kαiT)-μŌĆ¢2,$

subject to a constraint on the centroid size of the mean shape μ,

$C(μ)=ŌĆ¢CμŌĆ¢ =1,$

where αi is location, Γi is rotation such that |Γ| = 1, βi > 0 is scale, i = 1, . . . , n, C(μ) is the centroid size, and C is the k × k centering matrix.

Then the solution to the minimization of equation (2.2) over μ is given by (Dryden and Mardia, 2016)

$μ^=1n∑i=1n(β^iXiΓ^i+1kα^iT).$

An explicit eigenvector solution to the mean shape for two dimensional data can be found as the complex eigenvector corresponding to the largest eigenvalue of the complex sum of squares and products matrix (Kent, 1994). The Procrustes mean shape $μ^$ has to be found iteratively for m ≥ 3 dimensional data. The GPA algorithm of Gower (1975), which is modified by Ten Berge (1977), is useful for a practical implementation when the objects are in m > 2 dimensions.

3. Resistant Procrustes analysis

### 3.1. M and LMS OPA

The LS OPA method in our the fitting problem is not resistant since it is badly affected by landmark outliers; therefore, we need some resistant fitting methods. These methods have been studied for appropriate fit of OPA in shape analysis. In particular, Verboon and Heiser (1992) suggested a resistant fit based on M-estimator. However, the procedure may not work well when a scale parameter is introduced and location is not considered. We would like to consider the fitting under the full set of Euclidean similarity transformations in our application. Dryden and Walker (1999) also adapt the LMS of Rousseeuw (1984) for shape analysis (LMS OPA).

Now we describe resistant ordinary Procrustes analysis using M-estimator (M OPA). Let two matrices X1 and X2 denote configurations of k landmarks (k ≥ 3). Recall that the object X1 can be fitted onto another object X2 by minimizing objective function, O(α,Γ, β), defined as

$O(α,Γ,β)=∑i=1kρ(ŌĆ¢riŌĆ¢),$

where residual ri = x(2)iβΓTx(1)iα, x(1)i, and x(2)i are vectors of ith landmark of X1 and X2 respectively, and α is a location vector, Γ rotation matrix and β positive scale parameter, and ρ(·) represents a loss function.

Then, Verboon and Heiser (1992) considered the use of Huber’s function

$ρ(ŌĆ¢rŌĆ¢)={ŌĆ¢rŌĆ¢22,ŌĆ¢rŌĆ¢ ≤c,cŌĆ¢rŌĆ¢-c22,ŌĆ¢rŌĆ¢ >c$

or biweight function,

$ρ(ŌĆ¢rŌĆ¢)={c2 [1-(1-ŌĆ¢rŌĆ¢2c2)3]/6ŌĆ¢rŌĆ¢ ≤c,c26,ŌĆ¢rŌĆ¢ >c$

as a loss function in (3.1) where c is some tuning constant. For large values of the tuning constant c, objective function reduces to LS OPA.

Theorem 1

Let two matrices X1 and X2denote configurations of k ≥ 3 landmarks in m ≥ 2 dimensions. M OPA estimates of α, Γ, and β can be found out by minimizing the objective function (3.1).

Then the solutions are written as

$α^=(X2T-β^Γ^TX1T)W1k1kTW1k,$$Γ^=UVT,$$β^=tr [Γ^TX1TW (Ik-1k1kTW1kTW1k) X2]tr [X1TW (Ik-1k1kTW1kTW1k) X1],$

where W = diag(w1, . . . ,wk) is a diagonal matrix with weight wi = ψ(||ri||)/||ri||, i = 1, . . . , k, ψ(·) = ρ(·)′, the derivative of ρ(·), and U,VSO(m) satisfying the singular value decomposition

$X1TW (Ik-1k1kTW1kTW1k) X2=UDλVT,$

with Dλ diagonal matrix of singular values.

The objective function O(α, Γ, β) of Theorem 1 can be minimized by an algorithm that is based on the method of iteratively reweighted least squares. There are two steps in the algorithm. First, the weights are calculated by wi = ψ(||ri||) /||ri|| with ρ(·)′ = ψ(·) . We use the weight matrix W = 1k as initial values. Next the parameters α, Γ, and β are calculated by weighted least squares as given by equations (3.4), (3.5), and (3.6), respectively. We stop the iteration if the conditions |O(s+1)O(s)| < ε is satisfied, where O(s) is the values of objective function O(α, Γ, β) on sth iteration.

Another resistant estimator is the LMS estimator, which has a breakdown of almost 50% (Rousseeuw, 1984) as an alternative to the repeated median. The objective function which Dryden and Walker (1999) have considered, is defined by

$s2(E)=median (ŌĆ¢(E)1ŌĆ¢2,…,ŌĆ¢(E)kŌĆ¢2),$

where E is an error matrix of regression equation

$X2=g (X1)+E,$

the elements of gG, a transformation group, are referred to as the similarity transformation parameters α, Γ, and β, and (E)i denotes ith row of E, i = 1, . . . , k.

For practical implementation, they considered the exhaustive selection, random selection such as PROGRESS and intelligent subset method.

### 3.2. M and LMS GPA

Unlike LS method, resistant methods limit the influence of objects outliers and the resulting fit is closer to the true mean shape. Rohlf and Slice (1990) considered GPA fit using the RM technique and gave an algorithm for resistant shape fitting. They compared LS GPA to their methods with wing data from mosquitoes. Er (1998) suggested a robust GPA method via the EM algorithm and compared the method with LS GPA by visualizing the estimates of the mean shape. In this section, we suggest two resistant GPA fitting algorithms based on the M and LMS estimator that are called the M GPA and LMS GPA algorithms.

First we introduce a resistant M GPA algorithm based on the M OPA fit method. Let n matrices X1,X2, . . . ,Xn be configurations of k ≥ 3 landmarks in m ≥ 2 dimensions. Then we can find out the Procrustes mean shape $μ^$ by minimizing objective function of the form

$O(αi,Γi,βi)=∑i=1nρ (ŌĆ¢Ri (Xi,μ)ŌĆ¢),$

subject to C(μ) = 1, where ρ(·) is the loss function like equations (3.2) and (3.3), and Ri (Xi, μ) denotes the residual matrix after OPA fit of Xi onto the mean shape, that is, $Ri (Xi,μ)=μ-βiXiΓi-1kαiT$, i = 1, . . . , n.

The estimate can be found by differentiation the objective function with respect to μ and setting the derivative to zero,

$∂O(αi,Γi,βi)∂μ=0,$

then the solution can be written as

$μ^=∑i=1n(wiβ^iXiΓ^i+wi1kα^iT)1nTW1n,$

where W = diag(w1, . . . , wn) is weight diagonal matrix with wi = ψ(||Ri||)/||Ri||, i = 1, . . . , n, and ψ(·) = ρ(·)′, i.e., the derivative of ρ(·).

For our practical implementation, we can adapt M OPA method to GPA procedures to estimate similarity parameters αi, Γi, and βi on the method of iteratively reweighted least squares. Therefore, the parameters are estimated by fitting each object onto the mean shape via M OPA method. After that, the Procrustes mean shape can be obtained from equation (3.10). The algorithm is as follows.

Algorithm 1. (M GPA algorithm)

Consider a set of k × m configuration matrices X1,X2, . . . ,Xn. For our convenience, we assume that n configurations are centered using Helmert sub-matrix H and scaled so that ||Xi|| = 1.

• Step 1: Fit each Xi onto $μ^$(i) using M OPA method, where $μ^$(i) is initial Procrustes mean shape exceptith object Xi, i = 1, . . . , n.

• Step 2: Calculate the residual Ri(Xi, $μ^$ (i)), and the weights wi = ψ(||Ri||)/||Ri|| are obtained, i = 1, . . . , n.

• Step 3: Find the estimate of mean shape from equation (3.10).

• Step 4: Replace $μ^$(i) with $μ^$, repeat steps 1 to 3, until the criterion |O(s+1)O(s)| < ε is satisfied, where O(s) is the value of objective function O(αi, Γi , βi) on sth iteration.

To achieve resistance, we find the n similarity parameters (αi, Γi, βi) separately using M OPA method, and update the Procrustes mean shape in steps 2 and 3. We do not show that the objective function reduces; however, the algorithm finds the solutions quickly in practice.

Next we develop resistant GPA algorithm based on the LMS estimator. Consider the situation where we have n configuration of k landmarks in m dimensions X1,X2, . . . ,Xn. Then we can fit the each configuration Xi onto mean shape. That is,

$μ=gi (Xi)+Ei,$

where the elements of gi are the similarity transformation parameters αi, Γi, and βi, and Ei is an error matrix, i = 1, . . . , n. Then the LMS objective function can be defined by

$median (ŌĆ¢E1ŌĆ¢2,ŌĆ¢E2ŌĆ¢2,…,ŌĆ¢EnŌĆ¢2).$

We can find the estimate of mean shape $μ^$ by minimizing the objective function.

As the LMS OPA, this algorithm proceeds by repeatedly drawing subsamples of p different objects. For our implementation we use subsamples chosen by random selection method. The algorithm implemented in PROGRESS (Program for Robust reGRESSion) can require large computation time in our GPA procedures. PROGRESS algorithm was proposed to calculate the parameters by resampling method (Rousseeuw and Leroy, 1987). Therefore, we use a further selection method with the criterion based on the centroid size of randomly chosen subsample. Additionally, we exclude the subsample if the values of centroid size of the subsample are too small or large. The algorithm can be described as follows.

Algorithm 2. (LMS GPA Algorithm)

Consider a set of k×m configuration matrices X1,X2, . . . ,Xn. We assume that n configurations are centered and scaled. Let each subsample be Yj for j = 1, . . . , ns where $ns=(np)$ is the total number of subsamples.

• Step 1: Draw a subsample Yj of p different objects X1,X2, . . . ,Xp from n configurations without replacement.

• Step 2: Check chosen subsample Yj with size criterion, if satisfied, estimate the parameters gi j from fitting Yj onto μ by LS GPA, where gi j, i = 1, . . . , n is the similarity parameters for Procrustes fit of each Xi onto μ in drawing jth subsample, otherwise, go step 1.

• Step 3: Determine the corresponding LMS objective function in equation (3.11) with respect to the whole n objects.

• Step 4: Repeat steps 1 to 3, and choose least one of among the objective function’s values. Then the estimate of mean shape is defined as

$μ^=arg min [median (ŌĆ¢E1ŌĆ¢2,ŌĆ¢E2ŌĆ¢2,…,ŌĆ¢EnŌĆ¢2)].$

This algorithm requires more np residuals in step 3. So we calculate the residuals using similarity parameters corresponding to minimum of p residuals in step 2. In practice, our random selection method is implemented with smaller subsamples than ns. The algorithm does not find the Procrustes mean shape faster than M GPA; however, its implementation works well.

4. Examples

We investigate the sensitivity of LS and the resistance of M and LMS GPA methods, and compare their performance by visualizing differences between estimated mean shapes.

A random sample of 20 diamonds are simulated. Each landmark is simulated from mixture models of bivariate normal distributions with different covariances with data generated from (1–ŽĄ)BN(μ, 1)+ ŽĄBN(μ,2), where μ is true mean shape and ∑1 = 0.2I2, ∑2 = 1.0I2 in High of Table 1 and ∑1 = 0.6I2, ∑2 = 2.0I2 in Low of Table 1. The simulation results of Table 1 show that two resistant GPA methods have smaller norm values of the residual than LS. As the graphs (b), (c), and (d) of Figure 1 visualize, M and LMS GPA methods also yield more close estimates to the true mean shape than LS result.

Figure 2(a) shows the centered and scaled landmarks for the T2 mouse vertebrae data (with high concentration) with four artificial object outliers. Graph (b) indicates that the LS GPA method produces a Procrustes mean shapes affected by to object outliers. Figure 2(c) and (d) shows that the estimates by M and LMS GPA methods are closer to the true mean shapes; therefore, the resistant algorithms work well for the estimated mean shape of

Next, consider the 30 handwritten digit 3 data with low concentration and eight artificial object outliers. Graph (b) of Figure 3 shows that the LS GPA method is affected by the object outliers. Graph (c) and (d) that M and LMS GPA algorithms give better Procrustes mean shapes than LS GPA despite the configuration data set having a low concentration and some object outliers. Table 2 shows how the weights byMGPA fit control these object outliers effectively when the objects which have a relatively larger residual than the others with smaller weights (wi = 0.131).

5. Concluding remarks

We have proposed two resistant GPA fitting methods to estimate mean shape avoiding influence of object outliers. M GPA algorithm has found the Procrustes mean shape quickly; however, the LMS GPA algorithm takes more time to compute than M GPA. In LMS GPA, the size criterion has reduced its iteration number effectively; however, the algorithm was somewhat expensive to compute as random selection method.

We need methods that can visualize the objects and mean shape effectively since the OPA and GPA methods are available for higher dimensions. Multivariate techniques such as principle components analysis can be an appropriate method. Resistant fitting methods have been studied without considering shape distributions; however, it is important that the shape analysis should not be sensitive to a particular model such as the complex Bingham distribution or the complex angular central Gaussian distribution. These robustness problem is an issue for further research. An improved method reducing computational time is needed since the LMS GPA algorithm was somewhat expensive to compute as random selection method. We often wish to examine the structure of shape variability after obtaining a mean shape; therefore, a suitable method is needed to investigate the shape variability in a tangent space.

Figures Fig. 1. Procrustes mean shapes displayed with ŽĄ = 0.2 in low data set of ; (a) the centered and scaled landmarks and true mean shape, (b) LS, (c) M, and (d) LMS, where the solid line is true mean shape and the dotted line is the estimates of mean shape. LS = least squares; M = M-estimator; LMS = least median of squares. Fig. 2. Procrustes mean shapes for 23 T2 mouse vertebrae data; (a) landmarks for data with 4 object outliers, (b) LS, (c) M, and (d) LMS, where the solid line is Procrustes mean shapes for the data without object outliers and dotted line is that for the data with object outliers. LS = least squares; M = M-estimator; LMS = least median of squares. Fig. 3. Procrustes mean shapes for 30 handwritten digit 3 data; (a) landmarks for data with 8 object outliers, (b) LS, (c) M, and (d) LMS, where the solid line is Procrustes mean shapes for the data without object outliers and a dotted line is for the data with object outliers. LS = least squares; M = M-estimator; LMS = least median of squares.
TABLES

### Table 1

Norms of the differences of between true mean and estimated mean shapes

ŽĄ High Low

LS M LMS LS M LMS
0.1 0.0292 0.0164 0.0274 0.0728 0.0432 0.0643
0.2 0.0365 0.0283 0.0314 0.0785 0.0595 0.0530
0.3 0.0475 0.0347 0.0329 0.0905 0.0678 0.0757
0.4 0.0598 0.0376 0.0582 0.1346 0.0951 0.0522
0.5 0.0505 0.0367 0.0416 0.1350 0.0804 0.0885

LS = least squares; M = M-estimator; LMS = least median of squares.

### Table 2

Norms of residuals and weights after LS, M, and LMS GPA fits

LS M LMS

||Ri|| ||Ri|| wi ||Ri||
0.507 0.275 1.283 0.327 0.131 0.757 0.900 0.279
0.325 0.258 0.594 0.279 0.484 0.726 0.600 0.269
0.240 0.507 0.453 1.283 0.562 0.131 0.202 0.844
0.278 0.406 0.359 0.545 0.805 0.694 0.233 0.297
0.430 0.298 0.415 0.325 0.646 0.766 0.330 0.286
0.507 0.203 1.283 0.426 0.131 0.576 0.941 0.174
0.327 0.281 0.267 0.322 0.799 0.806 0.279 0.427
0.507 0.302 1.283 0.228 0.131 0.782 1.011 0.182
0.264 0.350 0.331 0.473 0.669 0.598 0.374 0.263
0.507 0.206 1.283 0.246 0.131 0.792 1.011 0.339
0.241 0.305 0.393 0.389 0.772 0.636 0.289 0.185
0.507 0.507 1.283 1.283 0.131 0.131 1.011 1.011
0.393 0.255 0.430 0.287 0.619 0.776 0.242 0.231
0.264 0.507 0.539 1.283 0.485 0.131 0.168 1.011
0.287 0.284 0.264 0.275 0.733 0.784 0.365 0.203

LS = least squares; M = M-estimator; LMS = least median of squares; GPA = generalized Procrustes analysis.

References
1. Dryden, IL, and Mardia, KV (2016). Statistical Shape Analysis with Applications in R. New York: John Wiley & Sons
2. Dryden, IL, and Walker, G (1999). Highly resistant regression and object matching. Biometrics. 55, 820-825.
3. Er, F 1998. Robust methods in statistical shape analysis. PhD thesis. University of Leeds. Leeds, UK.
4. Goodall, C (1991). Procrustes methods in the statistical analysis of shape (with discussion). Journal of Royal Statistical Society, Series B (Methodological). 53, 285-339.
5. Gower, JC (1975). Generalized Procrustes analysis. Psychometrika. 40, 33-51.
6. Kendall, DG (1984). Shape manifolds, Procrustean metrics, and complex projective spaces. Bulletin of the London Mathematical Society. 16, 81-121.
7. Kent, JT (1994). The complex Bingham distribution and shape analysis. Journal of Royal Statistical Society, Series B (Methodological). 56, 285-299.
8. Rohlf, FJ, and Slice, D (1990). Extensions of the Procrustes method for the optimal superimposition of landmarks. Systematic Zoology. 39, 40-59.
9. Rousseeuw, PJ (1984). Least median of squares regression. Journal of the American Statistical Association. 79, 871-880.
10. Rousseeuw, PJ, and Leroy, AM (1987). Robust Regression and Outlier Detection. New York: Wiley
11. Siegel, AF, and Benson, RH (1982). A robust comparison of biological shapes. Biometrics. 38, 341-350.
12. Ten Berge, JMF (1977). Orthogonal Procrustes rotation for two or more matrices. Psychometrika. 42, 267-276.
13. Verboon, P, and Heiser, WJ (1992). Resistant orthogonal Procrustes analysis. Journal of Classification. 9, 237-256.