TEXT SIZE

search for



CrossRef (0)
Nonparametric logistic regression based on sparse triangulation over a compact domain
Communications for Statistical Applications and Methods 2024;31:557-569
Published online September 30, 2024
© 2024 Korean Statistical Society.

Seoyeon Kima, Kwan-Young Bak1,b

aDepartment of Statistics, Sungshin Women’s University, Korea;
bSchool of Mathematics, Statistics and Data Science, Sungshin Women’s University, Korea
Correspondence to: 1 School of Mathematics, Statistics and Data Science, Sungshin Women’s University, 2 Bomun-ro 34 Da-gil, Seongbuk-gu, Seoul 02844, Korea. E-mail: kybak@sungshin.ac.kr
This research was supported by National Research Foundation (NRF) of Korea, RS-2022-00165581.
Received February 21, 2024; Revised April 22, 2024; Accepted April 26, 2024.
 Abstract
Based on the investigation of logistic regression models utilizing sparse triangulation within a compact domain in ℝ2, this study addresses the limited research extending the triogram model to logistic regression. A primary challenge arises from the potential instability induced by a large number of vertices, hindering the effective modeling of complex relationships. To mitigate this challenge, we propose introducing sparsity to boundary vertices of the triangulation based on the Ramer-Douglas-Peucker algorithm and employing the K-means algorithm for adaptive vertex initialization. A second order coordinate-wise descent algorithm is adopted to implement the proposed method. Validation of the proposed algorithm’s stability and performance assessment are conducted using synthetic and handwritten digit data (LeCun et al., 1989). Results demonstrate the advantages of our method over existing methodologies, particularly when dealing with non-rectangular data domains.
Keywords : barycentric coordinates, coordinate descent algorithm, logistic regression, RDP algorithm, triangulation
1. Introduction

Multiple regression is an important cornerstone of supervised learning in statistics with countless applications. It is used to identify the relationship between multiple predictors and a response variable. The basic idea extends to the generalized linear model in which the predictor is related to the response variable via a link function when the conditional distribution of the response belongs to an exponential family with some regularity conditions; see Nelder and Wedderburn (1972). A straightforward approach in the (generalized) linear model is to use a linear predictor, which is a linear combination of predictor variables. However, this approach is often too restrictive in many practical applications, especially when the predictors are related to the mean of the response via a complicated relationship. Nonparametric regression methods have the advantage of uncovering complex relationships between the predictors and response. Examples of nonparametric methods include local polynomial regression, kernel regression, basis expansion methodology such as spline and wavelet regression, and so on. One may refer to Tsybakov (2008); Hastie et al. (2009); Wasserman (2006) for an overview of nonparametric regression.

Many nonparametric estimation methods are known to enjoy good theoretical properties at least in the asymptotic sense. However, when examining performance in finite samples, considerations of the domain can lead to significant differences. Even in problems of estimating one-dimensional functions, extensive research has been conducted on methodologies aimed at addressing the impact of domain shape on estimation accuracy. Examples include estimating functions on positive domains (Geenens, 2021; Wright and Zabin, 1994), boundary effects (Müller, 1991), and estimation across the entire real line (Bak et al., 2021). Especially when dealing with multidimensional spaces, nonparametric methods typically require large sample sizes, making the influence of domain shape on estimation even more evident. Therefore, the development of techniques for smoothing and spatial regression applied to datasets distributed across domains with intricate geometries is a significant research topic in nonparametric estimation. To explore related issues and recent research findings, one can refer to Ferraccioli et al. (2021); Ramsay (2002); Sangalli et al. (2013); Wang and Ranalli (2007); Scott-Hayward et al. (2014), as well as the references cited therein.

Different shapes of domains impacting estimation accuracy is also observed within the generalized linear model framework. However, research into the development of estimation methodologies that reflect this phenomenon is very limited. Within the nonparametric approach to generalized linear models, the standard approach involves considering a tensor product space. In a popular approach using the regression spline model, this corresponds to constructing a tensor product spline basis for estimation. For example, Stone (1994) considered the use of polynomial splines and their tensor products in multivariate function estimation and showed that it leads to desirable statistical properties. However, a possible drawback of the tensor product spline method is that it implicitly assumes the shape of the domain. Specifically, tensor product splines assume that predictors are observed on a rectangular domain. In cases where the shape of the domain is irregular and complex, the supports of tensor product basis functions may not partition the domain appropriately. As a remedy for this, a nonparametric regression method based on triangulation, which efficiently partitions the domain using triangles, has been developed. Barycentric coordinates functions defined with respect to the resulting triangles form a basis for a space of piecewise polynomial functions over triangulation. For details concerning the triangulation and the barycentric coordinates basis functions, one may refer to Mark Hansen and Sardy (1998); Lai and Schumaker (2007); Jhong et al. (2022) and the references cited therein.

Striking a good balance between bias and variance is a fundamental issue in nonparametric estimation. In the triogram regression model, this comes down to choosing the optimal number and location of the vertices of the triangulation. Ideally, vertices should be densely placed in regions with high local fluctuation in the regression function, while regions with smooth variations should have fewer vertices. If an appropriate triangular partition can be obtained, the estimator can capture the local trends in the data without compromising the overall smoothness. To this end, Mark Hansen and Sardy (1998) considered stepwise selection of vertices with the use of the Rao (score) statistic for addition and the Wald statistic for deletion. Koenker and Mizera (2004) used total variation-type penalty in the quantile regression framework. In a similar vein, Jhong et al. (2022) introduced a sparsity-inducing roughness penalty in the mean regression problem and studied the asymptotic properties of the related estimators.

In this study, we investigate the logistic regression model based on sparse triangulation of the compact domain in ℝ2. Despite the promising possibility, there is very little research extending the triogram model to logistic regression. One practical reason is that a large number of vertices can compromise the stability of the algorithm, making it challenging to model complex relationships effectively. To address this issue, we introduce sparsity to the boundary vertices of the triangulation based on the Ramer-Douglas-Peucker (RDP) algorithm (Douglas and Peucker, 1973; Ramer, 1972), and employ the K-means algorithm to initialize the interior vertices in a data-adaptive way. Additionally, we adopt the coordinate descent algorithm to enhance the stability of the implementation strategy. We validate the stability of the proposed algorithm and assess the performance of the estimates through the application of synthetic data and handwritten digit data (LeCun et al., 1989). The results illustrate that our method offers advantages compared to existing methodologies when the data is observed in a non-rectangular domain.

The rest of the paper is organized as follows. Section 2 reviews the basics of the triangulation and the associated barycentric coordinates basis functions, and defines the logistic regression estimator. Section 3 describes the implementation scheme including the proposed triangulation process and coordinate descent algorithm. A numerical study including simulation and analysis of the digit data is presented in Section 4. Section 5 summarizes the findings of this study and presents discussion about possible generalizations of the results.

2. Background and problem set-up

2.1. Preliminaries

This section introduces the concepts of triangulation and tent spline basis, and defines the notations used throughout the paper. For details concerning the triangulation and the corresponding spline space, one may refer to Stone (1994), Mark Hansen and Sardy (1998), Koenker and Mizera (2004), Jhong et al. (2022), and Lai and Schumaker (2007).

Let Ω be a compact region in ℝ2. Let T be a triangle that is the convex hull of three points not located in one line. A collection Δ = {T1, . . . , Tg} of triangles in the plane with a disjointed interior is called a triangulation of Ω = ∪TΔT. We consider the space of continuous linear splines over a given triangulation . The linear tent spline basis functions {Bj}j=1J can be defined in terms of the barycentric coordinates functions of the triangles over Δ with the dimension J be the number of vertices. Specifically, given as vertex set {v1, . . . , vJ } in the triangulation Δ, basis functions are defined as

Bj(x)={bjTx(x),if xstar (vj)0,otherwise

for j = 1, . . . , J, where Tx is the triangle containing x and star(vj) is the set of all triangles that share the vertex vj. Here, bjTx(·) is the barycentric coordinates function with respect to triangle Tx. The barycentric coordinates function is illustrated in Figure 1.

Upon obtaining the basis functions, any continuous piecewise linear function is expressed as

sb(x)=j=1JbjBj(x) β€Šβ€Š β€Šβ€Š β€Šβ€Šfor bℝJ.

The {Bj}j=1J are linearly independent since Bj(vk) = 1 for j = k and Bj(vk) = 0 otherwise for vertices {vk}k=1J. By defining a basis through tent splines defined by barycentric coordinates, an effective fitting of nonparametric regression model is possible for the given arbitrary triangulation of the data domain. This can significantly improve estimation accuracy, especially when the shape of the domain is complex and irregular, and when the sample size is small, as will be illustrated in the numerical study of Section 4.

2.2. Model and estimator

Logistic regression model is introduced to deal with the binary classification problem in which the response variable Y takes on a binary value of 0 or 1. Given the predictors X = x, the conditional distribution of Y|X = x is assumed to follow the Bernoulli distribution with probability p(x) for x ∈ Ω ⊂ ℝ2.

The probability function p is modeled by a set of tent spline basis functions {Bj}j=1J. For b = (b1, . . . , bJ) ∈ ℝJ , we denote

pb(x)=σ(j=1JbjBj(x)),

where

σ(z)=11+e-z β€Šβ€Š β€Šβ€Š β€Šβ€Šfor zℝ

denotes the logistic function.

Suppose that we are given a set of data {(xi,yj)}i=1n, where xi ∈ Ω, yi ∈ {0, 1}. We define the log-likelihood function as

β„“(b)=i=1n[yibTB(xi)-log (1+ebTB(xi))],

where B(xi) = (B1(xi), . . . , BJ(xi)) ∈ ℝJ . The maximum likelihood estimator is defined as

β^=arg maxbℝJβ„“(b).

The sparse triogram probability estimator (STriPE) of p is given by

p^=pβ^.
3. Implementation scheme

3.1. Coordinate descent algorithm

This section summarizes the algorithm for fitting logistic regression based on a given triangulation. We first consider the standard Newton-Raphson algorithm for the logistic regression model. Let β„“ : Ω → ℝ be an objective function to minimize. A maximization problem can be reformulated as a minimization problem by taking the negative of the objective function. Thus, the negative log-likelihood S (b) is defined as follows

S(b)=-β„“(b)=-i=1n[yibTB(xi)-log (1+ebTB(xi))].

The gradient vector is given by

S(b)=i=1n[σ(bT​B(xi))-yi]B(xi).

The Hessian matrix is given by

2S(b)=i=1nσ(bT​B(xi))(1-σ(bTB(xi)))B(xi)B(xi)T.

The iterative update formula of the pure Newton-Raphson algorithm is given by

β˜k+1=β˜k-(2S(β˜k))-1S(β˜k) β€Šβ€Š β€Šβ€Š β€Šβ€Šfor k=0,1,.

The algorithm summarized above is outlined in Mark Hansen and Sardy (1998). However, this algorithm may exhibit some instability in practical applications. As the dimension of the spline space increases, the area of the triangle narrows, leading to convergence issues in the algorithm. The poor condition number of the Gram matrix has led to a decrease in numerical stability. In response to this issue, we consider the coordinate descent algorithm along with the initialization strategy to be described in the next subsection. We use the Taylor second-order approximation of the univariate objective function and employ it to obtain an update formula. We now consider the coordinate-wise optimization algorithm. Since initial value of coefficients be βΜƒ = (βΜƒ0, . . . ,βΜƒ0), we have the univariate objective function for the jth coefficient as follows,

Sj(bj)=S(β˜0,,β˜j-1,bj,β˜j+1,,β˜p).

The univariate objective function is approximated

qj(bj)=-β„“(β˜)+Sj(β˜j)(bj-β˜j)+12Sj(β˜j)(bj-β˜j)2.

To obtain the closed-form solution for the minimizer, we differentiate the above expression as follows,

qj(bj)=Sj(β˜j)+Sj(β˜j)(bj-β˜).

Setting the equation equal to zero, we derive the minimizer

bj=β˜j-Sj(β˜j)Sj(β˜j).

The jth element of ∇S (βΜƒ) and the (j, j)th element of ∇2S (βΜƒ) represent Sj(β˜j) and Sj(β˜j) respectively. Therefore, we obtain the following update formula

β˜jβ˜j-ηS(β˜)j2S(β˜)jj,

where η is an appropriately chosen step size.

3.2. Initial triangulation with Ramer–Douglas–Peucker and K-means algorithm

The RDP algorithm (Douglas and Peucker, 1973; Ramer, 1972), an effective method for polyline simplification, plays a crucial role in the reduction of vertices in a given linear path or polygon while preserving its overall shape. RDP algorithm operates on the principle of recursive division of the line, starting with an ordered set of points or lines and a specified distance threshold, Ι›, greater than zero. At the outset, the algorithm encompasses all points between the initial and final points of the curve, designating these terminal points as retained. It then identifies the points most distant from the line segment defined by these terminal points. This point, being the furthest on the curve from the approximating line segment, is critical for assessing the necessity of vertex retention. If this point’s distance from the line segment is less than Ι›, it implies that the curve can be simplified without significantly deviating from the original shape by discarding any points not marked for retention.

The numerical instability of the triogram method arises primarily when the number of vertices is large. There are several approaches to mitigate this issue and improve estimation accuracy at the same time. Koenker and Mizera (2004) advocated for β„“2 regularization, while Jhong et al. (2022) introduced a total variation type penalty to induce sparsity. Mark Hansen and Sardy (1998) proposed using Rao-Wald statistics for stepwise selection of vertices. However, these approaches primarily address regularization and adaptation at interior vertices, rather than resolving the instability stemming from an increase in boundary vertices. Therefore, particularly in logistic regression where algorithms tend to exhibit instability, having a large number of vertices does not ensure sufficient stability.

As a remedy, we adopt the RDP algorithm to impose sparsity on the boundary vertices. Previous studies have utilized the convex hull algorithm proposed by Eddy (1977) for initial triangulation; see, for example, Jhong et al. (2022) and Toussaint and Avis (1982). This algorithm is employed to find the minimum convex polygon for a given set of points, which often results in the generation of a large number of somewhat redundant boundary vertices. This dense representation of the boundary can cause instability of the optimization algorithm. If a large number of vertices are selected using the convex hull algorithm, adjusting the number of internal vertices does not significantly improve numerical performance. This is where the RDP algorithm has an advantage because it allows for a sparse representation of the boundary.

Since we are dealing with two-dimensional data, researchers can visually assess whether sparse triangulation is suitable for the given data. It seems sufficient to evaluate the adequacy by comparing sparse representations with the visualization of the data’s shape and convex hull. Although the sparsity parameter Ι› for the RDP algorithm can be tuned based on standard validation methods, we find that the choice of Ι› does not have a significant impact on practical performance as long as it stays within a reasonable range. We recommend choosing Ι› to be a value in {0.01, 0.05, 0.1} with some validation techniques if required.

3.3. Selection of interior centroids with K-means algorithm

We adopt the K-means algorithm (MacQueen, 1967) to determine the number and location of interior vertices in a data-adaptive way. The K-means algorithm groups data into clusters and enables the use of each group’s center as interior vertices in the triangulation process. This ensures that triangulation can be formed in areas with a large amount of data and significant variation.

In the triangulation process, K is the tuning parameter of the K-means algorithm, which represents the number of interior vertices. The choice of K has a significant impact on the performance of the proposed method. We choose optimal K via the following Bayesian information criterion (BIC) to ensure that it is an appropriate value based on the data:

BIC=Jlog(n)-2β„“(b),

where J represents total number of vertices and β„“(b) represents (2.2). Here, J is determined by sum of K and the number of boundary vertices.

Through numerical experiments, we confirmed that the triangulation strategy, combined with the coordinate descent algorithm presented in Section 3.1, significantly enhances numerical stability and estimation accuracy. The proposed algorithm was implemented using R software, employing the grDevices, stats, and RDP packages. The overall implementation algorithm is summarized in Algorithm 1 below.

Algorithm 1 : Implementation algorithm for STriPE

1: Input:x : predictors ∈ ℝn×2,
        Ι› : threshold,
        K : number of interior vertices,
        J : total number of vertices,
        η : step size,
        δ : tolerance,
        max_iter : maximum iterations
2: Function: Use function: grDevices.chull(),
          stats.kmeans(),
          RDP.RamerDouglasPeucker()
3: Initial triangulation:
4: Compute the boundary vertices:
vertex_chull = chull(x)
vertex_simpli f ied = RamerDouglasPeucker(vertex_chull[, 1], vertex_chull[, 2], Ι›)
5: Choose K using BIC statistic
6: Compute the interior vertices:
interior_centroids = kmeans(x, K)
7: Combine boundary vertices and interior centroids
8: Compute the design matrix G ∈ ℝn×J
9: Coefficient initialization:βΜƒ = βΜƒold = (1, . . . , 1) ∈ ℝJ
10: while diff > δ and iteration < max_iterdo
11: βΜƒold = βΜƒ
12: forj = 1 to Jdo
13:   Compute the gradient ∇S (βΜƒj) using (3.1)
14:   Compute the Hessian ∇2S (βΜƒj) using (3.2)
15:   Update β˜jβ˜j-ηS(β˜)j2S(β˜)jj using (3.3)
16: end for
17:  diff = ||βΜƒβΜƒold||
18: end while
19: Output:βΜƒ
4. Numerical studies

4.1. Simulation study

This section illustrates the advantages of the proposed method based on simulation studies. We consider three probability functions defined on non-rectangular domains. Each function is defined as a logistic transformation of a linear combination of basis functions defined on a triangulation obtained by adding 1, 3, and 2 interior vertices to pre-specified boundary vertices, respectively. The contour plots of the example functions and the domain areas can be seen in Figure 2.

We randomly generated x1, . . . , xn in this domain and applied the proposed triangulation strategy. The Ι› parameter of the RDP algorithm is adjusted to 0.05 for all three examples. In Figure 3, three plots in the top row represent the triangulation obtained from the standard convex hull algorithm. It is seen that each example has a total of 21, 24, and 14 vertices, respectively. Although the number of interior vertices is determined as K = 1, 3, and 2, respectively, for three examples, a dense basis representation is obtained because some of the boundary vertices are redundant, which causes a detrimental effect on the stability of the algorithm. On the other hand, the plots in the bottom row depict the result of applying the RDP algorithm, yielding a sparse representation through 7, 10, and 7 vertices with the regions closely resembling the true function’s domain. This significantly stabilizes the optimization algorithm.

We consider sample sizes of n = 100, 200, 300, 400 and 500. Through 50 replicates, we record the mean squared error (MSE) obtained by making predictions on randomly selected points. The MSE is defined as

MSE(p^)=11000s=11000(p(ts)-p^(ts))2,

where ts represents randomly selected points, independent of the training data, in Ω. To illustrate the performance, we compared the proposed method with the thin plate regression (Wood, 2003), cubic spline (De Boor, 1978) and kernel logistic regression (Zhu and Hastie, 2005). The selection of knots for implementing the cubic spline (CS) method was based on the quantile values of observations, while the hyperparameters of the kernel logistic regression (KLR) method were determined by the cross-validation. The average MSE values and standard errors for the STriPE method, thin plate spline (TPS) method, CS method, and KLR are summarized in Table 1, with the smallest MSE in boldface. Numerical results confirm that the proposed method exhibits superior performance. While the difference in MSE values between the proposed method and other methods tends to decrease as the sample size increases, we still observe that our method outperforms others, with particularly significant differences occurring in small samples.

4.2. Digit data analysis

We apply our method to analyze the handwritten zip code database presented in LeCun et al. (1989). We use features “intensity” and “symmetry” for this analysis. Intensity represents the count of black pixels in the images, and symmetry represents how closely a character resembles its specular image. We choose digits 7 and 8. Out of a total of 1187 training observations, we exclude outliers, resulting in the use of 1184 training observations. Test observations consist of 313. The class label y is set to 0 for digit 7 and 1 for digit 8. Intensity is considered as the predictor variable x1, and symmetry is used as the predictor variable x2.

In the left panel of Figure 4, we observe that the data is not distributed in a rectangular shape. We obtained an initial triangulation using the proposed initialization strategy. The user-defined parameter Ι› for the RDP algorithm was tuned to 0.01, and the number of centroids in the K-means algorithm was tuned to 3. This approach determines 10 boundary vertices and 3 interior vertices. In the left panel of Figure 4, the red dots and blue dots represent Digit 8 and Digit 7, respectively, while the black diamonds represent the vertices. We can observe that the three interior vertices are positioned at the boundary of the data, indicating their importance in forming the decision boundary. The right panel of Figure 4 visualizes the triangulation obtained using the given vertices.

For x = (x1, x2) ∈ Ω, we compute pΜ‚(x). If pΜ‚(x) is greater than 0.5, we predict y as 1 and otherwise, we predict y as 0. The decision boundary where pΜ‚(x) becomes 0.5 is depicted as a black line in both plots in Figure 5. The calculated in-sample accuracy from the training data is 0.8472, and the out-of-sample accuracy computed from the test data is 0.8233. For comparison, fitting a logistic model using thin plate spline, cubic spline and kernel logistic regression resulted in an out-of-sample accuracy of 0.81, 0.8066, 0.7866, respectively. Figure 5 visualizes the training and test data along with the decision boundary. The result implies that there is a tendency for intensity to decrease as symmetry increases for both digit 7 and digit 8. Symmetry is a factor that determines the classification of digit data. Digit 8 is greater than digit 7 in terms of the symmetry, and the region of digit 8 is formed on the right side of the decision boundary.

5. Concluding remarks

This paper proposed a multivariate nonparametric logistic regression method based on the sparse triangulation within a compact domain in ℝ2. Sparsity on the boundary vertices of the triangulation was imposed by applying the RDP algorithm to the initial vertices obtained by the convex hull algorithm. The complexity of estimation methods was controlled by the Ι› of the RDP algorithm and the number of centroids K in the K-means algorithm used to obtain the data-dependent interior vertices. This strategy combined with the coordinate descent algorithm helps stabilize the convergence property of the implementation algorithm. The performance of the proposed method was investigated using the synthetic and handwritten digit data. Results illustrate that the proposed method outperforms existing methods when the data is observed on non-rectangular domains.

The results of this paper are expected to provide a foundation for further research. They can be generalized and extended in a few ways. First, we can extend the method to the case of p-dimensional predictors where p > 2. To our knowledge, there has been no research applying spline methodology based on barycentric coordinates to nonparametric function estimation problems for p > 2. It is expected that by efficiently partitioning the domain using simplices and the associated spline basis, one can significantly improve the efficiency of nonparametric function estimation methods.

Second, we can consider combining the proposed methodology with sparsity-inducing penalization. In Jhong et al. (2022), a method for automatically selecting the number of vertices in triangulation-based regression problems using a total-variation type penalty was proposed. Building on this, we can develop a penalization method within the generalized linear model framework for choosing the number of vertices in a data-adaptive way after placing a sufficient number of vertices to ensure the flexibility of the model.

Figures
Fig. 1. The values b 1 123 ( x ) , b 2 123 ( x ) , b 3 123 ( x ) are the relative areas of the green, blue, red triangle with respect to the area of the triangle determined by {v1, v2, v3}.
Fig. 2. The plots represent the contour plots of three example functions and the domain of those functions.
Fig. 3. Plot of the example 1, 2, and 3 via initial triangulation based on the vertices determined by the convex hull algorithm (top) and the RDP algorithm (bottom).
Fig. 4. Left panel presents a plot of the training data and the initial vertices. The red dots and blue dots represent Digit 8 and Digit 7, respectively, while the black diamonds represent the initial vertices. The right plot shows the initial triangulation determined by the initial vertices.
Fig. 5. Plot of the training data (left) and test data (right) along with the decision boundary (black solid line).
TABLES

Table 1

The average MSE values and the standard errors (in parentheses) of STriPE, TPS, CS, and KLR with 50 replicated simulations

Example 1

Sample Size STriPE(se) TPS(se) CS(se) KLR(se)
n = 100 0.0160(0.0013) 0.0345(0.0048) 0.0355(0.0053) 0.0425(0.0059)
n = 200 0.0061(0.0004) 0.0305(0.0011) 0.0116(0.0012) 0.0281(0.0028)
n = 300 0.0051(0.0004) 0.0073(0.0004) 0.0069(0.0004) 0.0221(0.0015)
n = 400 0.0036(0.0002) 0.0057(0.0002) 0.0056(0.0002) 0.0198(0.0011)
n = 500 0.0032(0.0002) 0.0046(0.0002) 0.0043(0.0002) 0.0177(0.0014)

Example 2

Sample Size STriPE(se) TPS(se) CS(se) KLR(se)

n = 100 0.0247(0.0014) 0.0377(0.0043) 0.0400(0.0050) 0.0916(0.0020)
n = 200 0.0129(0.0005) 0.0160(0.0009) 0.0150(0.0007) 0.0890(0.0012)
n = 300 0.0112(0.0004) 0.0121(0.0005) 0.0124(0.0005) 0.0900(0.0011)
n = 400 0.0090(0.0003) 0.0098(0.0003) 0.0099(0.0003) 0.0877(0.0009)
n = 500 0.0087(0.0002) 0.0092(0.0004) 0.0091(0.0003) 0.0891(0.0010)

Example 3

Sample Size STriPE(se) TPS(se) CS(se) KLR(se)

n = 100 0.0122(0.0011) 0.0287(0.0045) 0.0271(0.0043) 0.0180(0.0007)
n = 200 0.0055(0.0004) 0.0062(0.0006) 0.0067(0.0009) 0.0154(0.0003)
n = 300 0.0047(0.0003) 0.0047(0.0004) 0.0050(0.0005) 0.0146(0.0002)
n = 400 0.0030(0.0002) 0.0033(0.0003) 0.0034(0.0003) 0.0143(0.0001)
n = 500 0.0026(0.0001) 0.0026(0.0001) 0.0026(0.0002) 0.0141(0.0001)

In bold, best row-wise.


References
  1. Bak K-Y, Jhong J-H, Lee JJ, Shin J-K, and Koo J-Y (2021). Penalized logspline density estimation using total variation penalty. Computational Statistics & Data Analysis, 153, 107060.
    CrossRef
  2. De Boor C (1978). A practical guide to splines, 27, springer-verlag, New York.
    CrossRef
  3. Douglas DH and Peucker TK (1973). Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: The International Journal for Geographic Information and Geovisualization, 10, 112-122.
    CrossRef
  4. Eddy WF (1977). A new convex hull algorithm for planar sets. ACM Transactions on Mathematical Software (TOMS), 3, 398-403.
    CrossRef
  5. Ferraccioli F, Arnone E, Finos L, Ramsay JO, and Sangalli LM (2021). Nonparametric density estimation over complicated domains. Journal of the Royal Statistical Society Series B: Statistical Methodology, 83, 346-368.
    CrossRef
  6. Geenens G (2021). Mellin–Meijer kernel density estimation on R+. Annals of the Institute of Statistical Mathematics, 73, 953-977.
    CrossRef
  7. Hastie T, Tibshirani R, Friedman JH, and Friedman JH (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, vol.2, Springer, Berlin.
    CrossRef
  8. Jhong J-H, Bak K-Y, and Koo J-Y (2022). Penalized polygram regression. Journal of the Korean Statistical Society, 51, 1161-1192.
    CrossRef
  9. Koenker R and Mizera I (2004). Penalized triograms: Total variation regularization for bivariate smoothing. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66, 145-163.
    CrossRef
  10. Lai M-J and Schumaker LL (2007). Spline Functions on Triangulations, Cambridge University Press, Cambridge.
    CrossRef
  11. Lai M-J and Li Wang (2013). Bivariate penalized splines for regression. Statistica Sinica, 23, 1399-1417.
  12. LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, and Jackel L (1989). Handwritten digit recognition with a back-propagation network. Advances in Neural Information Processing Systems, 2.
  13. MacQueen J (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, 281-297. Oakland, CA, USA.
  14. Mark Hansen CK and Sardy S (1998). Triogram models. Journal of the American Statistical Association, 93, 101-119.
    CrossRef
  15. Müller H-G (1991). Smooth optimum kernel estimators near endpoints. Biometrika, 78, 521-530.
    CrossRef
  16. Nelder JA and Wedderburn RW (1972). Generalized linear models. Journal of the Royal Statistical Society Series A: Statistics in Society, 135, 370-384.
    CrossRef
  17. Ramer U (1972). An iterative procedure for the polygonal approximation of plane curves. Computer Graphics and Image Processing, 1, 244-256.
    CrossRef
  18. Ramsay T (2002). Spline smoothing over difficult regions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 64, 307-319.
    CrossRef
  19. Sangalli LM, Ramsay JO, and Ramsay TO (2013). Spatial spline regression models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 75, 681-703.
    CrossRef
  20. Scott-Hayward LAS, MacKenzie ML, Donovan CR, Walker C, and Ashe E (2014). Complex region spatial smoother (cress). Journal of Computational and Graphical Statistics, 23, 340-360.
    CrossRef
  21. Stone CJ (1994). The use of polynomial splines and their tensor products in multivariate function estimation. The Annals of Statistics, 22, 118-171.
  22. Toussaint GT and Avis D (1982). On a convex hull algorithm for polygons and its application to triangulation problems. Pattern Recognition, 15, 23-29.
    CrossRef
  23. Tsybakov AB (2008). Introduction to Nonparametric Estimation, 1st edition, Springer Publishing Company, Incorporated, Berlin.
  24. Wang H and Ranalli MG (2007). Low-rank smoothing splines on complicated domains. Biometrics, 63, 209-217.
    Pubmed CrossRef
  25. Wang R, Ramos D, and Fierrez J (2012). Improving radial triangulation-based forensic palmprint recognition according to point pattern comparison by relaxation. In 2012 5th IAPR International Conference on Biometrics (ICB) New Delhi, 427-432, IEEE.
  26. Wasserman L (2006). All of Nonparametric Statistics, Springer Science & Business Media, Berlin.
  27. Wood SN (2003). Thin plate regression splines. Journal of the Royal Statistical Society Series B: Statistical Methodology, 65, 95-114.
    CrossRef
  28. Wright GA and Zabin SM (1994). Nonparametric density estimation for classes of positive random variables. IEEE Transactions on Information Theory, 40, 1513-1535.
    CrossRef
  29. Zhu J and Hastie T (2005). Kernel logistic regression and the import vector machine. Journal of Computational and Graphical Statistics, 14, 185-205.
    CrossRef