In this paper, we study the maximal property of the volume of the convex hull of
Random convex hull, the convex hull of independent and identically distributed random points, have been studied for decades after the seminal work by Rényi and Sulanke (1963, 1964). In particular, researchers in stochastic geometry focus on the functionals of random convex hull (where the two most important functionals are the volume and number of faces) and investigate their finite and asymptotic properties.
The random convex hull is also employed in many multivariate statistical procedures. For example, Barnett (1976) defines an ordering of the multivariate data based on the notion of convex hull peeling depth. In Cook (1979), the random convex hull generated by the data points is used to identify the influential observations in linear regression. It is also used to find an optimal classifier in machine learning literature (Fawcett and Niculescu-Mizil, 2007; Lim and Won, 2012; Son
In this paper, we are interested in the maximal property of the volume of the convex hull of
The remainder of the paper is organized as follows. In Section 2, we introduce general notations to be used in the paper. In addition, we show the invariance of the volume of the random convex hull with respect to rotational and axis-scalable transformations. We also introduce multivariate location-scale (MLS) family indexed by ∑, denoted by MLS(∑), which is invariant to the two transformations above. In Section 3, we prove the maximal property of the random convex hull under independence when the random vectors are from a distribution in MLS(∑). We then discuss the Gaussian random convex hull as an illustrative example and provide numerical illustrations of the results in the section. Finally, in Section 4, we conclude the paper with a discussion on the extension of the results to serially correlated data.
Suppose we consider
A vertex of a convex set
In this section, we introduce the MLS family that we assume for the distribution of random vectors in the paper.
The MLS family is one of the important parametric families and many important distributions are included in location-scale family. The family {
where G(·) is an arbitrary given probability measure on the
Some examples of the MLS family are as follows. Elliptically symmetric distributions (or simply elliptical distributions) belong to a location-scale family (Ollila
where
In this paper, we assume
Let us consider
The following lemma shows that the vertex set of chull(
The proof of (ii) is very similar to that of (i). Thus, we only prove (i) at here.
We first show that P
By multiplying P
which implies
We now show that any point in chull(P
Again, by multiplying P
where the second equation is from that V = {
is equivalent to
which contradicts to that
We now present our main results of the paper which show the volume of the convex hull of a MLS(∑) is stochastically maximized when true covariance matrix ∑ is diagonal, equivalently,
Suppose V = {
which is from the rotation transformation
We now show that v.chull(
where
where
Finally, we conclude the proof using the Hadamard’s inequality (Cover and Gamal, 1983), which tells, for the covariance matrix ∑,
The Gaussian random convex hull, the random convex hull for
where
However, Hug (2013) points out that the explicit finite sample distribution function is still unknown. Instead, some of its asymptotic are known. For example, Bárány and Vu (2007) prove the central limit theorem for the volume of the Gaussian random convex hull.
Theorem 1 in Section 3 further shows that, for the general ∑, the volume has a constant multiplicative factor
is invariant to the scale transformation and has the same distribution with
where E(v.chull(
We now numerically illustrates the findings in the previous subsection. The identity (
where
has the same distribution with
and is invariant to the choice of
We generate samples from
and its log-determinant is log det(CS(
and its log-determinant is log det(AR1(
The box plots of
In this paper, the maximal property of the volume of the convex hull of
Possible future research direction is to extend this conclusion to the random convex hull from dependent samples including time series data. In time series data, the lagged plot, the plot (