TEXT SIZE

search for



CrossRef (0)
Predicting depth value of the future depth-based multivariate record
Communications for Statistical Applications and Methods 2023;30:453-465
Published online September 30, 2023
© 2023 Korean Statistical Society.

Samaneh Tata, Mohammad Reza Faridrohani1,a

aDepartment of Statistics, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
Correspondence to: 1 Department of Statistics, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran 1983969411, Iran. E-mail: m_faridrohani@sbu.ac.ir
Received February 19, 2023; Revised February 25, 2023; Accepted February 25, 2023.
 Abstract
The prediction problem of univariate records, though not addressed in multivariate records, has been discussed by many authors based on records values. There are various definitions for multivariate records among which depth-based records have been selected for the aim of this paper. In this paper, by means of the maximum likelihood and conditional median methods, point and interval predictions of depth values which are related to the future depth-based multivariate records are considered on the basis of the observed ones. The observations derived from some elements of the elliptical distributions are the main reason of studying this problem. Finally, the satisfactory performance of the prediction methods is illustrated via some simulation studies and a real dataset about Kermanshah city drought.
Keywords : multivariate record, depth function, depth-based multivariate record, prediction, elliptical distribution
1. Introduction

Let X1, X2, . . . is a sequence of independent identically distributed (i.i.d) random variables with cumulative distribution function (cdf) F(x, θ) and probability density function (pdf) f (x, θ). An observation Xj is defined as an upper record value if Xj > Xi for every j > i. Lower records can be defined in an analogous way, only by reversing the inequalities. Records of i.i.d random variables have a unique and unambiguous definition. Univariate records, their characteristics and their applications are well described in the literature. See Resnick (1973), Arnold et al. (2011), Nevzorov (2001) and the references therein. A survey of literature demonstrates comprehensive studies about estimation and prediction based on the univariate records. Ahsanullah (1980), Smith (1988), Berred (1998), Raqab et al. (2007), Ahmadi et al. (2009) and Ahsanullah and Nevzorov (2015) are some examples of studies about inferential problems according to the univariate records.

Introducing the notion of record in multi-dimensional space is not as simple as uni-dimensional space. This is due to the lack of some obvious ordering properties of univariate samples in the case of multivariate observations. Hence, various definitions of multivariate records have been presented. For example, Pareto record, dominating record and chain record presented by Gnedin (2007) and Hwang and Tsai (2010) and depth-based records presented by Tat and Faridrohani (2021) are some types of multivariate records.

Despite different definitions for multivariate records, the problem of prediction of future records has not been studied so far. The goal of this paper is to get into this problem through employing the depth-based records. In the depth-based procedure, the position of each observation is allocated by its depth value with respect to a global data cloud χs = {X1, . . .,Xs}. So, this value determines whether a multivariate observation can be a record. In other words, an observation is a depth-based record if its depth value is the smallest one among all the previous observations. Therefore, measuring the depth of observations in a global data cloud is an essential step in depth-based record recognition. Hence, predicting the depth value of future records can be essentially important.

The depth-based multivariate records have been defined and investigated thoroughly by Tat and Faridrohani (2021). Also, the marginal and joint distribution of depth-based record times and record values have been presented. Such information led to the problem of maximum likelihood estimation of the parameters in elliptical distribution family. This information which is useful in predicting the records is exactly what we are looking for in this paper.

Let X1, X2, . . . be a sequence of multivariate observations with cdf F(x, θ) and pdf f (x, θ). Let Rm = {R1,R2, . . .,Rm} is the set of the first m depth-based records and Zm = {Z1, Z2, . . ., Zm} is the set of their corresponding depth values. Predicting Rt with t > m condition to the observed Rm and Zm, can be an interesting problem in the field of multivariate records. Therefore, the ultimate goal is to predict the future depth-based record value, Rt, t > m. Each record value, Ri is considered as the concomitant of its depth value, Zi. So, predicting depth value related to the future depth-based records is an underlying and primitive step in the prediction of depth-based record values. In this paper, we are concerned with the prediction of the value of Zt, t > m, provided the knowledge of Rm and Zm. Undoubtedly, this problem brings us one step closer to the problem of predicting the future multivariate record value, Rt, t > m.

This paper is organized as follows. Section 2 firstly introduces depth notions and depth-based records and secondly presents the concomitant approach through which the definition of depth-based record is reconsidered and finally studies the distribution of depth value under 4 states while the observations are from one of the multivariate normal or multivariate t distributions and the records are recognized through Mahalanobis or projection depth functions. In Section 3, we introduce two methods, maximum likelihood and conditional median procedures for point and interval prediction of depth values. The aim of this paper is followed by these methods. Maximum likelihood equations don’t have closed forms and a numerical procedure is needed to find maximum likelihood predictor (MLP), while conditional median predictor (CMP) can be presented in a closed form. The performance of two predictors is evaluated in Section 4 by means of the bias and mean square prediction error. The corresponding prediction interval along with simulations and an illustrative example is also discussed in this section. Finally, Section 5 includes the concluding remarks.

2. Preliminaries

As pointed out in Tat and Faridrohani (2021), in depth-based viewpoint, an observation is a proper candidate for record if it is relatively far from the previous observations. So we should be able to determine the position of each observation relative to the dataset from which the observation is derived. The notion of depth can be a good tool for this matter. In this section, at the outset, we give a brief review of the notion of data depth and present two depth functions which will be used in this paper. Then, we introduce the depth-based multivariate records and review some of their needful characteristics.

2.1. Depth notion

Data depth is a device for measuring the centrality of a given point with respect to a multivariate dataset or distribution. Employing the depth values, a center-outward ordering of points and subsequently, center-outward ranking are formed. The center-outward ranking has been widely applied in multivariate nonparametric inferences. It would be beneficial in presenting depth-based multivariate records, too.

Let be the class of distributions on ℝp and let F. An associated depth function D(X, F) is defined to provide a center-outwared ordering of points X ∈ ℝp relative to distribution F. Based on center-outward ordering interpretation, the set of points that globally maximize depth is the center and the points near the center have higher depth Zuo and Serfling (2000a). A formal definition of statistical depth function has been presented by Zuo and Serfling (2000a) as follow:

Let, D(.; .) : ℝp * → ℝ1 is bounded, nonnegative and satisfies the following properties :

  • Affine invariance: D(AX + b, FAX+b) = D(X, FX), holds for any random X in ℝp and p × p nonsingular matrix A and p-vector b.

  • Maximality at center: D(θ, F) = supx∈ℝpD(x, F) holds if F is symmetric about θ in some sense.

  • Monotonicity relative to the deepest point: for any F having deepest point θ, D(x, F) ≤ D(θ+α(xθ), F) holds for α ∈ [0, 1].

  • Vanishing at infinity: if ||x|| → then D(x, F) → 0.

Then D(., F) is called statistical depth function. For more details see Zuo and Serfling (2000a) and Serfling (2006).

D(Xj, F) is a measure of depth value of the observation Xj with respect to the underlying distribution. When F is unknown, D(Xj, F) can be replaced with its sample version, D(Xj, Fs), which calculates the depth value of the point Xj with respect to a corresponding global data cloud, χs = {X1, . . .,Xs}.

Hereafter, D(Xj) = D(Xj, F) and Ds(Xj) = D(Xj, Fs) = D(Xj, χs).

The word of depth was used for the first time by Tukey (1975) to introduce the halfspace depth function. After that, various depth functions were introduced and applied for measuring the depth value of observations. Here, only Mahalanobis and Projection-based depth functions are presented.

Mahalanobis depth (Liu and Singh, 1993) : The Mahalanobis depth of x ∈ ℝp with respect to the underlying distribution F is measured by:

MD (x,F)=11+(x-μF)ΣF-1(x-μF),

where (x-μF)ΣF-1(x-μF) is a mahalanobis distance between x and the centered vector, μF, with respect to the dispersion matrix of distribution F, ∑F.

Projection-based depth (Liu, 1992; Zuo, 2003) : Suppose,

O(x,F)=supu=1|ux-μ(Fu)|σ(Fu),

and Fu is the distribution of uX, (|uxμ(Fu)|)/σ(Fu) equals with 0, if uxμ(Fu) = σ(Fu) = 0. Then the projection-based depth (PD) of a point x ∈ ℝp with respect to the given F is defined as:

PD (x,F)=11+O(x,F).

Remark 1. In situations where F is unknown, the sample versions of the above mentioned depth functions are substituted. The sample version of Mahalanobis depth function denoted by MD(x, Fs) = MD(x, χs), where s is the number of observations in a global data cloud, is created when (μ, ∑) is substituted by (χ̄, S ), where χ̄ is the centroid of the data and S is the empirical covariance matrix with respect to χs. Also, the sample version of PD is denoted by PD(x, Fs) = PD(x, χs). A specific pair of (μ,σ) results in a specific PD, but the pair (μ,σ) = (Med,MAD) have had long-term use, Med is median and MAD is median absolute deviation of χs.

2.2. Depth-based records

Let χs = {X1, . . .,Xs} be the global data cloud, containing all the first s’th random samples from the p-dimensional distribution function F(x, θ). The observation Xj is a sample depth-based record if either j = 1 or j > 1 and

Ds(Xj)=min (Ds(X1),Ds(X2),,Ds(Xj)),

where Ds(Xj) is the depth value of the observation Xj with respect to the data cloud χs.

Also, the sequences of depth-based record times {Tn} and record values {Rn} are defined as follows:

T1=1,         R1=X1,Tm=min {j>Tm-1:Ds(Xj)=min (Ds(X1),Ds(X2),,Ds(Xj))},Rm=XTm.

For more familiarity with depth-based record and details on its characteristics see Tat and Faridrohani (2021).

2.3. Distributions

To get the result, it is important to know the pdf of depths related to depth-based records.

Let χs = {X1, . . .,Xs} is a random sample from absolutely continuous distribution function FX(x, θ). Correspondingly, there is a set of depth values , s ≥ 1, where Yj = Ds(Xj) is a univariate random variable with continuous distribution function FY (y). is sorted in a descending order. Sorting is done by applying the center-outward ordering scheme such that the last member of the sorted is a lower record in the sense of univariate records. In this way, pairs of order statistics (X[1], Ds(X[1])), . . ., (X[s], Ds(X[s])) are achieved. Employing the concomitant procedure and according to the definition of depth-based records, Xj is a depth-based record if Ds(Xj) is a lower record in the set .

Now suppose that Rj = XTj, j = 1, 2, . . .,m, then r = {r1, . . ., rm} denotes a set of the first m observed depth-based record values from χs. Correspondingly, z = {z1, z2, . . ., zm} are the depth values of records, z j = Ds(rj) for j = 1, . . .,m, and z1 > z2 > · · · > zm.

According to Tat and Faridrohani (2021), the joint pdf of the first m depth of depth-based records, Zm = {Z1, Z2, . . ., Zm}, is similar to its ordinay univariate records obtained by Ahsanullah (2009) as follows:

fZ1,,Zm(z1,,zm)=fY(z1)FY(z1)fY(z2)FY(z2)fY(zm-1)FY(zm-1)fY(zm),

The marginal pdf of depth of the jth depth-based record, Zj, is given by

fZj(z)=fY(z)(-log FY(z))j-1(j-1)!,

Also the conditional pdf of Zt given Zm, m < t < s, is

ftm(ztzm)=fY(zt)(log FY(zm)-logFY(zt))t-m-1FY(zm)(t-m-1)!.

It should be noted that FY (y) depends on FX(x, θ) and cannot be independently determined. In the next subsection, we will explain their relevancy.

Now we intend to explicate the nexus between the distribution of the multivariate observations and the distribution of their depth values.

We understand from the literature that if observations are from elliptical distribution and the depth functions have some property, then the distribution of the depth values has a known form.

A random vector X in ℝp has an elliptical distribution, X ~ Ed(h; μ, ), if its density function is of the form

f(x;μ,Σ)=Cp|Σ|-12h((x-μ)Σ-1(x-μ)),

where Cp is a constant depending on p and the function h; μ is the center of the distribution; is a positive definite matrix and h(·) is a nonnegative function (Liu et al., 1999).

According to Zuo and Serfling (2000b), suppose that X has elliptical distribution defined in (2.4) and D(., F) is affine invariant (property P1) and attains maximum value at μ (property P2), then D(x, F) can be written in the form of

D(x,F)=g((x-μ)Σ-1(x-μ)),

for some nonincreasing function g.

The function g is accessible for both Mahalanobis and projection depth functions. For the Mahalanobis depth function and t ∈ ℝ, g(t) = 1/(1 + t) and based on Zuo (2003), g(t)=1/(1+t), for the projection depth function.

The family of multivariate t distribution and multivariate normal distribution are members of the elliptical distribution and can be written in the form of (2.5) easily.

If we suppose the multivariate observations are from multivariate t or multivariate normal distributions and the applied depth function is Mahalanobis or projection, we will have four states which will be presented in continue.

  • S1. Assume X1,X2, . . . be a sequence of i.i.d observations from Np(μ, ). Suppose that Mahalanobis depth function is applied for measuring the depth values of the observations. We define, Y = D(x) = 1/(1 + T), where T=(X-μ)Σ-1(X-μ)~χp2, a Chi-square distribution with p degrees of freedom (DeGroot, 2005). Then the distribution function of Y is

    FY(y)=1-yy1Γ(p/2)(12)p2tp2-1e-t2dt.

  • S2. Assume X1,X2, . . . be a sequence of i.i.d observations from Np(μ, ). Suppose that projection depth function is employed for measuring the depth values of the observations. Then Y=D(x)=1/(1+T) where T = (Xμ)′ 1(Xμ). The distribution function of Y is of the form

    FY(y)=(1-yy)21Γ(p/2)(12)p2tp2-1e-t2dt.

  • S3. Suppose X1,X2, . . . be a sequence of i.i.d observations from tp(μ, , ν). If Mahalanobis depth function is used for calculating the depth values of the observations, then Y = D(x) = 1/(1 + pF) where F = T/p = (1/p)(Xμ)′ 1(Xμ) ~ F(p, ν), the F distribution with p numerator degrees of freedom and ν denominator degrees of freedom. The distribution function of Y is given by

    FY(y)=1-ypyΓ((p+ν)/2)Γ(v/2)Γ(p/2)(pν)p2fp-22[1+(fp/ν)]p+ν2df.

  • S4. Let X1,X2, . . . be a sequence of i.i.d observations from tp(μ, , ν). Suppose depth values of observations are calculated using projection function. So, Y=D(x)=1/(1+pF), where F = T/p = (1/p)(Xμ)′ 1(Xμ) ~ F(p, ν), has the distribution function of the form

    FY(y)=(1-y)2py2Γ((p+ν)/2)Γ(ν/2)Γ(p/2)(pν)p2fp-22[1+(fp/ν)]p+ν2df.

Prediction depth of the future depth-based records will be done under each of the above states in what follows.

3. Prediction

In this section, we deal with the problem of predicting depth of the depth-based records. Suppose that we observe the first m depth-based records, r = {r1, . . ., rm}, and the corresponding depth values, z = {z1, z2, . . ., zm}, respectively from FX(x, θ) and FY (y), where θ is unknown. Our aim is to predict Zt, t > m, having observed z with two different schemes; maximum likelihood and conditional median prediction.

3.1. Maximum likelihood prediction

In this subsection, our objective is to predict the depth of the future depth-based records using maximum likelihood approach. The joint predictive likelihood function of Zt is given as

L(z,Zt)=f(z)h(zt|z)=f(z)ft|m(zt|zm),

where the last equality is obtained by the Markovian property of univariate records. If there exists Zt0=Ztmle such that

L(z,Zt0)=supztL(z,Zt),

then Zt0 is the maximum likelihood predictor (MLP) of Zt.

Using the equations (2.1)(2.3), the predictive likelihood function is

L(z,Zt)=j=1mfY(zj)FY(zj)×fY(zt)(log FY(zm)-log FY(zt)t-m-1(t-m-1)!,

and consequently, the predictive log-likelihood function is given by

l(z,Zt)=j=1m(log fY(zj)-log FY(zj))+(t-m-1)log (log FY(zm)-log FY(zt))+log fY(zt)-log (t-m-1)!,

The MLP of Zt can be obtained from the following normal equation.

0=l(z,Zt)Zt=fY(zt)FY(zt)-(t-m-1)fY(zt)FY(zt)×(log FY(zm)-log FY(zt)),

The above equations cannot be solved in closed form. Thus, we must use a numerical procedure to find the maximum likelihood predictor (MLP).

3.2. Conditional median prediction

Using the conditional distribution of Zt given Zm is an approach for prediction the future records. Here, we use this method for predicting depth of the future depth-based records.

Under the assumptions mentioned in this section and using the Markovian property of univariate records, the conditional density of Zt given z is as

f(ztz)=f(ztzm)=fY(zt)FY(zm)×(log FY(zm)-log FY(zt))t-m-1(t-m-1)!,         zt<zm,

The median of this distribution is called the conditional median predictor (CMP). The CMP is a function of zm. So assume that

Z^tCMP=h(zm)=ω.

Hence,

P(h(zm)Yzm)=12.

Using the equation (3.3)

ωzmfY(zt)FY(zm)×(log FY(zm)-log FY(zt))s-m-1(s-m-1)!dzt=12.

If we assume that Zm = zm has been observed, then by taking q = logFY (zm) − logFY (zt), the above equality can be rewritten as

0logFY(zm)FY(ω)qt-m-1(t-m-1)!e-qdq=12.

So,

Z^tCMP=FY-1(FY(zm)×e-Med(W)),

whereW ~ G(tm, 1) has gamma distribution with the shape parameter (tm) and the scale parameter

1. Also, Med(W) stands for the median of W.

Suppose that M = log(FY (zm))/(FY (zt)). It is easily shown 2Mzm~χ2(t-m)2. Thus, the exact (1 − γ)100% prediction interval for Zt is

I=(F-1(exp {-12χ2(t-m),(1-(γ/2))}FY(zm)),F-1(exp {-12χ2(t-m),(γ/2)}FY(zm))),

where χr,p2 is 100pth percentile from the χ2 distribution with degrees of freedom r.

4. Simulation and data analysis

In this section, we discuss some predictions of depth values related to the future depth-based record data extracted from a practical dataset by means of multivariate normal distribution and then conduct some simulation studies to assess the performance of MLP of depth value due to the future depth-based record as well as its CMP. For this purpose, we employ multivariate normal and multivariate t distributions along with the Mahalanobis and projection-based depth functions. All the computations are performed using R Software.

4.1. Simulation results

Here, we evaluate the performance of the different methods of predicting depth of the tth future depth-based records condition to observe the first m record values. Precisely, we evaluate the performance of the MLP and CMP in terms of bias and MSPE.

For data simulation, multivariate normal, Np(μ, ) and multivariate t distributions, tp(μ, , ν), for p = 2 and the following two sets of parameters are employed.

μ=(57),         Σ=(20171715),         ν=3,10,μ=(57),         Σ=(20101015),         ν=3,10.

In the above sets, is the covariance matrix for the multivariate normal distribution and the scatter matrix for the multivariate t distribution. In the two parameter sets, only the off-diagonal entries in have been varied. For the parameter set (4.1), the correlation is more intense; the density is more compressed, and generally, the contours form ellipses.

In each setting, 200 repetitions of independent random sample are generated sequentially and then the m MD-based and PD-based records, r = {r1, . . ., rm} and their corresponding depth values, z = {z1, . . ., zm}, for m = 6, 7, 8, 9, are extracted. Using the observed records and their depth values, point predictors of depth values are computed as well as the corresponding prediction intervals (PI) for t > m.

The MLP is derived from the solution of the equation (3.2) and according to equations (2.6) to (2.9). Since this equation should be solved analytically, we use the Nelder-Mead method implemented in the built-in maxLik function in the R package maxLik (Henningsen and Toomet, 2011) to maximize the likelihood functions in terms of suitable parameter. Here, the record to be predicted is known as the suitable parameter. Applying the optimization algorithm, one initial value for the parameter should be allocated. Our suggestion for the specification of this initial value is the average of the depth values related to the m observed depth-based records.

The CMP of depth values related to the tth depth-based record is resulted from the equation (3.4) and its PI is derived from the interval (3.5).

The bias and MSPEs of the predictors are computed for each method over 200 replications and these are all presented in Tables 13. Table 4 represents the average lengths (ALs) of 95% PI for the depth value related to the future tth depth-based record, based on the relation (3.5).

From the results reported in the Tables 13, some general facts are derived. These outcomes demonstrate proper predictions of depth values related to the future tth depth-based records. It can be quickly deduced from the results that by increasing the number of observed records, the prediction value gets closer to the actual value. Even when the gap between m and t is more less, predictions are made more accurately.

It turns out that regardless of depth and distribution functions, biases and MSPEs associated with the CMP are less than MLP. Nevertheless, conditional median procedure leads to more accurate predictions. Also, depth values related to the future Mahalanobis-based records are predicted more precisely than projection-based records.

Another derived conclusion from the above tables is when the Mahalanobis-based records are generated from multivariate normal distributions with parameters in (4.1) and have more intense contours, MLPs of depth values have less biases and MSPEs. Such inductions can not be deduced from other distributions and depth functions.

What can generally be concluded is that by increasing the number of the observed records, m, prediction under both procedures, maximum likelihood and conditional median, is well done.

From Table 4, it is observed that PIs based on projection-based records under multivariate normal distribution are narrower than Mahalanobis-based records. In addition, the difference between the length of Mahalanobis-based and projection-based prediction intervals is notable. This issue is quit the contrary for multivariate t distribution with ν = 3. In fact, PIs related to the depth values of the future Mahalanobis-based records from t distribution with 3 degrees of freedom have shorter lengths than projection-based records.

According to the results due to PIs, there is a clear evidence that predictions of depth values from multivariate distributions with parameter set (4.1) perform same as parameter set (4.2). This is true for both Mahalanobis and projection depth functions.

4.2. Data analysis

By considering a real dataset, we illustrate the performance of the ML and CM predictors of the depth value related to the future depth-based records in practice. The dataset is about the Kermanshah city drought. It consists of two measurements, total monthly precipitation (TMP) and average monthly temperature (AMT) during 66 water years, 1951–2016. The term water year is equivalent to 12-months period for which precipitation totals are measured. In Iran, water year is defined the period between September 23th of one year to September 22th of the following year. The data used in this case are gathered by the department of meteorology of Kermanshah city, which is a subset of Islamic Republic (I.R.) of Iran meteorological organization (IRIMO) and are accessible from “www.kermanshahmet.ir”. These data have been previously analyzed earlier by Tat and Faridrohani (2021).

Since the distribution of Kermanshah city did not follow from a multivariate normal distribution, we applied a box-cox transformation to get a dataset with a multivariate normal distribution. The MD-based and PD-based records along with the corresponding depth values were extracted from this transformed dataset. The information about records is reported in the Table 5.

In view of the depth-based approach, 6 MD-based and 5 PD-based records are recognized during 66 water years for Kermanshah city. These records consist of both wetness and dryness records. For more familiarity with them see Tat and Faridrohani (2021). In order to evaluate the performance of prediction procedures in reality, we eliminated the last depth-based record and endeavored to predict its depth value condition to the former records via ML and CM procedures from equations (3.2) and (3.4).

The results of predictions of depth values related to the tth depth-based records condition to the first m records are reported in the Table 6. It also contains the 95% prediction interval which is calculated from the relation (3.5).

With regard to the displayed results in the above table, the CMP is closer to the actual value than the MLP for both Mahalanobis and projection depth functions. It also should be noted that the Mahalanobis depth function has made more accurate predictions than the projection depth function.

5. Concluding remarks

The problem of prediction is an important issue in the field of records. It has been studied thoroughly for univariate records, while there is no entrance to prediction issue of multivariate records. For this purpose, we selected depth-based record from different definitions of multivariate records. So this paper can be considered a step forward in the problem of multivariate record prediction. Since a depth-based record is recognized by its depth value, we decided to study the prediction problem in two parts. In the first part, we focused on the prediction of depth values related to the depth-based records which is undoubtedly an underlying step for the second part. For this purpose, we supposed the observations are from multivariate normal or multivariate t distributions and records are recognized by Mahalanobis and projection depth functions from these observations.

In this paper, we studied the problem of prediction of depth values related to the future depth-based records through maximum likelihood (ML) and conditional median (CM) procedures. Besides, we could build a prediction interval for depth values. Finally, we evaluated the performance of both procedures by some simulation studies and a real dataset about Kermanshah city drought. Both results of the simulations and real dataset demonstrated satisfactory perfomarnce for both MLP and CMP.

The second part dealing with the prediction problem of depth-based record values will be considered in the future works.

TABLES

Table 1

Bias and MSPE of MLP and CMP for multivariate normal distribution with parameter sets (4.1) and (4.2)

DistributionsmtMahalanobisProjection

MLPCMPMLPCMP

BiasMSPEBiasMSPEBiasMSPEBiasMSPE
Multivariate normal distribution with parameter set (4.1)670.05010.00550.00302.2 ×1040.07890.0111−0.01445.8 ×104
680.06360.00720.00392.2 ×1040.07310.0103−0.02601.2 ×103
690.07380.00870.00402.4 ×1040.08710.0122−0.03421.7 ×103
6100.08330.01040.00442.5 ×1040.09030.0132−0.04202.3 ×103

780.04210.00420.00091.2 ×1040.07250.0093−0.01273.9 ×104
790.05230.00520.00101.7 ×1040.06980.0080−0.02167.5 ×104
7100.06180.0064−0.00151.8 ×1040.08160.0105−0.03001.2 ×103

890.03450.0031−0.00119.7 ×1050.06770.0074−0.00962.1 ×104
8100.04400.0040−0.00071.0 ×1040.07000.0076−0.01845.2 ×104

9100.02910.0025−0.00246.9 ×1040.06420.0067−0.00911.7 ×104

Multivariate normal distribution with parameter set (4.2)670.05450.0058−0.00052.9 ×1040.06790.00940.00544.4 ×104
680.07050.00780.00313.3 ×1040.06910.00990.00805.9 ×104
690.07940.00900.00393.4 ×1040.07840.01140.00766.3 ×104
6100.09320.01190.00414.2 ×1040.07920.01160.00405.2 ×104

780.04940.0047−0.00071.6 ×1040.05850.00690.00152.8 ×104
790.06100.00620.00121.6 ×1040.06030.00730.00152.9 ×104
7100.07170.00790.00171.7 ×1040.06980.00880.00173.0 ×104

890.04010.0037−0.00049.7 ×1050.05700.00610.00132.1×104
8100.05030.0044−0.00051.1 ×1040.06540.00890.00432.2 ×104

9100.03370.0025−0.00176.7 ×1050.05990.00610.00441.3 ×104

Table 2

Bias and MSPE of MLP and CMP for multivariate t distribution with 3 degrees of freedom and parameter sets (4.1) and (4.2)

DistributionsmtMahalanobisProjection

MLPCMPMLPCMP

BiasMSPEBiasMSPEBiasMSPEBiasMSPE
Multivariate t distribution with parameter set (4.1)670.03310.00430.00423.1 ×104−0.00150.00470.00554.6 ×104
680.04840.00600.00723.1 ×1040.01530.00480.01155.8 ×104
690.05800.00730.00743.9 ×1040.02800.00510.01445.8 ×104
6100.06380.00830.00514.2 ×1040.04120.00630.01526.1 ×104

780.02220.00330.00082.0 ×1040.01680.00410.00192.3 ×104
790.03190.00420.00321.4 ×1040.03150.00450.00713.3 ×104
7100.03750.00490.00351.5 ×1040.04590.00510.00733.3 ×104

890.01030.00260.00016.7 ×1050.03050.00360.00111.8 ×104
8100.01600.00300.00206.9 ×1050.04420.00450.00222.1 ×104

910−0.00180.00220.00021.5 ×1050.02950.0033−0.00241.3 ×104

Multivariate t distribution with parameter set (4.2)670.03750.00460.00363.1 ×1040.03950.00490.00514.5 ×104
680.05310.00690.00803.0 ×1040.05350.00700.01356.9 ×104
690.06090.00760.00823.2 ×1040.06350.00860.01627.1 ×104
6100.06860.00910.00843.3 ×1040.06370.00860.01607.2 ×104

780.02430.00350.00211.5 ×1040.02480.00340.00423.8 ×104
790.03410.00430.00431.5 ×1040.03420.00430.00924.6 ×104
7100.04110.00500.00491.6 ×1040.04050.00490.00985.0 ×104

890.01270.00270.00046.0 ×1050.01250.00260.00131.7 ×104
8100.01940.00300.00206.7 ×1050.02010.00330.00382.4 ×104

9107.3 ×1050.0019−7.1 ×1062.9 ×1050.00210.0011−0.00111.4 ×104

Table 3

Bias and MSPE of MLP and CMP for multivariate t distribution with 10 degrees of freedom and parameter sets (4.1) and (4.2)

DistributionsmtMahalanobisProjection


MLPCMPMLPCMP

BiasMSPEBiasMSPEBiasMSPEBiasMSPE
Multivariate t distribution with parameter set (4.1)670.01380.00319.6 ×1043.2 ×1040.01990.00400.00784.6 ×104
680.02910.00410.00603.9 ×1040.03370.00490.01408.6 ×104
690.04290.00520.00684.2 ×1040.04920.00650.01708.7 ×104
6100.05110.00600.00705.1 ×1040.05500.00710.01869.0 ×104

780.00580.00253.8 ×1042.2 ×1040.00990.00290.00334.3 ×104
790.01950.00300.00192.4 ×1040.02190.00350.00744.5 ×104
7100.02770.00340.00432.4 ×1040.03150.00410.01005.0 ×104

890.00420.0021−0.00221.5 ×1040.00220.00238.9 ×1042.3 ×104
8100.00830.00230.00141.7 ×1040.01040.00240.00412.6 ×104

910−0.00130.00171.2 ×1046.3 ×105−0.00700.0018−1.9 ×1051.4 ×104

Multivariate t distribution with parameter set (4.2)670.01040.00320.00123.2 ×1040.00240.00450.00838.6 ×104
680.02490.00420.00764.1 ×1040.00660.00460.01449.3 ×104
690.03500.00470.00844.5 ×1040.01730.00470.01659.4 ×104
6100.04600.00540.00854.8 ×1040.02170.00510.01679.6 ×104

780.00110.00260.00171.8 ×1040.00360.00360.00324.5 ×104
790.01390.00280.00422.2 ×1040.01800.00370.00654.8 ×104
7100.02370.00310.00502.3 ×1040.02770.00380.00684.9 ×104

89−0.00600.00219.3 ×1041.2 ×1040.01250.0032−6.1 ×1052.1 ×104
8100.00430.00230.00111.3 ×1040.02670.0033−1.1 ×1043.0 ×104

910−0.00100.0018−1.3 ×1047.6 ×1050.01100.0027−3.4 ×1041.6 ×104

Table 4

ALs of PIs for multivariate normal and t distributions under parameter sets (4.1) and (4.2)

Depth functionmtParameter set (4.1)Parameter set (4.2)


Normalt with ν = 3t with ν = 10Normalt with ν = 3t with ν = 10
Mahalanobis depth function67(0.0603,0.1122)(0.0045,0.0598)(0.0306,0.0952)(0.0608,0.1138)(0.0044,0.0584)(0.0307,0.0954)
68(0.0489,0.1065)(0.0013,0.0505)(0.0192,0.0874)(0.0493,0.1080)(0.0012,0.0495)(0.0193,0.0877)
69(0.0420,0.0981)(0.0004,0.0380)(0.0131,0.0761)(0.0423,0.0994)(0.0004,0.0373)(0.0132,0.0763)
610(0.0372,0.0893)(0.0001,0.0269)(0.0093,0.0647)(0.0374,0.0904)(0.0001,0.0265)(0.0094,0.0650)

78(0.0557,0.0965)(0.0031,0.0394)(0.0257,0.0744)(0.0557,0.0969)(0.0030,0.0383)(0.0259,0.0749)
79(0.0459,0.0924)(0.0009,0.0335)(0.0164,0.0689)(0.0459,0.0927)(0.0008,0.0325)(0.0165,0.0694)
710(0.0398,0.0860)(0.0003,0.0254)(0.0113,0.0608)(0.0398,0.0863)(0.0003,0.0247)(0.0113,0.0612)

89(0.0509,0.0831)(0.0020,0.0245)(0.0215,0.0592)(0.0512,0.0840)(0.0020,0.0251)(0.0222,0.0610)
810(0.0426,0.0800)(0.0006,0.0209)(0.0139,0.0552)(0.0428,0.0809)(0.0006,0.0214)(0.0142,0.0569)

910(0.0469, 0.0730)(0.0012,0.0150)(0.0175,0.0455)(0.0471,0.0733)(0.0013,0.0157)(0.0184,0.0482)

Projection depth function67(0.1058,0.1128)(0.0282,0.0904)(0.0943,0.1387)(0.1433,0.1594)(0.0258,0.0913)(0.0952,0.1402)
68(0.1029,0.1124)(0.0153,0.0845)(0.0781,0.1354)(0.1369,0.1582)(0.0154,0.0853)(0.0788,0.1369)
69(0.1005,0.1115)(0.0088,0.0751)(0.0663,0.1299)(0.1321,0.1563)(0.0089,0.0758)(0.0670,0.1313)
610(0.0985,0.1106)(0.0053,0.647)(0.0570,0.1235)(0.1280,0.1541)(0.0053,0.0654)(0.0575,0.1248)

78(0.0927,0.0969)(0.0242,0.0782)(0.0929,0.1363)(0.1452,0.1613)(0.0243,0.0785)(0.0941,0.1381)
79(0.0909,0.0967)(0.0131,0.0730)(0.0770,0.1330)(0.1388,0.1602)(0.0131,0.0733)(0.0779,0.1349)
710(0.0893,0.0962)(0.0076,0.0649)(0.0654,0.1277)(0.1338,0.1583)(0.0076,0.0651)(0.0662,0.1295)

89(0.0808,0.0834)(0.0199,0.0647)(0.0888,0.1296)(0.1440,0.1593)(0.0208,0.0673)(0.0899, 0.1312)
810(0.0795,0.0833)(0.0108,0.0604)(0.0737,0.1266)(0.1377,0.1582)(0.0112,0.0628)(0.0745,0.1282)

910(0.0715,0.0733)(0.0162,0.0530)(0.0835,0.1213)(0.1406,0.1547)(0.0170,0.0553)(0.0838,0.1218)

Table 5

MD- and PD-based records in the term of TMP and AMT variables

MD-based recordPD-based record
Record timeYear(TMP, AMT)DepthRecord timeYear(TMP, AMT)Depth
11951(0.99, 2.36)0.29011951(0.99, 2.36)0.294
131963(0.99, 2.38)0.23321952(0.99, 2.44)0.203
211971(0.99, 2.34)0.179131963(0.99, 2.38)0.197
481998(0.99, 2.60)0.144572007(0.99, 2.55)0.163
572007(0.99, 2.55)0.097662016(0.99, 2.54)0.127
662016(0.99, 2.54)0.070----

Table 6

MLP and CMP of depth value related to the tth depth-based record

Depth functionmsZtZ^tMLPZ^tCMP(PI)
Mahalanobis560.700.820.79 (0.565, 0.965)
Projection450.1270.1430.138 (0.126, 0.163)

References
  1. Ahmadi J, Jafari Jozani M, Marchand 횋, and Parsian A (2009). Prediction of k-records from a general class of distributions under balanced type loss functions. Metrika, 70, 19-33.
    CrossRef
  2. Ahsanullah M (1980). Linear prediction of record values for the two parameter exponential distribution. Annals of the Institute of Statistical Mathematics, 32, 363-368.
    CrossRef
  3. Ahsanullah M (2009). Records and concomitants. Bulletin of the Malaysian Mathematical Sciences Society Second Series, 32, 101-117.
  4. Ahsanullah M and Nevzorov VB (2015). Records via Probability Theory, Paris, Atlantis Press.
  5. Arnold BC, Balakrishnan N, and Nagaraja HN (2011). Records, New York, John Wiley & Sons.
  6. Berred AM (1998). Prediction of record values. Communications in Statistics-Theory and Methods, 27, 2221-2240.
    CrossRef
  7. DeGroot MH (2005). Optimal Statistical Decisions, New York, John Wiley & Sons.
  8. Gnedin A (2007). The chain records. Electronic Journal of Probability, 12, 767-786.
    CrossRef
  9. Henningsen A and Toomet O (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics, 26, 443-458.
    CrossRef
  10. Hwang HK and Tsai TH (2010). Multivariate records based on dominance. Electronic Journal of Probability, 15, 1863-1892.
    CrossRef
  11. Liu RY (1992), Dodge Y (Ed). Data depth and multivariate rank tests, (pp. 279-294).
  12. Liu RY, Parelius JM, and Singh K (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference (with discussion and a rejoinder by liu and singh). The Annals of Statistics, 27, 783-858.
    CrossRef
  13. Liu RY and Singh K (1993). A quality index based on data depth and multivariate rank tests. Journal of the American Statistical Association, 88, 252-260.
  14. Nevzorov VB (2001). Records: Mathematical Theory, Providence, RI, American Mathematical Society.
  15. Raqab MZ, Ahmadi J, and Doostparast M (2007). Statistical inference based on record data from Pareto model. Statistics, 41, 105-118.
    CrossRef
  16. Resnick SI (1973). Record values and maxima. The Annals of Probability, 1, 650-662.
    CrossRef
  17. Serfling R (2006). Depth functions in nonparametric multivariate inference. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 72, 1-16.
    CrossRef
  18. Smith RL (1988). Forecasting records by maximum likelihood. Journal of the American Statistical Association, 83, 331-338.
    CrossRef
  19. Tat S and Faridrohani MR (2021). A new type of multivariate records: Depth-based records. Statistics, 55, 296-320.
    CrossRef
  20. Tukey J (1975). Mathematics and picturing data. Proceedings of International Congress of Mathematicians, 2, 523-531.
  21. Zuo Y (2003). Projection-based depth functions and associated medians. The Annals of Statistics, 31, 1460-1490.
    CrossRef
  22. Zuo Y and Serfling R (2000a). General notions of statistical depth function. The Annals of Statistics, 28, 461-482.
  23. Zuo Y and Serfling R (2000b). Structural properties and convergence results for contours of sample statistical depth functions. The Annals of Statistics, 28, 483-499.