TEXT SIZE

search for



CrossRef (0)
Other approaches to bivariate ranked set sampling
Communications for Statistical Applications and Methods 2018;25:283-296
Published online May 31, 2018
© 2018 Korean Statistical Society.

Mohammad Fraiwan Al-Saleh1,a, and Hadeel Mohammad Alshboula

aDepartment of Statistics, Yarmouk University, Jordan
Correspondence to: Department of Statistics, College of Science, Yarmouk University, Irbid-Jordan. E-mail: m-saleh@yu.edu.jo
Received February 3, 2018; Revised April 4, 2018; Accepted April 6, 2018.
 Abstract

Ranked set sampling, as introduced by McIntyre (Australian Journal of Agriculture Research, 3, 385–390, 1952), dealt with the estimation of the mean of one population. To deal with two or more variables, different forms of bivariate and multivariate ranked set sampling were suggested. For a technique to be useful, it should be easy to implement in practice. Bivariate ranked set sampling, as introduced by Al-Saleh and Zheng (Australian & New Zealand Journal of Statistics, 44, 221–232, 2002), is not easy to implement in practice, because it requires the judgment ranking of each of the combination of the order statistics of the two characteristics. This paper investigates two modifications that make the method easier to use. The first modification is based on ranking one variable and noting the rank of the other variable for one cycle, and do the reverse for another cycle. The second approach is based on ranking of one variable and giving the second variable the same rank (Concomitant Order Statistic) for one cycle and do the reverse for the other cycle. The two procedures are investigated for an estimation of the means of some well-known distributions. It is show that the suggested approaches can be used in practice and can be more efficient than using SRS. A real data set is used to illustrate the procedure.

Keywords : ranked set sampling, simple random sampling, bivariate ranked set sampling, bivariate normal distribution, Downton’s bivariate exponential distribution, concomitant variable
1. Introduction

Statistics is the science of collecting data from a population sample to make an inference about the population. The accuracy of the inference depends on the representativeness of the sample to the population, which is frequently controlled by the size of the sample and the technique used to select it. There are several sampling techniques available for obtaining data with informative image of the large population of interest.

Simple random sampling is the basic method for almost all other sampling techniques. If a sample of size n is drawn from a population of size N (finite population) in such a way that every possible sample of size n has the same chance of being selected, then the sample is called a simple random sample (SRS); the probability of any sample of size n to be the chosen sample is 1/(Nn). As a consequence of the definition, all elements of the population have the same inclusion probability, (= n/N). For details about other well-known sampling techniques (Scheaffer et al., 2011).

Ranked set sampling (RSS) was first proposed by McIntyre (1952). This technique of data collection is suitable for situation where taking actual measurements for sample observations is difficult (costly and time-consuming) as compared to the judgment ranking of them. The ranked set sampling technique can be executed as:

  • Randomly draw m SRSs each of size m from the population of interest.

  • The elements within each set are ranked from lowest to largest by judgment with respect to the variable of interest. It is assumed that each element can be ranked visually or by a negligible cost method that does not require actual quantification.

  • From the tth set, take for actual quantification the element (judgment) ranked as tth the order statistic.

  • Repeat Steps 1–3 r times (cycles), if necessary, to obtain a RSS of size n = rm units.

The observations of this sample can be denoted by {X(i:m)(j): j = 1, 2, …, r, i = 1, 2, …,m} where X(i:m)(j) is the tth order statistic from the tth set of size m in the jth cycle. When sampling from an infinite population with underlying probability density function (pdf), f, and absolutely continuous cumulative distribution function (cdf), F, then for each i, X(i::m)(1),X(i:m)(2),,X(i:m)(r) are independently and identically distributed (iid) f(i:m) with mean μ(i:m) and variance σ(i:m)2, while, for each j, X(1:m)(j),X(2:m)(j),,X(m:m)(j) are only independent. The pdf of the tth order statistic is

f(i:m)(x)=m!(i-1)!(m-i)!Fi-1(x)[1-F(x)]m-if(x).

In McIntyre’s RSS procedure, it is assumed that the researcher could order a set of size units with respect to the characteristic of interest with perfect ranking. He claimed (without proof) that the mean of the ranked set sampling μ^RSS=(1/(mr))j=1ri=1mX(i:m)(j), is an unbiased estimator of the population mean μ regardless of any error in judgment ranking, and in a special unimodal distribution the mean of such a sample is nearly (m + 1)/2 times as efficient as the mean of a simple random sample, μ^SRS=(1/n)j=1mXj. The supporting mathematical theory of RSS was provided by Takahasi and Wakimoto (1968). They showed that:

i=1mf(i:m))(x)=f(x),         i=1mμ(i:m)=mμ,         and         mσ2=i=1mσ(i:m)2+i=1m(μ(i:m)-μ)2.

Based on these identities, μ̂RSS is an unbiased estimator of μ and has smaller variance than μ̂SRS. They also established the following inequalities for the efficiency of μ̂RSS with respect to μ̂SRS, MSE is mean squared error:

1eff(μ^RSS;μ^SRS)=MSE(μ^SRS)MSE(μ^RSS)m+12.

The upper bound is achieved if the underlying distribution is uniform and the lower bound is achieved if it is degenerate.

Al-Saleh and Al-Kadiri (2000) introduced the double ranked set sampling method (DRSS) as a procedure that increases the efficiency of RSS without increasing the set size m. Al-Saleh and Al-Omari (2002) introduced the multistage ranked set sampling (MSRSS) method as a generalization on DRSS method. The efficiency of estimators using MSRSS for all distributions is found to be between 1 and m2.

Moving extreme ranked set sampling (MERSS) is a modification of the usual RSS technique. It was introduced by Al-Odat and Al-Saleh (2001) and investigated further by Al-Saleh and Al-Hadrami (2003a, 2003b). In this procedure, only the extremes values of sets of varied size are identified by judgment. Unlike RSS, MERSS allows for an increase of set size m without introducing too much ranking error.

1.1. Bivariate ranked set sampling

In many real applications such as agriculture or medicine, two variables are closely related and one of them is easy to measure. Stokes (1977) studied RSS with concomitant variables, she assumed that each sampling unit have a bivariate response where X is the character of interest and Y is the concomitant variable that can be easily measured; ranking the variable X can be done according to the variable Y. Several authors have considered estimating bivariate characteristics using RSS. McIntyre (1952) suggested applying the RSS procedure to a single selected characteristic and making an assumption regarding the performance of the method for the other characteristics. Patil et al. (1994) investigated two different approaches for dealing with multiple characteristics. In the first approach, as in McIntyre, units were ranked by judgment with respect to one chosen characteristic, and all other characteristics were given the same ranking order as the ranked characteristic. The second approach was based on the concept of size biased permutation. This approach allowed the ranking of units to depend upon several or all characteristics collectively (Norris et al., 1995).

A multivariate version of RSS was introduced by Al-Saleh and Zheng (2002a). For simplicity, the procedure was introduced for two characteristics and referred to it as ‘Bivariate ranked set sampling (BVRSS)’. The following steps describe a BVRSS procedure:

  • For a given set size m identify a random sample of size m4 from the population and randomly allocate it into m2 pools of size m2 each, where each pool is a square matrix with m rows and m columns.

  • In the first pool, identify the minimum value by judgment with respect to the first characteristic, for each of the m rows.

  • For the m minima obtained in Step 2, choose the pair that corresponds to the minimum value of the second characteristic, identified by judgment, for actual quantification. This pair, which resembles the label (1, 1), is the first element of the BVRSS sample.

  • Repeat Steps 2 and 3 for the second pool, but for actual quantification choose the pair that corresponds to the first minimum value with respect to the first characteristic and the second minimum value with respect to the second characteristic. This pair resembles the label (1, 2).

  • Continue the process until the label (m,m) is resembled from the m2 (last) pool.

This procedure produces a BVRSS sample of size m2.

Ridout (2003) considered several techniques that deals with taking RSS from a population, when we have to deal with multiple characteristics, one method is similar to one of the two methods that we will investigate. His investigation was based on simulation from bivariate normal distribution. Samawi and Al-Saleh (2007) studied relative performance of BVRSS, with respect to RSS and SRS, and investigated the estimation of the population mean using ratio and regression methods. Al-Saleh and Diab (2009) used RSS to estimate the parameters of Downton’s bivariate exponential distribution using MERSS with concomitant variable. For more work on RSS see Al-Saleh and Al-Kadiri (2000), Hanandeh and Al-Saleh (2013), Al-Saleh and Zheng (2002b), Al-Saleh and Diab (2009), Al-Saleh and Al-Ananbeh (2007), Al-Saleh and Na’amneh (2014) and Al-Saleh and Aldarabseh (2017), Zamanzade and Mohammadi (2016), Al-Omari and Al-Saleh (2009), Al-Saleh and Ababneh (2015).

BVRSS as introduced by Al-Saleh and Zheng (2002b) can sometimes be difficult to implement in practice. This paper introduces and investigates two modifications of BVRSS. The first modification is the content of Section 2, and the second modification is the content of Section 3. An actual data set is used for illustration. Conclusions and some possible future works are given in Section 4.

2. First modification

2.1. First modification of BVRSS (MBVRSS1)

Suppose (X, Y) is a bivariate random vector with pdf f (x, y). The first suggested approach to obtain a BVRSS can be explained as:

  • Randomly draw a SRS of size from the population of interest; denote the elements of the chosen sample by (X1(1),Y1(1)),(X2(1),Y2(1)),,(Xm(1),Ym(1)). Then, identify the minimum with respect to the X-variable and note the rank of the other variable. Denote the identified element by (X(1)(1),Y(j1)(1)) where j1 = 1, 2, …,m.

  • Randomly draw another SRS of size, and identify the second minimum with respect to X-variable and note the rank of other variable. Denote the identified element by (X(2)(2),Y(j2)(2)), j2 = 1, 2, …,m.

  • The process is continued until we identify the maximum with respect to X-variable and note the rank of the other variable for a SRS of size m. Denote the identified element by (X(m)(m),Y(jm)(m)), jm = 1, 2, …,m.

  • Randomly draw a SRS of size m from the population of interest (X1(1)*,Y1(1)*),(X2(1)*,Y2(1)*),,(Xm(1)*,Ym(1)*) then identify the minimum with respect to the Y-variable and note the rank of the other variable; denote the identified element by (X(i1)(1)*,Y(1)(1)*) where i1 = 1, 2, …,m.

  • Randomly draw another SRS of size m and identify the second minimum with respect to the Y-variable and note the rank of the other variable; denote the identified element by (X(i2)(2)*,Y(2)(2)*) where i2 = 1, 2, …,m.

  • The process is continued until we identify the maximum w.r.t. Y-variable and note the rank of other variable for a SRS of size m, i.e., (X(m2)(m)*,Y(m)(m)*), im = 1, 2, …,m.

Steps 1–3 make up the first cycle while steps 4–6 make up the second cycle.

The elements of the set {(X(ik)(k),Y(jk)(k)),(X(ik)(k)*,Y(jk)(k)*); k = 1, 2, …,m} are the elements of a MBVRSS1 of size 2m. Cycle 1 and 2 can be repeated r times if needed to obtain a MBVRSS1 of size n = 2rm. Table 1 illustrates the above procedure by a numerical example with m = 3. The MBVRSS1 of size 6 is {(50, 143), (67, 158), (62, 155), (52, 148), (71, 188), (76, 174)}.

2.2. Estimation of the population mean using MBVRSS1

Assume that {(X(ik)(k),Y(jk)(k)),(X(ik)(k)*,Y(jk)(k)*); k = 1, 2, …,m} is a MBVRSS1 of size 2m from f (x, y). Let μx, μy,σx,σy, and ρ be the mean of X, the mean of Y, the variance of X, the variance of Y and the correlation between X and Y, respectively. We want to use the overall mean to estimate μx (for simplicity, we redenote it by μ). For a given (i1, i2, …, im), let

μ^=k=1mX(k)(k)+k=1mX(ik)(k)*2m.

From Takahasi and Wakimoto (1968), we have k=1mf(k)(x)=mf(x). Thus, E(k=1mX(k)(k))=m-xf(x)dx=mμ,E(k=1mX(ik)(k)*)=k=1mμ(ik). Therefore, E(μ^)=(μ/2)+(1/2m)k=1mμ(ik). Thus, μ̂ can be a biased estimator. The estimator is unbiased if ik = k for k = 1, 2, …,m. Also,

MSE(μ^)=Var (μ^)+(Bias(μ^))2=14m2[k=1mσ(k)2+k=1mσ(ik)2]+(-μ2+12mk=1mμ(ik))2.

Also, from Takahasi and Wakimoto (1968), we have k=1mσ(k)2=mσ2-km(μ(k)-μ)2. Thus,

MSE(μ^)=14m2[mσ2-km(μ(k)-μ)2+k=1mσ(ik)2]+(-μ2+12mk=1mμ(ik))2.

Let X1, X2, …, X2m is a SRS with common pdf f (x). Let μ^SRS=k=12mXk/(2m), then, then efficiency of μ̂ w.r.t. μ̂SRS for a given (i1, i2 …, im) is

eff(μ^;μ^SRS)=σ212m[mσ2-k=1m(μ(k)-μ)2+k=1mσ(ik)2]+2m(-μ2+12mk=1mμ(ik))2.

2.3. Efficiency of the MBVRSS1 for some specific distributions

In this subsection, we will investigate the performance of this MBVRSS method for estimating the mean of the uniform, exponential & normal marginal distributions.

• Uniform distribution

Let X1, X2, …, X2m be a SRS from U(0, θ), with pdf f (x) and cdf F(x):

f(x)={1θ,if 0<xθ,0,otherwise,         F(x)={0,x<0,xθ,0xθ,1,xθ.

The pdf of the kth order statistic, X(k), of a SRS of size m is f(k)(x) = m!/{(k−1)!(mk)!}(x/θ)k−1(1− x/θ)mk(1/θ). It can be shown that X(k)/θ has the beta distribution with α = k and β = mk +1. Thus, E(X(k)) = μ(k) = {α/(α + β)}θ = {k/(m + 1)}θ and Var(X(k))=σ(k)2=[αβ/{(α+β)2(α+β+1)}]θ2=[k(m-k+1)/{(m+1)2(m+2)}]θ2. For uniform distribution, U(0, θ), we have E(X) = μ = θ/2, Var(X) = σ2 = θ2/12. Thus, E(μ̂SRS) = θ/2 and Var(μ̂SRS) = θ2/(24m). Therefore, the efficiency of μ̂ w.r.t. μ̂SRS for a given (i1, i2, …, im) is

eff (μ^;μ^SRS)=σ26m[m12-k=1m(km+1-12)2+k=1mσ(ik)2]+24m(-14+12mk=1mikm+1)2.

The values of eff(μ̂; μ̂SRS) for some values of m and (i1, i2, …, im), are given in Table 2. The best case then occurs when ik = k for k = 1, 2, …,m. In this case, eff(μ̂; μ̂SRS) when m = 2, 3, 4 are respectively, 1.50, 2.00, and 2.50. These are the same values for the efficiency (eff) of the usual RSS with set size m and sample size m. The efficiency is highly depends on the value of ik, which highly depends on the relation between the two variables.

•Exponential distribution

Let X1, X2, …, X2m be a SRS from exp(θ), with pdf f (x) and cdf F(x):

f(x)={1θe-xθ,if x0,0,otherwise,         F(x)={0,x<0,1-e-xθ,x0.

The pdf of the kth order statistic of a random sample of size m is:

f(k)(x)=m!(k-1)!(m-k)!(1-e-xθ)k-1(e-xθ)m-k1θe-xθ,         x0.

If X is exp(θ), then X/θ is exp(1); E(X) = μ = θ and Var(X) = σ2 = θ2. Thus, E(μ̂SRS) = θ/2 and Var(μ̂SRS) = θ2/2m.

E(X(k))=μ(k)=0xf(k)(x)dx=0xθm!(k-1)!(m-k)!(1-e-xθ)k-1(e-xθ)m-k+1dx.

The above integral can be evaluated for each value of k using Scientific Work Place package. Also,

MSE(μ^)=14m2(mθ2-k=1m(μ(k)-θ)2+k=1mσ(ik)2)+(-θ2+12mk=1mμ(ik))2.

Thus, the efficiency of μ̂ with respect to μ̂SRS for estimating the mean of exponential distribution is

eff(μ^;μ^SRS)=112m[m-k=1m(μ(k)-1)2+k=1mσ(ik)2]+2m(-12+12mk=1mμ(ik))2.

For example, if m = 2, then μ(1) = 0.5, μ(2) = 1.5, σ(1)2=0.25,σ(2)2=1.25. Thus, eff(μ̂; μ̂SRS) = 4/3. The values of eff(μ̂; μ̂SRS) for some values of m and (i1, i2, …, im) are given in Table 3. The comments on this table are similar to those on Table 2.

• Normal distribution

Let X1, X2, …, X2m be a SRS from N(μ,σ2) with pdf f (x) and cdf F(x) :

f(x)=1σ2πe-12σ2(x-μ)2=1σφ(x-μσ),         F(x)=Φ(x-μσ),         -<x<.

The pdf of the kth order statistic of a random sample of size m is

f(k)(x)=m!(k-1)!(m-k)!(Φ(x-μσ))k-1(1-Φ(x-μσ))m-k1σφ(x-μσ).

Thus,

MSE(μ^)=14m2(mσ2-k=1m(μ(k)-μ)2+k=1mσ(ik)2)+(-μ2+12mk=1mμ(ik))2.

The value of efficiency of normal distribution does not depend on the parameters μ and σ2, so we can find μ(k) based on standard normal distribution. Thus,

eff(μ^;μ^SRS)=112m[m-k=1mμ(k)2+k=1mσ(ik)2]+2m(12mk=1mμ(ik))2.

The values of eff(μ̂; μ̂SRS) for some values of m and (i1, i2, …, im) are given in Table 4. Again, the best scenario occurs when ik = k for k = 1, 2, …,m.

2.4. Overall performance of the proposed BVRSS

In the previous subsection, the efficiency was investigated for some specific values of (i1, i2, …, im). In general, i1, i2, …, im are independent random variables, with ik = 1, 2, …,m. The best case as seen is ik = k, k = 1, 2, …,m, which occurs when the two variables of interest are highly correlated. In this extreme case, we have two RSS each of size m. So, in this case if we use X(k)(k),X(ik)(k)* to estimate μ, then

μ^=k=1mX(k)(k)+k=1mX(ik)(k)*2m;E(μ^)=12m(k=1mE(X(k)(k))+k=1mE(X(ik)(k)*))=12m(mμ+mμ)=μ;

thus, μ̂ is an unbiased estimator of μ. Also,

Var(μ^)=14(Var(μ^RSS)+Var(μ^RSS*))=12Var (μ^RSS).

μ̂RSS is the estimator of based on RSS of size m. Thus, the efficiency of μ̂ w.r.t. μ̂SRS for estimating the mean is

eff (μ^;μ^SRS)=σ2/2m0.5Var (μ^RSS)=σ2/mVar (μ^RSS)=eff,

which is the efficiency of RSS of size m w.r.t. SRS of the same size. It is well known in this case that 1≤ eff(μ̂; μ̂SRS) ≤ (1/2)(m + 1); this means the number of cycles has no effect on the efficiency. The worst case occurs when there is no relation between the two variables; for this case P(i1 = 1) = P(i2 = 2) = · · · = P(ik = k) = 1/m, for k = 1, 2, …,m. The resulting sample consists of a RSS and a SRS of size m each, i.e., X(ik)(k)*, k = 1, …,m, are equivalent to a SRS of size m :

Var(μ^)=14(Var(μ^RSS)+Var(μ^SRS*));eff(μ^;μ^SRS)=σ2/2m14(Var (μ^RSS)+Var (μ^SRS*))=2σ2/mVar(μ^RSS)/(1+Var(μ^SRS)Var(μ^RSS))=2eff1+eff,

where eff is the efficiency of the usual RSS of set size m w.r.t. a SRS of size m. Since, 1 ≤ eff ≤ (1/2)(m + 1), we have, 1 ≤ eff(μ̂; μ̂SRS) ≤ (2m + 2)/(m + 3). Table 5 contains the values of the efficiency in general for the best and worst cases.

3. Second modification

3.1. Second modification of BVRSS (MBVRSS2)

In this section, we consider another modification of BVRSS, MBVRSS2. In this modification, we order with respect to the first variable and give the second variable the same rank (concomitant variable) for one cycle and do the other way around for the other cycle. Let (X, Y) be a bivariate random vector with joint pdf, f (x, y). The suggested approach of BVRSS is obtained based on concomitant variable using the following steps:

  • Randomly draw a SRS of size m from the population of interest. Then, identify the minimum with respect to X-variable and give the corresponding Y-variable the same rank. Denote the identified element by (X(1)(1),Y[1](1)).

  • Randomly draw another SRS of size m. Then, identify the second minimum with respect to X-variable and give the corresponding Y-variable the same rank. Denote the identified element by (X(2)(2),Y[2](2)).

  • The process is continued until we identify the maximum with respect to X-variable and give the corresponding Y-variable the same rank for a SRS of size m. Denote the identified element by (X(m)(m),Y[m](m)).

  • Randomly draw a SRS of size m from the population of interest. Then, identify the minimum with respect to Y-variable and give the corresponding X-variable the same rank. Denote the identified element by (X[1](1)*,Y(1)(1)*).

  • Randomly draw a SRS of size m from the population of interest. Then, identify the second minimum with respect to Y-variable and give the corresponding X-variable the same rank. Denote the identified element by (X[2](2)*,Y(2)(2)*).

  • The process is continued until we identify the maximum with respect to Y-variable and give the corresponding X-variable the same rank for a SRS of size m. Denote the identified element by (X[m](m)*,Y(m)(m)*).

* Steps 1–3 make up the first cycle while steps 4–6 make up the second cycle.

The elements of the set {(X(k)(k),Y[k](k)),(X[k](k)*,Y(k)(k)*): k = 1, 2, …,m} are the elements of the MBVRSS2 of size 2m using concomitant variable. Cycles 1 and 2 can be repeated r times if needed to obtain a MBVRSS2 of size n = 2rm. The round bracket on a subscript is used to indicate that the ordering is perfect, while the square bracket is used to indicate that the ordering is imperfect.

Consider (X[k](k)*,Y(k)(k)*), k = 1, 2, …,m. Yang (1977) has shown that the conditional distribution of the concomitant variable X[k](k)* given the order statistic Y(k)(k)* is the same as the conditional distribution of X given Y; i.e.,

fX[k](k)*Y(k)(k)*(x[k](k)*y(k)(k)*)=fXY(x[k](k)*y(k)(k)*).

Based on this result we can obtain the mean and the variance of the concomitant order statistic X[k](k)*, in terms of the mean and variance of the order statistic Y(k)(k)*.

The performance of this MBVRSS2 is investigated for two important distributions: the Bivariate Normal Distribution and Downton’s Bivariate Exponential Distribution.

• Bivariate normal distribution

Assume that a random vector (X, Y) has a bivariate normal distribution which is denoted by BN(μx,μy,σx2,σy2,ρ), where −∞ < x, y, μx, μy < ∞, 0<σx2,σy2<, and −1 < ρ < 1. μx, μy, σx2,σy2, ρ are respectively the mean of X, the mean of Y, the variance of X, the variance of Y and the correlation between X and Y. The joint pdf is given by:

fX,Y(x,y)=12πσxσye-12(1-ρ2)((x-μxσx)2-2ρ(x-μxσx)(y-μyσy)+(y-μyσy)2).

It is well- known that the marginals are normal and the conditionals are also normal. In particular:

Y=N(μy,σy2)         and         (XY=y)=N(μx+ρσxσy(y-μy),σx2(1-ρ)2).

Using the results of Yang (1977), we have

fX[k](k)*Y(k)(k)*(x[k](k)*y(k)(k)*)=fXY(x[k](k)*y(k)(k)*).

Thus,

E(X[k](k)*y(k)(k)*)=μx+ρσxσy(y(k)(k)*-μy)         and         Var (X[k](k)*y(k)(k)*)=σx2(1-ρ)2.

Therefore,

E(X[k](k)*)=E(E(X[k](k)*Y(k)(k)*))=μx+ρσxσyE(Y(k)(k)*-μy);

also,

Var (X[k](k)*)=Var (E(X[k](k)*Y(k)(k)*))+E(Var (X[k](k)*Y(k)(k)*))=σx2(1-ρ)2+ρ2σx2σy2Var (Y(k)(k)*).

Let

μ^x=k=1mX(k)(k)+k=1mX(ik)(k)*2m,

then it can be shown easily that

E(μ^x)=μx         and         Var (μ^x)=14m2(mσx2-k=1m(μ(k)-μ)2+mσx2(1-ρ2)+ρ2σx2σy2k=1mVar (Y(k)(k)*)).

Let X1, X2, …, X2m be a SRS from N(μx,σx2) and let μ̂SRS = . Then

eff(μ^x;μ^SRS)=σx2/2m14m2(mσx2-k=1m(μ(k)-μ)2+mσx2(1-ρ2)+ρ2σx2σy2k=1mVar (Y(k)(k)*))=2m(m/eff)+m(1-ρ2)+ρ2(m/eff)=2eff1+eff(1-ρ2)+ρ2.

Table 6 gives the efficiency for m = 2, 3, 4 and different values of ρ. The efficiency is always larger than 1; it is increasing in m for fixed ρ, and increasing in ρ for fixed m. As |ρ| → 1, the efficiency goes to eff, which is the efficiency of μ̂RSS with respect to μ̂SRS for estimating the mean of normal distribution.

• Downton’s bivariate exponential distribution

This distribution, denoted by DBE(θx, θy, ρ), was introduced by Downton (1970). Its pdf is given by:

fX,Y(x,y)=1θxθyρe-(xθx(1-ρ)+yθy(1-ρ))×I0(2(ρxy)(1-ρ)θxθy),

where, x, y, θx, θy > 0, 0 ≤ ρ < 1, and I0(z)=k=1(z/2)2k/k!2 is the modified Bessel function of the first kind of order zero.

The marginal distributions of X and Y are exponential with parameters θx and θy, respectively. Also,

E(XY=y)=(1-ρ)θx+ρθxθyy         and         Var(XY=y)=(1-ρ)2θx2+2ρ(1-ρ)θx2θy2y.

Note that unlike the bivariate normal distribution, the conditional variance depends on the given value of Y. Using the results of Yang (1977), we have

fX[k](k)*Y(k)(k)*(x[k](k)*y(k)(k)*)=fXY(x[k](k)*y(k)(k)*).

Thus,

E(X[k](k)*y(k)(k)*)=(1-ρ)θx+ρθxθyY(k)(k)*         and         Var (X[k](k)*y(k)(k)*)=(1-ρ)2θx2+2ρ(1-ρ)θx2θy2Y(k)(k)*;

therefore

E(X[k](k)*)=E(E(X[k](k)*Y(k)(k)*))=(1-ρ)θx+ρθxθyE(Y(k)(k)*);

also,

Var (X[k](k)*)=Var (E(X[k](k)*Y(k)(k)*))+E(Var (X[k](k)*Y(k)(k)*))=(1-ρ)2θx2+ρ2θx2θy2Var (Y(k)(k)*).

Thus, θ^x={k=1mX(k)(k)+k=1mX(ik)(k)*}/2m, as in the bivariate normal case, is an unbiased estimator for θx. The efficiency of θ̂x w.r.t. θ̂SRS is

eff(θ^x;θ^SRS)=2eff1+eff(1-ρ2)+ρ2.

From Table 7, the efficiency is always larger than 1, increasing in m for fixed ρ, and increasing in ρ for fixed m. Also as |ρ| → 1, the efficiency goes to eff, which is the efficiency of μ̂RSS with respect to μ̂SRS for estimating the mean of the exponential distribution.

3.2. General setting of the technique

In the previous subsection, we investigated the efficiency of μ̂x with respect to μ̂SRS for the mean of some well-known distributions. In this subsection, we will look at a general case.

Let {(X(k)(k),Y[k](k)),(X[k](k)*,Y(k)(k)*): k = 1, 2, …,m} be a MBVRSS2 of size 2m obtained using the suggested technique explained above. Frey (2007) proposed that the distribution of the concomitant order statistic can be written as a linear combination of the distribution of the order statistics:

X[k](k)*=di=1mai(k)X(i),

where for each value of k, ai(k)0, and i=1mai(k)=1 (So X[k](k)* is a convex combination of the X(i), i = 1, …,m). Now, let

μ^x=k=1mX(k)(k)+k=1mX(ik)(k)*2m.

Thus, E(μ^x)=(1/2m)(mμ+i=1mai(k)μ(i))=μ/2+(1/2m)i=1mai(k)μ(i); thus μ̂x can be a biased estimator of μ. Again, let eff be the efficiency of the usual estimator of μ using RSS of size m w.r.t. the corresponding estimator using SRS; the values of eff are well established for most of the well-known distributions. The efficiency of μ̂x w.r.t. μ̂SRS is given by

eff(μ^x;μ^SRS)=σ2/2m14[σ2m×eff+Var (1mi=1mai(k)X(i))]+(-μ2+12mk=1mi=1mai(k)μ(i))2.

eff(μ̂x; μ̂SRS) depends on the values of ai(k). If ai(k)=1/m for all i then E(μ̂x) = μ and Var(μ̂x) = (σ2/4m)(1/eff + 1). Thus, in this case

eff(μ^x;μ^SRS)=2eff1+eff.

If ai(k)=0 for all i except ak(k)=1, then the sample is equivalent to two RSSs of size m each; thus, eff(μ̂x; μ̂SRS) = 1. For other values of ai(k), the efficiency can be obtained similarly.

4. Application: trees data

In this section, data on heights (Y) and diameters (X) of 1,083 trees is used to illustrate the last approach of MBVRSS2. These data was due to Pordan (1968). For this data: μx = 23.07, μy = 21.656, σx2=6.268,σy2=3.039, and ρ = 0.721. To choose different MBVRSS2s from this bivariate data, we followed the steps described previously. For m = 3 and m = 5, 10,000 MBVRSS2s were chosen randomly; so, 10,000 values were generated for μ̂x and μ̂y.

  • For m = 3, based on Table 8 we have:

    Bias(μ̂x) = E(μ̂x) − μx ≅ 23.054 − 23.070 = −0.016, MSE(X¯)σx2/2m=6.548, MSE(μ̂x) = (2.032)2 + (−0.016)2 = 4.129. Thus, eff(μ̂x; ) = MSE( )/MSE(μ̂x) = 1.586. Similarly, eff(μ̂y; ) = 1.539.

  • For m = 5, based on Table 9 we have:

    eff(μ̂x; ) = 3.929/1.991 = 1.973 and eff(μ̂y; ) = 0.924/0.462 = 2.

5. Concluding remarks and suggested future work

In this paper, two approaches of BVRSS are introduced, which are shown to be more convenient to apply in practice. The first approach was based on ranking of one variable and noting the exact rank of the other variable in one cycle, and do the other way around in the other cycle. The second approach is based on ranking the first variable in the first cycle and give the second variable the same rank (concomitant order statistic), and do the other way around in the second cycle. The two approaches were investigated in general and for some well-known distributions. A real data set was used for illustration. The suggested approaches are shown to be useful for use in practice and can be more efficient than using SRS. The proposed two approaches of BVRSS can be applied on other modifications of RSS such as moving extreme ranked set sampling; this will be another choice that can be investigated in the future. Parametric statistics can be done if the underlying distribution is known; consequently, the MLE can be obtained. The information in the chosen sample can be measured using Fisher information number.

Acknowledgements

Our sincere thanks is given to the referees for their careful reading of the paper and comments that significantly improved the original version of the paper.

TABLES

Table 1

A numerical example of the modification with m = 3

(X, Y)(X, Y)(X, Y)Steps 1, 2, 3
Cycle 1(50, 143)(81, 169)(72, 176) −→(50, 143)
(77, 181)(67, 158)(64, 177) −→(67, 158)
(43, 147)(62, 155)(55, 163) −→(62, 155)

(X, Y)(X, Y)(X, Y)Steps 4, 5, 6

Cycle 2(52, 148)(83, 160)(49, 150) −→(52, 148)
(85, 189)(75, 169)(71, 188) −→(71, 188)
(59, 171)(48, 154)(76, 174) −→(76, 174)

Table 2

The eff(μ̂; μ̂SRS) for uniform distribution

m = 2Efficiencym = 3Efficiencym = 4Efficiency
i1 = i2 = 11.00i1 = i2 = i3 = 10.63i1 = i2 = i3 = i4 = 10.40
i1 = 1, i2 = 21.50i1 = 1, i2 = 2, i3 = 32.00i1 = 1, i2 = 2, i3 = 3, i4 = 42.50
i1 = i2 = 21.00i1 = 3, i2 = i3 = 21.54i1 = i2 = 3, i3 = 4, i4 = 12.17
i1 = i2 = i3 = 30.63i1 = i2 = i3 = i4 = 40.40

Table 3

The eff(μ̂; μ̂SRS) for exponential distribution

m = 2Efficiencym = 3Efficiencym = 4Efficiency
i1 = i2 = 11.33i1 = i2 = i3 = 10.97i1 = i2 = i3 = i4 = 10.71
i1 = 1, i2 = 21.33i1 = 1, i2 = 2, i3 = 31.64i1 = 1, i2 = 2, i3 = 3, i4 = 41.92
i1 = i2 = 20.80i1 = 3, i2 = i3 = 21.44i1 = i2 = 3, i3 = 4, i4 = 11.71
i1 = i2 = i3 = 30.49i1 = i2 = i3 = i4 = 40.30

Table 4

The eff(μ̂; μ̂SRS) for normal distribution

m = 2Efficiencym = 3Efficiencym = 4Efficiency
i1 = i2 = 11.00i1 = i2 = i3 = 10.62i1 = i2 = i3 = i4 = 10.30
i1 = 1, i2 = 21.47i1 = 1, i2 = 2, i3 = 31.91i1 = 1, i2 = 2, i3 = 3, i4 = 42.35
i1 = i2 = 21.00i1 = 3, i2 = i3 = 21.60i1 = i2 = 3, i3 = 4, i4 = 11.94
i1 = i2 = i3 = 30.62i1 = i2 = i3 = i4 = 40.39

Table 5

The efficiency in general with the best and worst cases eff(μ̂; μ̂SRS)

m23456General m
Best case1.52.002.503.03.50(m + 1)/2
Worst case1.21.331.431.51.56(2m + 2)/(m + 3)

Table 6

The eff(μ̂x; μ̂SRS) for Bivariate normal distribution

m ↓ |ρ| →0.00.20.40.60.8|ρ| → 1
21.191.201.231.281.361.47
31.311.331.381.481.641.92
41.401.461.501.641.902.35

Table 7

The eff(μ̂x; μ̂SRS) for Downton’s bivariate exponential distribution

m ↓ |ρ| →00.20.40.60.8|ρ| → 1
21.141.151.171.201.261.33
31.251.301.362.041.471.64
41.321.331.381.481.641.92

Table 8

The summary of the resulted descriptive statistics for m = 3

Variable↓NAverageStandard deviationMaxMin
X108323.0542.03233.17516.567
Y108321.6550.98225.03317.467

Table 9

The summary of the resulted descriptive statistics for m = 5

Variable↓NAverageStandard deviationMaxMin
X108323.0471.41129.77518.090
Y108321.6470.68024.01018.890

References
  1. Al-Odat, MT, and Al-Saleh, MF (2001). A variation of ranked set sampling. Journal of Applied Statistical Science. 10, 137-146.
  2. Al-Omari, AI, and Al-Saleh, MF (2009). Quartile double ranked set sampling for estimating the population mean. Stochastics and Quality Control. 24, 243-253.
  3. Al-Saleh, MF, and Ababneh, A (2015). Test for accuracy in ranking in moving extreme ranked set sampling. International Journal of Computational and Theoretical Statistics. 2, 67-77.
    CrossRef
  4. Al-Saleh, MF, and Al-Ananbeh, AM (2007). Estimation of the means of the bivariate normal using moving extreme ranked set sampling with concomitant variable. Statistical Papers. 48, 179-195.
    CrossRef
  5. Al-Saleh, MF, and Aldarabseh, MZ (2017). Inference on the skew normal distribution using ranked set sampling. International Journal of Computational and Theoretical Statistics. 4, 65-76.
    CrossRef
  6. Al-Saleh, MF, and Al-Hadrami, SA (2003a). Parametric estimation for the location parameter for symmetric distributions using moving extremes ranked set sampling with application to trees data. Environmetrics. 14, 651-664.
    CrossRef
  7. Al-Saleh, MF, and Al-Hadrami, SA (2003b). Estimation of the mean of the exponential distribution using moving extremes ranked set sampling. Statistical Papers. 44, 367-382.
    CrossRef
  8. Al-Saleh, MF, and Al-Kadiri, MA (2000). Double-ranked set sampling. Statistics & Probability Letters. 48, 205-212.
    CrossRef
  9. Al-Saleh, MF, and Al-Omari, AI (2002). Multistage ranked set sampling. Journal of Statistical Planning and Inference. 102, 273-286.
    CrossRef
  10. Al-Saleh, MF, and Diab, YA (2009). Estimation of the parameters of Downton’s bivariate exponential distribution using ranked set sampling scheme. Journal of Statistical Planning and Inference. 139, 277-286.
    CrossRef
  11. Al-Saleh, MF, and Na’amneh, AK (2014). Properties of the elements of simple, ranked set and moving extreme ranked set samples. Journal of Applied Statistical Science. 22, 75-85.
  12. Al-Saleh, MF, and Zheng, G (2002a). Estimation of bivariate characteristics using ranked set sampling. Australian & New Zealand Journal of Statistics. 44, 221-232.
    CrossRef
  13. Al-Saleh, MF, and Zheng, G (2002b). Modified maximum likelihood estimators based on ranked set sampling. Annals of the Institute of Statistical Mathematics. 54, 641-658.
    CrossRef
  14. Downton, F (1970). Bivariate exponential distribution in reliability theory. Journal of the Royal Statistical Society. Series B (Methodological). 32, 408-417.
  15. Fery, JC (2007). New imperfect ranking models for ranked set sampling. Journal of Statistical Planning and Inference. 137, 1433-1445.
    CrossRef
  16. Hanandeh, AA, and Al-Saleh, MF (2013). Inference on Downton’s bivariate exponential distribution based on moving extreme ranked set sampling. Austrian Journal of Statistics. 42, 161-179.
    CrossRef
  17. McIntyre, GA (1952). A method of unbiased selective sampling, using ranked set. Australian Journal of Agriculture Research. 3, 385-390.
    CrossRef
  18. Norris, RC, Patil, GP, and Sinha, AK (1995). Estimation of multiple characteristics by ranked set sampling methods. Coenoses. 10, 95-111.
  19. Patil, GP, Sinha, AK, and Taillie, C (1994). Ranked set sampling for multiple characteristics. International Journal of Ecology and Environmental Sciences. 20, 357-373.
  20. Pordan, M (1968). Forest Biometric. London: Pergamum Press
  21. Ridout, MS (2003). On ranked set sampling for multiple characteristics. Environmental and Ecological Statistics. 10, 255-262.
    CrossRef
  22. Samawi, HM, and Al-Saleh, MF (2007). On bivariate ranked set sampling for ratio and regression estimators. International Journal of Modeling and Simulation. 27, 299-305.
    CrossRef
  23. Scheaffer, RL, Mendenhall, W, Ott, RL, and Gerow, KG (2011). Elementary Survey Sampling. London: Duxbury Press
  24. Stokes, SL (1977). Ranked set sampling with concomitant variables. Communications in Statistics-Theory and Methods. 6, 1207-12011.
    CrossRef
  25. Takahasi, K, and Wakimoto, K (1968). On unbiased estimates of the population mean based on the sample stratified by means of ordering. Annals of the Institute of Statistical Mathematics. 20, 1-31.
    CrossRef
  26. Yang, SS (1977). General distribution theory of the concomitant of order statistics. The Annals of Statistics. 5, 996-1002.
    CrossRef
  27. Zamanzade, E, and Mohammadi, M (2016). Some modified mean estimators in ranked set sampling using a covariate. Journal of Statistical Theory and Applications. 15, 142-152.
    CrossRef