Ranked set sampling, as introduced by McIntyre (
Statistics is the science of collecting data from a population sample to make an inference about the population. The accuracy of the inference depends on the representativeness of the sample to the population, which is frequently controlled by the size of the sample and the technique used to select it. There are several sampling techniques available for obtaining data with informative image of the large population of interest.
Simple random sampling is the basic method for almost all other sampling techniques. If a sample of size
Ranked set sampling (RSS) was first proposed by McIntyre (1952). This technique of data collection is suitable for situation where taking actual measurements for sample observations is difficult (costly and time-consuming) as compared to the judgment ranking of them. The ranked set sampling technique can be executed as:
Randomly draw
The elements within each set are ranked from lowest to largest by judgment with respect to the variable of interest. It is assumed that each element can be ranked visually or by a negligible cost method that does not require actual quantification.
From the
Repeat Steps 1–3
The observations of this sample can be denoted by {
In McIntyre’s RSS procedure, it is assumed that the researcher could order a set of size units with respect to the characteristic of interest with perfect ranking. He claimed (without proof) that the mean of the ranked set sampling
Based on these identities,
The upper bound is achieved if the underlying distribution is uniform and the lower bound is achieved if it is degenerate.
Al-Saleh and Al-Kadiri (2000) introduced the double ranked set sampling method (DRSS) as a procedure that increases the efficiency of RSS without increasing the set size
Moving extreme ranked set sampling (MERSS) is a modification of the usual RSS technique. It was introduced by Al-Odat and Al-Saleh (2001) and investigated further by Al-Saleh and Al-Hadrami (2003a, 2003b). In this procedure, only the extremes values of sets of varied size are identified by judgment. Unlike RSS, MERSS allows for an increase of set size
In many real applications such as agriculture or medicine, two variables are closely related and one of them is easy to measure. Stokes (1977) studied RSS with concomitant variables, she assumed that each sampling unit have a bivariate response where
A multivariate version of RSS was introduced by Al-Saleh and Zheng (2002a). For simplicity, the procedure was introduced for two characteristics and referred to it as ‘Bivariate ranked set sampling (BVRSS)’. The following steps describe a BVRSS procedure:
For a given set size
In the first pool, identify the minimum value by judgment with respect to the first characteristic, for each of the
For the
Repeat Steps 2 and 3 for the second pool, but for actual quantification choose the pair that corresponds to the first minimum value with respect to the first characteristic and the second minimum value with respect to the second characteristic. This pair resembles the label (1, 2).
Continue the process until the label (
This procedure produces a BVRSS sample of size
Ridout (2003) considered several techniques that deals with taking RSS from a population, when we have to deal with multiple characteristics, one method is similar to one of the two methods that we will investigate. His investigation was based on simulation from bivariate normal distribution. Samawi and Al-Saleh (2007) studied relative performance of BVRSS, with respect to RSS and SRS, and investigated the estimation of the population mean using ratio and regression methods. Al-Saleh and Diab (2009) used RSS to estimate the parameters of Downton’s bivariate exponential distribution using MERSS with concomitant variable. For more work on RSS see Al-Saleh and Al-Kadiri (2000), Hanandeh and Al-Saleh (2013), Al-Saleh and Zheng (2002b), Al-Saleh and Diab (2009), Al-Saleh and Al-Ananbeh (2007), Al-Saleh and Na’amneh (2014) and Al-Saleh and Aldarabseh (2017), Zamanzade and Mohammadi (2016), Al-Omari and Al-Saleh (2009), Al-Saleh and Ababneh (2015).
BVRSS as introduced by Al-Saleh and Zheng (2002b) can sometimes be difficult to implement in practice. This paper introduces and investigates two modifications of BVRSS. The first modification is the content of Section 2, and the second modification is the content of Section 3. An actual data set is used for illustration. Conclusions and some possible future works are given in Section 4.
Suppose (
Randomly draw a SRS of size from the population of interest; denote the elements of the chosen sample by
Randomly draw another SRS of size, and identify the second minimum with respect to
The process is continued until we identify the maximum with respect to
Randomly draw a SRS of size
Randomly draw another SRS of size
The process is continued until we identify the maximum w.r.t.
Steps 1–3 make up the first cycle while steps 4–6 make up the second cycle.
The elements of the set {
Assume that {
From Takahasi and Wakimoto (1968), we have
Also, from Takahasi and Wakimoto (1968), we have
Let
In this subsection, we will investigate the performance of this MBVRSS method for estimating the mean of the uniform, exponential & normal marginal distributions.
Let
The pdf of the
The values of eff(
Let
The pdf of the
If
The above integral can be evaluated for each value of
Thus, the efficiency of
For example, if
Let
The pdf of the
Thus,
The value of efficiency of normal distribution does not depend on the parameters
The values of eff(
In the previous subsection, the efficiency was investigated for some specific values of (
thus,
which is the efficiency of RSS of size
where eff is the efficiency of the usual RSS of set size
In this section, we consider another modification of BVRSS, MBVRSS2. In this modification, we order with respect to the first variable and give the second variable the same rank (concomitant variable) for one cycle and do the other way around for the other cycle. Let (
Randomly draw a SRS of size
Randomly draw another SRS of size
The process is continued until we identify the maximum with respect to
Randomly draw a SRS of size
Randomly draw a SRS of size
The process is continued until we identify the maximum with respect to
* Steps 1–3 make up the first cycle while steps 4–6 make up the second cycle.
The elements of the set {
Consider (
Based on this result we can obtain the mean and the variance of the concomitant order statistic
The performance of this MBVRSS2 is investigated for two important distributions: the Bivariate Normal Distribution and Downton’s Bivariate Exponential Distribution.
Assume that a random vector (
It is well- known that the marginals are normal and the conditionals are also normal. In particular:
Using the results of Yang (1977), we have
Thus,
Therefore,
also,
Let
then it can be shown easily that
Let
Table 6 gives the efficiency for
This distribution, denoted by DBE(
where,
The marginal distributions of
Note that unlike the bivariate normal distribution, the conditional variance depends on the given value of
Thus,
therefore
also,
Thus,
From Table 7, the efficiency is always larger than 1, increasing in
In the previous subsection, we investigated the efficiency of
Let {
where for each value of
Thus,
eff(
If
In this section, data on heights (
For
Bias(
For
eff(
In this paper, two approaches of BVRSS are introduced, which are shown to be more convenient to apply in practice. The first approach was based on ranking of one variable and noting the exact rank of the other variable in one cycle, and do the other way around in the other cycle. The second approach is based on ranking the first variable in the first cycle and give the second variable the same rank (concomitant order statistic), and do the other way around in the second cycle. The two approaches were investigated in general and for some well-known distributions. A real data set was used for illustration. The suggested approaches are shown to be useful for use in practice and can be more efficient than using SRS. The proposed two approaches of BVRSS can be applied on other modifications of RSS such as moving extreme ranked set sampling; this will be another choice that can be investigated in the future. Parametric statistics can be done if the underlying distribution is known; consequently, the MLE can be obtained. The information in the chosen sample can be measured using Fisher information number.
Our sincere thanks is given to the referees for their careful reading of the paper and comments that significantly improved the original version of the paper.
A numerical example of the modification with
( | ( | ( | Steps 1, 2, 3 | |
---|---|---|---|---|
Cycle 1 | (50, 143) | (81, 169) | (72, 176) −→ | (50, 143) |
(77, 181) | (67, 158) | (64, 177) −→ | (67, 158) | |
(43, 147) | (62, 155) | (55, 163) −→ | (62, 155) | |
( | ( | ( | Steps 4, 5, 6 | |
Cycle 2 | (52, 148) | (83, 160) | (49, 150) −→ | (52, 148) |
(85, 189) | (75, 169) | (71, 188) −→ | (71, 188) | |
(59, 171) | (48, 154) | (76, 174) −→ | (76, 174) |
The eff(
Efficiency | Efficiency | Efficiency | |||
---|---|---|---|---|---|
1.00 | 0.63 | 0.40 | |||
1.50 | 2.00 | 2.50 | |||
1.00 | 1.54 | 2.17 | |||
0.63 | 0.40 |
The eff(
Efficiency | Efficiency | Efficiency | |||
---|---|---|---|---|---|
1.33 | 0.97 | 0.71 | |||
1.33 | 1.64 | 1.92 | |||
0.80 | 1.44 | 1.71 | |||
0.49 | 0.30 |
The eff(
Efficiency | Efficiency | Efficiency | |||
---|---|---|---|---|---|
1.00 | 0.62 | 0.30 | |||
1.47 | 1.91 | 2.35 | |||
1.00 | 1.60 | 1.94 | |||
0.62 | 0.39 |
The efficiency in general with the best and worst cases eff(
2 | 3 | 4 | 5 | 6 | General | |
---|---|---|---|---|---|---|
Best case | 1.5 | 2.00 | 2.50 | 3.0 | 3.50 | ( |
Worst case | 1.2 | 1.33 | 1.43 | 1.5 | 1.56 | (2 |
The eff(
0.0 | 0.2 | 0.4 | 0.6 | 0.8 | | | |
---|---|---|---|---|---|---|
2 | 1.19 | 1.20 | 1.23 | 1.28 | 1.36 | 1.47 |
3 | 1.31 | 1.33 | 1.38 | 1.48 | 1.64 | 1.92 |
4 | 1.40 | 1.46 | 1.50 | 1.64 | 1.90 | 2.35 |
The eff(
0 | 0.2 | 0.4 | 0.6 | 0.8 | | | |
---|---|---|---|---|---|---|
2 | 1.14 | 1.15 | 1.17 | 1.20 | 1.26 | 1.33 |
3 | 1.25 | 1.30 | 1.36 | 2.04 | 1.47 | 1.64 |
4 | 1.32 | 1.33 | 1.38 | 1.48 | 1.64 | 1.92 |
The summary of the resulted descriptive statistics for
Variable↓ | Average | Standard deviation | Max | Min | |
---|---|---|---|---|---|
1083 | 23.054 | 2.032 | 33.175 | 16.567 | |
1083 | 21.655 | 0.982 | 25.033 | 17.467 |
The summary of the resulted descriptive statistics for
Variable↓ | Average | Standard deviation | Max | Min | |
---|---|---|---|---|---|
1083 | 23.047 | 1.411 | 29.775 | 18.090 | |
1083 | 21.647 | 0.680 | 24.010 | 18.890 |