TEXT SIZE

search for



CrossRef (0)
Partial optional randomized response technique with calibration weighting to adjust non-response in successive sampling
Communications for Statistical Applications and Methods 2021;28:493-510
Published online September 30, 2021
© 2021 Korean Statistical Society.

Kumari Priyanka1,a, Pidugu Trisandhyab , Ajay Kumarb

aDepartment of Mathematics, Shivaji College (University of Delhi), India
bDepartment of Mathematics, University of Delhi, India
Correspondence to: 1 Department of Mathematics, Shivaji College (University of Delhi), New Delhi 110 027, India. priyanka.ism@gmail.com
Received March 19, 2021; Revised May 4, 2021; Accepted June 8, 2021.
 Abstract
The present article endeavours to develop partial optional randomized - response technique (PORT) to deal with sensitive issues in presence of non-response in successive sampling. Calibration techniques have been embedded with PORT to estimate sensitive population mean at current move in two move successive sampling in presence of non-response. Optimum calibration weights are computed at each move with the aid of constraints based on auxiliary information. Detailed properties of the proposed estimators have been discussed. Possible cases in which non-response may creep at two moves has been explored. The proposed technique has been compared with the modified existing technique. Simulation results indicate that the proposed technique is more ecient than existing, modified one. Suitable recommendations are forwarded.
Keywords : partial optional randomized-response technique, sensitive variable, successive moves, calibration estimators, non-response, population mean, auxiliary information
1. Introduction

In many social surveys, people often do not respond genuinely when socially sensitive questions are asked. To account for this behaviour, different techniques were introduced by many statisticians to reduce no reporting, under-reporting or over reporting.

The sensitive issues may be matters of medical/legal malpractices, addiction of drugs, criminal conviction, induced abortions, acid attacks, domestic violence, etc. Two widely practised indirect questioning techniques dealing with these issues are randomized response technique (RRT) and scrambled response technique (SRT) that protect the privacy of respondents and mask the sensitive response, thereby motivating the respondents to give accurate response.

Warner (1965) was the first to provide such a randomizing model and followed by sizeable literature was added by Horvitz et al. (1967), Greenberg et al. (1971), Christofides (2003), Kim and Elam (2007), Wu et al. (2008), Yan et al. (2009), Arnab (2011), Diana and Perri (2011), Arnab et al. (2012). The SRT was initiated by Pollock and Bek (1976). Further, Eichhorn and Hayre (1983), Saha (2007), Diana and Perri (2010, 2011), Perri and Diana (2013) and Hussain and Al-Zahrani (2016) added substantive literature in this area. In RRT or SRT the respondent need to provide randomized or scrambled response. However, depending on the thought of respondents it might be possible that some issue may be sensitive to one respondent but the same issue may not be sensitive to other. Hence, to address these problems optional randomized response technique (ORT) may be used. ORT can be classified into two categories: full optional randomized response technique (FORT) and partial optional randomized response technique (PORT). The ORT is more efficient than compulsory randomized response technique (CRT) because the probability of obtaining true responses in ORT is much higher than that in the CRT (Arnab, 2004). PORT was discussed by many authors such as Mangat and Singh(1994), Gupta et al. (2002, 2013), Pal (2008), Chaudhuri and Dihidar (2009), and Sanaullah et al. (2020).

It is common that the sensitive variable may be dynamic over time, in such a situation one point survey may not be sufficient. Continuous monitoring of sensitive variable may be required. The dynamics of such sensitive variables may be studied using successive sampling. Addressing the sensitive variable, Arnab and Singh (2013), Yu et al. (2015), Naeem and Shabbir (2016), Singh et al. (2017), Priyanka et al. (2018), Priyanka and Trisandhya (2019a, 2019b), Priyanka et al. (2019), Singh et al. (2018) contributed rich literature. These researchers used simple random sampling design in successive sampling and used either RRT or SRT to deal with sensitive issues. However, no attempt has been made to estimate sensitive population mean using PORT in successive sampling.

Generally, all sample surveys are affected by the problem of non-response. When the issues are sensitive, then they are more prone to occur and can severely affect the validity and generalizability of the results. Non-response are generally of two types namely unit non-response and item non-response. In unit non-response, sampled unit fails to respond completely, however in item non-response, the sampled unit responds to the survey but fails to respond to a particular question.

Hence, before proceeding with any method the kind of non-response creeping in the survey must be identified and suitable measure must be devices.

Therefore, in this paper an attempt has been made to estimate sensitive population mean at current move using PORT with calibration weighting to adjust unit non-response in two move successive sampling. A new model for PORT has been proposed and the existing model by Sanaullah et al. (2020) have been modified to work as PORT model for successive sampling. The paper is structured as follows, in Section 2, the proposed PORT models on two successive moves have been presented along with the calibration estimators in presence of unit non-response at both the moves for the coded response variable at current move. In Section 3, the asymptotic behaviour of the estimators developed in previous section have been studied. Section 4 is devoted to discuss the study under simple random sampling without replacement. In Section 5, the possible cases in which the non-response may creep has been discussed. Section 6 presents the corresponding estimators for sensitive population mean along with the expression for their variances. Section 7 is dedicated to a simulation study to compare the precision of the proposed estimators with the usual modified estimator and the comparison of the proposed PORT models. Finally, Section 8 is devoted to the discussion of result and conclusion.

2. Survey design and development of estimator

2.1. Formulation and notation

A finite population U = (U1,U2, . . .,UN) of N identifiable units is considered for sampling over two moves. The size of population is constant while the value of units do change over the moves. Let the sensitive study character be denoted by y1 the first move and y2 the second move. It is assumed that the information on non-sensitive auxiliary variable x, whose population mean is known and stable over moves, is readily available on both the moves. In two moves successive sampling, we intend to focus on the probability design to draw relevant samples on different moves under unit non-response for the estimation of population mean of sensitive variable. At the first move, a sample sn of size n is drawn with the design d1 having probability pd1 (sn). We assume that sr1 of size r1 be the response set from sn; sr1snU, having probability pd1 (sr1). A random sub-sample sm of m units from sr1 (the set of responding units at first move) are preserved with design d2 having the probability pd2 (sm) to be used at current move. Next, an unmatched sample su of size u is drawn afresh (without replacement) at second move with the design d3 having probability pd3 (su). Let the response set r2 of size sr2 is obtained from su; sr2su ⊂ (Usn) having the probability pd3 (sr2). Let S1 and S2 be two scrambling variables that will be used to code the response for the sensitive variables. It is assumed that two scrambling variables are independent of each other and their mean and variance are known. Further, let z1(z2) be the coded response variable corresponding to sensitive variables y1(y2) on two moves. The first order and second order positive inclusion probabilities for different samples on two moves are shown in Table 1 and sampling design basic weights are presented in Table 2.

Let the known mean and variance of scrambling variables be assumed as E(S1) = 1, E(S2) = 2, V(S1)=σs12,V(S2)=σs22.

Based on the considered sampling design, we intend to apply partial optional randomization technique (PORT) on successive moves to handle sensitivity of study variable.

2.2. Partial optional randomization technique on successive moves

Motivated by recent work of Sanaullah et al. (2020), we intend to modify their generalized randomised response model to be applicable for successive sampling under PORT. The model is modified as,

• PORT-I: Modified model (Sanaullah et al., 2020)

The response obtained from the jth respondent on first and second move are respectively given as,

z1j={y1j,with probability p,y1jS1j+aS2j,with probability (1-p),z2j={y2j,with probability p,y2jS1j+aS2j,with probability (1-p),

where a ∈ (−∞, ∞) is a suitable constant chosen by the investigator (Himmelfarb and Edgell, 1980).

Taking expectation on both sides of equation (2.1) and equation (2.2) respectively, we get

Z¯1=pY¯1+(1-p)[Y¯1S¯1+aS¯2],Z¯2=pY¯2+(1-p)[Y¯2S¯1+aS¯2].

Therefore, the population mean of sensitive variable at current move is given as

Y2=Z¯2-(1-p)aS¯2p+(1-p)S¯1,

such that

ρz1z2=p2(ρy1y2σy1σy2)+2p(1-p)(ρy1y2σy1σy2S¯1)+(1-p)2[(ρy1y2σy1σy2)(σs12+S¯12)+Y¯1Y¯2σs12]I1I2ρz1y1=(p+(1-p)S¯1)σy1I1,ρz2y2=(p+(1-p)S¯1)σy2I2,I1=p2σy12+(1-p)2[σy12(σs12+S¯12)+σs12Y¯12+a2σs22],I2=p2σy22+(1-p)2[σy22(σs12+S¯12)+σs12Y¯22+a2σs22].
• PORT-II: Proposed model

Since the modified Sanaullah et al. (2020) model involves an unknown constant to be chosen by the investigator. This selection of constant may involve an added source of randomness in the model. Hence, we intend to propose a model for PORT on successive moves which is independent of arbitrary chosen constant. The response received from the jth respondent under the proposed PORT model on first and second move are respectively given as

z1j={y1j,with probability p,y1jS1j+S2j-S¯2S¯1with probability (1-p),z2j={y2j,with probability p,y2jS1j+S2j-S¯2S¯1with probability (1-p).

On taking expectation on both sides of equation (2.4), we observe that

E(z1j)=pE[y1j]+(1-p)E[y1jS1j+S2j-S¯2S¯1],=pY¯1+(1-p)[Y¯1+S¯2S¯1-S¯2S¯1],Z¯1=Y¯1.

Similarly, taking expectations on both the sides of equation (2.5) we get the population mean of sensitive variable at current move as

Y¯2=Z¯2,

such that

ρz1z2=p2(ρy1y2σy1σy2)+p(1-p)[2ρy1y2σy1σy2+Y¯1Y¯2]+(1-p)2[I3]I4I5I3=(ρy1y2σy1σy2+Y¯1Y¯)(σs12+S¯12)+σs22+S¯22+σs22/nσs12/n+S¯12+S¯2Y¯2-Y¯1Y¯2,ρz1y1=pσy12+(1-p)σy12σy1I4,ρz2y2=pσy22+(1-p)σy22σy2I5,I4=p2σy12+(1-p)2[(σs12+S¯12)(σy12+Y¯12)σs12/n+S¯12-Y¯12+σs22+S¯22σs12/n+S¯12-σs22/n+S¯22σs12/n+S¯12]+2p(1-p)σy12,I5=p2σy22+(1-p)2[(σs12+S¯12)(σy22+Y¯22)σs12/n+S¯12-Y¯22+σs22+S¯22σs12/n+S¯12-σs22/n+S¯22σs12/n+S¯12]+2p(1-p)σy22.
Remark 1

The mean and variance of sensitive variable at current move in two move successive sampling are obtained in terms of mean of coded response variable. Hence, efficient estimators need to be investigated to estimate coded response variable so that the estimate of sensitive variable get improved and became more effective. Hence, in next section we investigate the suitable estimators for coded response variable in presence of non-response at both the moves.

Remark 2

Since the study character is sensitive in nature. Even though the investigator try so hard, there will always be scope for some non-response. Hence, in order to deal with non-response, calibration technique applied over successive moves may be a good alternative. The calibration technique becomes more effective if auxiliary information is available and in successive sampling the information from previous move may also be used as an auxiliary information at current move together with the availability of additional auxiliary variable. Hence, in the next section a weighted calibration estimator for coded response variable have been proposed to adjust the effect due to presence of non-response.

2.3. Calibration estimator for coded response variable

Devil and Särndal (1992) invoked calibration technique in survey sampling, which is proved to be an efficient technique to adjust non-response by Lundström and Särndal (1999). Therefore, calibration technique has been used to adjust non-response with the aid of an additional auxiliary variable to estimate coded response variable which will be further used to estimate population mean of sensitive variable.

2.3.1. Calibration estimator based on fresh sample

Let the basic design weight βi* be replaced by the new weight wr2. Hence, for the estimation of sensitive population mean at current move, the proposed calibration estimator T^cuNR based on fresh sample of size u is given as

T^cuNR=1Nisr2wr2iz2i,

In order to obtain the calibrated weight wr2i, we minimize the chi-square type function

isr2(wr2i-βi*)2quiβi*,

subject to calibration constraint

1Nisr2wr2ixi=X¯,

with qui being known positive constant unrelated to βi*, xi and . This results in to calibrated weights given by

wr2i=βi*+βi*quiN[(X¯-1Nisr2βi*xi)xiisr2quiβi*xi2].

After substituting the calibrated weights wr2i in equation (2.7) we obtain the final calibrated estimator T^cuNR as,

T^cuNR=[1Nisr2βi*z2i+b^u(X¯-1Nisr2βi*xi)],

with

b^u=(isr2βi*quixi2)-1(isr2βi*quixiz2i),

where qui being known positive constant unrelated to βi* and .

2.3.2. Calibration estimator based on matched sample

In successive sampling, to ameliorate the performance of the estimators on the current move, it is quotidian practice to use the information collected on the first move as auxiliary information in addition to availability of additional non-sensitive auxiliary variable. The calibration estimator in presence of non-response is proposed based on sample of size m at current (second) move with the new calibrated weights as

T^cmNR=1Nismwmiz2i.

To find the calibrated weight wmi, we minimize the chi-square function

ism(wmi-αi*)2qmiαi*,

subject to calibration constraints

1Nismwmiz1i=z¯1n*NR,1Nismwmixi=X¯,

with qmi as known positive constant unrelated to αi* and . Following similar procedure, we obtain the calibrated estimator for z¯1n*NR based on sample sr1, the response set obtained from sn drawn at first move as

z¯1n*NR=[1Nisr1α1iz1i+(X¯-1Nisr1iα1ixi)b^n],

with

b^n=(isr1α1iqnixi2)-1(isr1α1iqnixiz1i),

where qni is known positive constant unrelated to α1i and .

Now, minimizing the chi-square function in equation (2.13) subject to constraints in equation (2.14) and (2.15) lead to the calibrated weights given by

wmi=αi*+[1Nxiαi*qmi{(X¯-1Nismαi*xi)(ismαi*qmiz1i2)-(z¯1n*NR-1Nismαi*z1i)(ismαi*qmiz1ixi)}+1Nz1iqmiαi*{(z¯1n*NR-1Nismαi*z1i)(ismαi*qmixi2)-(X¯-1Nismαi*xi)(ismαi*qmiz1ixi)}]b^m,

where

b^m=[(ismαi*qmi(z1i)2)(1N2ismαi*qmixi2)-1N2(ismαi*qmiz1ixi)2]-1.

Substituting the calibrated weight wmi in equation (2.12), the final proposed calibrated estimator T^cmNR based on sample size m at current move becomes

T^cmNR=[1Nismαi*z2i+b^m1(z¯1n*NR-1Nismαi*z1i)+b^m2(X¯-1Nismαi*xi)],

with

b^m1=(1N2ismαi*qmiz1iz2i)[(ismαi*qmixi2)-(ismαi*qmiz1ixi)]b^m,b^m2=(1N2ismαi*qmixiz2i)[(ismαi*qmi(z1i)2)-(ismαi*qmiz1ixi)]b^m,

qmi being known positive constant unrelated to αi* and z¯1n*NR.

2.3.3. Combined calibrated estimator

The final calibrated estimator in presence of non-response at both moves is considered as convex linear combination of the two calibrated estimators T^cuNR and T^cmNR respectively and is given as,

T^cNR=φT^cuNR+(1-φ)T^cmNR,

where T^cuNR and T^cmNR are given in equation (2.11) and (2.18) respectively and ϕ ∈ [0, 1] is a scalar quantity to be chosen suitably.

3. Asymptotic variance

This section is dedicated to elaboration of asymptotic properties of proposed calibration estimator T^cNR . Since, the estimator T^cNR depends on the estimators T^cuNR and T^cmNR given in equation (2.11) and (2.18) respectively, so we first discuss the asymptotic properties of T^cuNR and T^cmNR. In addition, the results suggested by Randles (1982), may be used to discuss the asymptotic variance of estimators.

Proposition 1

The asymptotic behaviour of the calibrated estimator T^cuNRis same as that of

T^cubNR=1Nisr2βi*z2i+(X¯-1Nisr2βi*xi)b,

with

b=(isr2quixi2)-1(isr2quixiz2i).
Proof

Assuming,

T^cuNR(γ)=1Nisr2βi*z2i+(X¯-1Nisr2βi*xi)γ,

for any variable γ, equation (3.3) shows T^cuNR(γ) is the calibration estimator T^cuNR when u in equation(2.11) is replaced by γ. Therefore, the limiting mean of T^cuNR(γ) when the actual parameter value is b, given in equation (3.2), can be written as

μ(γ)=limu+Eb[T^cuNR(γ)]=Z˜2,

where 2 is the limiting value of 2 as N → ∞. Using Randles (1982), the estimator T^cuNR has the same asymptotic behaviour as that of the estimator in equation (3.1).

Proposition 2

The variance of the estimator T^cubNRin equation (3.1) is given by

V(T^cubNR)=[1N2iUjUΔijz2i(πi)cz2j(πj)c]+E1[1N2i(sn)cj(sn)cΔij(sn)c(π1i)c(π1j)ceiπisncejπjsnc],

where ei = z2ixib, Δi j = πi jπiπj, Δi j|(sn)c = πi j|(sn)cπi|(sn)cπj|(sn)c and E1is the expectation under the design d1.

Proof

Since the estimator T^cubNR is unbiased, so its variance is given by

V(T^cubNR)=V1(E3[T^cubNR])+E1[V3(T^cubNR)],

where E1 and V1 are the expectation and variance under the design d1 respectively, and E3 and V3 represent the conditional expectation and conditional variance under design d3 respectively

V1(E3[T^cubNR])=V1(1Nisr2βi*z2i)=1N2iUjUΔijz2i(πi)cz2j(πj)c.

Now,

E1[V3(T^cubNR)]=E1[V3{(1Nisr2βi*z2i)+(X¯-1Nisr2βi*xi)b}]=E1[V3(1Nisr2βi*z2i-1Nisr2βi*xib)]=E1[V3(1Nisr2βi*ei)]=E1[1N2i(sn)cj(sn)cΔij(sn)c(π1i)c(π1j)ceiπisncejπj(sn)c].

Using equation (3.7) and (3.8) in equation (3.6), we get the expression for variance as in equation (3.5).

Remark 3

From Proposition 1 and Proposition 2, the estimator T^cuNR and T^cmNR are asymptotically unbiased and their asymptotic variances are given as

V(T^cuNR)=1N2[iUjUΔijz2i(πi)cz2j(πj)c]+1N2E1[i(sn)cj(sn)cΔij(sn)c(π1i)c(π1j)ceiπi(sn)cejπj(sn)c].

Similarly,

V(T^cmNR)=1N2[iUjUΔijz2i(πi)cz2j(πj)c]+1N2E2[isr1jsr1Δijsr1π1iπ1jeiπisr1ejπjsr1],

where Δij=π1ij-π1iπ1j and E2 is the expectation under the design d2.

Proposition 3

The asymptotic variance of proposed calibration estimator is obtained as

V(T^cNR)=φ2V(T^cuNR)+(1-φ)2V(T^cmNR),

where V(T^cuNR)and V(T^cmNR)are given in equation (3.9) and (3.10) respectively.

Proof

The asymptotic variance of calibration estimator T^cNR is given by

V(T^cNR)=E[T^cNR-Z¯2]2,=E[φT^cuNR+(1-φ)T^cmNR-Z¯2]2,=φ2V(T^cuNR)+(1-φ)2V(T^cmNR)+2φ(1-φ)cov(T^cuNR,T^cmNR).

The values of T^cuNR and T^cmNR have been computed in equation (3.9) and (3.10) respectively and as the estimators V(T^cuNR) and V(T^cmNR) are based on two non-overlapping samples of sizes u and m respectively. So, cov(T^cuNR,T^cmNR)=0. By using these values in equation (3.12), we have the expression for the asymptotic variance of the calibrated estimator in presence of non-response as in equation (3.11).

Remark 4

From the equation (3.11), it can be concluded that, V(T^cNR) is a function of unknown constant ϕ. Therefore, it is optimized with respect to ϕ and subsequently the optimum value of ϕ is obtained as

φopt.=V(T^cmNR)V(T^cuNR)+V(T^cmNR).

Substituting the value of ϕ(opt.) from equation (3.13) in (3.11), we get the optimum variance of the proposed estimator T^cNR

as

V(T^cNR)opt.=V(T^cuNR)×V(T^cmNR)V(T^cuNR)+V(T^cmNR).
4. Study under simple random sampling without replacement sampling design

In this section, calibration estimator in presence of non-response has been considered for simple random sampling without replacement (SRSWOR) design on both the moves. For that the relevant suppositions are given as

π1i=r1N,         π1ij=r1(r1-1)N(N-1).

Because the sample sn is drawn from U with SRSWOR of size n, it implies that the complement, snc=U-sn is a simple random sample without replacement of size Nn, therefore we have

π1ic=N-nN,π1ijc=(N-n)(N-n-1)N(N-1).

Also, we suppose that the matched sample sm is drawn from sr1, with SRSWOR of size m so

πisr1=mr1,πijsr1=m(m-1)r1(r1-1).

Finally, the unmatched sample su is drawn from snc with SRSWOR of size u. Thus, we have

πisnc=r2N-r2,πijsnc=r2(r2-1)(N-n)(N-n-1)

Now, based on sample of size r2 on current move, the proposed calibration estimator T^cuNR under SRSWOR sampling design becomes

T^cuNR(s)=[z¯2u+b^u(s)(X¯-x¯u)],

with

b^u(s)=(isr2quixi2)-1(isr2quixiz2i).

Similarly, based on sample of size m on current move, the proposed calibration estimator T^cmNR under SRSWOR scheme is obtained as

T^cmNR(s)=[z¯2m+b^m1(s)(z¯1n*NR(s)-z¯1m)+b^m2(s)(X¯-x¯m)].

with

b^m1(s)=(ismqmiz1iz2i)[(ismqmixi2)-(ismqmiz1ixi)]b^m(s),b^m2(s)=(ismqmixiz2i)[(ismqmi(z1i)2)-(ismqmiz1ixi)]b^m(s),b^m(s)=[(ismqmi(z1i)2)(ismqmixi2)-(ismqmiz1ixi)2]-1,z¯1n*NR(s)=[z¯1n+b^n(s)(X¯-x¯n)],b^n(s)=(isr1qnixi2)-1(isr1qnixiz1i).

Now, the estimator T^cNR(s) becomes

T^cNR(s)=φsT^cuNR(s)+(1-φs)T^cmNR(s),

where T^cuNR(s) and T^cmNR(s) are given in equation (4.1) and (4.2) respectively and ϕsε[0, 1] is a scalar quantity to be chosen suitably.

Remark 5

Further if we assume, qui = qni = qmi = 1 in T^cNR, then the calibration estimator T^cNR be denoted as T^c*NR.

Remark 6

The proposed calibration estimator have been compared with general successive sampling estimator in presence of non-response at both the moves, so the general successive sampling estimator have been modified for estimation of coded response variable and is given as

T^gNR=ΨgT^guNR+(1-Ψg)T^gmNR,whereΨg[0,1],

with T^guNR=z¯2r2,T^gmNR=z¯2m+βz2z1(z¯1r1-z¯1m)βz2z1=sz2z1sz12.

5. Possible cases

There might be a possibility that non-response may occur only at current move or only at previous move or there may be no non-response at any move. Therefore, in order to retain similar estimators in all possible situations, the calibration technique have been retained and possible modifications has been done in the constraints as per the situation and calibration estimators in different possible situations have been obtained, which are described in following cases.

5.1. Case-I: when there is non-response only at current move

In this situation the proposed estimator T^cNR of the coded response variable 2 changes to

T^c1NR=Ψ1T^cuNR+(1-Ψ1)T^cm;Ψ1[0,1],

where the estimator T^cm can be obtained by replacing r1 by n in equation (2.16) and other equations that depends on it and T^cuNR is defined in equation (2.11).

5.2. Case-II: when there is non-response only at first (previous) move

In the presence of non-response only at first (previous) move, the estimator T^cNR of the coded response variable 2 changes to

T^c2NR=Ψ2T^cu+(1-Ψ2)T^cmNR;Ψ2[0,1],

where the estimator cu can be obtained by replacing r2 by u in equation (2.11) and other equations that depends on it and T^cmNR is defined in equation (2.18).

5.3. Case-III: when there is no non-response at any move

In the presence of no non-response at any move, the estimator T^cNR of the coded response variable 2 changes to

T^c3=Ψ3T^cu+(1-Ψ3)T^cm;Ψ3[0,1],

where the estimator cu can be obtained by replacing r2 by u in equation (2.11) and cm can be obtained by replacing r1 by n in equation (2.16) and other equations that depends on it and T^cmNR is defined in equation (2.18).

6. Estimators for sensitive population mean under PORT

Replacing the population mean of coded response variable 2 in equation (2.3) and (2.6) by its estimators T^c*NR,T^g*NR,T^c1*NR,T^c2*NR and c3, the respective estimators for sensitive population mean at current move becomes Y¯^2c,Y¯^2g,Y¯^2c1,Y¯^2c2 and Y¯^2c3 under PORT-I model and Y¯^2c*,Y¯^2g*,Y¯^2c1*,Y¯^2c2* and Y¯^2c3* under PORT-II model respectively, which are presented in Table 3.

Remark 7

For the considered model in equation (2.3) and (2.6), the two scrambling variables 1 and 2 used to perturb the true response through the PORT models may follow any distribution in two move successive sampling. Hence, following Pollock and Bek (1976) and Eichhorn and Hayre (1983), we consider scrambling variable S1 to follow normal distribution with mean 0 and variance 1. However, the scrambling variable S2 has been assumed to follow normal distribution with mean 1 and variance 1.

7. Simulation study

In this section, a simulation study has been carried out to reveal the behaviour of the proposed estimators. For this purpose, a natural population has been considered from statistical abstracts of United States. The considered population comprise of N = 51 states, which is described as: y1 : Rate of abortion in 2007 y2 : Rate of abortion in 2008 x : Rate of abortion in 2005. As discussed in Remark 6, the scrambling variables S1 ~ N(0, 1) and S2 ~ N(1, 1). The data for S1 and S2 have been generated by MATLAB software.

To judge the performance of both the PORT models under the proposed calibration estimators in presence of non-response to estimate the sensitive population mean in two move successive sampling, we have studied the behaviour of the estimators by considering different choices for rate of non-response at both moves. For simulation, 10, 000 independent replications of considered sampling design in two move successive sampling via MATLAB have been considered. All the samples are obtained under simple random sampling without replacement. An environment through simulation process has been created for non-response by assuming non-response rates as 10%, 20%, and 30% at both the moves.

The entire simulation has been replicated for different values of n, m and u which are considered as different sets given as

Set-I:n=20;m=12;u=8,Set-II:n=20;m=10;u=10.

Calibration estimators have been compared with general successive sampling estimator under both the considered PORT models in terms of percent relative efficiency (PRE), which are given as

PRE1=MSE(Y¯^2g)MSE(Y¯^2c)×100,PRE2=MSE(Y¯^2g*)MSE(Y¯^2c*)×100,

where MSE(Y¯^2g)=1/10,000i=110,000[Y¯^2gi-Y¯2]2 . Similarly, MSE(Y¯^2g*ht),MSE(Y¯^2c) and MSE(Y¯^2c*) can be computed. The simulation results have been represented in Tables 45 respectively. Further, in order to identify the better PORT model, the percent relative efficiency (PRE) of calibration estimator under PORT-I with respect to calibration estimator under PORT-II have been computed as

PRE3=MSE(Y¯^2c)MSE(Y¯^2c*)×100,

where MSE(Y¯^2c)=1/10,000i=110,000[Y¯^2ci-Y¯2]2 and similarly, MSE(Y¯^2c*) can be computed. The results are presented in Figures 12.

8. Discussion of results and conclusion

Following interpretation can be drawn from the simulation results presented in Tables 45 and also in Figures 12.

  • (i) From Tables 45, it is observed that the proposed calibration estimator is performing better than general successive sampling estimator in the presence of non-response at both the moves under PORT-I as well as PORT-II models. This shows that the use of calibration technique to adjust the effect due to non-response is fruitful.

  • (ii) Figures 12 show that the calibration estimator under proposed PORT-II model is better than the same calibration estimator under PORT-I model for all considered choices of constants and non-response rates.

  • (iii) It is also observed from Figures 12 that for the fixed value of p, the PRE3 first increases up to Ψ = 0.5, then decreases for Ψ > 0.5. This shows that if more weight is attached to matched sample than PRE is higher as compared to the situation when more weight is attached to unmatched/fresh sample at current move. These results are in accordance with the theory of successive sampling.

  • (iv) Figures 12 also shows that higher percent relative efficiency is observed for larger value of p i.e., PRE3 in general increased as p increases.

8.1. Conclusion

The estimation of sensitive population mean at current move in two move successive sampling is feasible using PORT. The calibration technique applied to adjust the effect due to non-response is proved to be fruitful under both the considered models. The proposed model PORT-II is coming out to be more efficient than the modified Sanaullah et al. (2020) model (PORT-I) on successive moves. Therefore, it is concluded that the proposed PORT model may be used for the estimation of sensitive population mean at current move in two move successive sampling.

Figures
Fig. 1. PRE of the calibration estimator under PORT-I with respect to the calibration estimator under PORT-II for different choices of non-response rates and varying for a = 닋15.
Fig. 2. PRE of the calibration estimator under PORT-I with respect to the calibration estimator under PORT-II for different choices of non-response rates and varying for a = 닋19.
TABLES

Table 1

Inclusion probabilities

SizeFirst order inclusion probabilitySecond order inclusion probability
r1π1i = ∑isr1Pd1 (sr1)π1i j = ∑i, jsr1Pd1 (sr1)
mπi|sr1 = ∑ismPd2 (sm)πi j|sr1 = ∑i, jsmPd2 (sm)
r2πi = ∑isr2Pd3 (sr2 )πi j = Σi, jsr2Pd3 (sr2)

Table 2

Sample design basic weights

Response SetSizeSampling design basic weight for selecting ith unit in corresponding sample
sr1r1α1i=1π1i
smmαi*=1π1iπisr1
sr2r2βi*=1(π1i)cπisnc

Table 3

Estimators of sensitive population mean and their variances

EstimatorsVariance
PORT-IY¯^2c=T^c*NR-(1-p)aS¯2p+(1-p)S¯1V[Y¯^2c]=V(T^c*NR)opt.[p+(1-p)S¯1]2
Y¯^2g=T^g*NR-(1-p)aS¯2p+(1-p)S¯1V[Y¯^2g]=V(T^g*NR)opt.[p+(1-p)S¯1]2
Y¯^2c1=T^c1*NR-(1-p)aS¯2p+(1-p)S¯1V[Y¯^2c1]=V(T^c1*NR)opt.[p+(1-p)S¯1]2
Y¯^2c2=T^c2*NR-(1-p)aS¯2p+(1-p)S¯1V[Y¯^2c2]=V(T^c2*NR)opt.[p+(1-p)S¯1]2
Y¯^2c3=T^c3-(1-p)aS¯2p+(1-p)S¯1V[Y¯^2c3]=V(T^c3*NR)opt.[p+(1-p)S¯1]2

PORT-IIY¯^2c*=T^c*NRV[Y¯^2c*]=V(T^c*NR)opt.
Y¯^2g*=T^g*NRV[Y¯^2g*]=V(T^gNR)opt.
Y¯^2c1*=T^c1NRV[Y¯^2c1*]=V(T^c1NR)opt.
Y¯^2c2*=T^c2NRV[Y¯^2c2*]=V(T^c2NR)opt.
Y¯^2c3*=T^c3V[Y¯^2c3*]=V(T^c3)opt.

Table 4

Percent relative efficiency of Y¯^2g with respect to Y¯^2c for different sets and different choices of non-response (NR) rates under PORT-I where Ψ ∈ {φg}

ΨpSET-ISET-II


NR = 10%NR = 20%NR = 30%NR = 10%NR = 20%NR = 30%






PRE1PRE1PRE1PRE1PRE1PRE1






a = −10a = −19a = −10a = −19a = −10a = −19a = −10a = −19a = −10a = −19a = −10a = −19
0.3110.7137.9178.6137.2175.8140.4100.5138.1109.1136.7136.1135.6
0.5152.7173.3152.4173.0251.9172.12156.3174.2155.0173.4151.6172.6
0.10.7362.6196.9349.1195.1336.4198.6342.6191.0315.7192.9315.1189.1
0.9319.1387.3372.9419.4406.2436.2158.0234.2201.7227.8181.4255.1

0.3120.4118.4103.5107.8120.0146.4115.6118.5115.0117.5132.3106.4
0.5140.9148.9138.8148.1137.1146.5141.1178.6141.7138.2140.1137.2
0.30.7302.5173.8292.7173.1286.7174.1295.9171.6289.8172.0281.7172.1
0.9309.9355.2332.3413.2428.3435.9185.2246.1207.1224.7239.0277.9

0.3110.4106.5116.5105.2113.8100.0100.4100.1108.3109.6116.5118.6
0.5128.6163.6126.4122.4124.7181.3129.6123.6128.6142.8127.4134.5
0.50.7244.7150.3234.5149.6231.0152.0243.0150.2237.2151.9234.9152.2
0.9405.7378.5366.0410.2454.6448.2219.4298.1231.9277.7249.5232.5

0.3112.3113.9109.1171.8108.570.4103.9183.8102.5112.7100.6101.7
0.5115.8119.0114.7186.9113.485.5117.6118.6118.0108.3116.0117.1
0.70.7184.9129.1179.3228.9174.8127.3187.0130.2185.0129.4182.3230.2
0.9424.1325.7413.4339.3423.6353.0261.6285.2293.1287.6303.4287.4

0.3100.097.294.0184.9102.0183.2100.1100.1102.0186.9100.0135.7
0.5105.3102.7104.3191.6102.5189.8107.3103.7105.9193.1105.0192.2
0.90.7134.7108.4132.0208.4131.0207.3233.4109.4135.1209.1133.7209.3
0.9355.8220.9349.9223.7341.7231.6334.2204.1327.2211.8328.7217.0

Table 5

Percent relative efficiency of Y¯^2g* with respect to Y¯^2c* for different sets and different choices of non-response (NR) rates under PORT-II where Ψ ∈ {φg}

ΨpSET-ISET-II


NR = 10%NR = 20%NR = 30%NR = 10%NR = 20%NR = 30%






PRE2PRE2PRE2PRE2PRE2PRE2
0.3142.0144.7160.1130.6131.2140.6
0.5177.0208.1235.0239.0244.4257.5
0.10.7277.7339.0456.8900.6154.3396.4
0.93809.64150.15548.51578.51523.02549.4

0.3152.6161.8183.6126.1126.6129.9
0.5214.2209.0252.0254.2356.8462.4
0.30.7389.2363.9453.9460.0488.0568.5
0.93420.24141.84400.81504.11849.22485.5

0.3185.2197.5117.7136.7249.1163.4
0.5253.0259.3259.7148.7396.0188.4
0.50.7304.9326.2338.3225.1582.8228.0
0.92408.22407.62181.41582.81622.91960.6

0.3135.9137.9132.8107.9113.2106.6
0.5173.1161.4159.0137.5134.5164.0
0.70.7247.0229.0207.2213.4224.5212.8
0.91015.5931.2784.21204.81035.51012.1

0.3148.5134.4122.9167.8162.8150.0
0.57155.3142.5131.0193.9180.3155.7
0.90.7173.7165.0144.8224.6193.5176.4
0.9388.4350.2300.6496.8428.5387.6

References
  1. Arnab R (2004). Optional randomized response techniques for complex survey designs. Biometrical Journal, 46, 114-124.
    CrossRef
  2. Arnab R (2011). Alternative estimators for randomized response techniques in multi-character surveys. Communications in Statistics-Theory and Methods, 40, 1839-1848.
    CrossRef
  3. Arnab R, Singh S, and North D (2012). Use of two decks of cards in randomized response techniques for complex survey designs. Communications in Statistics-Theory and Methods, 41, 3198-3210.
    CrossRef
  4. Arnab R and Singh S (2013). Estimation of mean of sensitive characteristics for successive sampling. Communications in Statistics-Theory and Methods, 42, 2499-2524.
    CrossRef
  5. Christofides TC (2003). A generalized randomized response technique. Metrika, 57, 195-200.
    CrossRef
  6. Chaudhuri A and Dihidar K (2009). Estimating means of stigmatizing qualitative and quantitative variables from discretionary responses randomized or direct. Sankhya, 71, 123-136.
  7. Diana G and Perri PF (2010). New scrambled response models for estimating the mean of a sensitive quantitative character. Journal of Apllied Statistics, 37, 1875-1890.
    CrossRef
  8. Diana G and Perri PF (2011). A class of estimators for quantitative sensitive data. Stat Papers, 52, 633-650.
    CrossRef
  9. Deville JC and S채rndal CE (1992). Calibration estimators in survey sampling, Journal of the American Statistical Association, 87, 376-382.
    CrossRef
  10. Eichhorn BH and Hayre LS (1983). Scrambled randomized response method for obtaining sensitive quantitative data, Journal of Statistical Planning and Inference, 7, 307-316.
  11. Gupta S, Gupta B, and Singh S (2002). Estimation of sensitivity level of personal interview survey question. Journal of Statistical Planning and Inference, 100, 239-247.
    CrossRef
  12. Gupta S, Mehta S, Shabbir J, and Dass BK (2013). Generalized scrambling quantitative optional randomized response models. Communication in Statistics-Theory and Methods, 42, 4034-4042.
    CrossRef
  13. Greenberg BG, Kubler RR, and Horvitz DG (1971). Application of RR technique in obtaining quantitative data. Journal of the American Statistical Association, 66, 243-250.
    CrossRef
  14. Horvitz DG, Shah BV, and Simmons WR (1967). The unrelated question randomized response model. Journal of the American Statistical Association, 65-72.
  15. Himmelfarb S and Edgell SE (1980). Additive constants model: A randomized response technique for eliminating evasiveness to quantitative response questions. Psychological Bulletin, 87, 525-530.
    CrossRef
  16. Hussain Z and Al-Zahrani B (2016). Mean and sensitivity estimation of a sensitive variable through additive scrambling. Communications in Statistics-Theory and Methods, 45, 182-193.
    CrossRef
  17. Kim JM and Elam ME (2007). A stratified unrelated question randomized response model. Statistical Papers, 48, 215-233.
    CrossRef
  18. Lundstr철m S and S채rndal CE (1999). Calibration as a standard method for treatment of non-response. Journal of Official Statistics, 15, 305-327.
  19. Mangat NS and Singh S (1994). An optional randomised response sampling technique. Journal of Indian Statistical Association, 32, 71-75.
  20. Naeem N and Shabbir J (2016). Use of scrambled responses on two occasions successive sampling under non-response. Hacettepe University Bulletin of Natural Sciences and Engineering Series B: Mathematics and Statistics, 46.
  21. Pal S (2008). Unbiasedly estimating the total of a stigmatizing variable from a complex survey on permitting options for direct or randomized responses. Statistical Papers, 49, 157-164.
    CrossRef
  22. Pollock KH and Bek Y (1976). A comparison of three randomized response models for quantitative data. Journal of the American Statistical Association, 71, 884-886.
    CrossRef
  23. Perri PF and Diana G (2013). Scrambled response models Based on auxiliary variables. Advances in Theoretical and Applied Statistics, (pp. 281-291), Berlin, Springer.
    CrossRef
  24. Priyanka K and Trisandhya P (2019a). A Composite Class of Estimators using Scrambled Response Mechanism for Sensitive Population mean in Successive Sampling. Communications in Statistics-Theory and Methods, 48, 1009-1032.
    CrossRef
  25. Priyanka K and Trisandhya P (2019b). A Item sum techniques for quantitative sensitive estimation on successive occasions. Communications for Statistical Applications and Methods, 26, 175-189.
    CrossRef
  26. Priyanka K, Trisandhya P, and Mittal R (2018). Dealing sensitive characters on successive occasions through a general class of estimators using scrambled response techniques. Metron, 76, 203-230.
    CrossRef
  27. Priyanka K, Trisandhya P, and Mittal R (2019). Scrambled response Techniques in Two Wave Rotation Sampling for Estimating Population Mean of Sensitive Characteristics with Case Study. Journal of Indian Society of Agricultural Statistics, 73, 41-52.
  28. Randles R (1982). On the asymptotic normality of statistics with estimated parameters. Annals of Statistics, 10, 462-474.
    CrossRef
  29. Saha A (2007). A simple randomized response technique in complex surveys. Metron, LXV, 59-66.
  30. Singh GN, Suman S, Khetan M, and Paul C (2017). Some estimation procedures of sensitive character using scrambled response techniques in successive sampling. Communications in Statistics-Theory and Methods.
  31. Singh GN, Khetan M, and Suman S (2018). Assessment of Scrambled Response on Second Call in Two-Occasion Successive Sampling under Non-Response. Journal of Indian Society of Agricultural Statistics, 72, 147-156.
  32. Sanaullah A, Saleem I, Gupta S, and Hanif M (2020). Mean estimation with generalized scrambling using two-phase sampling. Communications in Statistics-Simulation and Computations.
    CrossRef
  33. Wu JW, Tian GL, and Tang ML (2008). Two new models for survey sampling with sensitive characteristics: Design and Analysis. Metrika, 67, 251-263.
    CrossRef
  34. Warner SL (1965). Randomized response: a survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60, 63-69.
    Pubmed CrossRef
  35. Yan Z, Wang J, and Lai J (2009). An efficiency and protection degree-based comparison among the quantitative randomized response strategies. Communications in Statistics-Theory and Methods, 38, 400-408.
    CrossRef
  36. Yu B, Jin Z, Tian J, and Gao G (2015). Estimation of sensitive proportion by randomized response data in successive sampling. Computational and Mathematical Methods in Medicine, 2015, 1-6.
    CrossRef