TEXT SIZE

search for



CrossRef (0)
On the Bayes risk of a sequential design for estimating a mean difference
Communications for Statistical Applications and Methods 2024;31:427-440
Published online July 31, 2024
© 2024 Korean Statistical Society.

Sangbeak Ye1,a, Kamel Rekabb

aMethods Center, University of Tübingen, Germany
bDepartment of Mathematics and Statistics, University of Missouri Kansas City, USA
Correspondence to: 1 Methods Center, University of Tübingen, Haußerstr. 11, Tübingen 72076, Germany. E-mail: sangbeakye@gmail.com
Received January 1, 2024; Revised March 13, 2024; Accepted March 18, 2024.
 Abstract
The problem addressed is that of sequentially estimating the difference between the means of two populations with respect to the squared error loss, where each population distribution is a member of the one-parameter exponential family. A Bayesian approach is adopted in which the population means are estimated by the posterior means at each stage of the sampling process and the prior distributions are not specified but have twice continuously differentiable density functions. The main result determines an asymptotic second-order lower bound, as t → ∞, for the Bayes risk of a sequential procedure that takes M observations from the first population and tM from the second population, where M is determined according to a sequential design, and t denotes the total number of observations sampled from both populations.
Keywords : Bayes risk, Fatou’s lemma, the martingale convergence theorem, one-parameter exponential family, sequential design, squared error loss, uniform integrability
1. Introduction

Let Ω denote an open interval and let Fθ, θ ∈ Ω, denote a one-parameter exponential family of probability distributions; that is, for each θ ∈ Ω,

dFθ(x)=exp{θx-ψ(θ)}dλ(x)for-<x<,

where ψ is a twice continuously differentiable function on Ω and λ is a non-degenerate sigma-finite measure on the Borel sets of (−∞,∞). It is well known that if X is a random variable with a distribution Fθ, then the mean and variance of X are ψ′(θ) and ψ″(θ), respectively (Lehmann, 1959).

Let ℘1 and ℘2 denote populations with independent distributions Fθ1 and Fθ2 , where θ1, θ2 ∈ Ω are unknown. A total of t observations are to be taken from the two populations, and the objective of the study is to estimate the mean difference ψ′(θ1) − ψ′(θ2) with respect to the squared error loss, using a Bayesian approach.

Let X1, X2, … denote observations sampled from the first population, ℘1, and let Y1, Y2, … denote observations from the second population, ℘2. In the Bayesian framework, it is assumed that X1, X2, … are conditionally independent sharing a common distribution Fθ1, given Θ1 = θ1. Similarly, Y1, Y2, … are presumed to be conditionally independent with a common distribution Fθ2, given Θ2 = θ2. Additionally, X1, X2, … are conditionally independent of Y1, Y2, …, given Θ1 = θ1 and Θ2 = θ2; and that Θ1 and Θ2 are independent random variables with respective prior density functions ξ1 and ξ2.

For m ≥ 1 and n ≥ 1, let ℱm,n indicate the sigma-algebra generated by X1, … , Xm, and Y1, … , Yn. Then, denotes a sequential design defined as a sequence of indicators D1, … , Dt, where Dk = 0 if the kth value is sampled from ℘2 and Dk = 1 if the kth value is from ℘1. The constants, D1 and D2, satisfy D1 + D2 = 1 where Dk is ℱmk,nk-measurable for k = 3, … , t with mk = D1 + · · · + Dk and nk = kmk for k = 1, … , t. In the remainder of this paper, we denote mt and nt by M and N and ℱmt,nt by ℱt.

A sequential procedure for estimating the difference between the two population means, μ(θ1, θ2) = ψ′(θ1) − ψ′(θ2), is the pair , where is the sequential design defined above and μ̂t = E{ψ′(Θ1) − ψ′(Θ2)|ℱt}. The Bayes risk incurred by the sequential procedure is defined as

Rt(P)=E[(μ^t-μ(Θ1,Θ2))2].

The problem considered is to find a sequential design for which the risk is minimal and to derive its optimality. To this end, Woodroofe and Hardwick (1990) devise a quasi-Bayesian approach to derive an asymptotic lower bound for the integrated risk and propose a three-stage procedure for two normal distributions with a unit variance. For non-linear estimation, Shapiro (1985) adopts an allocation strategy that has shown that the myopic rule is asymptotically optimal. In this context, Rekab (1989, 1992) derives a first-order sequential procedure and asymptotic lower bound for the Bayes risk, and Benkamra et al. (2015) further derive a nearly second-order asymptotically optimal three-stage design. Song and Rekab (2017) extend the former approaches with a three-stage design to obtain first-order efficiency.

In order to yield a minimal error in estimating the function of the parameters from two populations, obtaining a lower bound contributes to a more refined approximation. However, obtaining a closed-form expression of an exact lower bound for the Bayes risk, particularly in the absence of explicitly specified prior density functions, is notably challenging. Rekab (1990) derives the first-order Bayes risk lower bound for the difference between the means with the conjugate priors. Rekab and Tahir (2004) further extend to a second-order lower bound for the Bayes risk. However, their work specifies the priors as a conjugate, in which forms are provided by Diaconis and Ylvistaker (1979).

Now consider the difference between the two populations from the one-parameter exponential family with the Bayes risk. When the conjugate prior is known, the lower bound of the Bayes risk is as follows:

Rt(P)E[(ψ(Θ1)+ψ(Θ2))2]t+o(1t)

as t → ∞.

The objective of this study is, without an explicit assumption on the conjugate priors as such provided by Diaconis and Ylvisaker (1979), to establish an asymptotic second-order lower bound for the Bayes risk. This result extends the lower bound result of Woodroofe and Hardwick (1990) and further generalizes Rekab (1990), obtained for the difference between the means of two normal populations with unit variance, using the difference of the sample means instead of the Bayes estimator.

The paper is organized as follows. In Section 2, we further describe the preliminary notations and present the main result with an example. In Section 3, the proof of the main result is provided with lemmas and remarks. In Section 4, we illustrate the implementation of the main result with a numerical simulation showcasing the performance of the Bayes risk lower bound. Section 5 concludes with some remarks on the main result and the future direction.

2. An asymptotic second-order lower bound

Let Rt(℘) be as in (1.1). Then, it follows that

Rt(P)=E[Var{ψ(Θ1)X1,,XM}+Var{ψ(Θ2)Y1,,YN}].

Furthermore, Lemma A.2 (see Appendix) shows that

Var{ψ(Θ1)X1,,XM}=E{(ψ(Θ1)-X¯M)2X1,,XM}-1M2(E{α1(Θ1)X1,,XM})2,

and

Var{ψ(Θ2)Y1,,YN}=E{(ψ(Θ2)-Y¯N)2Y1,,YN}-1N2(E{α2(Θ2)Y1,,YN})2,

where for i = 1, 2,

αi(θ)={ξi(θ)ξi(θ),ifθ{θΩ:ξi(θ)>0}0,otherwise.

Moreover, if ξ1 and ξ2 have compact supports in Ω, it follows from Woodroofe (1985) that

E{(ψ(Θ1)-X¯M)2X1,,XM}=1ME{ψ(Θ1)X1,,XM}+1M2E{β1(Θ1)X1,,XM},

and

E{(ψ(Θ2)-Y¯N)2Y1,,YN}=1NE{ψ(Θ2)Y1,,YN}+1N2E{β2(Θ2)Y1,,YN},

where for i = 1, 2,

βi(θ)={ξi(θ)ξi(θ),ifθ{θΩ:ξi(θ)>0}0,otherwise.

Hence, the Bayes risk becomes the following.

Rt(P)=E[UMM+VNN]+E[AMM2]+E[BNN2],

where

UM=E{ψ(Θ1)X1,,XM},VN=E{ψ(Θ2)Y1,,YN},AM=E{β1(Θ1)X1,,XM}-[E{α1(Θ1)X1,,XM}]2,BN=E{β2(Θ2)Y1,,YN}-[E{α2(Θ2)Y1,,YN}]2.

Next,

E[UMM+VNN]=1tE[(UM+VN)2]+1tE[(NUM-MVN)2MN].

Thus, (2.3) becomes

Rt(P)=1tE[UM+VN+2UMVN]+1tE[(NUM-MVN)2MN]+E[AMM2]+E[BNN2].

In the remainder of this paper,

C(θ1,θ2)=ψ(θ1)ψ(θ1)+ψ(θ2),

For the following main result, the simple regularity conditions, as in Woodroofe (1985), are assumed.

Theorem 1

If Θ1and Θ2have compact supports in Ω, then for any sequential proceduresuch that

MtC(Θ1,Θ2)w.p.1ast

then, the Bayes risk ofsatisfies the following asymptotic lower bound:

liminft(t2Rt(P)-tE[(ψ(Θ1)+ψ(Θ2))2])E[[γ(ψ(Θ1))]2ψ(Θ1)ψ(Θ2)C(Θ1,Θ2)]+E[[γ(ψ(Θ2))]2ψ(Θ1)ψ(Θ2)C(Θ2,Θ1)]+E[β1(Θ1)-[α1(Θ1)]2[C(Θ1,Θ2)]2]+E[β2(Θ2)-[α2(Θ2)]2[C(Θ2,Θ1)]2],

where γ(η)=ψ(g(η)) with g being the inverse of ψand α1(θ), α2(θ), β1(θ) and β2(θ) are defined by (2.1) and (2.2).

The proof of Theorem 1 hinges on lemmas provided in Section 3.

Example 1

Suppose that Fθ is the exponential distribution with mean |θ|−1, where θ ∈ Ω = (−∞, 0) and that Θi has p.d.f.

ξi(θ)=siriΓ(ri)θri-1e-siθfor-<θ<0,

where ri and si are given positive real numbers. Then,

ψ(θ)=-ln(-θ),         ψ(θ)=-1θ,         ψ(θ)=1θ2,         γ(η)=ηαi(θ)=si-ri-1θ         and         βi(θ)=si2-2si(ri-1)θ+ri2-3ri+2θ2.

Using the fact that, for i = 1, 2,

E[[ψ(Θi)]p]=si2pΓ(ri-2p)Γ(ri)

for any p > 0, yields

Rt(P)L1t+L2+L3t2

for sufficiently large t, provided that r1 > 3 and r2 > 3, where

L1=s12(r1-1)(r1-2)+s22(r2-1)(r2-2)+2r1r2(r1-1)(r1-2)L2=s1s22(r1-1)(r2-2)(r2-3)+s23(r1-1)(r2-2)(r2-3)+s12s22(r1-1)(r2-2)(r2-1)2+s13s2(r1-1)(r1-2)(r1-3)(r2-1)L3=1-3r1-5r1-2+r2s12(1-s22)(r1-1)(r1-2)s2-(r2-1)s12(r1-1)(r2-2).
3. Proof of Theorem 1

The following lemmas are needed for the proof of the theorem.

Lemma 1

Let γ(η) be as in the statement of Theorem 1. Then,

Var{ψ(Θ1)X1,,XM}1ME{[γ(ψ(Θ1))]2ψ(Θ1)X1,,XM}+1ME{[γ(ψ(Θ1))-γ(X¯M)]2ψ(Θ1)-X¯Mα1(Θ1)X1,,XM},

and

Var{ψ(Θ2)Y1,,YN}1NE{[γ(ψ(Θ2))]2ψ(Θ2)Y1,,YN}+1NE{[γ(ψ(Θ2))-γ(Y¯N)]2ψ(Θ2)-Y¯Nα2(Θ2)Y1,,YN}

w.p.1.

Proof

Let m ≥ 1 be a value of M, and let x = (x1, … , xm), where xi is a value of Xi. Also, let χ̄m = (x1 + · · · + xm)/m denote the value of M. Then,

Var{ψ(Θ1)M=m,X=x}=Var{γ(ψ(Θ1))M=m,X=x}1cm[γ(ψ(θ1))-γ(x¯m)]2Lm(θ1)ξ1(θ1)dθ1=-1mcm[γ(ψ(θ1))-γ(x¯m)]2ψ(θ1)-x¯mLm(θ1)ξ1(θ1)dθ1,

where

Lm(θ1)=exp{mθ1x¯m-mψ(θ1)}         and         cm=Lm(θ1)ξ1(θ1)dθ1.

Let η = ψ′ (θ1). Then, θ1 = g(η) and 1 = g′(η); so that

Var{ψ(Θ1)M=m,X=x}-1mcm[γ(η)-γ(x¯m)]2η-x¯mLm(g(η))ξ1(g(η))g(η)dη,=1mcmddη[[γ(η)-γ(x¯m)]2η-x¯mξ1(g(η))]Lm(g(η))dη

by performing integration by parts. Next,

ddη[[γ(η)-γ(x¯m)]2η-x¯m]=2γ(η)γ(η)-γ(x¯m)η-x¯m-[γ(η)-γ(x¯m)η-x¯m]2=-[γ(η)-γ(η)-γ(x¯m)η-x¯m]2+[γ(η)]2[γ(η)]2.

It follows that

Var{ψ(Θ1)M=m,X=x}1mcm[γ(η)]2Lm(g(η))ξ1(g(η))dη+1mcm[γ(η)-γ(x¯m)]2η-x¯mξ1(g(η))g(η)Lm(g(η))dη=1mcm[γ(ψ(θ1))]2Lm(θ1)ξ1(θ1)ψ(θ1)dθ1+1mcm[γ(ψ(θ1))-γ(x¯m)]2ψ(θ1)-x¯mξ1(θ1)Lm(θ1)dθ1=1mE{[γ(ψ(Θ1))]2ψ(Θ1)M=m,X=x}+1mE{[γ(ψ(Θ1))-γ(X¯m)]2ψ(Θ1)-X¯mα1(Θ1)|M=m,X=x}.

The first assertion of the lemma follows. A parallel argument leads to the subsequent assertion.

Lemma 2

Let γ(η) be as in the statement of Theorem 1.

Var{ψ(Θ1)ψ(Θ2)Ft}1ME{[γ(ψ(Θ1))]2ψ(Θ1)ψ(Θ2)+[γ(ψ(Θ1))-γ(X¯M))]2ψ(Θ1)-X¯Mα1(Θ1)ψ(Θ2)Ft}+1NE{[γ(ψ(Θ2))]2ψ(Θ1)ψ(Θ2)+[γ(ψ(Θ2))-γ(Y¯N)]2ψ(Θ2)-Y¯Nα2(Θ2)ψ(Θ1)Ft}

w.p.1.

ProofVar{ψ(Θ1)ψ(Θ2)Ft}=E{ψ(Θ1)ψ(Θ2)Ft}-[E{ψ(Θ1)ψ(Θ2)Ft}]2=(E{ψ(Θ1)Ft}-[E{ψ(Θ1)Ft}]2)E{ψ(Θ2)Ft}+(E{ψ(Θ2)Ft}-[E{ψ(Θ2)Ft}]2)[E{ψ(Θ1)Ft}]2Var{ψ(Θ1)X1,XM}E{ψ(Θ2)Y1,YN}+Var{ψ(Θ2)Y1,YN}E{ψ(Θ1)X1,XM}.

Now, use Lemma 1 to complete the proof.

Lemma 3

For any sequential procedurethat satisfies Condition (2.5)

tVar{ψ(Θ1)ψ(Θ2)Ft}[C(Θ1,Θ2)]-1[γ(ψ(Θ1))]2ψ(Θ1)ψ(Θ2)+[C(Θ2,Θ1)]-1[γ(ψ(Θ2))]2ψ(Θ1)ψ(Θ2)

w.p.l. as t → ∞.

Proof

Since

1ME{[γ(ψ(Θ1))-γ(X¯M)]2ψ(Θ1)-X¯Mα1(Θ1)|X1,,XM}0,

and

1NE{[γ(ψ(Θ2))-γ(X¯N)]2ψ(Θ2)-Y¯Nα2(Θ2)|Y1,,YN}0

w.p.l as t → ∞, by Lemma A.3, it follows from Lemma 2 that

limsupttVar{ψ(Θ1)ψ(Θ2)|Ft}limsupttME{[γ(ψ(Θ1))]2ψ(Θ1)|X1,,XM}E{ψ(Θ2)Y1,,YN}+limsupttNE{[γ(ψ(Θ2))]2ψ(Θ2)|Y1,,YN}E{ψ(Θ1)X1,,XM}.

Moreover,

tME{[γ(ψ(Θ1))]2ψ(Θ1)X1,,XM}1C(Θ1,Θ2)[γ(ψ(Θ1))]2ψ(Θ1),

and

tNE{[γ(ψ(Θ2))]2ψ(Θ2)Y1,,YN}1C(Θ2,Θ1)[γ(ψ(Θ2))]2ψ(Θ2)

w.p.l. as t → ∞, by Condition (2.5) of the theorem and Lemma A.4. Also,

E{(ψ(Θ1)X1,,XM}ψ(Θ1)         and         E{ψ(Θ2)Y1,,YN}ψ(Θ2)

w.p.l. as t → ∞, by Lemma A.4. Combining these results yield

limsupttVar{ψ(Θ1)ψ(Θ2)Ft}[C(Θ1,Θ2)]-1[γ(ψ(Θ1))]2ψ(Θ1)ψ(Θ2)+[C(Θ2,Θ1)]-1[γ(ψ(Θ2))]2ψ(Θ1)ψ(Θ2).

To establish the reverse inequality, first, write

tVar{ψ(Θ1)ψ(Θ2)Ft}=tMVar{Mψ(Θ1)X1,,XM}(E{ψ(Θ2)Y1,,YN})2+tNVar{Nψ(Θ2)Y1,,YN}(E{ψ(Θ1)X1,,XM})2,

as in the proof of Lemma 2. Next, use Taylor’s expansion for γψ′ at Θ̂M = E1|X1, … , XM} to obtain

ψ(Θ1)=γψ(Θ1)=γ(ψ(Θ^M))+γ(ψ(ΘM*))ψ(ΘM*)(Θ1-Θ^M),

where ΘM* is an intermediate variable between Θ1 and Θ̂M. It follows that

tMVar{Mψ(Θ1)X1,,XM}=tM1ψ(Θ^M)Var{γ(ψ(ΘM*))ψ(ΘM*)Mψ(Θ^M)(Θ1-Θ^M)|X1,,XM}

Thus,

liminfttMVar{Mψ(Θ1)X1,,XM}E(ψ(Θ2)Y1,,YN)1C(Θ1,Θ2)ψ(Θ1)[γ(ψ(Θ1))]2ψ(Θ2)

w.p.1, by first using Fatou’s lemma, then Condition (2.5) of the theorem, the fact that ψ″(Θ̂M) → ψ″(Θ1) w.p.1, the fact that

Var{γ(ψ(ΘM*))ψ(ΘM*)Mψ(Θ^M)(Θ1-Θ^M)|X1,,XM}[γ(ψ(Θ1))]2[ψ(Θ1)]2

w.p.1 as t → ∞ since the posterior distribution of Mψ(Θ^M)(Θ1-Θ^M), given X1, … , XM, is asymptotically normal with mean 0 and variance 1(see Bickel and Yahav, 1969) and the fact that E{ψ″(Θ2) | Y1, … , YN} → ψ″ (Θ2) by Lemma A.4. A similar argument will yield

liminfttNVar{Nψ(Θ2)Y1,,YN}[E{ψ(Θ2)X1,,XM}]21C(Θ2,Θ1)ψ(Θ2)[γ(ψ(Θ2))]2ψ(Θ1)

w.p.l. Now take the liminf in (3.1) and use (3.2) and (3.3) to complete the proof.

Proof of Theorem 1

It follows from (2.4) that

Rt(P)1tE[UM+VN+2UMVN]+E[AMM2]+E[BNN2].

Next, let Wt=E{ψ(Θ1)ψ(Θ2)Ft} and Zt=Var{ψ(Θ1)ψ(Θ2)Ft}. Then, Zt = UMVNWt2, which implies that

UMVN=Wt+ZtUMVN+Wt.

Thus,

E[UM+VN+2UMVN]=E[UM]+E[VN]+2E[Wt]+E[2ZtUMVN+Wt]=E[(ψ(Θ1)+ψ(Θ2))2]+E[2ZtUMVN+Wt].

Combining (3.4) and (3.5) yields

t2Rt(P)-tE[(ψ(Θ1)+ψ(Θ2))2]E[2tZtUMVN+Wt]+E[t2M2AM]+E[t2N2BN].

Furthermore, by Lemma 3 and the martingale convergence theorem,

2tZtUMVN+Wt[γ(ψ(Θ1))]2ψ(Θ1)ψ(Θ2)+[γ(ψ(Θ2))]2ψ(Θ2)[ψ(Θ1)]3/2C(Θ1,Θ2)

w.p.l. as t → ∞; so that

liminftE{2tZtUMVN+Wt}E[[γ(ψ(Θ1))]2ψ(Θ1)ψ(Θ2)+[γ(ψ(Θ2))]2[ψ(Θ1)]3ψ(Θ2)C(Θ1,Θ2)],

by Fatou’s lemma. Finally,

limtE[t2M2AM]=E[β1(Θ1)-[α1(Θ1)]2[C(Θ1,Θ2)]2],limtE[t2N2BN]=E[β2(Θ2)-[α2(Θ2)]2)[C(Θ2,Θ1)]2],

by Lemma A.4 (see Appendix) since t2/M2 → [C12)]−2 w.p.1 and t2/N2 → [C21)] −2 w.p.1 by Condition (2.5) of the theorem, AMβ11) − [α11)]2 w.p.1, BNβ22) − [α22)]2 w.p.1 by Lemma A.4, AM, t > 0, and BN, t > 0, are both uniformly integrable martingales, and t/M and t/N are bounded by Condition (2.5) of the theorem.

Then, the theorem follows by taking the liminf in (3.6) and using eqs. (3.7) to (3.9).

4. Numerical illustrations

In this section, we specialize the outcomes from Section 2 to Bernoulli trials, that is Xi takes values in 0, 1 for i = 1, 2, … with probabilities 1 − θ1 and θ1, respectively. Similarly, Yi takes values in 0, 1 for i = 1, 2, … with probabilities 1 − θ2 and θ2, respectively. Here, θ1 and θ2 are independent random variables constrained within 0 < θ1 < 1 and 0 < θ2 < 1. The prior density functions of θ1 and θ2 are denoted as ξ1 and ξ2, respectively. Both ξ1 and ξ2 are standard uniform distributions, defined as ξ1(θ1) = 1 for 0 ≤ θ1 ≤ 1 and 0 otherwise, and ξ2(θ2) = 1 for 0 ≤ θ2 ≤ 1 and 0 otherwise.

Table 1 displays two columns of Bayes risks where ℛt(P) and ℛt(O) each correspond to the outcomes of the fully sequential procedure and the optimal sampling scheme as described in Rekab (1990). In Table 1, we denote EF as the rate of convergence or the excess of second-order Bayes risk as follows:

EF=t2{Rt(P)-Rt(O)}.

The results indicate that the Bayes risks of the fully sequential procedure and optimal sampling scheme decrease with increasing t. The absolute excess of the Bayes risk comparing the fully sequential procedure to the optimal sampling scheme diminishes. Overall, the Bayes risk of the fully sequential procedure performs closely to the optimal sampling scheme in terms of the second-order excess or rate of convergence. Up to t = 100, the variability gain drives up the rate of convergence; however, it diminishes again as t increases. The numerical simulation confirms that the second-order excess behaves with increasing t under a non-conjugate prior for a density of one-parameter exponential families.

5. Concluding remarks

In this study, we address the problem of estimating the mean difference between two populations, ℘1 and ℘2, modeled by one-parameter exponential families of probability distributions. The objective was to estimate the difference ψ′(θ1) − ψ′(θ2) using a Bayesian approach, with a focus on minimizing the Bayes risk through sequential designs. The central contribution of this work lies in establishing an asymptotic second-order lower bound for the Bayes risk without explicit assumptions on conjugate priors. By extending the results of previous works, such as Woodroofe and Hardwick (1990) and Rekab (1990), we generalized the framework beyond normal distributions with unit variance, offering a more comprehensive approach to Bayesian estimation in the context of one-parameter exponential families. The main result underscores the complexities inherent in Bayesian estimation, particularly in the absence of explicit prior specifications.

Application of the main result to the exponential distribution with nonstandard gamma prior, as well as a numerical illustration with Bernoulli distribution with a uniform prior, are given. The numerical simulation illustrates the second-order lower bound given by the full sequential design with the best optimal design for several values of the sequence size. While the study presents a fully sequential design with Bayes risk, and it is not specified for a stage-wise procedure, there are other designs that are of interest, including the two-stage design and the myopic design (see Terbeche, 2000). Furthermore, in order to refine the approximation of the Bayes risk further, it may be desirable to attain higher-order optimality (see Martinsek, 1983). Last but not least, this study can further benefit from examining the tightness of the lower bound without specifying the conjugate prior.

Appendix
Lemma A.1

For i = 1, 2, let μ̂in = E{ψ′(Θi)|Xi1, … , Xin}, where X1 j = Xj and X2 j = Yj . If Θi has a a compact support on Ω, then

μ ^ i n = X ¯ i n + 1 n α i n ,

where αin = E{αii)|Xi1, … , Xin}.

Proof

For simplicity, the subscript “i” is omitted in the proof. Let

L n ( θ ) = exp { n θ x ¯ n - n ψ ( θ ) }             and             c n = L n ( θ ) ξ ( θ ) d θ ,

where x1, … , xn are the observed values of X1, … , Xn. Then,

μ ^ n = E { ψ ( Θ ) X 1 = x 1 , , X n = x n } = 1 c n Ω ψ ( θ ) L n ( θ ) ξ ( θ ) d θ = - 1 n c n Ω L n ( θ ) ξ ( θ ) d θ + x ¯ n = 1 n c n Ω L n ( θ ) ξ ( θ ) d θ + x ¯ n = 1 n c n Ω L n ( θ ) α ( θ ) ξ ( θ ) d θ + x ¯ n = 1 n E { α ( Θ ) X 1 = x 1 , , X n = x n } + x ¯ n

by using integration by parts. The lemma follows.

Lemma A.2

If Θi has a compact support, then

V a r { ψ ( Θ i ) X i 1 , , X i n } = E { [ ψ ( Θ i ) - X ¯ i n ] 2 X i 1 , , X i n } - 1 n 2 [ E { α i ( Θ i ) X i 1 , , X i n } ] 2 ,

where X1 j = Xj and X2 j = Yj.

Proof

For simplicity, the subscript “i” is omitted in the proof. Lemma A.1 yields

ψ ( Θ ) - μ ^ n = ψ ( Θ ) - X ¯ n - 1 n α n .

Thus,

V a r { ψ ( Θ ) X 1 , , X n } = E { [ ψ ( Θ ) - μ ^ n ] 2 X 1 , , X n } = E { [ ψ ( Θ ) - X ¯ n ] 2 X 1 , , X n } - 2 n α n [ E { ψ ( Θ ) X 1 , , X n } - X ¯ n ] + 1 n 2 α n 2 = E { [ ψ ( Θ ) - X ¯ n ] 2 X 1 , , X n } - 2 n α n ( 1 n α n ) + 1 n 2 α n 2 = E { [ ψ ( Θ ) - X ¯ n ] 2 X 1 , , X n } - 1 n 2 α n 2 .
Lemma A.3

Let M and N be as in Theorem 1. Then,

t M E { [ γ ( ψ ( Θ 1 ) ) - γ ( X ¯ M ) ] 2 ψ ( Θ 1 ) - X ¯ M α 1 ( Θ 1 ) | X 1 , , X M } 0 ,

and

t N E { [ γ ( ψ ( Θ 2 ) ) - γ ( Y ¯ N ) ] 2 ψ ( Θ 2 ) - Y ¯ N α 2 ( Θ 2 ) | Y 1 , , Y N } 0

w.p.l as t → ∞.

Proof

A simple expansion yields

γ ( ψ ( Θ 1 ) ) = γ ( X ¯ M ) + γ ( U M * ) [ ψ ( Θ 1 ) - X ¯ M ] ,

where U M * is a random variable between ψ′(Θ1) and M. Combining this observation and Lemma A.1. yields

t M E { [ γ ( ψ ( Θ 1 ) ) - γ ( X ¯ M ) ] 2 ψ ( Θ 1 ) - X ¯ M α 1 ( Θ 1 ) | X 1 , , X M } = t M 2 [ γ ( U M * ) ] 2 E { [ α 1 ( Θ 1 ) ] 2 X 1 , , X M } .

Next, there exist positive numbers a and b such that γ ( U M * ) a w.p.1 and |α11)| ≤ b w.p.1. since γ is continuously differentiable on ψ′(Ω1) and α1 is continuously differentiable on Ω1, the compact support of Θ1. It follows from this observation that

| t M E { [ γ ( ψ ( Θ 1 ) ) - γ ( X ¯ M ) ] 2 ψ ( Θ 1 ) - X ¯ M α 1 ( Θ 1 ) | X 1 , , X M } | a 2 b 2 t δ 2 0

w.p.1 as t → ∞. The second assertion can be established similarly.

Lemma A.4

Let h11) and h22) be continuous functions of Θ1 and Θ2, respectively. Then, E{h11)|X1, … , XM} and E{h22)|Y1, … , YN} are uniformly integrable martingales and

E { h 1 ( Θ 1 ) X 1 , , X M } h 1 ( Θ 1 ) E { h 2 ( Θ 2 ) Y 1 , , Y N } h 2 ( Θ 2 )

w.p.1 as t → ∞.

Proof

See Theorem 6.6.2 of Ash and Doleans-Dade (2000).

TABLES

Table 1

Second-order optimality of sequential design with uniform priors

tt(P)t(O)EF
100.04717160.04583980.13317
200.02732740.02673990.23499
400.01486700.01458540.45066
500.01206760.01188440.45818
1000.00622120.00617070.50473
2000.00315770.00314580.47633
3000.00211330.00211100.21545

Note. ℛt(P) represents the Bayes risk incurred by the sequential design.

t(O) represents the Bayes risk incurred by the optimal design.


References
  1. Ash R and Doleans-Dade C (2000). Probability and Measure Theory, Academic Press, New York.
  2. Bickel PJ and Yahav JA (1969). Some contributions to the asymptotic theory of Bayes solutions. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 11, 257-276.
    CrossRef
  3. Benkamra Z, Terbeche M, and Tlemcani M (2015). Nearly second-order three-stage design for estimating a product of several Bernoulli proportions. Journal of Statistical Planning and Inference, 167, 90-101.
    CrossRef
  4. Diaconis P and Ylvisaker D (1979). Conjugate priors for exponential families. The Annals of Statistics, 7, 269-281.
    CrossRef
  5. Lehman EL (1959). Testing Statistical Hypothesis, John Wiley & Sons, Inc, New York.
  6. Martinsek AT (1983). Second order approximation to the risk of a sequential procedure. The Annals of Statistics, 11, 827-836.
    CrossRef
  7. Rekab K (1989). Asymptotic efficiency in sequential designs for estimation. Sequential Analysis, 8, 269-280.
    CrossRef
  8. Rekab K (1990). Asymptotic efficiency in sequential designs for estimation in the exponential family case. Sequential Analysis, 9, 305-315.
    CrossRef
  9. Rekab K (1992). A nearly optimal two stage procedure. Communications in Statistics-Theory and Methods, 21, 197-201.
    CrossRef
  10. Rekab K and Tahir M (2004). An asymptotic second-order lower bound for the Bayes risk of a sequential procedure. Sequential Analysis, 25, 451-464.
    CrossRef
  11. Shapiro CP (1985). Allocation schemes for estimating the product of positive parameters. Journal of the American Statistical Association, 80, 449-454.
    CrossRef
  12. Song X and Rekab K (2017). Asymptotic efficiency in sequential designs for estimating product of k means in the exponential family case. Journal of Applied Mathematics, 4, 50-69.
  13. Terbeche M (2000). Sequential designs for estimation, Florida Institute of Technology.
  14. Woodroofe M (1985). Asymptotic local minimaxity in sequential point estimation. Annals of Statistics, 13, 676-688.
    CrossRef
  15. Woodroofe M and Hardwick J (1990). Sequential allocation for estimation problem with ethical costs. Annals of Statistics, 18, 1358-1377.
    CrossRef