TEXT SIZE

search for



CrossRef (0)
Convergence rate of a test statistics observed by the longitudinal data with long memory
Communications for Statistical Applications and Methods 2017;24:481-492
Published online September 30, 2017
© 2017 Korean Statistical Society.

Yoon Tae Kima, and Hyun Suk Park1,a

aDepartment of Finance and Information Statistics, Hallym University, Korea
Correspondence to: 1Corresponding author: Department of Statistics, Hallym University, 1 Hallimdaehak-gil, Chuncheon 24252, Korea. E-mail: hspark@hallym.ac.kr
Received May 25, 2017; Revised August 28, 2017; Accepted August 29, 2017.
 Abstract

This paper investigates a convergence rate of a test statistics given by two scale sampling method based on Aït-Sahalia and Jacod (Annals of Statistics, 37, 184–222, 2009). This statistics tests for longitudinal data having the existence of long memory dependence driven by fractional Brownian motion with Hurst parameter H ∈ (1/2, 1). We obtain an upper bound in the Kolmogorov distance for normal approximation of this test statistic. As a main tool for our works, the recent results in Nourdin and Peccati (Probability Theory and Related Fields, 145, 75–118, 2009; Annals of Probability, 37, 2231–2261, 2009) will be used. These results are obtained by employing techniques based on the combination between Malliavin calculus and Stein’s method for normal approximation.

Keywords : Malliavin calculus, multiple stochastic integrals, central limit theorem, Hurst parameter, longitudinal data, fractional Brownian motion
1. Introduction

A fractional Brownian motion {BH, t ≥ 0} with Hurst parameter H ∈ (0, 1) is a centered Gaussian process with the covariance function

E[BH(t)BH(s)]=12(t2H+s2H-t-s2H),         t,s0.

The Hurst parameter H ∈ (0, 1) characterizes the self-similar behavior of the process. This parameter gives the long-range dependence property of its increments and decides the regularity of the sample paths. Therefore, the problem of properly estimating Hurst parameter H is of the most importance. Many methods to estimate H of {BH, t ≥ 0} have been proposed to solve this problem, such as wavelets, k-variations, variograms, maximum likelihood method and spectral methods, some of which can be found in the book by Beran (1994).

This paper investigates a convergence rate of test statistics Fn to see if the error is a Brownian motion or a true fractional Brownian motion in the following longitudinal data:

Y(t)=β0+β1x(t)+BH(t),         t[0,T],

where x(t) is a non-random function. In terms of the Hurst parameter, this test can be formulated as:

H0:H=12         vs.         H1:H12.

This test statistics Fn, based on the ratio of two realized power variations with different sampling frequencies, has the form:

Fn=l=1[T/kΔn]Δl,knY2l=1[T/Δn]ΔlnY2,

where ΔlnY=Y(lΔn)-Y((l-1)Δn) and Δl,knY=Y(lkΔn)-Y((l-1)kΔn) for determined positive integer k. In the paper Kim and Park (2015), authors prove that

1Δn(Fn-k2H-1)N(0,k4H-2σ2T2),

where σ2 is given by

σ2=2T(k+1)jρH(j)2-2k-2Hlj=1k(r=1kρH(lk+r-j))2.

Here ρH is the covariance function of a fractional Brownian motion expressed as

ρH(l)=12(l+12H+l-12H-2l2H).

From (1.3), we reject H0 if

Fn-1>Δnzα2σT,

where ℙ(Zzα/2) = α/2 [Z ~ (0, 1)].

Asymptotic analysis focuses on only describing that the properties (e.g., the central limit theorem (CLT) in our case) of a statistics even when the sample size is finite and similar to the properties when the sample size becomes arbitrarily large. Our main result may give information on how similar the distribution of Fn is with the Gaussian distribution according to sample size.

If the data {Y(t)} have the long memory property for each series, i.e., H0 is rejected, then we may use the model (1.1) for a statistical application. Suppose we observe {Yi(t)} at times jΔn, j = 1, …, [Tn] and at cross section i = 1, …, d. Assume that all series in the longitudinal data have the same Hurst parameter H. For practical purpose, we have to estimate Hurst parameter H first, and then a realization, obtained by the data Yi, of the estimator ols(n, d) proposed in this paper is plugged into H in the model (1.1). The estimator ols(n, d) given above is of the following form:

H^ols(n,d)=i=1dlog(Un(i))+dlog kdlog k2,

where

Un(i)=l=1[T/kΔn]Δl,knYi2l=1[T/Δn]ΔlnYi2.

The model (1.1) becomes

Yi(t)=(β0+ui)+β1xi(t)+ɛi(t),         i=1,,dand t[0,T],

where the error term εi(t) is a fractional Brownian motion with εi(t + h) − εi(t) ~ (0, σ2h2ols(n,d)). After that, we may use the usual longitudinal data analysis in order to estimate the linear regression model (1.5).

The main tool for the proof of a Berry-Esseen bound is the combination of Stein’s method and Malliavin calculus as well as the result in Nourdin and Peccati (2009a, 2009b). Recently, Berry-Esseen bounds for various statistics for estimators of parameters, involved in stochastic differential equations and stochastic partial differential equations, have been much studied (Kim and Park, 2016, 2017a, 2017b).

2. Preliminaries

In this section, we briefly review some facts about Malliavin calculus for Gaussian processes. For a more detailed reference, see Nualart (2006). Suppose that ℌ is a real separable Hilbert space with scalar product denoted by 〈·, ·〉. Let X = {X(h), h ∈ ℌ} be an isonormal Gaussian process, that is a centered Gaussian family of random variables such that [X(h)X(g)] = 〈h, g. If X = BH, then

E[BH(t)BH(s)]=1[0,s],1[0,t]H=12(t2H+s2H-t-s2H).

For every q ≥ 1, let ℋq be the qth Wiener chaos of X, that is the closed linear subspace of (Ω) generated by {Hq(X(h)): h ∈ ℌ, ||h|| = 1}, where Hq is the qth Hermite polynomial. We define a linear isometric mapping Iq: ℌq → ℋq by Iq(hq) = Hq(X(h)), where ℌn is the symmetric tensor product. The following duality formula holds

E[FIq(h)]=E[DqF,hHq],

for any element h ∈ ℌq and any random variable F. Here

||F||q,22=E[F2]+k=1qE[DkFHk2],

where Dk is the iterative Malliavin derivative. The linear isometric mapping Iq satisfies Iq(f) = Iq() and

E[Ip(f)Iq(g)]={0,if pq,p!f˜,g˜H,if p=q,

where denotes the symmetrization of f.

If f ∈ ℌp, the Malliavin derivative of the multiple stochastic integrals is given by

DzIq(fq)=qIq-1(fq(·,z)),         for z[0,1]2.

Let {el, l ≥ 1} be a complete orthonormal system in ℌ.

If f ∈ ℌp and g ∈ ℌq, the contraction frg, 1 ≤ rpq, is the element of ℍ(p+q2r) defined by

frg=l1,,lr=1f,el1elrHrg,el1elrHr.

Notice that the tensor product fg and the contraction frg, 1 ≤ rpq are not necessarily symmetric even though f and g are symmetric. We will denote their symmetrizations by f ⊗̃ g and f ⊗̃rg, respectively. The following formula for the product of the multiple stochastic integrals will be frequently used to prove the main result in this paper:

Proposition 1

Let f ∈ ℌpand g ∈ ℌqbe two symmetric functions. Then

Ip(f)Iq(g)=r=0pqr!(pr)(qr)Ip+q-2r(frg).

Now we introduce the infinitesimal generator L of the Ornstein-Uhlenbeck semigroup and the relation of the operator L with the operators D and δ (see Subsection 1.4 in Nualart (2006) for more details). Let FL2(Ω) be a square integrable random variable. For each n ≥ 1, we will denote by : L2(Ω) → ℍn the orthogonal projection on the nth Wiener chaos ℋn. The operator L is defined through the projection operator , n = 0, 1, 2 …, as L=n=0-nJnF, and is called the infinitesimal generator of the Ornstein-Uhlenbeck semigroup. The relationship between the operator D, δ and L is given as: δDF = −LF, that is, for FL2(Ω) the statement F ∈ Dom(L) is equivalent to F ∈ Dom(δD) (i.e. F and DF ∈ Dom(δ)), and in this case δDF = −LF. We also define the operator L1, which is the pseudo-inverse of L, as L1L-1F=n=1Jn(F)/n. Note that L1 is an operator with values in and LL1F = F[F] for all FL2(Ω).

3. Main results

In this section, we investigate a convergence rate of CLT in (1.3). First recall that for every z ∈ ℝ, the function

fz(x)=ex22-x{1(-,z](u)-Φ(z)}e-u22du={2πex22Φ(x){1-Φ(z)},if xz,2πex22Φ(z){1-Φ(x)},if x>z

is a solution to the following Stein equation such that fz2π/4 and fz1:

f(x)-xf(x)=1(-,z](x)-Φ(z),

The derivative of fz is given by

fz(x)={{1-Φ(z)}{1+2πxex22Φ(x)},if xz,Φ(z)[-1+2πxex22{1-Φ(x)}],if x>z.

We use the following lemma, given by Michael and Pfanzagl (1971), to prove our main result.

Lemma 1

Let (Ω, , ℙ) be a probability space and Gnand Vnbe a-measurable function such that Vn > 0 a.s. for all n. Then for any ε > 0, we have

supz|(GnVnz)-Pr(Zz)|=supz(Unz)-(Zz)+(Vn-1ɛ)+ɛ.

Now we obtain the Berry-Esseen bound of the test statistics Fn given in (1.2).

Theorem 1

Suppose that |x(t) − x(s)| ≤ c|ts| for c > 0, where x(t) is given in the equation (1.5). If H > 1/2, there exists a constant c > 0 such that, for sufficiently large n,

supz|(TΔnk2H-1σ(Fn-k2H-1)z)-(Zz)|cmin {Δn3-4H2,Δn1-H2}.

whereσ2is given in (1.4).

Proof

Throughout this proof, c stands for an absolute constant with possibly different values in different places. Using Lemma 1, we have that for any 0 < ε < 1,

supz|(TΔnk2H-1σ(Fn-k2H-1)z)-Pr(Zz)|=supz|(TΔnk2H-1σl=1[T/kΔn]|Δl,knY|2-k2H-1l=1[T/Δn]|ΔlnY|2l=1[T/Δn]|Δl,knY|2z)-(Zz)|=supz|(Δn1-2HΔnk2H-1σ{l=1[T/kΔn]|Δl,knY|2-k2H-1l=1[T/Δn]|ΔlnY|2}z)-(Zz)|+(|Δn1-2HTl=1[T/Δn]|ΔlnY|2-1|ɛ)+ɛ.

First consider the second term in (3.6). We write

l=1[T/Δn]ΔlnY2=l=1[T/Δn]{β12(Δlnx)2+2β1(Δlnx)(ΔlnBH)+(ΔlnBH)2}:=A1n+A2n+A3n.

Therefore,

(|Δn1-2HTl=1[T/Δn]ΔlnY2-1|ɛ)1ɛ{Δn1-2HTA1n+Δn1-2HTE[A2n]+E[|Δn1-2HTA3n-1|]}.

By the assumption on x(t), the first term in (3.8) can be estimated as

Δn1-2HTA1ncβ12Δn1-2HT[TΔn]Δn2cΔn2(1-H).

By the Cauchy-Schwartz inequality, we get

Δn1-2HTE[A2n]cβ1Δn1-2HTΔnl=1[T/Δn]E[ΔlnBH]cβ1Δn1-2HTΔnl=1[T/Δn]E[(ΔlnBH)2]cΔn1-H.

As for the third term in (3.8), we estimate

E[|Δn1-2HTA3n-1|]=Δn1-2HTE[|A3n-TΔnΔn2H|]Δn1-2HTE[|A3n-[TΔn]Δn2H|]+Δn2H(TΔn-[TΔn]).

By using the computation of Var(l=1[T/Δn](ΔlnBH)2) in Kim and Park (2015), the first term in (3.11) can be bounded

Δn1-2HTE[|A3n-[TΔn]Δn2H|]=Δn1-2HTE[|l=1[T/Δn](ΔlnBH)2-l=1[T/Δn]E[(ΔlnBH)2]|]Δn1-2HTVar(l=1[T/Δn](ΔlnBH)2)Δn1-2HT[TΔn]Δn2H[TΔn]2H-([TΔn]-1)2H.

By the mean value theorem, we have [Tn]2H − ([Tn] − 1)2H ≤ 2H([Tn])2H1. This inequality proves that the right-hand side of (3.12) can be estimated as

Δn1-2HTE[|A3n-[TΔn]Δn2H|]cΔn1-H.

From (3.11) and (3.13), it follows that

E[|Δn1-2HTA3n-1|]c(Δn1-H+Δn2H).

By combining the above estimates (3.9), (3.10), and (3.14), we obtain that for every ε > 0,

(|Δn1-2HTl=1[T/Δn]ΔlnY2-1|ɛ)cΔn1-Hɛ

Let us set

Un=Δn1-2HΔnk2H-1σ{l=1[T/kΔn]Δl,knY2-k2H-1l=1[T/Δn]ΔlnY2}.

Using the multiplication formula of multiple stochastic integral in (2.5) yields

(kΔn)1-2Hl=1[T/kΔn]Δl,knBH2=I2(fn,k,2)+(kΔn)k[TkΔn],Δn1-2Hl=1T/ΔnΔlnBH2=I2(fn,1,2)+Δn[TΔn],

where the kernels fn,k,2 are given by

fn,k,2=(kΔn)1-2Hl=1[T/kΔn]1[(l-1)kΔn,lkΔn]2.

Also we define a kernel fn,k,1 as:

fn,k,1=2β1(kΔn)1-2Hl=1[T/kΔn](Δl,knx)1[(l-1)kΔn,lkΔn].

Hence it follows from (3.16) and (3.17) that

Un=1Δnσ[I1(fn,k,1-fn,1,1)+I2(fn,k,2-fn,1,2)]+1ΔnσE[Un].

Here [Un] is given by

E[Un]=(kΔn)1-2Hl=1[T/kΔn]β12(Δl,knx)2-Δn1-2Hl=1[T/Δn]β12(Δlnx)2+(kΔn)[TkΔn]-Δn[TΔn].

Applying Lemma 2.3 in Nourdin and Peccati (2009b) to the first term of the right-hand side (3.6), we have, using fz2π/4 and fz1, that

supz(Unz)-(Zz)=|E[fz(Un)-Unfz(Un)]|=|E[fz(Un)(1-DUn,-DL-1UnH)]-1ΔnσE[Un]E[fz(Un)]|E[(1-DUn,-DL-1UnH)2]+2π4ΔnσE[Un].

For simplicity, we set

gn,1=fn,k,1-fn,1,1Δnσ         and         gn,2=fn,k,2-fn,1,2Δnσ.

Following the proof of Proposition 3.7 in Nourdin and Peccati (2009b), we estimate

E[(1-DUn,-DL-1UnH)2]=E[(1-gn,1+2I1(gn,2),gn,1+I1(gn,2)H)2]=E[(1-gn,1H2-3gn,1,I1(gn,2)H-2I1(gn,2),I1(gn,2)H)2]2(1-gn,1H2-2gn,2H22)2+18gn,11gn,2H2+16gn,21gn,2H22.

Obviously, the first term can be estimated as

(1-gn,1H2-2gn,2H22)22gn,1H4+2(1-2gn,2H22)24Δn2σ4(fn,k,1H4+fn,1,1H4)+2σ4{σ2-2Δn(fn,k,2H22-2fn,k,2,fn,1,2H2+fn,1,2H22)}2:=B1n+B2n.

We note that

ρH(l)=H(2H-1)l2H-2+o(l2H-2)         as   l.

For sufficiently large n, we estimate, from (3.21),

l,l=1T/kΔnρH(l-l)j<[T/Δn]([TΔn]-j)|j+12H+j-12H-2j2H|Tkj=1[T/kΔn]Δn-1j2H-2TkΔn-1{1+([TkΔn])2H-1}c(Δn-1+Δn-2H).

Direct computation and the estimate (3.22) give

B1nc1Δn2σ4Δn8-4H[{kl,l=1[T/kΔn]ρH(l-l)}2+{l,l=1[T/Δn]ρH(l-l)}2]c(Δn4-4H+Δn6-8H).

As for the term B2n, we compute the three terms in B2n

Δn-1fn,k,2H22=Δn-1(kΔn)2-4Hl,l=1[T/kΔn]1[(l-1)kΔn,lkΔn]2,1[(l-1)kΔn,lkΔn]2H2=Δn-1(kΔn)2l,l=1[T/kΔn]ρH(l-l)2=Δn-1k2j<[T/kΔn]Δn([TkΔn]-j)ρH(j)2.

Using a similar argument as for the first term in (3.24) yields

Δn-1fn,k,2,fn,1,2H2=k1-2HΔn1-4Hl=1[T/kΔn]l=1[T/Δn]1[(l-1)kΔn,lkΔn]2,1[(l-1)Δn,lΔn]22=k1-2HΔnl=1[T/kΔn]l=1[T/kΔn]j=1k(1[(l-1)k,lk],1[(l-1)k+j-1,(l-1)k+j])2=k1-2HΔnl=1[T/kΔn]l=1[T/kΔn]j=1k(r=1kρH((l-l)k+r-j))2=k1-2HΔnl<[T/kΔn]j=1k([T/kΔn]-l)(r=1kρH(lk+r-j))2.

Substituting k = 1 into k in (3.24), we have

Δn-1fn,1,2H22=Δn-1j<[T/Δn]Δn([TΔn]-j)ρH(j)2.

Combining the above results (3.24), (3.25), and (3.26), we obtain

B2n2σ4[2|kTjρH(j)2-k2j<[T/kΔn]Δn([TkΔn]-j)ρH(j)2|2+4|k-2Hlj=1k(r=1kρH(lk+r-j))2-k1-2HΔnl<[T/kΔn]j=1k([TkΔn]-l)(r=1kρH(lk+r-j))2|2+2|TjρH(j)2-j<[T/Δn]Δn([TΔn]-j)ρH(j)2|2:=B21n+B22n+B23n.

Obviously, for sufficiently large n, we have

B21n4σ4[kTj[T/kΔn]ρH(j)2+kTj<[T/kΔn]{1-kΔnT[TkΔn]}ρH(j)2+k2Δnj<[T/kΔn]jρH(j)2]2c{Δn3-4H+Δn(1+Δn3-4H)+Δn(1+Δn2-4H)}2cΔn2(3-4H).

By a similar estimate as for the term B21n in (3.27), we get B22ncΔn2(3-4H) and B23ncΔn2(3-4H). Thus we have

B2ncΔn2(3-4H).

As for the second and third terms in (3.20), observe that gn,11gn,2H2cfn,k,11fn,k,2H2. First write

fn,k,11fn,k,2=2β1(kΔn)2-4HΔnσ2l,l=1[T/kΔn](Δl,knx)1[(l-1)kΔn,lkΔn],1[(l-1)kΔn,lkΔn]H1[(l-1)kΔn,lkΔn]=2β1(kΔn)2-2HΔnσ2l,l=1[T/kΔn](Δl,knx)ρH(l-l)1[(l-1)kΔn,lkΔn].

Therefore,

fn,k,11fn,k,2H2=4β12(kΔn)4Δn2σ4l,l,j,j=1[T/kΔn](Δl,knx)(Δj,knx)ρH(l-l)ρH(j-j)ρH(l-j)cΔn4l,l,j,j=1[T/kΔn]ρH(l-l)ρH(j-j)ρH(l-j).

For the sum in (3.30), we decompose as follows

l>l>j>j[T/kΔn]+l>l>j>j[T/kΔn]+.

For the first term, we have

Δn4l>l>j>j[T/kΔn]ρH(l-l)ρH(j-j)ρH(l-j)Δn4l>l>j>j[T/kΔn](l-l)2H-2(j-j)4H-4Δn2l=1[T/kΔn]l2H-2j=1[T/kΔn]j4H-4cΔn2{(1+[TkΔn]2H-1)(1+[TkΔn]4H-3)}cΔn3-2H.

Obviously, the same bound also holds for the other terms in (3.31). As for the last term in (3.20), observe that gn,21gn,2H2cfn,k,21fn,k,2H2.

fn,k,21fn,k,2=(kΔn)2-4Hl,l=1[T/kΔn]1[(l-1)kΔn,lkΔn],1[(l-1)kΔn,lkΔn]×1[(l-1)kΔn,lkΔn]˜1[(l-1)kΔn,lkΔn]=(kΔn)2-2Hl,l=1[T/kΔn]ρH(l-l)1[(l-1)kΔn,lkΔn]˜1[(l-1)kΔn,lkΔn].

Let us set ρn,H(j) = |ρn,H(j)|1{|j|≤[T/kΔn]}. By using the arguments in the quadratic variation of the fractional Brownian motion studied by Nourdin (2013), we obtain, from (3.33),

fn,k,21fn,k,222=4β12(kΔn)4Δn2σ4l,l,j,j=1[T/kΔn]ρH(l-l)ρH(j-j)ρH(l-j)ρH(l-j)cΔn2l,j=1[T/kΔn]j,lρn,H(l-l)ρn,H(j-j)ρn,H(l-j)ρn,H(l-j)cΔnl(ρn,H*ρn,H)(l)2k4Δn(l[T/kΔn]ρH(l)43)3cΔn(1+[TkΔn]8H-53)3cΔn2(3-4H).

The last term in (3.19) can easily estimated as

2π4ΔnσE[Un]c(Δn3-4H2+Δn).

By combining all these bounds (3.15), (3.23), (3.28), (3.32), (3.34), and (3.35), together with ɛ=Δn(1-H)/2, the proof of Theorem is now completed.

Remark 1

We are not sure that the upper bound, obtained in Theorem 1, is an optimal bound in the following sense: the bound ϕ(Fn) is optimal for the sequence {Fn} with respect to some distance d if there exist constants 0 < c < C < ∞, independent of n, such that

cϕ(Fn)d(Fn,Z)Cϕ(Fn),         for all n1.

An optimal rates of convergence in the Kolmogorov distance will be derived in future studies; therefore, we should develop the techniques to find an lower bound.

Acknowledgements

This work was supported by the National Research Foundation of Korea Grant funded by the Korean Government (NRF-2014-S1A5B6A02048942).

References
  1. A챦t-Sahalia, Y, and Jacod, J (2009). Testing for jumps in a discretely observed process. The Annals of Statistics. 37, 184-222.
    CrossRef
  2. Beran, J (1994). Statistics for Long-Memory Processes. London: Chapman and Hall
  3. Kim, YT, and Park, HS (2015). Estimation of Hurst parameter in the longitudinal data with long memory. Communications for Statistical Applications and Methods. 22, 295-304.
    CrossRef
  4. Kim, YT, and Park, HS (2016). Berry-Esseen Type bound of a sequence {} and its application. Journal of the Korean Statistical Society. 45, 544-556.
    CrossRef
  5. Kim, YT, and Park, HS (2017a). Optimal Berry-Esseen bound for statistical estimations and its application to SPDE. Journal of Multivariate Analysis. 155, 284-304.
    CrossRef
  6. Kim, YT, and Park, HS (2017b). Optimal Berry-Esseen bound for an estimator of parameter in the Ornstein-Uhlenbeck process. Journal of the Korean Statistical Society. 46, 413-425.
    CrossRef
  7. Michel, R, and Pfanzagl, J (1971). The accuracy of the normal approximation for minimum contrast estimates. Zeitschrift f체r Wahrscheinlichkeitstheorie und Verwandte Gebiete. 18, 73-84.
    CrossRef
  8. Nourdin, I (2013). Lectures on Gaussian approximations with Malliavin calculus, S챕minaire de Probabilit챕s. 45, 3-89.
  9. Nourdin, I, and Peccati, G (2009a). Stein셲 method on Wiener chaos. Probability Theory and Related Fields. 145, 75-118.
    CrossRef
  10. Nourdin, I, and Peccati, G (2009b). Stein셲 method and exact Berry-Esseen asymptotics for functionals for functionals of Gaussian fields. The Annals of Probability. 37, 2231-2261.
    CrossRef
  11. Nualart, D (2006). Malliavin Calculus and Related Topics. Berlin: Springer