Nonparametric regression is a statistical method estimating the mean function when the true signal is masked by some level of noise. Many methods such as kernel smoothing, splines, Fourier, and wavelet methods have been developed at an exponential rate over the last two decades. However, most of these methods are focused on cases when the true trend function is contaminated by uncorrelated or weakly correlated errors. In this paper, nonparametric regression is considered when the errors are strongly correlated in the sense of long memory. For related works, see Beran and Feng (2002), Hall and Hart (1990), Künsch (1986), Masry and Mielniczuk (1999), Opsomer
To concrete our discussion, consider the following statistical model
where unobservable error {
and
with bandwidth
is asymptotically given by
for some positive constant
The long memory parameter estimation is also necessary for data adaptive bandwidth selection methods, such as the cross validation or bootstrap methods. The usual leave-one-out cross validation suffers from severe bias for strongly correlated errors. See Opsomer
where
This paper considers a more precise estimation of the long memory parameter in the presence of a smooth trend. Somewhat surprisingly, Robinson (1997) showed that the memory parameter can be estimated log^{1/2}(
However, an additional tuning parameter is also required in estimating the long memory parameter. For example, the number of frequencies used in the semiparametric estimation of long memory such as exact local Whittle estimation (ELW) plays a central role. One of the pioneering methods suggested by Henry (2001) is to minimize the mean squared error (MSE)
where
This paper starts with the observation that kernel bandwidth selection can be unified by minimizing the MSE of the long memory parameter for a more precise estimation of the LRD parameter. The available methods iteratively find kernel bandwidth
where
In this section, the tuning parameters selection method based on the MSE of the LRD parameter estimation is detailed. Illustrated here with the Nadaraya-Watson estimator and the ELW estimator for their simplicity and superior performance in practice. However, it can be applied to other types of Kernel estimators such as the local polynomial estimator and/or a variety of LRD parameter estimation methods such as the log-periodogram estimator also known as GPH estimator (Geweke and Porter-Hudak, 1983; Robinson, 1995).
For given observations
where
The LRD parameter is now estimated based on the residual series {
where Θ = [Δ_{1}, Δ_{2}] for −∞ < Δ_{1} < Δ_{2} < ∞ with Δ_{2} − Δ_{1} ≤ 9/2 and
Here
where the fractional differencing is given by
The ELW bandwidth refers to the number of low frequencies
Recall that the proposed bandwidths selector is given by
where
The block bootstrap sample of residual series {
and draw the starting point of the new block
where
Finally, the bootstrap strategy for selecting optimal tuning parameters is described as follows:
Step 1. Obtain an initial estimate of
Step 2. Iterate the following procedures until the relative ratio is within the error bound,
Update a kernel bandwidth from
where
Update an ELW number of frequency by
where the ELW estimator
Update an ELW estimator
This section reports the finite sample performance of the proposed method through extensive Monte Carlo simulations. Four data generating processes (DGPs) are considered as follows:
(DGP1)
(DGP2)
(DGP3)
(DGP4)
with Gaussian FARIMA(0,
DGP1 is a constant function so that it essentially the same as to estimate the LRD parameter. It is included to see whether the proposed method works well even without any obvious trend. DGP2 considers a cyclic trend; and this makes the long memory parameter estimation harder (e.g. Baek and Pipiras (2014)). DGP3 is used in Chu and Marron (1991) and DGP4 appears in Hurvich
To compare a new tuning parameters selection rule based on ELW criterion, it is compared to four other methods. The first method uses a bimodal kernel described in Kim
to estimate the LRD parameter. Then, the optimal ELW bandwidth is obtained by iterating
until convergence with
The second method is the MCV of Chu and Marron (1991) with the adaptive choice of block length. As detailed in Kim
where
The third method is the oracle bandwidth assuming that the true mean function is known. It is defined as
and the LRD parameter is estimated similarly as described above with block bootstrapping residuals
All results are based on
and the (empirical) average sum of squares (ASE)
Table 1 shows the MSE for the LRD parameter and the ASE(×1000) for all five methods when DGP1 and DGP2 are considered. For DGP1, note that our proposed method based on ELW performs nicely in all cases considered for both MSE and ASE. The MCV method performs the worst for both MSE and ASE, but this is consistent with the simulations results in Hall
When the cyclic trend is considered as in DGP2, however, the oracle method shows the smallest MSE and ASE as expected. However, observe that our ELW method is closest to the oracle in terms of MSE for moderate to large LRD parameters. Indeed, the proposed ELW method works well for estimating the LRD parameter in the presence of a smooth trend. However, the ASE is closest to the oracle when the bimodal kernel method is used. Observe also that the ASE increases as the LRD parameter
Table 2 shows the results for DGP3 and DGP4. The overall interpretations are similar to DGP2 with an emphasis that the proposed ELW method is closest to the oracle as LRD parameter
To illustrate our proposed method based on ELW estimation of LRD parameter, we have considered the volatility of exchange rate between Korean Won (KRW) for one US dollar (USD). The index is expressed in local currency, that is, in exchange of 1 US dollar to KRW, and we have considered exchange rate from Jan 1, 2002 to Dec 31, 20013. It is widely recognized that the volatility exhibits both non-stationary and LRD properties as it is nicely documented in Stărică and Granger (2005).
We study the power-transformed absolute differences,
where
Figure 3 represents observations {
We have applied our proposed method to estimate a smooth trend perturbed by strongly correlated errors. Figure 4 shows estimated smooth trend and the SACF of absolute residuals. However, it still shows slowly decaying autocorrealtions that are weaker than the original series. This is also observed from ELW estimator plot, showing smaller LRD parameter values. The resulting ELW estimator is
A new tuning parameter selection rule is proposed for the long memory parameter estimation in the presence of a smooth trend. Tuning parameters are selected by minimizing the single MSE of the long memory parameter from the residuals. A simulations study shows outstanding performance of the proposed method. It was closest to the oracle, and outperformed other methods as dependency and sample size increase. It also remains an interesting future work on extension to bivariate LRD series as studied in Baek
This work was supported by the Basic Science Research Program from the National Research Foundation of Korea (NRF-2017R1A1A1A05000831, NRF-2019R1F1A1057104).
Time plots of DGPs considered in the simulations.
Time plots of estimated
The volatility of KRW and USD during 2002–2007 with the SACFs and ELW LRD parameter estimates. SACF = sample autocorrelations; ELW = exact local Whittle estimation; LRD = long-range dependence.
Estimated smooth trend, correlograms on absolute residuals and ELW estimation from residuals. SACF = sample autocorrelations; ELW = exact local Whittle estimation.
MSE and ASE(×1000) for DGP1 and DGP2 with sample size
DGP1 | DGP2 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
ELW | Bimodal | MCV | Oracle | HLS | ELW | Bimodal | MCV | Oracle | HLS | ||
0.10 | MSE | 0.350 | 0.327 | 0.357 | 0.320 | 0.368 | 0.650 | 0.436 | 0.613 | 0.325 | 0.358 |
ASE | 0.520 | 0.992 | 2.616 | 0.673 | 15.271 | 7.067 | 7.109 | 5.281 | |||
0.20 | MSE | 0.215 | 0.225 | 0.353 | 0.210 | 0.462 | 0.379 | 0.461 | 1.040 | 0.206 | 0.443 |
ASE | 0.976 | 2.725 | 6.618 | 1.781 | 25.283 | 13.546 | 14.747 | 10.064 | |||
0.30 | MSE | 0.168 | 0.199 | 0.323 | 0.163 | 0.331 | 0.216 | 0.526 | 1.152 | 0.173 | 0.366 |
ASE | 3.202 | 7.256 | 14.563 | 4.078 | 34.038 | 24.792 | 27.558 | 17.759 | |||
0.35 | MSE | 0.152 | 0.183 | 0.346 | 0.156 | 0.335 | 0.215 | 0.577 | 1.710 | 0.176 | 0.317 |
ASE | 5.953 | 11.907 | 21.398 | 6.404 | 42.610 | 36.941 | 39.845 | 25.105 | |||
0.40 | MSE | 0.208 | 0.291 | 0.540 | 0.206 | 0.347 | 0.218 | 0.433 | 2.096 | 0.184 | 0.357 |
ASE | 10.293 | 19.291 | 30.338 | 9.638 | 57.916 | 48.812 | 51.644 | 32.486 | |||
0.45 | MSE | 0.167 | 0.249 | 0.796 | 0.157 | 0.248 | 0.187 | 0.462 | 2.864 | 0.166 | 0.265 |
ASE | 14.314 | 28.314 | 43.282 | 13.078 | 77.455 | 62.660 | 67.712 | 40.986 |
MSE = mean squared error; ASE = average sum of squares; DGP = data generating processes; ELW = exact local Whittle estimation; MCV = modified cross-validation; HLS = Hurvich
MSE and ASE(×1000) for DGP3 and DGP4 with sample size
DGP3 | DGP4 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
ELW | Bimodal | MCV | Oracle | HLS | ELW | Bimodal | MCV | Oracle | HLS | ||
0.10 | MSE | 0.548 | 0.422 | 0.674 | 0.362 | 0.381 | 0.733 | 0.633 | 0.906 | 0.691 | 0.334 |
ASE | 11.683 | 7.748 | 7.319 | 5.492 | 15.727 | 11.860 | 12.027 | 10.345 | |||
0.20 | MSE | 0.303 | 0.465 | 1.088 | 0.173 | 0.397 | 0.545 | 1.539 | 2.874 | 0.454 | 0.412 |
ASE | 23.227 | 14.242 | 14.847 | 10.529 | 34.601 | 20.970 | 23.510 | 18.306 | |||
0.30 | MSE | 0.156 | 0.534 | 0.489 | 0.141 | 0.366 | 0.190 | 1.583 | 4.603 | 0.171 | 0.378 |
ASE | 32.797 | 26.374 | 29.880 | 18.795 | 50.370 | 36.818 | 41.957 | 31.055 | |||
0.35 | MSE | 0.146 | 0.523 | 2.046 | 0.144 | 0.324 | 0.193 | 1.310 | 6.464 | 0.175 | 0.304 |
ASE | 43.288 | 38.894 | 42.613 | 26.879 | 61.549 | 49.826 | 56.411 | 40.531 | |||
0.40 | MSE | 0.185 | 0.567 | 2.879 | 0.178 | 0.313 | 0.190 | 1.016 | 7.661 | 0.181 | 0.309 |
ASE | 72.083 | 52.771 | 57.891 | 35.278 | 72.420 | 65.500 | 73.650 | 51.750 | |||
0.45 | MSE | 0.165 | 0.499 | 4.04 | 0.168 | 0.267 | 0.157 | 0.721 | 8.483 | 0.163 | 0.262 |
ASE | 121.465 | 67.802 | 75.012 | 46.236 | 87.726 | 84.667 | 93.999 | 67.795 |
MSE = mean squared error; ASE = average sum of squares; DGP = data generating processes; ELW = exact local Whittle estimation; MCV = modified cross-validation; HLS = Hurvich
MSE and ASE(×1000) for DGP4 with sample size
ELW | Bimodal | MCV | Oracle | HLS | ELW | Bimodal | MCV | Oracle | HLS | ||
---|---|---|---|---|---|---|---|---|---|---|---|
0.40 | MSE | 0.096 | 0.592 | 5.298 | 0.115 | 0.155 | 0.050 | 5.350 | 3.523 | 0.059 | 0.062 |
ASE | 58.497 | 67.460 | 72.098 | 46.592 | 51.439 | 63.874 | 62.942 | 37.671 | |||
0.45 | MSE | 0.077 | 0.506 | 7.416 | 0.102 | 0.149 | 0.046 | 3.375 | 6.721 | 0.060 | 0.065 |
ASE | 81.769 | 91.124 | 95.648 | 61.356 | 78.854 | 95.686 | 89.942 | 54.041 |
MSE = mean squared error; ASE = average sum of squares; DGP = data generating processes; ELW = exact local Whittle estimation; MCV = modified cross-validation; HLS = Hurvich