This paper highlights the computational explosion issues in the autoregressive moving average approach of frequency estimation of sinusoidal data with a large sample size. A new algorithm is proposed to circumvent the computational explosion difficulty in the conditional least-square estimation method. Notice that sinusoidal pattern can be generated by a non-invertible non-stationary autoregressive moving average (ARMA) model. The computational explosion is shown to be closely related to the non-invertibility of the equivalent ARMA model. Simulation studies illustrate the computational explosion phenomenon and show that the proposed algorithm can efficiently overcome computational explosion difficulty. Real data example of sunspot number is provided to illustrate the application of the proposed algorithm to the time series data exhibiting sinusoidal pattern.
Time series exhibiting sinusoidal pattern is ubiquitous in digital signal processing, econometrics, agriculture, and biomedicine. For example, Stroebel
The feasibility of analyzing sinusoidal data via ARMA model is studied in Platonod
Both stationarity condition and invertibility condition are violated in the sinusoidal data and make the situation more difficult. The maximum likelihood estimate cannot be obtained unless the initial values of the innovation terms are supplied by the user since the stationary distribution is not well-defined. The initial values are unknown, but is necessary as the starting point of the recursive procedure as described in Brockwell and Davis (1991). Without the invertibility condition, it is not guaranteed that the influence of such initial values on the maximum likelihood estimate is negligible.
In this paper, a new algorithm is proposed to estimate the ARMA model violating both stationarity and invertibility conditions. The important idea is that the univariate ARMA model always has multivariate vector state-space representation, see Chapter 12 of Brockwell and Davis (1991). By storing the normalized vector and the logarithm of the norm of the vector only, we are able to circumvent the “computational explosion” difficulties.
The paper is organized as follows. In Section 2, the relationship between sinusoidal data and non-invertible ARMA model is discussed. In Section 3, a procedure allowing estimation under noninvertibility is presented. The computational explosion difficulty is further dealt with in Section 4. Simulation studies are given in Section 5, followed by concluding remarks in Section 6.
Suppose that the model for the data is generated by
where
is not unique due to the periodicity of sine/cosine function.
Can the ARMA model be used to handle the sinusoidal pattern? The answer is positive, see Platonod
Here,
when the characteristic equation 1 −
where
If the true model is
One can consider the following model fitting procedure
(
Consider the model
where
Consider the special cases with
where
Let
However, if the invertibility condition is violated, the influence of the misspecified initial values
(
To illustrate the rationale underlying the procedure, notice that the model can be rewritten as
where
and
Equivalently,
One can consider this as a regression model with a left-hand-side response, the coefficients are the initial values (
Set
To avoid the difficulties related to the exploding
and
and
(
The equivalence between Procedures 2 and 3 can be verified easily by mathematical induction arguments.
The simulation study illustrates: (i) the computational explosion phenomenon, (ii) the necessity of introducing
Consider the frequency representations of the ARMA model like
, with the following three settings,
Setting 1:
Setting 2:
The time series are then standardized so that the standard deviations always equal to one. The coefficients
To illustrate the computational explosion phenomenon, the critical sample sizes for the stack overflow are investigated. The programs are implemented in CRAN R language. Procedure 2 requires the storage of the values of sums of squares
To see the necessity of introducing
Setting 1:
Setting 2:
Setting 3:
For each setting, the experiment is repeated for
Here,
Table 1 shows the results of the RMSE. It can be seen that ARMA models fitted by R function arima tend to have greater RMSE than those fitted by Procedure 3, especially for large
The same settings as in Section 5.2 are considered. To study the performance of the frequency estimation under ARMAeta model, 2
where the
Table 2 shows the results of RMSE of
Consider the hysteresis of the solar cycles, that can be measured by the number of sunspots, see Dmitriev
The ARMAeta is applied for the sunspot data and the optimal model is selected by minimizing the Bayesian information criterion (BIC). For ARMAeta(
where ln(
Table 3 shows the
The estimation problem of ARMA model is discussed under settings where the invertibility is not fulfilled. New algorithms are proposed to deal with non-invertibility difficulty and computational explosion difficulty. The methods developed are useful for the sinusoidal time series data where the frequency is unknown. This facilitates the application of the ARMA method to biomedicine and meteorology such as breathing motion data obtained during radiation therapy.
Chi Tim, Ng’s work is supported by National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. NRF-2017R1C1B2011652).
The RMSE of coefficients (
ARMA | ARMAeta | ||||
---|---|---|---|---|---|
RMSE | RMSE | RMSE | RMSE | ||
4 | 100 | 1.2098 | 0.9529 | 0.0714 | 0.0572 |
300 | 1.2292 | 0.9467 | 0.0413 | 0.0350 | |
600 | 1.2227 | 0.9464 | 0.0418 | 0.0286 | |
1500 | 1.1796 | 0.9663 | 0.0390 | 0.0495 | |
6 | 100 | 1.0972 | 0.9697 | 0.0297 | 0.0038 |
300 | 1.1268 | 0.9651 | 0.0183 | 0.0027 | |
600 | 1.1357 | 0.9638 | 0.0342 | 0.0029 | |
1500 | 1.1204 | 0.9624 | 0.0609 | 0.0024 | |
8 | 100 | 1.0189 | 1.0308 | 0.0035 | 0.0019 |
300 | 1.0055 | 1.0227 | 0.0055 | 0.0014 | |
600 | 1.0614 | 0.9825 | 0.0090 | 0.0015 | |
1500 | 1.1555 | 0.9217 | 0.0095 | 0.0014 |
ARMA is obtained by arima function in R and ARMAeta is obtained by Procedure 3. RMSE = relative mean square error.
The RMSE of coefficients 2
ARMAeta | |||||
---|---|---|---|---|---|
RMSE | RMSE | RMSE | RMSE | ||
4 | 100 | 0.0079 | 0.0149 | ||
300 | 0.0096 | 0.0199 | |||
600 | 0.0639 | 0.0416 | |||
1500 | 0.1019 | 0.0613 | |||
6 | 100 | 0.0723 | 0.0688 | 0.0498 | |
300 | 0.1605 | 0.1197 | 0.1184 | ||
600 | 0.1687 | 0.1152 | 0.0850 | ||
1500 | 0.1836 | 0.1409 | 0.1101 | ||
8 | 100 | 0.1724 | 0.0836 | 0.8350 | 0.0656 |
300 | 0.2106 | 0.1068 | 0.1007 | 0.0737 | |
600 | 0.2078 | 0.1068 | 0.0989 | 0.0695 | |
1500 | 0.2104 | 0.1119 | 0.0969 | 0.0617 |
ARMAeta is fitted by Procedure 3 and 2
The BIC values of ARMAeta models for the sunspot data
Models | ARMAeta(2, 2) | ARMAeta(4, 4) | ARMAeta(6, 6) | ARMAeta(8, 8) |
---|---|---|---|---|
BIC | 55112.43 | 55498.78 | 69445.98 | 176840.6 |
BIC = Bayesian information criterion.
The coefficients (
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | ||
---|---|---|---|---|---|---|---|---|---|
ARMAeta(2, 2) | −1.9967 | 0.9993 | |||||||
−1.9986 | 1.0019 | ||||||||
0.0511 | |||||||||
ARMAeta(4, 4) | −3.8922 | 5.7861 | −3.8929 | 1.0004 | |||||
−3.8927 | 5.7862 | −3.8929 | 1.0001 | ||||||
0.0565 | 0.3249 | ||||||||
ARMAeta(6, 6) | −5.6313 | 13.5533 | −17.8440 | 13.6633 | −5.6313 | 0.9990 | |||
−5.6313 | 13.5533 | −17.8439 | 13.5533 | −5.6312 | 0.9990 | ||||
2.051 × 10^{−16} | 0.3334 | 0.5162 | |||||||
ARMAeta(8, 8) | −7.0733 | 22.6740 | −43.0201 | 52.8390 | −43.0201 | 22.6739 | −7.0733 | 1 | |
−7.0733 | 22.6740 | −43.0201 | 52.8390 | −43.0201 | 22.6739 | −7.0733 | 1 | ||
0.0498 | 0.3247 | 0.5174 | 0.7655 |