Time series are used in many studies for model building and analysis. We must be very careful to understand the kind of time series data used in the analysis. In this review article, we will begin with some issues related to the use of aggregate and systematic sampling time series. Since several time series are often used in a study of the relationship of variables, we will also consider vector time series modeling and analysis. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series. Therefore, we will also discuss some issues related to vector time models. Understanding these issues is important when we use time series data in modeling and analysis, regardless of whether it is a univariate or multivariate time series.
Let z_{t} be a time series process. For a stationary process, its mean, E(z_{t}) = μ, and variance,
Autoregressive process of order p (AR(p) model)
Moving average process of order q (MA(q) model)
Autoregressive moving average process (ARMA(p, q) model)
Autoregressive integrated moving average process (ARIMA(p, d, q) model)
The model is stationary if the roots of its associated AR polynomial are all outside the unit circle, and the important characteristics of stationary models can be summarized in the following table:
AR(p) | Decreases exponentially | Cuts off at lag p |
MA(q) | Cuts off at lag q | Decreases exponentially |
ARMA(p, q) | Decreases exponentially | Decreases exponentially |
Time series are used in many studies either for model building or inference. We must be careful when choosing what kind of time series data is used in the analysis. Since many time series variables like rainfall, industrial production, and sales exist only in some aggregated forms, we will begin with the issue related to the temporal aggregation effect on the model form. Given a time series z_{t}, let Z_{T} = (1 + B + · · · + B^{m}^{?1})z_{mT} . For example, with m = 3, Z_{1} = (1 + B + B^{2})z_{3} = z_{1} + z_{2} + z_{3}, Z_{2} = (1 + B + B^{2})z_{6} = z_{4} + z_{5} + z_{6}, etc. We will call z_{t} as non-aggregate series and Z_{T} as aggregate series. For m = 3 if z_{t} is a monthly series, then Z_{T} will be a quarterly series. To make inference, should we use non-aggregate series, z_{t} or aggregate series, Z_{T}? Do they make any difference? Are the time series models for z_{t} and for Z_{T} the same?
The first published papers on aggregation effects on ARIMA models were by Tiao (1972) and Amemiya and Wu (1972), and they led to many other studies on the topic including my Ph.D. dissertation and life time research in the area.
To answer the above questions, we need to study the relationship of autocovariances between the non-aggregate and aggregate series z_{t} and Z_{T}, or more generally between w_{t} = (1 ? B)^{d}z_{t} and U_{T} = (1 ? ?)^{d}Z_{T} . Define the m-period overlapping sum,
and note that Z_{T} = ζ_{mT}, (1 ? ?)Z_{T} = Z_{T} ? Z_{T}_{?1} = ζ_{mT} ? ζ_{m}_{(}_{T}_{?1)} = (1 ? B^{m})ζ_{mT},
Hence, we have
Now let us consider an MA(2) model for z_{t}, z_{t} = (1 ? θ_{1}B ? θ_{2}B^{2})a_{t}. If m = 3, what is the model for Z_{T}? For this MA(2) model, we have
and
Note that for d = 0 and m = 3, from (
where
Hence, Z_{T} is a MA(1) process, Z_{T} = (1 ? Θ?)A_{T} where
and
More generally, we have the following results from Stram and Wei (1986):
(1) Temporal aggregation of the AR(p) process
Suppose that the non-aggregate series z_{t} follows a stationary AR(p) process,
Let φ_{p}(B) = (1 ? φ_{1}B ? · · · ? φ_{p}B^{p})z_{t} and
where
(2) If z_{t} ~ ARMA(p, q) model, then Z_{T} ~ ARMA(M, N_{2}), where N_{2} = [p+1+(q?p?1)/m]?(p?M).
(3) If z_{t} ~ ARIMA(p, d, q) model, then Z_{T} ~ ARIMA(M, d, N_{3}), where N_{3} = [p + d + 1 + (q ? p ? d ? 1)/m] ? (p ? M).
The limiting behavior of aggregates was studied by Tiao (1972) and he showed that given z_{t} ~ ARIMA(p, d, q) model, the limiting model for the aggregates, Z_{T} exists, and as m → ∞, Z_{T} → IMA(d, d).
When a variable is a stock variable and we observe only every mth value of the variable, i.e., given z_{1}, z_{2}, z_{3}, . . . but we observe only Z_{T} = z_{mT}. For example, for m = 3, Z_{1} = z_{3}, Z_{2} = z_{6}, . . . . We have the following interesting result from Wei (1981).
Given z_{t} ~ IMA(d, q): (1 ? B)^{d}z_{t} = (1 ? θ_{1}B ? θ_{1}B^{2} ? · · · ? θ_{q}B^{q})a_{t}. Let Z_{T} = z_{mT}. Then, as m → ∞, Z_{T} → IMA(d, d ? 1). Theoretically, every ARIMA(p, d, q) process can be approximated by an IMA(d, q) process. Thus, if z_{t} is an ARIMA(p, 1, q), then as m → ∞, Z_{T} approaches IMA(1, 0), which could very likely explain why most daily stock prices follow a random walk model. In building an underlying time series model, we need to be aware of the effect of the use of aggregate series. In addition to the above cited references, we refer readers to some other useful references including Brewer (1973), Wei (1982), Ansley and Kohn (1983), Weiss (1984), Wei and Stram (1988), Marcellino (1999), Shellman (2004), and Sbrana and Silvestrini (2013).
Given the AR(1) model
where a_{t} is
Dickey and Fuller (1979) suggested using the following test statistic and showed that under H_{0}: φ = 1,
where W(t) is a Wiener process (also known as Brownian motion process). We reject H_{0} if the value of the test statistic is too small (negative).
In practice, aggregate data, Z_{T} = (1 + B + · · · + B^{m}^{?1})z_{mT} are often used. It has been shown by Teles et al. (2008) that under H_{0}: φ = 1, (1 ? B)z_{t} = a_{t}, the corresponding model for the aggregate series is
where the E_{T}_{?1} are independent and identically distributed variables with zero mean and variance
(1) If m = 1, then Θ = 0;
(2) If m ≥ 2, then
As a result, the corresponding test statistic becomes
Comparing with (
In this example, we simulated a time series of 240 observations from the model z_{t} = 0.95z_{t}_{?1} + a_{t} where the a_{t} are i.i.d. N(0, 1). This series was then aggregated with m = 3.
(1) Test a unit root based on non-aggregate series, z_{t} (240 observations). Based on the sample autocorrelation function and partial autocorrelation function, we have an AR(1) model. The least squares estimation leads to the following result
To test the hypothesis of a unit root, the value of the test statistic is
At α = 5%, the critical point from Dickey and Fuller (1979) is between ?8.0 and ?7.9 for n = 240. Thus, the hypothesis of a unit root is rejected, and we conclude that the underlying model is stationary. This is consistent with the underlying simulated model.
(2) Test a unit root with aggregate series, Z_{T}, with m = 3 (80 observations). The sample autocorrelation function and partial autocorrelation function suggest an AR(1) model (an ARMA(1,1) model was also considered but its MA parameter was not significant). The least squares estimation leads to the following result
To test the hypothesis of a unit root, the value of the test statistic is
Again, at α = 5%, the critical point from Dickey and Fuller (1979) is between ?7.9 and ?7.7 for n = 80. Thus, the hypothesis of a unit root is not rejected, and we conclude that the underlying model is nonstationary. This leads to a wrong conclusion. However, if we use the adjusted critical value given in Teles et al. (2008) based on the adjusted test statistic of (
When aggregate time series are used in modeling and testing, we need to make sure to use a proper adjusted table for the test of its significance.
Time series are often used in regression analysis, which is possible the most commonly used statistical method. So we will also consider the consequence of the use of aggregate series in a regression model. Let us consider the simple regression model,
which is a one-sided causal relationship. If x_{t}_{?1} is also stochastic, for example, if it follows a MA(1) process, we can also write the joint system as
or
where a_{t} and e_{t} are independent
For m = 3, W_{1} = x_{0} + x_{1} + x_{2}, W_{2} = x_{3} + x_{4} + x_{5}, etc., which are not available, and the available data are X_{1} = x_{1} + x_{2} + x_{3}, X_{2} = x_{4} + x_{5} + x_{6}, etc. A natural way to estimate W_{T} is to consider its projection on X_{T}. Specifically, we let Z = [W_{T}, X_{T}]′ and compute its covariance matrix generating function
with
and hence
and
It follows that
It is interesting to note that
which implies that the estimate of ?_{T} is the weighted average of X_{T} and X_{T}_{?1} with weights (m?1)/m and 1/m, respectively. This is clearly reasonable. The aggregate model then becomes
where U_{T} = α(W_{T} ? ?_{T} ) + E_{T} = αV_{T} + E_{T},
In studying the relationship of variables, other than the regression model, we often consider vector time series models. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series models. We now discuss some special issues of vector time models.
First, let us review some results from univariate time series models. It is well known that we can always write a stationary process as a MA representation
or
such that
or
such that
Let Z_{t} = [Z_{1,}_{t}, Z_{2,}_{t}, . . . , Z_{m}_{,}_{t}]′ be the m-dimensional vector time series. Some commonly used vector time series models are VAR(p), VMA(q), and VARMA(p, q) processes. Again, we can write these vector processes in a moving average representation
where Z? = Z ? μ and a_{t} is a m-dimensional vector white noise process N(0, ∑). We can also write it in an autoregressive representation
It follows that we can express a VAR(1) process,
in the following MA representation
where Φ_{0} = I. A natural question to ask: is the VMA representation always an infinite order? People often think that the univariate model is a special case of the vector model with dimension equal 1, and so the answer to the question is obviously yes. However, let us consider the following 3-dimensional vector VAR(1) model
or equivalently
where
Thus, the inverse of a non-degenerate VAR(1) matrix polynomial (i.e., Φ(B) ≠ I) will be of a finite order if the determinant |Φ(B)| is independent of B. For more detailed discussion, we refer readers to Tiao and Tsay (1989), and Shen and Wei (1995).
Given a univariate seasonal ARMA model
where
and a_{t} is a Gaussian white noise process with mean 0 and a constant variance
where
are matrix polynomials. The matrix I is the k-dimensional identity matrix, the Φs, φs, θs, and Θs are k × k parameter matrices, and a_{t} is a vector Gaussian white noise process with mean vector 0 and
Note that as expected, the vector model reduces to the univariate model when k = 1. Moreover, in such a case, the ARMA(P)_{s}(p)(q)(Q)_{s} model can also be written as the following ARMA(p)(P)_{s}(Q)_{s}(q) model
As a result, a multiplicative seasonal ARMA model is also traditionally written in the form of ARMA (P)_{s}(p)(q)(Q)_{s}. This traditional representation has been adopted by many researchers for both univariate and vector time series.
When k > 1 and x_{t} is a vector process, can we really extend the above operation and write the VARMA(P)_{s}(p)(q)(Q)_{s} model, Φ_{P}(B^{s})φ_{p}(B)x_{t} = θ_{q}(B)Θ_{Q}(B^{s})a_{t}, as the following VARMA(p)(P)_{s}(Q)_{s} (q) model?
Let us consider two simple seemingly equivalent bivariate VAR(1)_{4}(1) and VAR(1)(1)_{4} representations with the following parameters
and the associated noise a_{t} is a vector Gaussian white noise process with mean zero and covariance matrix Ω. For the VAR(1)_{4}(1) representation,
For the VAR(1)(1)_{4} representation,
The different implications between the two seemingly equivalent representations are clear and cannot be ignored especially when a policy related decision is to be made. Please see Yozgatligil and Wei (2009) for details.
In addition to temporal aggregation discussed in earlier sections, there is another commonly used aggregation. For example, the total money supply is the aggregate of demand deposits and currency in circulation. The total housing start is the aggregate of housing starts in the north east, north central, south, and west regions, which again are the subaggregates of housing starts in different states. The total sales of a company is the aggregate of the sales achieved by all of its branches throughout the country or countries.
Let z_{1,}_{t}, z_{2,}_{t}, . . . , z_{m}_{,}_{t} be the m component time series and
Method 1: based on a model using the aggregate series Y_{t} and its l-step ahead forecast, ?_{t}(l).
Method 2: based on individual component models using the non-aggregate series z_{i}_{,}_{t} and the sum of the forecasts from all component models, i.e.,
Method 3: based on a joint multiple time series model and the forecast from the joint multiple model,
Question: what are the relative efficiencies among the three methods in terms of the minimum mean square error forecast? The answers are given below.
The comparison between methods 1 and 2 depends on the model structure; there is no definite winner between methods 1 and 2.
The answer (2) above could be surprising to some people because aggregation normally will cause information loss. For the proof, we refer readers to Wei and Abraham (1981).
Next, let us consider the m-dimensional models related to both time and space where we write it as
The commonly used model to describe (
where
and φ_{k}_{,}_{l} and θ_{k}_{,}_{l} are autoregressive and moving average parameters at time lag k and space lag l, respectively. p is the autoregressive order, q is the moving average order, r_{k} is the spatial order for the kth autoregressive term, and τ_{k} is the spatial order for the kth moving average term. The STARMA(p, q) model becomes a space-time autoregressive (STAR(p)) model when q = 0. It becomes a space-time moving average (STMA(q)) model when p = 0.
For these STARMA(p, q) models, it will be interesting to study the effect of temporal aggregation, the effect of contemporal aggregation, and more generally, the combining effects of both temporal and contemporal aggregation. Because of the time limitation of this presentation, we refer readers to Arbia et al. (2010), Giacomini and Granger (2004), and Hendry and Hubrich (2011) among others on some of these issues.
Model | ACF | PACF |
---|---|---|
AR(p) | Decreases exponentially | Cuts off at lag p |
MA(q) | Cuts off at lag q | Decreases exponentially |
ARMA(p, q) | Decreases exponentially | Decreases exponentially |