Time series are used in many studies for model building and analysis. We must be very careful to understand the kind of time series data used in the analysis. In this review article, we will begin with some issues related to the use of aggregate and systematic sampling time series. Since several time series are often used in a study of the relationship of variables, we will also consider vector time series modeling and analysis. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series. Therefore, we will also discuss some issues related to vector time models. Understanding these issues is important when we use time series data in modeling and analysis, regardless of whether it is a univariate or multivariate time series.
Let
Autoregressive process of order
Moving average process of order
Autoregressive moving average process (ARMA(
Autoregressive integrated moving average process (ARIMA(
The model is stationary if the roots of its associated AR polynomial are all outside the unit circle, and the important characteristics of stationary models can be summarized in the following table:
AR( | Decreases exponentially | Cuts off at lag |
MA( | Cuts off at lag | Decreases exponentially |
ARMA( | Decreases exponentially | Decreases exponentially |
Time series are used in many studies either for model building or inference. We must be careful when choosing what kind of time series data is used in the analysis. Since many time series variables like rainfall, industrial production, and sales exist only in some aggregated forms, we will begin with the issue related to the temporal aggregation effect on the model form. Given a time series
The first published papers on aggregation effects on ARIMA models were by Tiao (1972) and Amemiya and Wu (1972), and they led to many other studies on the topic including my Ph.D. dissertation and life time research in the area.
To answer the above questions, we need to study the relationship of autocovariances between the non-aggregate and aggregate series
and note that
Hence, we have
Now let us consider an MA(2) model for
and
Note that for
where
Hence,
and
More generally, we have the following results from Stram and Wei (1986):
Temporal aggregation of the AR(
Suppose that the non-aggregate series
Let
where
If
If
The limiting behavior of aggregates was studied by Tiao (1972) and he showed that given
When a variable is a stock variable and we observe only every mth value of the variable, i.e., given
Given
Given the AR(1) model
where
Dickey and Fuller (1979) suggested using the following test statistic and showed that under
where
In practice, aggregate data,
where the
If
If
As a result, the corresponding test statistic becomes
Comparing with (
In this example, we simulated a time series of 240 observations from the model
Test a unit root based on non-aggregate series,
To test the hypothesis of a unit root, the value of the test statistic is
At
Test a unit root with aggregate series,
To test the hypothesis of a unit root, the value of the test statistic is
Again, at
When aggregate time series are used in modeling and testing, we need to make sure to use a proper adjusted table for the test of its significance.
Time series are often used in regression analysis, which is possible the most commonly used statistical method. So we will also consider the consequence of the use of aggregate series in a regression model. Let us consider the simple regression model,
which is a one-sided causal relationship. If
or
where
For
with
and hence
and
It follows that
It is interesting to note that
which implies that the estimate of
where
In studying the relationship of variables, other than the regression model, we often consider vector time series models. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series models. We now discuss some special issues of vector time models.
First, let us review some results from univariate time series models. It is well known that we can always write a stationary process as a MA representation
or
such that
or
such that
Let
where
It follows that we can express a VAR(1) process,
in the following MA representation
where
or equivalently
where
Thus, the inverse of a non-degenerate VAR(1) matrix polynomial (i.e.,
Given a univariate seasonal ARMA model
where
and
where
are matrix polynomials. The matrix
Note that as expected, the vector model reduces to the univariate model when
As a result, a multiplicative seasonal ARMA model is also traditionally written in the form of ARMA (
When
Let us consider two simple seemingly equivalent bivariate VAR(1)_{4}(1) and VAR(1)(1)_{4} representations with the following parameters
and the associated noise
For the VAR(1)(1)_{4} representation,
The different implications between the two seemingly equivalent representations are clear and cannot be ignored especially when a policy related decision is to be made. Please see Yozgatligil and Wei (2009) for details.
In addition to temporal aggregation discussed in earlier sections, there is another commonly used aggregation. For example, the total money supply is the aggregate of demand deposits and currency in circulation. The total housing start is the aggregate of housing starts in the north east, north central, south, and west regions, which again are the subaggregates of housing starts in different states. The total sales of a company is the aggregate of the sales achieved by all of its branches throughout the country or countries.
Let
Method 1: based on a model using the aggregate series
Method 2: based on individual component models using the non-aggregate series
Method 3: based on a joint multiple time series model and the forecast from the joint multiple model,
Question: what are the relative efficiencies among the three methods in terms of the minimum mean square error forecast? The answers are given below.
The comparison between methods 1 and 2 depends on the model structure; there is no definite winner between methods 1 and 2.
The answer (2) above could be surprising to some people because aggregation normally will cause information loss. For the proof, we refer readers to Wei and Abraham (1981).
Next, let us consider the
The commonly used model to describe (
where
and
For these STARMA(
AR( | Decreases exponentially | Cuts off at lag |
MA( | Cuts off at lag | Decreases exponentially |
ARMA( | Decreases exponentially | Decreases exponentially |