search for

CrossRef (0)
Issues Related to the Use of Time Series in Model Building and Analysis: Review Article
Communications for Statistical Applications and Methods 2015;22:209-222
Published online May 31, 2015
© 2015 Korean Statistical Society.

William W. S. Weia

aDepartment of Statistics, Temple University, USA
Correspondence to: William W. S. Wei
Department of Statistics, Temple University, 1810 North 13th Street, 330 Speakman Hall (006-12), Philadelphia, PA 19122, USA. E-mail: wwei@temple.edu
Received December 14, 2014; Revised January 28, 2015; Accepted January 28, 2015.

Time series are used in many studies for model building and analysis. We must be very careful to understand the kind of time series data used in the analysis. In this review article, we will begin with some issues related to the use of aggregate and systematic sampling time series. Since several time series are often used in a study of the relationship of variables, we will also consider vector time series modeling and analysis. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series. Therefore, we will also discuss some issues related to vector time models. Understanding these issues is important when we use time series data in modeling and analysis, regardless of whether it is a univariate or multivariate time series.

Keywords : temporal aggregation, systematic sampling, unit root test, causal relationship, vector time series, contemporal aggregation
1. Introduction

Let zt be a time series process. For a stationary process, its mean, E(zt) = μ, and variance, γz(0)=E(zt-μ)2=σz2, are constant. Also, in this case, its autocovariance function (ACF) between zt and zt+k, γz(k) = E(ztμ)(zt+kμ) = E(żtżt+k), and autocorrelation function, ρz(k) = γz(k)/γz(0), are functions of only the time difference. The partial autocorrelation function (PACF) is defined as φkk = Corr(zt, zt+k|zt+1, . . . , zt+k−1). Some commonly used time series processes or models are:

  • Autoregressive process of order p (AR(p) model)


  • Moving average process of order q (MA(q) model)


  • Autoregressive moving average process (ARMA(p, q) model)


  • Autoregressive integrated moving average process (ARIMA(p, d, q) model)


The model is stationary if the roots of its associated AR polynomial are all outside the unit circle, and the important characteristics of stationary models can be summarized in the following table:

AR(p)Decreases exponentiallyCuts off at lag p
MA(q)Cuts off at lag qDecreases exponentially
ARMA(p, q)Decreases exponentiallyDecreases exponentially
2. Temporal Aggregation Effect on Model Form

Time series are used in many studies either for model building or inference. We must be careful when choosing what kind of time series data is used in the analysis. Since many time series variables like rainfall, industrial production, and sales exist only in some aggregated forms, we will begin with the issue related to the temporal aggregation effect on the model form. Given a time series zt, let ZT = (1 + B + · · · + Bm−1)zmT . For example, with m = 3, Z1 = (1 + B + B2)z3 = z1 + z2 + z3, Z2 = (1 + B + B2)z6 = z4 + z5 + z6, etc. We will call zt as non-aggregate series and ZT as aggregate series. For m = 3 if zt is a monthly series, then ZT will be a quarterly series. To make inference, should we use non-aggregate series, zt or aggregate series, ZT? Do they make any difference? Are the time series models for zt and for ZT the same?

The first published papers on aggregation effects on ARIMA models were by Tiao (1972) and Amemiya and Wu (1972), and they led to many other studies on the topic including my Ph.D. dissertation and life time research in the area.

To answer the above questions, we need to study the relationship of autocovariances between the non-aggregate and aggregate series zt and ZT, or more generally between wt = (1 − B)dzt and UT = (1 − ℬ)dZT . Define the m-period overlapping sum,


and note that ZT = ζmT, (1 − ℬ)ZT = ZTZT−1 = ζmTζm(T−1) = (1 − Bm)ζmT,


Hence, we have


Now let us consider an MA(2) model for zt, zt = (1 − θ1Bθ2B2)at. If m = 3, what is the model for ZT? For this MA(2) model, we have



γz(j)=0,         j>2.

Note that for d = 0 and m = 3, from (2.1), we have




Hence, ZT is a MA(1) process, ZT = (1 − Θℬ)AT where




More generally, we have the following results from Stram and Wei (1986):

  • Temporal aggregation of the AR(p) process

    Suppose that the non-aggregate series zt follows a stationary AR(p) process,


    Let φp(B) = (1 − φ1B − · · · − φpBp)zt and δi-1 for i = 1, . . . , p* be the distinct roots of φp(B), each with multiplicity si such that i=1p*si=p. For any given value of m, let b equal the number of distinct values δim for i = 1, . . . , p*. Furthermore, partition the numbers si for i = 1, . . . , p* into b distinct sets Ai such that sk and sjAi if and only if δkm=δjm. Then the mth order aggregate series, ZT follows an ARMA(M, N1) model,


    where M=i=1bmax Ai, max Ai = the largest element in Ai, N1 = [p+1−(p+1)/m]−(pM) = [M + 1 − (p + 1)/m], the ET are white noise with mean 0 and variance σE2, and αi, βj, and σE2 are functions of φk’s and σa2.

  • If zt ~ ARMA(p, q) model, then ZT ~ ARMA(M, N2), where N2 = [p+1+(qp−1)/m]−(pM).

  • If zt ~ ARIMA(p, d, q) model, then ZT ~ ARIMA(M, d, N3), where N3 = [p + d + 1 + (qpd − 1)/m] − (pM).

The limiting behavior of aggregates was studied by Tiao (1972) and he showed that given zt ~ ARIMA(p, d, q) model, the limiting model for the aggregates, ZT exists, and as m → ∞, ZT → IMA(d, d).

When a variable is a stock variable and we observe only every mth value of the variable, i.e., given z1, z2, z3, . . . but we observe only ZT = zmT. For example, for m = 3, Z1 = z3, Z2 = z6, . . . . We have the following interesting result from Wei (1981).

Given zt ~ IMA(d, q): (1 − B)dzt = (1 − θ1Bθ1B2 − · · · − θqBq)at. Let ZT = zmT. Then, as m → ∞, ZT → IMA(d, d − 1). Theoretically, every ARIMA(p, d, q) process can be approximated by an IMA(d, q) process. Thus, if zt is an ARIMA(p, 1, q), then as m → ∞, ZT approaches IMA(1, 0), which could very likely explain why most daily stock prices follow a random walk model. In building an underlying time series model, we need to be aware of the effect of the use of aggregate series. In addition to the above cited references, we refer readers to some other useful references including Brewer (1973), Wei (1982), Ansley and Kohn (1983), Weiss (1984), Wei and Stram (1988), Marcellino (1999), Shellman (2004), and Sbrana and Silvestrini (2013).

3. Aggregation Effect on Testing for a Unit Root

Given the AR(1) model


where at is N(0,σa2) white noise process and t = 1, 2, . . . , n. To test a unit root, H0: φ = 1 vs H1: φ < 1, since


Dickey and Fuller (1979) suggested using the following test statistic and showed that under H0: φ = 1,


where W(t) is a Wiener process (also known as Brownian motion process). We reject H0 if the value of the test statistic is too small (negative).

In practice, aggregate data, ZT = (1 + B + · · · + Bm−1)zmT are often used. It has been shown by Teles et al. (2008) that under H0: φ = 1, (1 − B)zt = at, the corresponding model for the aggregate series is


where the ET−1 are independent and identically distributed variables with zero mean and variance σE2 and the parameters Θ and σE2 are determined as follows:

  • If m = 1, then Θ = 0; σE2=σa2;

  • If m ≥ 2, then


    As a result, the corresponding test statistic becomes


    Comparing with (3.2), we see that the limiting distribution of the test statistic for the aggregate time series depends on the order of aggregation m. Since m ≥ 2, the distribution of the test statistic is shifted to the right, and the shift increases with the order of aggregation. Aggregation leads to empirical significance levels lower than the nominal level and significantly reduces the power of the test.

Example 1

In this example, we simulated a time series of 240 observations from the model zt = 0.95zt−1 + at where the at are i.i.d. N(0, 1). This series was then aggregated with m = 3.

  • Test a unit root based on non-aggregate series, zt (240 observations). Based on the sample autocorrelation function and partial autocorrelation function, we have an AR(1) model. The least squares estimation leads to the following result


    To test the hypothesis of a unit root, the value of the test statistic is


    At α = 5%, the critical point from Dickey and Fuller (1979) is between −8.0 and −7.9 for n = 240. Thus, the hypothesis of a unit root is rejected, and we conclude that the underlying model is stationary. This is consistent with the underlying simulated model.

  • Test a unit root with aggregate series, ZT, with m = 3 (80 observations). The sample autocorrelation function and partial autocorrelation function suggest an AR(1) model (an ARMA(1,1) model was also considered but its MA parameter was not significant). The least squares estimation leads to the following result


    To test the hypothesis of a unit root, the value of the test statistic is


    Again, at α = 5%, the critical point from Dickey and Fuller (1979) is between −7.9 and −7.7 for n = 80. Thus, the hypothesis of a unit root is not rejected, and we conclude that the underlying model is nonstationary. This leads to a wrong conclusion. However, if we use the adjusted critical value given in Teles et al. (2008) based on the adjusted test statistic of (3.4), which is between −5.45 and −5.40 for n = 80, we will reject the null hypothesis of a unit root and leads to a consistent conclusion.

When aggregate time series are used in modeling and testing, we need to make sure to use a proper adjusted table for the test of its significance.

4. Aggregation Effect on a Dynamic Relation

Time series are often used in regression analysis, which is possible the most commonly used statistical method. So we will also consider the consequence of the use of aggregate series in a regression model. Let us consider the simple regression model,


which is a one-sided causal relationship. If xt−1 is also stochastic, for example, if it follows a MA(1) process, we can also write the joint system as




where at and et are independent N(0,σa2) and N(0,σe2), respectively. Let YT=(j=0m-1Bj)ymT,XT=(j=0m-1Bj)xmT,WT=(j=0m-1Bj)xmT-1, and ET=(j=0m-1Bj)emT. Equation (4.1) implies that


For m = 3, W1 = x0 + x1 + x2, W2 = x3 + x4 + x5, etc., which are not available, and the available data are X1 = x1 + x2 + x3, X2 = x4 + x5 + x6, etc. A natural way to estimate WT is to consider its projection on XT. Specifically, we let Z = [WT, XT]′ and compute its covariance matrix generating function



Γ0=E [WTXT][WTXT]=E [WTWTWTXTXTWTXTXT]=σa2[m(1-θ)2+2θ(m-1)(1-θ)2(m-1)(1-θ)2m(1-θ)2+2θ],Γ1=E [WT-1XT-1][WTXT]=E [WT-1WTWT-1XTXT-1WTXT-1XT]=σa2[-θ0(1-θ)2-θ],Γk=E [WT-kXT-k][WTXT]=E [WT-kWTWT-kXTXT-kWTXT-kXT]=0,         k2,

and hence




It follows that


It is interesting to note that Equation (4.6) can be rewritten as


which implies that the estimate of ŴT is the weighted average of XT and XT−1 with weights (m−1)/m and 1/m, respectively. This is clearly reasonable. The aggregate model then becomes


where UT = α(WTŴT ) + ET = αVT + ET, GU(B)=α2GV(B)+mσe2,GV(B)=G11(B)-G12(B)[G22(B)]-1G21(B)=σa2{[m(1-θ)2]+2θ-θ(B+F)-[(1-θ)4[(m-1)+B][(m-1)+F]]/[m(1-θ)2+2θ-θ(B+F)]}, and F = B−1. Thus, temporal aggregation turns a one-sided causal relationship into a two-sided feedback system. It is important to note that after proposing an underlying model for a study, one should use the same time unit in the hypothesis and data collection for modeling and testing. An improper use of time unit could lead to a very misleading conclusion. For a more detailed description, we refer readers to Tiao and Wei (1976). Other useful references include Wei (1978), and Lütkepohl (1987).

5. Issues Related to Vector Time Series Modeling

5.1. Representation of vector time series models

In studying the relationship of variables, other than the regression model, we often consider vector time series models. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series models. We now discuss some special issues of vector time models.

First, let us review some results from univariate time series models. It is well known that we can always write a stationary process as a MA representation




such that j=0ψj<. Similarly, we can write an invertible process as an AR representation




such that j=0πj<. From these two representations, we have the well-known dual relationship between AR(p) and MA(q) models in the univariate time series processes. That is, a finite order AR process corresponds to an infinite order MA process, and a finite order MA process corresponds to an infinite order AR process. For example, an AR(1) model, (1 − φB)t = at, corresponds to an infinite order MA process, Żt = (1 − φB)−1at = (1 + φB + φ2B2 + · · · )at, and a MA(1) model, Żt = (1 − θB)at, corresponds to an infinite order AR process, (1 + θB + θ2B2 + · · · )Żt = at.

Let Zt = [Z1,t, Z2,t, . . . , Zm,t]′ be the m-dimensional vector time series. Some commonly used vector time series models are VAR(p), VMA(q), and VARMA(p, q) processes. Again, we can write these vector processes in a moving average representation


where = Zμ and at is a m-dimensional vector white noise process N(0, ). We can also write it in an autoregressive representation


It follows that we can express a VAR(1) process,


in the following MA representation


where Φ0 = I. A natural question to ask: is the VMA representation always an infinite order? People often think that the univariate model is a special case of the vector model with dimension equal 1, and so the answer to the question is obviously yes. However, let us consider the following 3-dimensional vector VAR(1) model


or equivalently


where Φ=[0.8000.5000]. Since Φ20 and Φj = 0 for j > 2, Equation (5.4) actually represents a VMA(2) model. In fact,


Thus, the inverse of a non-degenerate VAR(1) matrix polynomial (i.e., Φ(B) ≠ I) will be of a finite order if the determinant |Φ(B)| is independent of B. For more detailed discussion, we refer readers to Tiao and Tsay (1989), and Shen and Wei (1995).

5.2. Representation of multiplicative seasonal vector autoregressive moving average models

Given a univariate seasonal ARMA model




and at is a Gaussian white noise process with mean 0 and a constant variance σa2. The model is often denoted as ARMA(p, q)×(P, Q)s. To facilitate our discussion, we will use the order of the polynomials appearing in the equation and denote it as ARMA(P)s(p)(q)(Q)s. When xt = [x1,t, . . . , xk,t]′ is a k-dimensional vector, the natural extension is the following multiplicative vector autoregressive moving average VARMA(P)s(p)(q)(Q)s model,




are matrix polynomials. The matrix I is the k-dimensional identity matrix, the Φs, φs, θs, and Θs are k × k parameter matrices, and at is a vector Gaussian white noise process with mean vector 0 and E(atat)=Ω.

Note that as expected, the vector model reduces to the univariate model when k = 1. Moreover, in such a case, the ARMA(P)s(p)(q)(Q)s model can also be written as the following ARMA(p)(P)s(Q)s(q) model


As a result, a multiplicative seasonal ARMA model is also traditionally written in the form of ARMA (P)s(p)(q)(Q)s. This traditional representation has been adopted by many researchers for both univariate and vector time series.

When k > 1 and xt is a vector process, can we really extend the above operation and write the VARMA(P)s(p)(q)(Q)s model, ΦP(Bs)φp(B)xt = θq(B)ΘQ(Bs)at, as the following VARMA(p)(P)s(Q)s (q) model?


Let us consider two simple seemingly equivalent bivariate VAR(1)4(1) and VAR(1)(1)4 representations with the following parameters


and the associated noise at is a vector Gaussian white noise process with mean zero and covariance matrix Ω. For the VAR(1)4(1) representation,


For the VAR(1)(1)4 representation,


The different implications between the two seemingly equivalent representations are clear and cannot be ignored especially when a policy related decision is to be made. Please see Yozgatligil and Wei (2009) for details.

6. Contemporal Aggregation

In addition to temporal aggregation discussed in earlier sections, there is another commonly used aggregation. For example, the total money supply is the aggregate of demand deposits and currency in circulation. The total housing start is the aggregate of housing starts in the north east, north central, south, and west regions, which again are the subaggregates of housing starts in different states. The total sales of a company is the aggregate of the sales achieved by all of its branches throughout the country or countries.

Let z1,t, z2,t, . . . , zm,t be the m component time series and Yt=i=1mzi,t be the corresponding series of aggregates. Suppose we are interested in forecasting the future aggregate Yt+l for some l, based on the knowledge of the available time series up to the time t. Clearly, such forecasts can be obtained through the following three methods.

  • Method 1: based on a model using the aggregate series Yt and its l-step ahead forecast, Ŷt(l).

  • Method 2: based on individual component models using the non-aggregate series zi,t and the sum of the forecasts from all component models, i.e., Y^^t(l)=i=1mz^i,t(l).

  • Method 3: based on a joint multiple time series model and the forecast from the joint multiple model, Y^^^t(l).

Question: what are the relative efficiencies among the three methods in terms of the minimum mean square error forecast? The answers are given below.

  • E[Yt+l-Y^^^t(l)]2E[Yt+l-Y^t(l)]2;E[Yt+l-Y^^^t(l)]2E[Yt+l-Y^^t(l)]2; and equality holds when z1,t, z2,t, . . . , zm,t are orthogonal to each other.

  • The comparison between methods 1 and 2 depends on the model structure; there is no definite winner between methods 1 and 2.

The answer (2) above could be surprising to some people because aggregation normally will cause information loss. For the proof, we refer readers to Wei and Abraham (1981).

Next, let us consider the m-dimensional models related to both time and space where we write it as

Zt=[Z1,tZ2,tZm,t]=[Time series for space 1Time series for space 2Time series for space m].

The commonly used model to describe (6.1) is the following space-time autoregressive moving average (STARMA(p, q)) model,


where W(l)=wi,j(l) are m × m the spatial weight matrices, j=1mwi,j(l)=1,

wi,j(l)={(0,1],if location jis the lthorder neighbor of i,0,otherwise,

and φk,l and θk,l are autoregressive and moving average parameters at time lag k and space lag l, respectively. p is the autoregressive order, q is the moving average order, rk is the spatial order for the kth autoregressive term, and τk is the spatial order for the kth moving average term. The STARMA(p, q) model becomes a space-time autoregressive (STAR(p)) model when q = 0. It becomes a space-time moving average (STMA(q)) model when p = 0.

For these STARMA(p, q) models, it will be interesting to study the effect of temporal aggregation, the effect of contemporal aggregation, and more generally, the combining effects of both temporal and contemporal aggregation. Because of the time limitation of this presentation, we refer readers to Arbia et al. (2010), Giacomini and Granger (2004), and Hendry and Hubrich (2011) among others on some of these issues.

AR(p)Decreases exponentiallyCuts off at lag p
MA(q)Cuts off at lag qDecreases exponentially
ARMA(p, q)Decreases exponentiallyDecreases exponentially

  1. Amemiya T and Wu RY (1972). The effect of aggregation on prediction in the autoregressive model. Journal of the American Statistical Association, 67, 628-632.
  2. Ansley CF and Kohn R (1983). Exact likelihood of vector autoregressive-moving average process with missing or aggregated data. Biometrika, 70, 275-278.
  3. Arbia G, Bee M, and Espa G (2010). Aggregation of regional economic time series with different spatial correlation structures. Geographical Analysis, 43, 78-103.
  4. Brewer KW (1973). Some consequences of temporal aggregation and systematic sampling for ARMA and ARMAX models. Journal of Econometrics, 1, 133-154.
  5. Dickey D and Fuller WA (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association, 74, 427-431.
  6. Giacomini R and Granger CWJ (2004). Aggregation of space-time processes. Journal of Econometrics, 118, 7-26.
  7. Hendry DF and Hubrich K (2011). Comment: The disaggregate forecasts or combing disaggregate information to forecast an aggregate. Journal of Business & Economic Statistics, 29, 216-227.
  8. L체tkepohl H (1987). Forecasting Aggregated Vector ARMA Processes, Berlin, Springer-Verlag.
  9. Marcellino M (1999). Some consequences of temporal aggregation in empirical analysis. Journal of Business and Economic Statistics, 17, 129-136.
  10. Sbrana G and Silvestrini A (2013). Aggregation of exponential smoothing process with an application to portfolio risk evaluation. Journal of Banking & Finance, 37, 1437-1450.
  11. Shellman SM (2004). Time series intervals and statistical inference: The effects of temporal aggregation on event data analysis. Political Analysis, 12, 97-104.
  12. Shen SY and Wei WWS (1995). A note on the representation of a vector ARMA model. Journal of Applied Statistical Science, 2, 311-318.
  13. Stram D and Wei WWS (1986). Temporal aggregation in the ARIMA process. Journal of Time Series Analysis, 7, 279-292.
  14. Teles P, Wei WWS, and Hodgess EM (2008). Testing a unit root based on aggregate time series. Communications in Statistics - Theory and Methods, 37, 565-590.
  15. Tiao GC (1972). Asymptotic behaviour of temporal aggregates of time series. Biometrika, 59, 525-531.
  16. Tiao GC and Box GEP (1981). Modelling multiple time series with applications. Journal of the American Statistical Association, 76, 802-816.
  17. Tiao GC and Tsay RS (1989). Model specification in multivariate time series. Journal of the Royal Statistical Society Series B, 51, 157-213.
  18. Tiao GC and Wei WWS (1976). Effect of temporal aggregation on the dynamic relationship of two time series variables. Biometrika, 63, 513-523.
  19. Wei WWS (1978). The effect of temporal aggregation on parameter estimation in distributed lag model. Journal of Econometrics, 8, 237-246.
  20. Wei WWS (1981). Effect of systematic sampling on ARIMA models. Communications in Statistics - Theory and Methods, 10, 2389-2398.
  21. Wei WWS and Abraham B (1981). Forecasting contemporal time series aggregates. Communications in Statistics - Theory and Methods, 10, 1335-1344.
  22. Wei WWS (1982). Comment: The effect of systematic sampling and temporal aggregation on causality A cautionary note. Journal of the American Statistical Association, 77, 316-319.
  23. Wei WWS and Stram D (1988). An eigenvalue approach to the limiting behavior of time series aggregates. Annals of the Institute of Statistical Mathematics, 40, 101-110.
  24. Weiss AA (1984). Systematic sampling and temporal aggregation in time series models. Journal of Econometrics, 26, 271-281.
  25. Yozgatligil C and Wei WWS (2009). Representation of multiplicative seasonal vector autoregressive moving average models. The American Statistician, 63, 328-334.