search for

CrossRef (0)
Common Feature Analysis of Economic Time Series: An Overview and Recent Developments
Communications for Statistical Applications and Methods 2015;22:415-434
Published online September 30, 2015
© 2015 Korean Statistical Society.

Marco Centonia, and Gianluca Cubadda1,b

aLUMSA Universit?, Italy, bDipartimento di Economia e Finanza, Universit? degli Studi di Roma Tor Vergata, Italy
Correspondence to: Gianluca Cubadda
Corresponding author: Gianluca Cubadda, Dipartimento di Economia e Finanza, Universit? degli Studi di Roma Tor Vergata, Via Columbia 2, 00133 Roma, Italy. E-mail: gianluca.cubadda@uniroma2.it
Received September 9, 2015; Revised September 12, 2015; Accepted September 12, 2015.

In this paper we overview the literature on common features analysis of economic time series. Starting from the seminal contributions by Engle and Kozicki (1993) and Vahid and Engle (1993), we present and discuss the various notions that have been proposed to detect and model common cyclical features in macroeconometrics. In particular, we analyze in details the link between common cyclical features and the reduced-rank regression model. We also illustrate similarities and differences between the common features methodology and other popular types of multivariate time series modelling. Finally, we discuss some recent developments in this area, such as the implications of common features for univariate time series models and the analysis of common autocorrelation in medium-large dimensional systems.

Keywords : common features, common cycles, reduced-rank regression, canonical correlation analysis, vector autoregressive models, dynamic factor models, business cycles
1. Introduction

Economic time series could be characterized by several features such as trends, cycles, seasonality, serial correlation, and so on. When a set of series possesses the same type of feature, it could be the case that a linear combination of them does not necessarily possesses the feature: this is the most interesting case, for which Engle and Kozicki (1993) provided the following definition: “A feature, which is present in each of a set of series, is said to be common to those series when there exists a nonzero linear combination of these series that does not have the feature”. A well known example of common features is cointegration (Engle and Granger, 1987; Johansen, 1988): a group of series that possesses stochastic trends is cointegrated when there are some linear combinations of the variables that are stationary, i.e. do not have stochastic trends. Nowadays, there is a huge collection of special cases of common features. A comprehensive, although still partial, list includes: codependence (Gourieroux and Peaucelle, 1988; Vahid and Engle, 1997) and the scalar component model (Tiao and Tsay, 1989), when a linear combination of variables possesses shorter memory than individual series; common serial correlation (Engle and Kozicki, 1993; Vahid and Engle, 1993), when a linear combination of serially correlated series is an innovation w.r.t. the past; cotrending (Chapman and Ogaki, 1993), when a linear combination of trend-stationary time series no longer displays deterministic trend; common volatility (Engle and Kozicki, 1993; Engle and Susmel, 1993), when a linear combination of conditionally heteroskedastic time series eliminates conditional heteroskedasticity; seasonal cointegration (Hylleberg et al., 1990), when a linear combination of seasonally integrated series is nonseasonal; co-breaking (Hendry, 1999; Hendry and Massmann, 2007), when a set of series appears subject to structural breaks but a linear combination of them does not display the breaks; codependent cycles (Vahid and Engle, 1997), when a linear combination of a group of variables has shorter memory than the individual series; common nonlinearity (Anderson and Vahid, 1998), when the conditional expectation of each element of a vector time series is nonlinear w.r.t. the conditional vector but there exist a linear combination of them whose conditional expectation is linear w.r.t. the conditional vector; common seasonal cycles (Cubadda, 1999), when there exists a linear combination of seasonally differenced series which follows an MA process of low order; common panel structures (Hecq et al., 2000), when there is a linear combination of the variables in a panel data which is white noise for all individuals of the panel; nonlinear cotrending (Bierens, 2000), when a linear combination of the components of a set of stationary time series around nonlinear deterministic time trends is stationary around a linear trend or a constant; polynomial common serial correlation (Cubadda and Hecq, 2001), when there exists a polynomial combination of serially correlated time series that is an innovation; long-run pure variance common feature (Engle and Marcucci, 2006), when the conditional variances of a collection of assets all depend upon a small number of variance factors; unpredictable polynomial combinations (Paruolo, 2006), when a polynomial linear combination of series integrated with different orders is an innovation; weak form of common serial correlation (Hecq et al., 2006), when a linear combination of serially correlated series adjusted for the equilibrium errors is an innovation; common periodic correlation (Haldrup et al., 2007), which extends the notion of common serial correlation to periodic autoregressive models. From a statistical point of view, common features imply a reduction to more parsimonious structures such as common factor representations (see, i.a., Cubadda, 2007), which can often be estimated by reduced-rank regression techniques (Anderson, 1984, 1999). Imposing the common feature restrictions to the stastistical model generally leads to considerable gains in both forecasting and structural analysis (Vahid and Issler, 2002).

Common features among economic time series are often predicted by economic theory. For example, in King et al. (1988) the solution of their macro model implies that output, consumption and investment have a common trend and a common cycle. The common stochastic trend is generated by an integrated productivity shock, while the deviation of capital stock from its steady state value determines the transitional dynamics of output, consumption and investment. Another example is Campbell (1987), where the saving path implies that disposable income and consumption cointegrate (Issler and Vahid, 2001). Further, Vahid and Engle (1993) and Issler and Vahid (2001) show that in models of aggregate consumption, either the existence of “myopic” individuals (Campbell and Mankiw, 1990) or the excess sensitivity of consumption to current income (Flavin, 1993) imply that the growth rates of consumption and income share a common cycle. Note that in the consumption model by Hall (1978) consumption and income share only a common stochastic trend. As another example, in the real business cycle model for sectoral output à laLong and Plosser (1983), as in Engle and Issler (1995), common cycles depend on the propagation mechanisms through the restrictions on the production function, i.e. technological constraints.

The rest of the paper is organized as follows. In Section 2, after introducing the general notion of common features and linking it to the reduced-rank regression model, we focus on the various forms of common cyclical features and their implication in terms of common short-run components. We also illustrate similarities and differences between these approaches and other popular types of multivariate time series modelling. Section 3 takes into account the consequences of the presence of common features for the univariate representation of multiple time series. Section 4 deals with the estimation methods of the models implied by the various form of common features, distinguishing between the cases of small systems and medium-large systems. Finally, Section 5 draws some conclusions.

2. Common Features in Economic Time Series

In this section we first present and discuss the general notion of common features that was originally proposed by Engle and Kozicki (1993). We stress the link between the presence of common features and the multivariate Reduced-Rank Regression model (RRR) that was introduced by Anderson (1951). A detailed survey on this modelling may be found in Reinsel and Velu (1998). Then we focus on the common autocorrelation feature and its interplay with the notion of common cycles in the multivariate Beveridge and Nelson (1981) decomposition. Starting from the seminal work by Vahid and Engle (1993), we illustrate the various forms of common cyclical features that have been proposed in the literature and their implication in terms of common unobserved components. Finally, we illustrate similarities and differences between the common serial correlation approach and other types of multivariate time series modelling that are popular in statistics and econometrics, such the dynamic factor model (see, e.g., Stock and Watson (2011) and the references therein) and the multivariate autoregressive index model (Reinsel, 1983).

2.1. Common features and the reduced-rank regression model

Engle and Kozicki (1993) considered features that satisfy the following axioms:

  • The vector series Xt has (does not have) the feature if any non-singular linear transformation of Xt still has (does not have) such feature.

  • If two n-vector time series Y1t and Y2t do not have the feature then (Y1t + Y2t) does not have the feature.

  • If Yt does not have the feature but Xt has the feature then (Yt + Xt) has the feature.

Then, any dynamic property of the data could be viewed as a special case of feature: for example, series with stochastic trends satisfies all axioms. As pointed out by Engle and Kozicki (1993) a linear combination of two series that both have the feature does not necessarily possess the feature. This is the most interesting case, and to this issue Engle and Kozicki gave particular attention by the following definition:

Definition 1

A feature, which is present in each of a set of series, is said to be common to those series when there exists a nonzero linear combination of these series that does not have the feature.

From a statistical point of view, the existence of such linear combinations can be linked to a common factor representation. Indeed, let Yt be an n-vector time series such that


where (ns) common factors ft have the feature and et does not have the feature. Consider a s-vector δ such that δ′B = 0, then δ ′Yt does not have the feature.

The main idea is that a small number of unobserved components possess a given feature and transmit it to a larger set of time series. It is then possible to combine such time series in order to cancel the influence of these unobserved components, thus removing the Common Feature (CF) from the data.

In order to illustrate the connection between CF’s and RRR, let us assume that Yt is a n-vector (weakly) stationary time series such that


where Xt and Zt are, respectively, vectors of k and m stationary time series, β ≠ 0, and ɛt are i.i.d. innovations, with E(ɛt) = 0, E(ɛtɛt)=Σɛɛ (positive definite) and finite fourth moments, that are independent from both Xt and Zt. Both Xt and Zt may contain (linear functions of) lags of Yt and, for the sake of simplicity, no deterministic terms are included.

Moreover, assume that:

  • Variables Xt possess the feature of interest whereas variables Zt don’t.

  • There exists a n × s (s < min{n, k}) full-rank matrix δ such that δ′Yt do not possess the feature. Then, it is said that variables Yt have s CF’s.

In view of the above assumptions we get


that is equivalent to


where δ is a n × (ns) full-rank matrix such that δ′δ = 0, and ψ is a k × (ns) matrix. If n > k, the matrix β′ has not full column-rank, then there exist (nk) “trivial” CF’s. Hence, we also assume that s < nk, so the matrix ψ has full-rank as well.

In view of Equations (2.2) and (2.3), it is clear that the existence of s CF’s is equivalent to postulate the following (partial) RRR model for series Yt


The cofeature matrix δ and the coefficient matrix ψ can be obtained as follows:

  • Obtain the partial regression model


    where yt=Yt-ΣYZΣZZ-1Zt,xt=Xt-ΣXZΣZZ-1Zt,β=Σxx-1Σxy, and


    for two generic stationary time series At and bt.

  • Solve the following maximization problem

    υ1=arg maxυn{υΣyxΣxx-1ΣxyυβΣxxβυΣyyυ}         s.t.υΣyyυ=1.

    Then we get the solution υ1=Σyy-1/2ν1, where νj ( j = 1, . . . , n) is the eigenvector associated to the jth largest eigenvalue λj (λ1λ2 ≥ · · · ≥ λn) of the symmetric, semi-positive definite matrix


    Note that R2(υ′yt|η′xt), where η = βυ, is maximized for υ = υ1 and η=η1Σxx-1Σxyυ1 since


  • Solve the following maximization problem

    υj=arg maxvn{υΣyxΣxx-1ΣxyυυΣyyυ}         s.t.{υjΣyyνj=1,υiΣyyυj=0,

    for ij = 2, . . . , n.

    We get the solution υj=Σyy-1/2νj. Note that R2(υjytηjxt)=λj, where ηj=Σxx-1Σxyυj. Moreover, R2(υiytηjxt)=0 since


    and νi and νj are eigenvectors corresponding to different eigenvalues.

  • Since δ′β′ = 0, then λns+1 = · · · = λn = 0, and the cofeatures matrix δ is obtained (up to an identification matrix) as follows


  • Since λ1 ≥ · · · ≥ λns > 0, the coefficient matrix ψ is obtained (up to an identification matrix) as follows


  • Finally, the loading matrix δ is given by the regression coefficients of yt on ψ′xt.

    It is worth remarking that the eigenvalue problem


    is equivalent to finding the roots of the equation:


    with the normalization υ′yyυ = 1.

One easily recognizes that the solution of the problem (2.5) coincides with the partial Canonical Correlation Analysis (CCA) between Yt and Xt conditional to Zt (see, e.g., Anderson, 1984).

2.2. Common trends and common cyclical features

A very well known example of CF is cointegration (see, e.g., Johansen (1996) and the references therein), where linear combinations of series having nonstationary stochastic trends feature are stationary.

Let us assume that the elements of a n-vector of time series Ut are integrated of order 1, denoted as Ut ~ I(1), and that they admit the following VAR(p) representation:


where A(L)=In-Σi=1pAiLi.

When variables Ut are cointegrated of order (1, 1), denoted as Ut ~ CI(1, 1), we can rewrite the model (2.6) in the Vector Error Correction Model (VECM) representation:


where Γ(L)=In-Σi=1p-1ΓiLi, and Γi=-Σj=i+1pAj, α and γ are full-rank n×r-matrices (r = 1, . . . , n−1) and γ′Ut ~ I(0).

Since ΔUt is a stationary stochastic process, it admits the following Wold representation:


where C(L)=In+Σi=1CiLi is such that Σj=1jCj<.

By expanding C(L) on 1 and integrating both sides of the above equation, we get the multivariate Beveridge-Nelson representation (BN; Beveridge and Nelson, 1981):


where i = −∑j>iCj for all i.

Since we know from the Engle-Granger representation theorem (Engle and Granger, 1987; Johansen, 1996) that C(1)=γ(αΓγ)-1α, series Ut share the following (nr) common stochastic trends:


Note the analogy between (2.8) and the common factor representation (2.1); the last term in (2.8) does not have the feature, while τt has the feature, and the (nr) common stochastic trends are removed by the cointegration vectors, since γ′C(1) = 0.

However, detrended economic time series often display clear evidence of comovements (Lucas, 1977), which cannot be due to cointegration, thus suggesting the presence of common cycles. If this is the case, we expect that there exist linear combinations of cyclical series the are not cyclical. This Common Cyclical Feature (CCF) (Engle and Kozicki, 1993; Vahid and Engle, 1993) can be interpreted as short-run equilibrium relationships, similarly to the interpretation of the cointegration relations as long-run equilibrium.

From the multivariate BN decomposition (2.8), the stochastic cycles of series Ut are:


where C˜(L)=Σi=0C˜iLi.

While the presence of cointegration implies reduced-rank restrictions on the VECM parameters that are responsible for the long-run behavior of series Ut, since γ′τt = 0, the analysis of CCF’s is instead concerned with reduced-rank restrictions on the short-run VECM parameters (the polynomial matrix Γ(L) and the adjustment matrix α) that have interesting implications on the cycles κt.

However, differently from cointegration, there is not a unique notion of common short-run components. Indeed, also the degree of synchronicity of the common cycle plays a role in the definitions. Alternative notions of CCF impose differing reduced-rank structures to the VAR. Let us briefly review various form of CCF starting with the seminal notion proposed by Engle and Kozicki (1993).

Series ΔUt have s (s < n) serial correlation common features (SCCF) iff there exists an n × s matrix δS with full column rank such that the VECM (2.7) can be rewritten into the following RRR model


where ψS is an (npn + r) × (ns) matrix with full column rank (Engle and Kozicki, 1993). Since δSΔUt=δSɛt, there exist s linear combinations of series ΔUt that are unpredictable from the past, i.e. they are innovation processes. Since δSκt=0, there exist (ns) common cycles in series Ut (Vahid and Engle, 1993) that are perfectly synchronized, as a results the impulse responses functions are exactly collinear. However, one observes that due to technical reasons (e.g., seasonal adjustment) as well as economic reasons (e.g., adjustment costs, labor market rigidities), the hypothesis underlying SCCF is sometimes too strong, since SCCF is not able to detect the existence of non-contemporaneous cyclical comovements (Ericsson, 1993). Then, some less restrictive variants of the SCCF have been introduced in the literature, which we will review in turn.

Cubadda and Hecq (2001) propose the notion of polynomial serial correlation common features (PSCCF) as a measure of non-contemporaneous cyclical comovements. Non-synchronous common cycles arises, for example, in economic model of consumption with several types of consumer goods as, i.a., in Vahid and Engle (1997) and Schleicher (2007); it is shown that the maximization problem of the representative agent implies a unsynchronized common cycle (codependent cycle) in the consumer goods vector.

By definition, series ΔUt have s PSCCF’s iff there exists an n × s matrix δP with full column rank such that δPΓ10

, and the VECM (2.7) can be rewritten into the following partial RRR model


where ψP is an (np − 2n + r) × (ns) matrix with full column rank (Cubadda and Hecq, 2001).

In order to interpret the notion of PSCCF, Cubadda and Hecq (2001) show that there exists a first-order polynomial matrix δ(L)=δP-Γ1δPL such that


Hence, PSCCF requires that there exists a first-order polynomial matrix δ(L) such that δ(L)ΔUt is white noise. The presence of PSCCF has an interesting implication for the BN cycles of series Ut: indeed, since δ(L)κt=δ1C(1)ɛt, the same PSCCF relationships cancel the dependence from the past of both the first differences and cycles of series Ut.

In the above definitions of CCF, the number of SCCF’s or PSCCF’s, s, cannot exceed the number of common trends (nr). In order to remove this restrictions, Hecq et al. (2006) proposed the notion of weak form of SCCF (WF): series ΔUt have s WF’s iff there exists an n × s matrix δW with full column rank such that δWα0, and the VECM in (2.7) can be rewritten into the following partial RRR model


where ψW is an (npn) × (ns) matrix with full column rank (Hecq et al., 2006).

In order to uncover interesting implications of the WF for the BN cycle, Cubadda (2007) shows that there exists a first-order polynomial matrix δW(L) ≡ δW − (γα′ + In)δWL such that


As consequence, since δW(L)κt=δW(In-C(1))ɛt, the same WF relationships cancel the dependence from the past of both the cycles and linearly detrended levels of series Ut.

A limitations of the above methods for cyclical features analysis is that they cannot handle the possible coexistence of differing types of reduced-rank restrictions in the same vector. In order to overcome this limitation, Cubadda (2007) introduced the notion of weak form of PSCCF (WFP), which encompasses most of the existing formulations: series ΔUt have s WFP’s iff there exists an n × s matrix δF with full column rank such that δFα0,δFΓ10, and the VECM in (2.7) can be rewritten into the following partial RRR model


where ψF is an (np − 2n) × (ns) matrix with full column rank.

The WFP requires the existence of a second-order polynomial matrix δF(L)δF-(γα+In+Γ1)δFL+Γ1δFL2 such that


An important implication of the WFP is that the polynomial matrix δF(L) transforms the BN cycles κt into a process with shorter memory, since δF(L)′κt ~ VMA(1).

The CCF analysis was extended even to the case of series having different forms of stationarity than I(1)-ness. In particular, Cubadda (1999, 2001) explored the presence of common cycles in seasonal time series that are also integrated at (a subset of) the seasonal frequencies, whereas Paruolo (2006) focused on the case of I(2) systems.

Franchi and Paruolo (2011) offered a comprehensive theoretical analysis of the conditions of existence of the various form of CCF’s and of the characterization of the CCF relations in I(0), I(1) and I(2) systems.

2.3. Relations with alternative multivariate time series models

It is interesting to analyze similarities and differences of the CF approach with other popular multivariate time models. For the sake of simplicity, we will refer within this subsection to the basic SCCF model, which can be formulated as


where series Yt are assumed to be I(0).

Ahn and Reinsel (1988) proposed a variant of the basic RRR model that is called Nested Reduced- Rank AR model (NRRAR). The main assumption is that the VAR coefficient matrices have reduced ranks, which are nested each other: Rank(Aj) ⊃ Rank(Aj + 1) for j = 1, . . . , p − 1. Then the NRRAR reads


The NRRAR is a very general statistical model since it is easy to see that both the SCCF and its polynomial extensions are particular cases of Equation (2.10). However, the interpretation of the NRRAR in terms of common short-run components is more involved.

A different modelling, which is also endowed with a reduced-rank structure, is the Multivariate Autoregressive Index model (MAI) as originally proposed by Reinsel (1983). The basic version of the MAI reads


where ξ is n × q (q < n) full-rank matrix, and φj is is n × q matrix for j = 1, . . . , p. The linear combinations It = ξ′Yt are called the indexes.

Notice that the regression coefficient matrix implied by the MAI has the following structure


which implies that β′ξ = 0. Hence, whereas SCCF imposes a common left null space to the VAR coefficient matrices, MAI imposes a common right null space to those matrices.

Notwithstanding both SCCF and MAI have a reduced-rank structure, the mathematical properties of these two modelling approaches are only partially similar. Indeed, although the interpretation of the canonical variates ψ′Xt and the index lags [ξ′Yt−1, . . . , ξ′Ytp] is analogous, since both of them represent the relevant predictors of series Yt, the same statement cannot be applied to the linear combinations δ′Yt and ξYt. The former are in fact innovations w.r.t. the past, whereas the latter are not, as one can easily observe by premultiplying both sides of Equation (2.11) by the matrix ξ.

A specific property of the MAI is that the indexes themselves follow a VAR(p) process. Indeed, if we premultiply both sides of Equation (2.11) by the matrix ξ′ we get


whereas linear combinations of series generated by an unrestricted VAR(p) model generally follow a VARMA process, see e.g. Cubadda et al. (2009) and the references therein. It is easy to see that a similar implication does not hold for the canonical variates δYt.

A different approach that gained large popularity is the Dynamic Factor Model (DFM), see e.g. Stock and Watson (2011) and the references therein. As shown by Stock and Watson (2005), the DFM can be represented in the VAR form as follows


where D(L) is a diagonal finite-order n×n polynomial matrix, Λ is a n×q matrix, Φj is is n × q matrix for j = 1, . . . , p, Ft is a q-vector of unobserved factors, and (ɛYt,ɛFt) are innovations w.r.t. the past of both Yt and Ft.

A key difference between the DFM and the previously considered approaches is that the former requires for inferential purposes that the number of series n is large compared to the sample size T. Indeed, as shown by Bai and Ng (2006), if the factors Ft are obtained through principal component methods, the estimated factors can be treated as observed for statistical inference on Equation (2.13) provided that n → ∞ and T1/2/n → 0 as T → ∞.

Apart from the different asymptotic frameworks, the DFM and the MAI have some degree of similarity in their mathematical formulations. Indeed, Equation (2.12) is entirely analogous to Equation (2.14), and Equation (2.13) can be seen as generalization of Equation (2.11) such that each series is endowed with an individual AR structure.

It is less obvious how to relate the DFM to the SCCF. One may notice that in the particular case that D(L) = In, series Yt in Equation (2.13) have indeed the SCCF with a matrix δ that is equal to Λ. However, lags of the individual series are typically included in empirical applications of the DFM.

3. Common Features and Univariate Time Series Models

It is well known that each series generated by a VAR process admits a univariate ARIMA representation, see e.g. Zellner and Palm (1974). However, the VAR models that are typically used in macroeconomic analysis would imply highly non parsimonious ARIMA models for individual time-series, whereas low order ARIMA models are empirically appropriate. This is the so-called “autoregressivity paradox”. Cubadda et al. (2008, 2009) argued that the presence of common cyclical features can provide a solution to this paradox.

Indeed, let us assume that the n series of interest are generated by the following VAR(p) model


The so-called Final Equations (FEs) of series Yt can be obtained by premultiplying both sides of the VAR equation by A(L)ad j, the adjoint matrix associated with A(L):


where det[A(L)] is the determinant of the polynomial matrix A(L).

Since det[A(L)] is a polynomial or order np and A(L)ad j is matrix polynomial of order (n − 1)p, it follows that elements of Yt should admit a univariate ARMA(np, (n − 1)p) representation. Hence, a typical VAR model with n = 5 and p = 4 would imply that each individual variable follows an ARMA(20, 16) model, which is at odds with the empirical evidence.

Following Cubadda et al. (2009), let us assume that n = 3 series are generated by the following VAR(1):


where Yt = (y1t, y2t, y3t).

For the above VAR, the FEs are:


such that individual series follow ARMA(3, 2) models.

However, if ω = 0, the VAR has reduced-rank structure


which produces the FEs:


This implies that the univariate representations are parsimonious ARMA(1, 1) models with the same autoregressive parameter and cross-correlated VMA errors having a factor structure.

More generally, Table 1 summaries the reduction of the individual ARMA orders due to common features restrictions.

As one can see, the existence of CCF’s provides a possible, economically meaningful, solution of the autoregressivity paradox. Notice that Table 1 provides the maxima ARIMA orders under CCF’s restrictions. However, the orders can be even smaller due to additional restrictions on the VAR parameters, such as block-diagonal or block-triangular structures.

The presence of short-run comovements has also consequences for the VMA part of FEs. Cubadda et al. (2009) show that in a stationary VAR(p), the existence of s SCCF relationships implies that in the FEs the VMA coefficient matrices associated with degrees strictly larger than (ns − 1)p have a common right null space that is spanned by δ. Hence, it is possible to reduce the order of the VMA component to a degree of at most (ns − 1)p instead of (ns)p. In particular, when n − 1 = s, the FEs follow a model that is popular in the macro-panel literature: an homogeneous AR component and cross-correlated VMA errors having a factor structure.

Cubadda et al. (2009) illustrate this point with an empirical example. They consider the industrial production indexes of Canada and the US. Since no cointegration in log levels is found, a VAR(1) in first differences seems appropriate. The estimation by OLS delivers (standard errors in brackets)

[Δln US^tΔln CA^t]=[0.333(0.088)0.273(0.079)0.265(0.102)0.360(0.092)][Δln USt-1Δln CAt-1].

Theoretical FEs orders imply an ARMA(2, 1) processes. However, SCCF test statistics is in favor of s = 1 (p-value = 0.31), with the estimated SCCF relationship (Δ lnUSt − 1.05Δ lnCAt). Statistical identification of the univariate models provides the following ARIMA(1,1,0) structures

Δln US^t=0.003(0.001)+0.554Δ(0.062)ln USt-1,Δln CA^t=0.004(0.001)+0.533Δ(0.064)ln CAt-1.

As expected, the AR coefficients are very similar. Moreover, since the estimated cofeature vector δ̂ ′ is close to (1 : −1), and the VAR residuals have similar variances and a correlation around 0.65, the factor structure of the VMA part of the FEs may explain why the MA(1) components are empirically negligible.

4. Statistical Analysis

In this section we first illustrate how to estimate the RRR models implied by the various form of CCF’s when the system dimension is small. In this case, a Maximum Likelihood (ML) under the Gaussianity assumption is generally employed. However, ML methods break down when the number of regressors becomes large compared to the typical sample size in macroeconomic datasets (100 ≤ T ≤ 200). The problem is due to the need of inverting large covariance matrices. Hence, we will review some recent proposals for estimating VAR model with reduced-rank restrictions when the number of series n exceeds 20.

4.1. Maximum likelihood inference of small-scale multivariate models

Under the assumptions that series Yt, Xt and Zt are I(0) and that the innovations ɛt are Gaussian, ML inference on model (2.2) is provided by partial CCA of series Yt and Xt conditional on Zt, denoted by CanCor{Yt, Xt | Zt}, see i.a.Anderson (1999, 2002).

In particular, the LR test statistic on the existence of s CF’s is:

LR(s)=-Ti=n-s+1nln (1-λ^i),

where λ̂i is the ith largest eigenvalue of the sample matrix Σ^yy-1/2Σ^yxΣ^xx-1Σ^xyΣ^yy-1/2. Under the null hypothesis that s CF’s are present, the test statistic (4.1) is asymptotically distributed as


The ML estimator of the cofeatures matrix δ is obtained (up to an identification matrix) as follows


where υ^j=Σ^yy-1/2ν^j and ν̂j ( j = 1, . . . , n) is the eigenvector associated to the jth largest sample eigenvalue λ̂j.

The ML estimator of the coefficient matrix ψ is obtained (up to an identification matrix) as follows


where η^jΣ^xx-1Σ^xyυ^j.

The ML estimator of the loading matrix δ is finally obtained by the regression coefficients of yt on ψ̂′xt.

The degrees of freedom of the test statistic (4.1) are obtained as follows. Under the alternative hypothesis, the matrix β is composed by n × k elements. Under the null hypothesis, we have that


Without loss of generality, we assume that δ⊥,1 has full-rank. Then we can write


where δ˜=(Is,δ˜,2). Hence, the matrix β is composed by (ns) × (s + k) elements. It follows that under the null hypothesis s(kn + s) restrictions are imposed.

In order to conduct inference on the various forms of CCF’s, a two-step procedure is usually employed (see, i.a., Vahid and Engle (1993), Ahn (1997), Cubadda and Hecq (2001), Hecq et al. (2006), and Cubadda (2007)). First, the cointegrating vectors are estimated by ML on the VECM without imposing any CCF restrictions. This can be done by using the procedures suggested in Johansen (1988) or Ahn and Reinsel (1990). Second, the cointegration matrix γ is fixed to its ML estimate, and inference on CCF’s is carried on. Since the ML estimator of the cointegation matrix γ converges at rate T irrespective of the existence of constraints in the short-run polynomial matrix Γ(L) (see, e.g., Johansen (1996)), the asymptotic distribution of the test statistics (4.1) remains the same as in (4.2).

Hence, ML inference on the various forms of common features is obtained by solving CanCor{ΔUt, Xt | Zt} for proper choices of the variables Xt and Zt as detailed in Table 2.

For each of the models in Table 2, under the null that s common features of a given form exist, the test statistic (4.1) is asymptotically distributed as a χ2(d) as detailed in Table 3 (see, i.a., Velu et al. (1986), Anderson (2002)).

Moreover, optimal estimates of both the common features vectors and (partial) RRR coefficients are then obtained as described in Table 4.

Finally, the remaining parameters of the various RRR models are estimated by OLS after fixing the matrices ψ’s to their estimated values.

Notice that the two step procedures previously illustrated do not maximize the Gaussian likelihood, although the estimated parameters have the same distribution as the optimal ones. Centoni et al. (2007) suggested an iterative algorithm for computing the ML estimates. The main idea is to switch between the estimation of the cointegration matrix γ having fixed the short-run parameters and the estimation of the short-run parameters having fixed the cointegration matrix.

The merits of imposing CCF restrictions in VAR models are investigated in several Monte Carlo studies. Vahid and Issler (2002) documented the advantages of simultaneously choosing the order p of the VAR model and the number s of SCCF restrictions for both forecasting and structural analysis. Athanasopoulos et al. (2011) extended the analysis to cointegrated systems. They proposed to simultaneously choose the cointegration rank r with p and s. Using the switching algorithm in Centoni et al. (2007) for estimation and a novel criterion for model identification, significant gains in forecasting accuracy are found both in simulations and empirical applications.

4.2. Methods for medium-large dimensional time series

The ML approach discussed so far is typically applied to small scale multivariate models, i.e. in situations where the number of series n is not larger than five and the sample size T rarely exceeds 200. However, it is of obvious interest to apply the common features framework to medium-large VAR models, i.e. when the n ranges from 10 to 40. Indeed, Koop (2013)i.a. shows that medium-large VAR’s outperform small-scales models in forecasting the key US macroeconomic variables. However, it is clear that when n is large ML inference suffers of the curse of dimensionality problem. Indeed, Cubadda and Hecq (2011) show by simulations that in a VAR(1) with n = 25 variables the LR test exhibits severe size distortions even with a large sample size as T = 600.

The main reason why ML inference performs poorly in high-dimensional systems is that the inversion of large variance-covariance matrices is required. Hence, most of methods that have been proposed to use RRR with large n try to “regularize” these covariance matrices in different ways.

The first attempt in this direction can be dated back to Vinod (1976), who proposed the canonical ridge model. When applied to the SCCF modelling, this approach would require to substitute the eigenvalues and eigenvectors that are used for CCA with those of the matrix


where κy and κx are two positive scalars, ∑̂yy and ∑̂xx are respectively the sample variance-covariance of series Yt and Xt in Equation (2.9), and ∑̂xy is the matrix of the covariances between the elements of Yt and Xt. It is clear that the matrices that appear in parentheses in Equation (4.3) are invertible even when T > np because the positive quantities κy and κx are respectively added to all the eigenvalues of ∑̂yy and ∑̂xx. Vinod (1976) proposed various criteria to choose the values for κy and κx in empirical applications. Notwithstanding its potential interest for high-dimensional time series analysis, to the best of our knowledge the canonical ridge approach has never been applied to either SCCF or other multivariate dynamic models.

Cubadda and Hecq (2011) suggested a Partial Least Squares (PLS) approach to test and impose the SCCF restriction to a VAR even when CCA is not feasible due a lack of degrees of freedom. PLS are a family of multivariate techniques with the aim of maximizing the covariance between linear combinations of two variable sets, see, e.g. Rosipal and Krämer (2006) for a detailed survey. The idea is to consistently estimate the SCCF matrix δ as the eigenvectors associated with the s smallest eigenvalues of the matrix


where Dyy and Dxx are diagonal matrices having the diagonal elements of, respectively, ∑yy and ∑xx. Since PLS require to invert diagonal matrices only, this method can provide estimates of δ (up to an identification matrix) that are less disperse and more numerically stable than those coming from CCA when the dimension of Xt approaches the sample size T.

In order to consistently estimate the factor weights ψ, it is necessary to take the additional assumption that the columns of the matrix ψS are equal to (ns) distinct eigenvectors of the matrix Dxx-1Σxx. Then we get


where Vψ is the diagonal matrix of the (ns) eigenvalues of the matrix Dxx-1Σxx that are associated with the eigenvectors ψS, which implies that the matrix ψS lies in the space generated by the eigenvectors associated with the positive eigenvalues of the matrix Dxx-1ΣxyDyy-1Σyx.

In order to detect the presence of SCCF in large dimensional systems, Cubadda and Hecq (2011) proposed to replace the condition that a linear combination of variables must be orthogonal to the past, namely

E (δYtXt-1)=0

with the one of absence of autocorrelation, namely

E (δiYt[Yt-1δi,,Yt-pδi])=0,         i=1,2,,s,

where δ = [δ1, . . . , δs]. This drastically decreases the number of restrictions to be imposed under the null hypothesis, thus making a test for (an implication of) SCCF feasible even when CCA is not. Condition (4.4) can be verified by means of univariate tests for no serial correlation of each δiΔYt having fixed δ to its PLS estimate.

Having controlled for the overall size of the test when different values of s are involved, Cubadda and Hecq (2011) show by simulations that the proposed testing procedure has negligible size distortions and remarkable power with n = 9, 25 and small sample size as T = 50.

Carriero et al. (2011) proposed a Bayesian approach to estimate large VAR’s with reduced-rank restrictions. In particular, the Reduced Rank Posterior (RRP) method goes as follows. First, estimate an unrestricted VAR by implementing the Kadiyala and Karlsson (1997) version of the Minnesota prior (see, i.a., Litterman, 1986). where the estimates of elements of β in (2.9) are shrunk towards zeros. This approach is computationally convenient since the posterior mean of β can be obtained by OLS on a system augmented with proper sets of dummy variables. Second, compute the singular value decomposition of the posterior mean of β and obtain a reduced-rank approximation of it by retaining the (ns) largest eigenvalues and the associated vectors only.

Carriero et al. (2011) compare the RRP with several alternatives, including a more elaborated Bayesian Reduced-Rank VAR (BRR) method, in a forecasting exercise where the reference model is a VAR(1) for n = 52 variables and a rolling window of 120 monthly observations is used for estimation. The empirical results show that BRR and RRP outperform the competitors for intermediate and long forecasting horizons, whereas univariate AR models are hard to beat for shorter horizons.

Bernardini and Cubadda (2015) proposed to regularize the estimate of the autocorrelation matrix prior on performing CCA. In particular, they suggest to use, in place of the natural estimator, a proper shrinkage estimator of the covariance matrix of wt=[yt,xt]:


where ρ ∈ [0, 1]. Then a Regularized version of CCA (RCCA) can be obtained by solving the eigenvector equation


Notice that when ρ = 1 the full-rank regression case coincides with n univariate white noises, whereas when ρ = 0 one gets the usual CCA solution. Hence, RCCA can be seen as a frequentist analogous of the Kadiyala-Karlsson version of the Minnesota prior, as it shrinks elements of β towards zero as well. In order to choose the value of ρ, Bernardini and Cubadda (2015) follow the approach of Ledoit and Wolf (2003), which requires to minimize a risk function based on the Frobenious norm of the difference between the shrinkage estimator ∑̃ww and the covariance matrix ∑ww. In particular, they provide a feasible estimator of the optimal ρ to the case that data are generated by a vector stationary process. Remarkably, such estimator converges to zero at rate T, hence RCCA is asymptotically equivalent to CCA.

Bernardini and Cubadda (2015) document, both by simulations and empirical applications, that RCCA improves both forecasting and estimation of structural parameters over traditional medium-size (n = 20 and p = 2) macroeconometric methods.

5. Conclusions

The main goal of this survey was to create a “common thread” between various topics, related to each other by the idea of modelling various forms of comovements that are typically observed in economic time series and often predicted by economic theory. From a statistical point of view, common features imply a reduction to more parsimonious structure such as common factor representation: a small number of unobserved components possesses a given feature and transmits it to a larger set of economic time series. As we have seen, RRR is often the solution to the inferential problem. Due to the large amount of literature on common features, we focused the discussion on common cyclical features. However, we have tried to take into account recent developments as the implications of common features for univariate time series models and the statistical issues that arise when the number of the variables is fairly large.

The main drawback of the methods discussed in this paper is that different features are evaluated separately; one important exception is the unifying framework for analyzing common cyclical features that we discussed in Subsection 2.2. Then, a major goal ahead is to develop estimation and testing procedures that allow for the joint identification of several forms of common features using an integrated statistical approach.

Another challenge for future research is the extension of common cycle analysis to nonstationary medium-large dimensional systems. Indeed, all the methods that are presented in Subsection 4.2 require that series are individually stationarized prior to the multivariate analysis. This is clearly a methodological limitation since possible cointegration relations are ignored. Hence, it is still missing a statistical modelling that allows for the simultaneous analysis of common trends and cycles when the number of time series is not small.


Table 1

Maximum ARIMA orders of univariate series generated by an n-dimensional VAR(p) with s cofeature restrictions

ModelAR orderI(d)MA order
I(0)VARnp0(n − 1)p
PSCCF(ns)p + s0(ns)p + (s − 1)
CI(1, 1) VARn(p − 1) + r1(n − 1)(p − 1) + r
SCCF(ns)(p − 1) + r1(ns)(p − 1) + r
PSCCF(ns)(p − 1) + r + s1(ns)(p − 1) + r + s − 1
WF(ns)(p − 1) + r1(ns)(p − 1) + r

SCCF = serial correlation common feature; PSCCF = polynomial SCCF; WF = weak form.

Table 2

Canonical correlations and CCF’s


SCCF = serial correlation common feature; WF = weak form; PSCCF = polynomial SCCF; WFP = weak form of PSCCF.

Table 3

Tests for common features

ModelDegress of freedom d
SCCFs × (n(p − 2) + r + s)
WFs × (n(p − 2) + s)
PSCCFs × (n(p − 3) + r + s)
WFPs × (n(p − 3) + s)

SCCF = serial correlation common feature; WF = weak form; PSCCF = polynomial SCCF; WFP = weak form of PSCCF.

Table 4

Estimators of the common features vectors and RRR coefficients

Model(η̂1, . . . ,η̂s)(υ̂s+1, . . . ,υ̂n)

SCCF = serial correlation common feature; WF = weak form; PSCCF = polynomial SCCF; WFP = weak form of PSCCF.

  1. Ahn SK (1997). Inference of vector autoregressive models with cointegration and scalar components. Journal of the American Statistical Association, 92, 350-356.
  2. Ahn SK and Reinsel GC (1988). Nested reduced-rank autoregressive models for multiple time-series. Journal of the American Statistical Association, 83, 849-856.
  3. Ahn SK and Reinsel GC (1990). Estimation for partially nonstationary multivariate autoregressive models. Journal of the American Statistical Association, 85, 813-823.
  4. Anderson HM and Vahid F (1998). Testing multiple equation systems for common nonlinear components. Journal of Econometrics, 84, 1-36.
  5. Anderson TW (1951). Estimating linear restrictions on regression coefficients for multivariate normal distributions. Annals of Mathematical Statistics, 22, 327-351.
  6. Anderson TW (1984). An Introduction to Multivariate Statistical Analysis (2nd ed), New York, John Wiley & Sons.
  7. Anderson TW (1999). Asymptotic theory for canonical correlation analysis. Journal of Multivariate Analysis, 70, 1-29.
  8. Anderson TW (2002). Canonical correlation analysis and reduced rank regression in autoregressive models. Annals of Statistics, 30, 1134-1154.
  9. Athanasopoulos G, de Carvalho Guill챕n OT, Issler JV, and Vahid F (2011). Model selection, estimation and forecasting in VAR models with short-run and long-run restrictions. Journal of Econometrics, 164, 116-129.
  10. Bai J and Ng S (2006). Confidence intervals for diffusion index forecasts and inference for factor-augmented regressions. Econometrica, 74, 1133-1150.
  11. Bernardini E and Cubadda G (2015). Macroeconomic forecasting and structural analysis through regularized reduced-rank regression. International Journal of Forecasting, 31, 682-691.
  12. Beveridge S and Nelson CR (1981). A new apporach to decomposition of economic time series into permanent and transitory component with particular attention to measurement of the 쁞usiness cycle. Journal of Monetary Economics, 7, 151-174.
  13. Bierens HJ (2000). Nonparametric nonlinear cotrending analysis, with an application to interest and inflation in the United States. Journal of Business & Economic Statistics, 18, 323-337.
  14. Campbell JY (1987). Does saving anticipate declining labor income - An alternative test of the permanent income hypothesis. Econometrica, 55, 1249-1273.
  15. Campbell JY and Mankiw NG (1990). Permanent income, current income, and consumption. Journal of Business & Economic Statistics, 8, 265-279.
  16. Carriero A, Kapetanios G, and Marcellino M (2011). Forecasting large datasets with Bayesian reduced rank multivariate models. Journal of Applied Econometrics, 26, 735-761.
  17. Centoni M, Cubadda G, and Hecq A (2007). Common shocks, common dynamics, and the international business cycle. Economic Modelling, 24, 149-166.
  18. Chapman DA and Ogaki M (1993). Cotrending and the stationarity of the real interest rate. Economics Letters, 42, 133-138.
  19. Cubadda G (1999). Common cycles in seasonal non-stationary time series. Journal of Applied Econometrics, 14, 273-291.
  20. Cubadda G (2001). Common features in time series with both deterministic and stochastic seasonality. Econometric Reviews, 20, 201-216.
  21. Cubadda G (2007). A unifying framework for analyzing common cyclical features in cointegrated time series. Computational Statistics & Data Analysis, 52, 896-906.
  22. Cubadda G and Hecq A (2001). On non-contemporaneous short-run co-movements. Economics Letters, 73, 389-397.
  23. Cubadda G and Hecq A (2011). Testing for common autocorrelation in data-rich environments. Journal of Forecasting, 30, 325-335.
  24. Cubadda G, Hecq A, and Palm FC (2008). Macro-panels and reality. Economics Letters, 99, 537-540.
  25. Cubadda G, Hecq A, and Palm FC (2009). Studying co-movements in large multivariate data prior to multivariate modelling. Journal of Econometrics, 148, 25-35.
  26. Engle RF and Granger CW (1987). Co-integration and error correction: Representation, estimation and testing. Econometrica, 55, 251-276.
  27. Engle RF and Issler JV (1995). Estimating common sectoral cycles. Journal of Monetary Economics, 35, 83-113.
  28. Engle RF and Kozicki S (1993). Testing for common features. Journal of Business & Economic Statistics, 11, 369-380.
  29. Engle RF and Marcucci J (2006). A long-run pure variance common features model for the common volatilities of the Dow Jones. Journal of Econometrics, 132, 7-42.
  30. Engle RF and Susmel R (1993). Common volatility in international equity markets. Journal of Business & Economic Statistics, 11, 167-176.
  31. Ericsson NR (1993). Comment: Testing for common features. Journal of Business & Economic Statistics, 11, 380-383.
  32. Flavin M (1993). The excess smoothness of consumption - identification and interpretation. Review of Economic Studies, 60, 651-666.
  33. Franchi M and Paruolo P (2011). A characterization of vector autoregressive processes with common cyclical features. Journal of Econometrics, 163, 105-117.
  34. Gourieroux CS and Peaucelle I (1988). Detecting a long run relationship (with an application to the p.p.p. hypothesis), Paris, Centre d쇒쯶udes prospectives d쇒쯢onomie math챕matique appliqu챕es 횪 la planification.
  35. Haldrup N, Hylleberg S, Pons G, and Sans처 A (2007). Common periodic correlation features and the interaction of stocks and flows in daily airport data. Journal of Business & Economic Statistics, 25, 21-32.
  36. Hall RE (1978). Stochastic implications of the life cycle-permanent income hypothesis: Theory and evidence. Journal of Political Economy, 86, 971-987.
  37. Hecq A, Palm FC, and Urbain JP (2000). Testing for common cyclical features in nonstationary panel data models. Advances in Econometrics, 15, 131-160.
  38. Hecq A, Palm FC, and Urbain JP (2006). Common cyclical features analysis in VAR models with cointegration. Journal of Econometrics, 132, 117-141.
  39. Hendry DF (1999). Co-breaking, Clements MP and Hendry DF (Eds). Forecasting Nonstationary Economic Time Series, Cambridge, MA, MIT Press.
  40. Hendry DF and Massmann M (2007). Co-breaking: Recent advances and a synopsis of the literature. Journal of Business & Economic Statistics, 25, 33-51.
  41. Hylleberg S, Engle RF, Granger CW, and Yoo BS (1990). Seasonal integration and cointegration. Journal of Econometrics, 44, 215-238.
  42. Issler JV and Vahid F (2001). Common cycles and the importance of transitory shocks to macroeconomic aggregates. Journal of Monetary Economics, 47, 449-475.
  43. Johansen S (1988). Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control, 12, 231-254.
  44. Johansen S (1996). Likelihood-Based Inference in Cointegrated Vector Autoregressive Models, Oxford, UK, Oxford University Press.
  45. Kadiyala KR and Karlsson S (1997). Numerical methods for estimation and inference in Bayesian VAR-models. Journal of Applied Econometrics, 12, 99-132.
  46. King RG, Plosser CI, and Rebelo ST (1988). Production, growth and business cycles: II. New Directions. Journal of Monetary Economics, 21, 309-341.
  47. Koop GM (2013). Forecasting with medium and large Bayesian VARs. Journal of Applied Econometrics, 28, 177-203.
  48. Ledoit O and Wolf M (2003). Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. Journal of Empirical Finance, 10, 603-621.
  49. Litterman RB (1986). Forecasting with Bayesian vector autoregressions: Five years of experience. Journal of Business & Economic Statistics, 4, 25-38.
  50. Long JB and Plosser CI (1983). Real business cycles. The Journal of Political Economy, 91, 39-69.
  51. Lucas RE (1977). Understanding business cycles. Carnegie-Rochester Conference Series on Public Policy, 5, 7-29.
  52. Paruolo P (2006). Common trends and cycles in I(2) VAR systems. Journal of Econometrics, 132, 143-168.
  53. Reinsel G (1983). Some results on multivariate autoregressive index models. Biometrika, 70, 145- 156.
  54. Reinsel GC and Velu RP (1998). Multivariate Reduced Rank Regression, New York, Springer.
  55. Rosipal R and Kr채mer N (2006). Overview and recent advances in partial least squares, Saunders C (Ed). Subspace, Latent Structure and Feature Selection, (pp. 34-51), Berlin, Springer.
  56. Schleicher C (2007). Codependence in cointegrated autoregressive models. Journal of Applied Econometrics, 22, 137-159.
  57. Stock JH and Watson MW (2005). Implications of dynamic factor models for VAR analysis (No. w11467), Cambridge, National Bureau of Economic Research, MA.
  58. Stock JH and Watson MW (2011). Dynamic factor models, Clements MP and Hendry DF (Eds). Oxford Handbook of Economic Forecasting, (pp. 35-59), New York, Oxford University Press.
  59. Tiao GC and Tsay RS (1989). Model specification in multivariate time series. Journal of the Royal Statistical Society. Series B (Methodological), 51, 157-213.
  60. Vahid F and Engle RF (1993). Common trends and common cycles. Journal of Applied Econometrics, 8, 341-360.
  61. Vahid F and Engle RF (1997). Codependent cycles. Journal of Econometrics, 80, 199-221.
  62. Vahid F and Issler JV (2002). The importance of common cyclical features in VAR analysis: A Monte-Carlo study. Journal of Econometrics, 109, 341-363.
  63. Velu RP, Reinsel GC, and Wichern DW (1986). Reduced rank models for multiple time series. Biometrika, 73, 105-118.
  64. Vinod HD (1976). Canonical ridge and econometrics of joint production. Journal of Econometrics, 4, 147-166.
  65. Zellner A and Palm F (1974). Time series analysis and simultaneous equation econometric models. Journal of Econometrics, 2, 17-54.