In this paper we overview the literature on common features analysis of economic time series. Starting from the seminal contributions by Engle and Kozicki (1993) and Vahid and Engle (1993), we present and discuss the various notions that have been proposed to detect and model common cyclical features in macroeconometrics. In particular, we analyze in details the link between common cyclical features and the reduced-rank regression model. We also illustrate similarities and differences between the common features methodology and other popular types of multivariate time series modelling. Finally, we discuss some recent developments in this area, such as the implications of common features for univariate time series models and the analysis of common autocorrelation in medium-large dimensional systems.
Economic time series could be characterized by several features such as trends, cycles, seasonality, serial correlation, and so on. When a set of series possesses the same type of feature, it could be the case that a linear combination of them does not necessarily possesses the feature: this is the most interesting case, for which Engle and Kozicki (1993) provided the following definition: “A feature, which is present in each of a set of series, is said to be common to those series when there exists a nonzero linear combination of these series that does not have the feature”. A well known example of common features is cointegration (Engle and Granger, 1987; Johansen, 1988): a group of series that possesses stochastic trends is cointegrated when there are some linear combinations of the variables that are stationary, i.e. do not have stochastic trends. Nowadays, there is a huge collection of special cases of common features. A comprehensive, although still partial, list includes: codependence (Gourieroux and Peaucelle, 1988; Vahid and Engle, 1997) and the scalar component model (Tiao and Tsay, 1989), when a linear combination of variables possesses shorter memory than individual series; common serial correlation (Engle and Kozicki, 1993; Vahid and Engle, 1993), when a linear combination of serially correlated series is an innovation w.r.t. the past; cotrending (Chapman and Ogaki, 1993), when a linear combination of trend-stationary time series no longer displays deterministic trend; common volatility (Engle and Kozicki, 1993; Engle and Susmel, 1993), when a linear combination of conditionally heteroskedastic time series eliminates conditional heteroskedasticity; seasonal cointegration (Hylleberg et al., 1990), when a linear combination of seasonally integrated series is nonseasonal; co-breaking (Hendry, 1999; Hendry and Massmann, 2007), when a set of series appears subject to structural breaks but a linear combination of them does not display the breaks; codependent cycles (Vahid and Engle, 1997), when a linear combination of a group of variables has shorter memory than the individual series; common nonlinearity (Anderson and Vahid, 1998), when the conditional expectation of each element of a vector time series is nonlinear w.r.t. the conditional vector but there exist a linear combination of them whose conditional expectation is linear w.r.t. the conditional vector; common seasonal cycles (Cubadda, 1999), when there exists a linear combination of seasonally differenced series which follows an MA process of low order; common panel structures (Hecq et al., 2000), when there is a linear combination of the variables in a panel data which is white noise for all individuals of the panel; nonlinear cotrending (Bierens, 2000), when a linear combination of the components of a set of stationary time series around nonlinear deterministic time trends is stationary around a linear trend or a constant; polynomial common serial correlation (Cubadda and Hecq, 2001), when there exists a polynomial combination of serially correlated time series that is an innovation; long-run pure variance common feature (Engle and Marcucci, 2006), when the conditional variances of a collection of assets all depend upon a small number of variance factors; unpredictable polynomial combinations (Paruolo, 2006), when a polynomial linear combination of series integrated with different orders is an innovation; weak form of common serial correlation (Hecq et al., 2006), when a linear combination of serially correlated series adjusted for the equilibrium errors is an innovation; common periodic correlation (Haldrup et al., 2007), which extends the notion of common serial correlation to periodic autoregressive models. From a statistical point of view, common features imply a reduction to more parsimonious structures such as common factor representations (see, i.a., Cubadda, 2007), which can often be estimated by reduced-rank regression techniques (Anderson, 1984, 1999). Imposing the common feature restrictions to the stastistical model generally leads to considerable gains in both forecasting and structural analysis (Vahid and Issler, 2002).
Common features among economic time series are often predicted by economic theory. For example, in King et al. (1988) the solution of their macro model implies that output, consumption and investment have a common trend and a common cycle. The common stochastic trend is generated by an integrated productivity shock, while the deviation of capital stock from its steady state value determines the transitional dynamics of output, consumption and investment. Another example is Campbell (1987), where the saving path implies that disposable income and consumption cointegrate (Issler and Vahid, 2001). Further, Vahid and Engle (1993) and Issler and Vahid (2001) show that in models of aggregate consumption, either the existence of “myopic” individuals (Campbell and Mankiw, 1990) or the excess sensitivity of consumption to current income (Flavin, 1993) imply that the growth rates of consumption and income share a common cycle. Note that in the consumption model by Hall (1978) consumption and income share only a common stochastic trend. As another example, in the real business cycle model for sectoral output ? laLong and Plosser (1983), as in Engle and Issler (1995), common cycles depend on the propagation mechanisms through the restrictions on the production function, i.e. technological constraints.
The rest of the paper is organized as follows. In Section 2, after introducing the general notion of common features and linking it to the reduced-rank regression model, we focus on the various forms of common cyclical features and their implication in terms of common short-run components. We also illustrate similarities and differences between these approaches and other popular types of multivariate time series modelling. Section 3 takes into account the consequences of the presence of common features for the univariate representation of multiple time series. Section 4 deals with the estimation methods of the models implied by the various form of common features, distinguishing between the cases of small systems and medium-large systems. Finally, Section 5 draws some conclusions.
In this section we first present and discuss the general notion of common features that was originally proposed by Engle and Kozicki (1993). We stress the link between the presence of common features and the multivariate Reduced-Rank Regression model (RRR) that was introduced by Anderson (1951). A detailed survey on this modelling may be found in Reinsel and Velu (1998). Then we focus on the common autocorrelation feature and its interplay with the notion of common cycles in the multivariate Beveridge and Nelson (1981) decomposition. Starting from the seminal work by Vahid and Engle (1993), we illustrate the various forms of common cyclical features that have been proposed in the literature and their implication in terms of common unobserved components. Finally, we illustrate similarities and differences between the common serial correlation approach and other types of multivariate time series modelling that are popular in statistics and econometrics, such the dynamic factor model (see, e.g., Stock and Watson (2011) and the references therein) and the multivariate autoregressive index model (Reinsel, 1983).
Engle and Kozicki (1993) considered features that satisfy the following axioms:
The vector series X_{t} has (does not have) the feature if any non-singular linear transformation of X_{t} still has (does not have) such feature.
If two n-vector time series Y_{1}_{t} and Y_{2}_{t} do not have the feature then (Y_{1}_{t} + Y_{2}_{t}) does not have the feature.
If Y_{t} does not have the feature but X_{t} has the feature then (Y_{t} + X_{t}) has the feature.
Then, any dynamic property of the data could be viewed as a special case of feature: for example, series with stochastic trends satisfies all axioms. As pointed out by Engle and Kozicki (1993) a linear combination of two series that both have the feature does not necessarily possess the feature. This is the most interesting case, and to this issue Engle and Kozicki gave particular attention by the following definition:
A feature, which is present in each of a set of series, is said to be common to those series when there exists a nonzero linear combination of these series that does not have the feature.
From a statistical point of view, the existence of such linear combinations can be linked to a common factor representation. Indeed, let Y_{t} be an n-vector time series such that
where (n?s) common factors f_{t} have the feature and e_{t} does not have the feature. Consider a s-vector δ such that δ′B = 0, then δ ′Y_{t} does not have the feature.
The main idea is that a small number of unobserved components possess a given feature and transmit it to a larger set of time series. It is then possible to combine such time series in order to cancel the influence of these unobserved components, thus removing the Common Feature (CF) from the data.
In order to illustrate the connection between CF’s and RRR, let us assume that Y_{t} is a n-vector (weakly) stationary time series such that
where X_{t} and Z_{t} are, respectively, vectors of k and m stationary time series, β ≠ 0, and ?_{t} are i.i.d. innovations, with E(?_{t}) = 0,
Moreover, assume that:
Variables X_{t} possess the feature of interest whereas variables Z_{t} don’t.
There exists a n × s (s < min{n, k}) full-rank matrix δ such that δ′Y_{t} do not possess the feature. Then, it is said that variables Y_{t} have s CF’s.
In view of the above assumptions we get
that is equivalent to
where δ_{⊥} is a n × (n ? s) full-rank matrix such that δ′δ_{⊥} = 0, and ψ is a k × (n ? s) matrix. If n > k, the matrix β′ has not full column-rank, then there exist (n ? k) “trivial” CF’s. Hence, we also assume that s < n ≤ k, so the matrix ψ has full-rank as well.
In view of
The cofeature matrix δ and the coefficient matrix ψ can be obtained as follows:
Obtain the partial regression model
where
for two generic stationary time series A_{t} and b_{t}.
Solve the following maximization problem
Then we get the solution
Note that R^{2}(υ′y_{t}|η′x_{t}), where η = βυ, is maximized for υ = υ_{1} and
Solve the following maximization problem
for i ≤ j = 2, . . . , n.
We get the solution
and ν_{i} and ν_{j} are eigenvectors corresponding to different eigenvalues.
Since δ′β′ = 0, then λ_{n}_{?}_{s}_{+1} = · · · = λ_{n} = 0, and the cofeatures matrix δ is obtained (up to an identification matrix) as follows
Since λ_{1} ≥ · · · ≥ λ_{n}_{?}_{s} > 0, the coefficient matrix ψ is obtained (up to an identification matrix) as follows
Finally, the loading matrix δ_{⊥} is given by the regression coefficients of y_{t} on ψ′x_{t}.
It is worth remarking that the eigenvalue problem
is equivalent to finding the roots of the equation:
with the normalization υ′∑_{yy}υ = 1.
One easily recognizes that the solution of the problem (
A very well known example of CF is cointegration (see, e.g., Johansen (1996) and the references therein), where linear combinations of series having nonstationary stochastic trends feature are stationary.
Let us assume that the elements of a n-vector of time series U_{t} are integrated of order 1, denoted as U_{t} ~ I(1), and that they admit the following VAR(p) representation:
where
When variables U_{t} are cointegrated of order (1, 1), denoted as U_{t} ~ CI(1, 1), we can rewrite the model (
where
Since ΔU_{t} is a stationary stochastic process, it admits the following Wold representation:
where
By expanding C(L) on 1 and integrating both sides of the above equation, we get the multivariate Beveridge-Nelson representation (BN; Beveridge and Nelson, 1981):
where C?_{i} = ?∑_{j}_{>}_{i}C_{j} for all i.
Since we know from the Engle-Granger representation theorem (Engle and Granger, 1987; Johansen, 1996) that
Note the analogy between (
However, detrended economic time series often display clear evidence of comovements (Lucas, 1977), which cannot be due to cointegration, thus suggesting the presence of common cycles. If this is the case, we expect that there exist linear combinations of cyclical series the are not cyclical. This Common Cyclical Feature (CCF) (Engle and Kozicki, 1993; Vahid and Engle, 1993) can be interpreted as short-run equilibrium relationships, similarly to the interpretation of the cointegration relations as long-run equilibrium.
From the multivariate BN decomposition (
where
While the presence of cointegration implies reduced-rank restrictions on the VECM parameters that are responsible for the long-run behavior of series U_{t}, since γ′τ_{t} = 0, the analysis of CCF’s is instead concerned with reduced-rank restrictions on the short-run VECM parameters (the polynomial matrix Γ(L) and the adjustment matrix α) that have interesting implications on the cycles κ_{t}.
However, differently from cointegration, there is not a unique notion of common short-run components. Indeed, also the degree of synchronicity of the common cycle plays a role in the definitions. Alternative notions of CCF impose differing reduced-rank structures to the VAR. Let us briefly review various form of CCF starting with the seminal notion proposed by Engle and Kozicki (1993).
Series ΔU_{t} have s (s < n) serial correlation common features (SCCF) iff there exists an n × s matrix δ_{S} with full column rank such that the VECM (
where ψ_{S} is an (np ? n + r) × (n ? s) matrix with full column rank (Engle and Kozicki, 1993). Since
Cubadda and Hecq (2001) propose the notion of polynomial serial correlation common features (PSCCF) as a measure of non-contemporaneous cyclical comovements. Non-synchronous common cycles arises, for example, in economic model of consumption with several types of consumer goods as, i.a., in Vahid and Engle (1997) and Schleicher (2007); it is shown that the maximization problem of the representative agent implies a unsynchronized common cycle (codependent cycle) in the consumer goods vector.
By definition, series ΔU_{t} have s PSCCF’s iff there exists an n × s matrix δ_{P} with full column rank such that
, and the VECM (
where ψ_{P} is an (np ? 2n + r) × (n ? s) matrix with full column rank (Cubadda and Hecq, 2001).
In order to interpret the notion of PSCCF, Cubadda and Hecq (2001) show that there exists a first-order polynomial matrix
Hence, PSCCF requires that there exists a first-order polynomial matrix δ(L) such that δ(L)′ΔU_{t} is white noise. The presence of PSCCF has an interesting implication for the BN cycles of series U_{t}: indeed, since
In the above definitions of CCF, the number of SCCF’s or PSCCF’s, s, cannot exceed the number of common trends (n ? r). In order to remove this restrictions, Hecq et al. (2006) proposed the notion of weak form of SCCF (WF): series ΔU_{t} have s WF’s iff there exists an n × s matrix δ_{W} with full column rank such that
where ψ_{W} is an (np ? n) × (n ? s) matrix with full column rank (Hecq et al., 2006).
In order to uncover interesting implications of the WF for the BN cycle, Cubadda (2007) shows that there exists a first-order polynomial matrix δ_{W}(L) ≡ δ_{W} ? (γα′ + I_{n})δ_{W}L such that
As consequence, since
A limitations of the above methods for cyclical features analysis is that they cannot handle the possible coexistence of differing types of reduced-rank restrictions in the same vector. In order to overcome this limitation, Cubadda (2007) introduced the notion of weak form of PSCCF (WFP), which encompasses most of the existing formulations: series ΔU_{t} have s WFP’s iff there exists an n × s matrix δ_{F} with full column rank such that
where ψ_{F} is an (np ? 2n) × (n ? s) matrix with full column rank.
The WFP requires the existence of a second-order polynomial matrix
An important implication of the WFP is that the polynomial matrix δ_{F}(L) transforms the BN cycles κ_{t} into a process with shorter memory, since δ_{F}(L)′κ_{t} ~ VMA(1).
The CCF analysis was extended even to the case of series having different forms of stationarity than I(1)-ness. In particular, Cubadda (1999, 2001) explored the presence of common cycles in seasonal time series that are also integrated at (a subset of) the seasonal frequencies, whereas Paruolo (2006) focused on the case of I(2) systems.
Franchi and Paruolo (2011) offered a comprehensive theoretical analysis of the conditions of existence of the various form of CCF’s and of the characterization of the CCF relations in I(0), I(1) and I(2) systems.
It is interesting to analyze similarities and differences of the CF approach with other popular multivariate time models. For the sake of simplicity, we will refer within this subsection to the basic SCCF model, which can be formulated as
where series Y_{t} are assumed to be I(0).
Ahn and Reinsel (1988) proposed a variant of the basic RRR model that is called Nested Reduced- Rank AR model (NRRAR). The main assumption is that the VAR coefficient matrices have reduced ranks, which are nested each other: Rank(A_{j}) ⊃ Rank(A_{j} + 1) for j = 1, . . . , p ? 1. Then the NRRAR reads
The NRRAR is a very general statistical model since it is easy to see that both the SCCF and its polynomial extensions are particular cases of
A different modelling, which is also endowed with a reduced-rank structure, is the Multivariate Autoregressive Index model (MAI) as originally proposed by Reinsel (1983). The basic version of the MAI reads
where ξ is n × q (q < n) full-rank matrix, and φ_{j} is is n × q matrix for j = 1, . . . , p. The linear combinations I_{t} = ξ′Y_{t} are called the indexes.
Notice that the regression coefficient matrix implied by the MAI has the following structure
which implies that β′ξ_{⊥} = 0. Hence, whereas SCCF imposes a common left null space to the VAR coefficient matrices, MAI imposes a common right null space to those matrices.
Notwithstanding both SCCF and MAI have a reduced-rank structure, the mathematical properties of these two modelling approaches are only partially similar. Indeed, although the interpretation of the canonical variates ψ′X_{t} and the index lags [ξ′Y_{t}_{?1}, . . . , ξ′Y_{t}_{?}_{p}] is analogous, since both of them represent the relevant predictors of series Y_{t}, the same statement cannot be applied to the linear combinations δ′Y_{t} and
A specific property of the MAI is that the indexes themselves follow a VAR(p) process. Indeed, if we premultiply both sides of
whereas linear combinations of series generated by an unrestricted VAR(p) model generally follow a VARMA process, see e.g. Cubadda et al. (2009) and the references therein. It is easy to see that a similar implication does not hold for the canonical variates
A different approach that gained large popularity is the Dynamic Factor Model (DFM), see e.g. Stock and Watson (2011) and the references therein. As shown by Stock and Watson (2005), the DFM can be represented in the VAR form as follows
where D(L) is a diagonal finite-order n×n polynomial matrix, Λ is a n×q matrix, Φ_{j} is is n × q matrix for j = 1, . . . , p, F_{t} is a q-vector of unobserved factors, and
A key difference between the DFM and the previously considered approaches is that the former requires for inferential purposes that the number of series n is large compared to the sample size T. Indeed, as shown by Bai and Ng (2006), if the factors F_{t} are obtained through principal component methods, the estimated factors can be treated as observed for statistical inference on
Apart from the different asymptotic frameworks, the DFM and the MAI have some degree of similarity in their mathematical formulations. Indeed,
It is less obvious how to relate the DFM to the SCCF. One may notice that in the particular case that D(L) = I_{n}, series Y_{t} in
It is well known that each series generated by a VAR process admits a univariate ARIMA representation, see e.g. Zellner and Palm (1974). However, the VAR models that are typically used in macroeconomic analysis would imply highly non parsimonious ARIMA models for individual time-series, whereas low order ARIMA models are empirically appropriate. This is the so-called “autoregressivity paradox”. Cubadda et al. (2008, 2009) argued that the presence of common cyclical features can provide a solution to this paradox.
Indeed, let us assume that the n series of interest are generated by the following VAR(p) model
The so-called Final Equations (FEs) of series Y_{t} can be obtained by premultiplying both sides of the VAR equation by A(L)^{ad j}, the adjoint matrix associated with A(L):
where det[A(L)] is the determinant of the polynomial matrix A(L).
Since det[A(L)] is a polynomial or order np and A(L)^{ad j} is matrix polynomial of order (n ? 1)p, it follows that elements of Y_{t} should admit a univariate ARMA(np, (n ? 1)p) representation. Hence, a typical VAR model with n = 5 and p = 4 would imply that each individual variable follows an ARMA(20, 16) model, which is at odds with the empirical evidence.
Following Cubadda et al. (2009), let us assume that n = 3 series are generated by the following VAR(1):
where Y_{t} = (y_{1}_{t}, y_{2}_{t}, y_{3}_{t})′.
For the above VAR, the FEs are:
such that individual series follow ARMA(3, 2) models.
However, if ω = 0, the VAR has reduced-rank structure
which produces the FEs:
This implies that the univariate representations are parsimonious ARMA(1, 1) models with the same autoregressive parameter and cross-correlated VMA errors having a factor structure.
More generally, Table 1 summaries the reduction of the individual ARMA orders due to common features restrictions.
As one can see, the existence of CCF’s provides a possible, economically meaningful, solution of the autoregressivity paradox. Notice that Table 1 provides the maxima ARIMA orders under CCF’s restrictions. However, the orders can be even smaller due to additional restrictions on the VAR parameters, such as block-diagonal or block-triangular structures.
The presence of short-run comovements has also consequences for the VMA part of FEs. Cubadda et al. (2009) show that in a stationary VAR(p), the existence of s SCCF relationships implies that in the FEs the VMA coefficient matrices associated with degrees strictly larger than (n ? s ? 1)p have a common right null space that is spanned by δ_{⊥}. Hence, it is possible to reduce the order of the VMA component to a degree of at most (n ? s ? 1)p instead of (n ? s)p. In particular, when n ? 1 = s, the FEs follow a model that is popular in the macro-panel literature: an homogeneous AR component and cross-correlated VMA errors having a factor structure.
Cubadda et al. (2009) illustrate this point with an empirical example. They consider the industrial production indexes of Canada and the US. Since no cointegration in log levels is found, a VAR(1) in first differences seems appropriate. The estimation by OLS delivers (standard errors in brackets)
Theoretical FEs orders imply an ARMA(2, 1) processes. However, SCCF test statistics is in favor of s = 1 (p-value = 0.31), with the estimated SCCF relationship (Δ lnUS_{t} ? 1.05Δ lnCA_{t}). Statistical identification of the univariate models provides the following ARIMA(1,1,0) structures
As expected, the AR coefficients are very similar. Moreover, since the estimated cofeature vector δ? ′ is close to (1 : ?1), and the VAR residuals have similar variances and a correlation around 0.65, the factor structure of the VMA part of the FEs may explain why the MA(1) components are empirically negligible.
In this section we first illustrate how to estimate the RRR models implied by the various form of CCF’s when the system dimension is small. In this case, a Maximum Likelihood (ML) under the Gaussianity assumption is generally employed. However, ML methods break down when the number of regressors becomes large compared to the typical sample size in macroeconomic datasets (100 ≤ T ≤ 200). The problem is due to the need of inverting large covariance matrices. Hence, we will review some recent proposals for estimating VAR model with reduced-rank restrictions when the number of series n exceeds 20.
Under the assumptions that series Y_{t}, X_{t} and Z_{t} are I(0) and that the innovations ?_{t} are Gaussian, ML inference on model (
In particular, the LR test statistic on the existence of s CF’s is:
where λ?_{i} is the i^{th} largest eigenvalue of the sample matrix
The ML estimator of the cofeatures matrix δ is obtained (up to an identification matrix) as follows
where
The ML estimator of the coefficient matrix ψ is obtained (up to an identification matrix) as follows
where
The ML estimator of the loading matrix δ_{⊥} is finally obtained by the regression coefficients of y_{t} on ψ?′x_{t}.
The degrees of freedom of the test statistic (
Without loss of generality, we assume that δ_{⊥,1} has full-rank. Then we can write
where
In order to conduct inference on the various forms of CCF’s, a two-step procedure is usually employed (see, i.a., Vahid and Engle (1993), Ahn (1997), Cubadda and Hecq (2001), Hecq et al. (2006), and Cubadda (2007)). First, the cointegrating vectors are estimated by ML on the VECM without imposing any CCF restrictions. This can be done by using the procedures suggested in Johansen (1988) or Ahn and Reinsel (1990). Second, the cointegration matrix γ is fixed to its ML estimate, and inference on CCF’s is carried on. Since the ML estimator of the cointegation matrix γ converges at rate T irrespective of the existence of constraints in the short-run polynomial matrix Γ(L) (see, e.g., Johansen (1996)), the asymptotic distribution of the test statistics (
Hence, ML inference on the various forms of common features is obtained by solving CanCor{ΔU_{t}, X_{t} | Z_{t}} for proper choices of the variables X_{t} and Z_{t} as detailed in Table 2.
For each of the models in Table 2, under the null that s common features of a given form exist, the test statistic (
Moreover, optimal estimates of both the common features vectors and (partial) RRR coefficients are then obtained as described in Table 4.
Finally, the remaining parameters of the various RRR models are estimated by OLS after fixing the matrices ψ’s to their estimated values.
Notice that the two step procedures previously illustrated do not maximize the Gaussian likelihood, although the estimated parameters have the same distribution as the optimal ones. Centoni et al. (2007) suggested an iterative algorithm for computing the ML estimates. The main idea is to switch between the estimation of the cointegration matrix γ having fixed the short-run parameters and the estimation of the short-run parameters having fixed the cointegration matrix.
The merits of imposing CCF restrictions in VAR models are investigated in several Monte Carlo studies. Vahid and Issler (2002) documented the advantages of simultaneously choosing the order p of the VAR model and the number s of SCCF restrictions for both forecasting and structural analysis. Athanasopoulos et al. (2011) extended the analysis to cointegrated systems. They proposed to simultaneously choose the cointegration rank r with p and s. Using the switching algorithm in Centoni et al. (2007) for estimation and a novel criterion for model identification, significant gains in forecasting accuracy are found both in simulations and empirical applications.
The ML approach discussed so far is typically applied to small scale multivariate models, i.e. in situations where the number of series n is not larger than five and the sample size T rarely exceeds 200. However, it is of obvious interest to apply the common features framework to medium-large VAR models, i.e. when the n ranges from 10 to 40. Indeed, Koop (2013)i.a. shows that medium-large VAR’s outperform small-scales models in forecasting the key US macroeconomic variables. However, it is clear that when n is large ML inference suffers of the curse of dimensionality problem. Indeed, Cubadda and Hecq (2011) show by simulations that in a VAR(1) with n = 25 variables the LR test exhibits severe size distortions even with a large sample size as T = 600.
The main reason why ML inference performs poorly in high-dimensional systems is that the inversion of large variance-covariance matrices is required. Hence, most of methods that have been proposed to use RRR with large n try to “regularize” these covariance matrices in different ways.
The first attempt in this direction can be dated back to Vinod (1976), who proposed the canonical ridge model. When applied to the SCCF modelling, this approach would require to substitute the eigenvalues and eigenvectors that are used for CCA with those of the matrix
where κ_{y} and κ_{x} are two positive scalars, ∑?_{yy} and ∑?_{xx} are respectively the sample variance-covariance of series Y_{t} and X_{t} in
Cubadda and Hecq (2011) suggested a Partial Least Squares (PLS) approach to test and impose the SCCF restriction to a VAR even when CCA is not feasible due a lack of degrees of freedom. PLS are a family of multivariate techniques with the aim of maximizing the covariance between linear combinations of two variable sets, see, e.g. Rosipal and Kr?mer (2006) for a detailed survey. The idea is to consistently estimate the SCCF matrix δ as the eigenvectors associated with the s smallest eigenvalues of the matrix
where D_{yy} and D_{xx} are diagonal matrices having the diagonal elements of, respectively, ∑_{yy} and ∑_{xx}. Since PLS require to invert diagonal matrices only, this method can provide estimates of δ (up to an identification matrix) that are less disperse and more numerically stable than those coming from CCA when the dimension of X_{t} approaches the sample size T.
In order to consistently estimate the factor weights ψ, it is necessary to take the additional assumption that the columns of the matrix ψ_{S} are equal to (n ? s) distinct eigenvectors of the matrix
where V_{ψ} is the diagonal matrix of the (n?s) eigenvalues of the matrix
In order to detect the presence of SCCF in large dimensional systems, Cubadda and Hecq (2011) proposed to replace the condition that a linear combination of variables must be orthogonal to the past, namely
with the one of absence of autocorrelation, namely
where δ = [δ_{1}, . . . , δ_{s}]. This drastically decreases the number of restrictions to be imposed under the null hypothesis, thus making a test for (an implication of) SCCF feasible even when CCA is not. Condition (
Having controlled for the overall size of the test when different values of s are involved, Cubadda and Hecq (2011) show by simulations that the proposed testing procedure has negligible size distortions and remarkable power with n = 9, 25 and small sample size as T = 50.
Carriero et al. (2011) proposed a Bayesian approach to estimate large VAR’s with reduced-rank restrictions. In particular, the Reduced Rank Posterior (RRP) method goes as follows. First, estimate an unrestricted VAR by implementing the Kadiyala and Karlsson (1997) version of the Minnesota prior (see, i.a., Litterman, 1986). where the estimates of elements of β in (
Carriero et al. (2011) compare the RRP with several alternatives, including a more elaborated Bayesian Reduced-Rank VAR (BRR) method, in a forecasting exercise where the reference model is a VAR(1) for n = 52 variables and a rolling window of 120 monthly observations is used for estimation. The empirical results show that BRR and RRP outperform the competitors for intermediate and long forecasting horizons, whereas univariate AR models are hard to beat for shorter horizons.
Bernardini and Cubadda (2015) proposed to regularize the estimate of the autocorrelation matrix prior on performing CCA. In particular, they suggest to use, in place of the natural estimator, a proper shrinkage estimator of the covariance matrix of
where ρ ∈ [0, 1]. Then a Regularized version of CCA (RCCA) can be obtained by solving the eigenvector equation
Notice that when ρ = 1 the full-rank regression case coincides with n univariate white noises, whereas when ρ = 0 one gets the usual CCA solution. Hence, RCCA can be seen as a frequentist analogous of the Kadiyala-Karlsson version of the Minnesota prior, as it shrinks elements of β towards zero as well. In order to choose the value of ρ, Bernardini and Cubadda (2015) follow the approach of Ledoit and Wolf (2003), which requires to minimize a risk function based on the Frobenious norm of the difference between the shrinkage estimator ∑?_{ww} and the covariance matrix ∑_{ww}. In particular, they provide a feasible estimator of the optimal ρ to the case that data are generated by a vector stationary process. Remarkably, such estimator converges to zero at rate T, hence RCCA is asymptotically equivalent to CCA.
Bernardini and Cubadda (2015) document, both by simulations and empirical applications, that RCCA improves both forecasting and estimation of structural parameters over traditional medium-size (n = 20 and p = 2) macroeconometric methods.
The main goal of this survey was to create a “common thread” between various topics, related to each other by the idea of modelling various forms of comovements that are typically observed in economic time series and often predicted by economic theory. From a statistical point of view, common features imply a reduction to more parsimonious structure such as common factor representation: a small number of unobserved components possesses a given feature and transmits it to a larger set of economic time series. As we have seen, RRR is often the solution to the inferential problem. Due to the large amount of literature on common features, we focused the discussion on common cyclical features. However, we have tried to take into account recent developments as the implications of common features for univariate time series models and the statistical issues that arise when the number of the variables is fairly large.
The main drawback of the methods discussed in this paper is that different features are evaluated separately; one important exception is the unifying framework for analyzing common cyclical features that we discussed in Subsection 2.2. Then, a major goal ahead is to develop estimation and testing procedures that allow for the joint identification of several forms of common features using an integrated statistical approach.
Another challenge for future research is the extension of common cycle analysis to nonstationary medium-large dimensional systems. Indeed, all the methods that are presented in Subsection 4.2 require that series are individually stationarized prior to the multivariate analysis. This is clearly a methodological limitation since possible cointegration relations are ignored. Hence, it is still missing a statistical modelling that allows for the simultaneous analysis of common trends and cycles when the number of time series is not small.
Maximum ARIMA orders of univariate series generated by an n-dimensional VAR(p) with s cofeature restrictions
Model | AR order | I(d) | MA order |
---|---|---|---|
I(0)VAR | np | 0 | (n ? 1)p |
SCCF | (n ? s)p | 0 | (n ? s)p |
PSCCF | (n ? s)p + s | 0 | (n ? s)p + (s ? 1) |
CI(1, 1) VAR | n(p ? 1) + r | 1 | (n ? 1)(p ? 1) + r |
SCCF | (n ? s)(p ? 1) + r | 1 | (n ? s)(p ? 1) + r |
PSCCF | (n ? s)(p ? 1) + r + s | 1 | (n ? s)(p ? 1) + r + s ? 1 |
WF | (n ? s)(p ? 1) + r | 1 | (n ? s)(p ? 1) + r |
SCCF = serial correlation common feature; PSCCF = polynomial SCCF; WF = weak form.
Canonical correlations and CCF’s
Model | ||
---|---|---|
SCCF | ||
WF | ||
PSCCF | ||
WFP |
SCCF = serial correlation common feature; WF = weak form; PSCCF = polynomial SCCF; WFP = weak form of PSCCF.
Tests for common features
Model | Degress of freedom d |
---|---|
SCCF | s × (n(p ? 2) + r + s) |
WF | s × (n(p ? 2) + s) |
PSCCF | s × (n(p ? 3) + r + s) |
WFP | s × (n(p ? 3) + s) |
SCCF = serial correlation common feature; WF = weak form; PSCCF = polynomial SCCF; WFP = weak form of PSCCF.
Estimators of the common features vectors and RRR coefficients
Model | (η?_{1}, . . . ,η?_{s}) | (υ?_{s}_{+1}, . . . ,υ?_{n}) |
---|---|---|
SCCF | δ?_{S} | ψ?_{S} |
WF | δ?_{W} | ψ?_{W} |
PSCCF | δ?_{P} | ψ?_{P} |
WFP | δ?_{F} | ψ?_{F} |
SCCF = serial correlation common feature; WF = weak form; PSCCF = polynomial SCCF; WFP = weak form of PSCCF.