TEXT SIZE

• •   CrossRef (0) Value at Risk of portfolios using copulas  Kiwoong Byuna, Seongjoo Song1,b

aDepartment of Business and Administration, Korea University, Korea;
bDepartment of Statistics, Korea University, Korea
Correspondence to: 1Department of Statistics, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Korea.
E-mail: sjsong@korea.ac.kr

This research was supported by a 2020 Korea University Grant (K2009541, S. Song).
Received August 29, 2020; Revised October 23, 2020; Accepted October 23, 2020.
Abstract
Value at Risk (VaR) is one of the most common risk management tools in finance. Since a portfolio of several assets, rather than one asset portfolio, is advantageous in the risk diversification for investment, VaR for a portfolio of two or more assets is often used. In such cases, multivariate distributions of asset returns are considered to calculate VaR of the corresponding portfolio. Copulas are one way of generating a multivariate distribution by identifying the dependence structure of asset returns while allowing many different marginal distributions. However, they are used mainly for bivariate distributions and are not widely used in modeling joint distributions for many variables in finance. In this study, we would like to examine the performance of various copulas for high dimensional data and several different dependence structures. This paper compares copulas such as elliptical, vine, and hierarchical copulas in computing the VaR of portfolios to find appropriate copula functions in various dependence structures among asset return distributions. In the simulation studies under various dependence structures and real data analysis, the hierarchical Clayton copula shows the best performance in the VaR calculation using four assets. For marginal distributions of single asset returns, normal inverse Gaussian distribution was used to model asset return distributions, which are generally high-peaked and heavy-tailed.
Keywords : Value at Risk, portfolio returns, normal inverse Gaussian distribution, copula, vine copula, hierarchical copula
1. Introduction

According to Jorion (2007), Value at Risk (VaR) is defined as the worst loss over a target horizon that will not be exceeded with a given level of confidence. VaR has been widely used for financial risk management since it is simple and intuitive in the sense that it summarizes the change in the value of a portfolio into a single number. Calculating VaR of a portfolio of assets is important in finance because it has been widely accepted that the investment in a portfolio of several assets has advantages over the investment in a single asset since Markowitz (1952). Different assets have different return distributions, and they are correlated in various ways. The key element of calculating VaR is to capture the relationship among assets in the portfolio appropriately.

In order to capture the relationship among assets and calculate VaR of the portfolio, multivariate distributions can be used. We may want to work with a univariate distribution from the history of returns from the same portfolio because the return of the portfolio itself is univariate. However, using a multivariate distribution for a collection of assets that make up the portfolio will give us more flexibility; for example, when we change the weight of the assets in a portfolio, we can handle the change more easily. Therefore, we would like to model the joint distribution of the asset returns with a proper multivariate distribution in this paper, especially with copulas.

Copulas are popular in modeling a joint distribution of several asset returns in finance. With copulas, we can construct multivariate distributions with different marginal distributions by separating the dependencies from marginal distributions. Also, there are many different copulas that can incorporate the proper dependence structure of the data. Embrechts et al. (2002) show that copulas are useful to identify the dependence structure among returns of assets. In this paper, we will use copulas to identify the dependence structure and to generate a multivariate distribution for returns of assets in a portfolio. The VaR of the portfolio will be computed using the resulting multivariate distribution.

Elliptical copulas include Gaussian (normal) copula and Student’s t copula, and they are commonly used in the field of finance. Archimedean copulas include Clayton, Gumbel, and Frank copulas, and they have a generator function that is useful in expressing various dependence structures. Wu et al. (2007) use exchangeable Archimedean copulas to calculate VaR. When there are more than two assets in a portfolio, elliptical and exchangeable Archimedean copulas are inadequate since they do not allow different dependence patterns between pairs of variables. In order to represent different dependence structures between pairs of variables, vine copulas or hierarchical copulas can be considered. Many studies deal with these copulas in the finance field. Low et al. (2013) use canonical vine (C-vine) copula, Kraus and Czado (2017) use D-vine copula, and Okhrin and Tetereva (2017) use hierarchical copula to estimate VaR.

In this paper, we would like to search for the best copula in calculating the VaR of a portfolio of many assets. The critical point of this paper is to find out a good copula for portfolios with various dependence structures among assets. The dependence structure can be quite complicated in high dimensional data; therefore, we conjecture that vine copulas and hierarchical copulas perform better than single copulas. By comparing VaRs, we suggest a copula that performs better than other copulas in the simulation studies and real data applications. This study is to choose a proper copula function to reflect various dependence structures in a portfolio, and the performance is compared by VaR of the portfolio.

As for the marginal distributions, we can use different distributions for each asset in a portfolio. It has been known that the normal distribution is inappropriate to model the return distribution of financial assets. The return distributions of financial assets are slightly skewed and fat-tailed. There has been extensive research on finding good alternatives to the normal distribution in literature. For example, Venkataraman (1997) uses a quasi-Bayesian maximum likelihood estimation procedure, and Hull and White (1998) use a transform to multivariate normal distributions, which is updating schemes such as GARCH. Also, Eberlein and Keller (1995) use Hyperbolic distribution, and Madan et al. (1998) use variance gamma (VG) distribution. Barndorff-Nielsen (1997) and Mabitsela et al. (2015) use normal inverse Gaussian (NIG) distribution. In this paper, we use NIG distribution as marginal distributions. NIG distribution is known to have a better return profile than VG distribution (Eriksson et al., 2009; Göncü and Yang, 2016), and the calculation of VaR using NIG distribution is also better than other models such as Gaussian GARCH or VG (Wilhelmsson, 2009; Kim and Song, 2011; Dorić and Dorić, 2011).

The remainder of the paper is organized as follows. Section 2 explains the model, VaR, NIG distribution, and copulas. Section 3 and Section 4 show the simulation results and real data applications, respectively. Finally, Section 5 concludes the paper.

2. Models and background

### 2.1. Models

Suppose we consider a portfolio of p assets. Let xj = (x1 j, x2 j, . . . , xn j) is the n daily log returns of the jth asset where j = 1, 2, . . . , p and w = (w1,w2, . . . ,wp) is the weight of assets in the portfolio. Then the daily portfolio return will be $Ri=Σj=1kwjxij$ where i = 1, 2, . . . , n. We are interested in finding a good joint distribution of daily log returns of p assets to compute a good estimate of VaR of the daily portfolio returns. In this paper, we would like to use copulas to capture the dependence structure of returns of different assets. When we use copulas for the joint distribution of returns of different assets, we can use any marginal distribution for the return of each asset. Since we are mainly interested in finding a good copula function for the joint distribution, we will use the same marginal distribution for each asset. The marginal distribution of xj is fitted by normal inverse Gaussian (NIG) distribution, which is explained in Section 2.3. NIG distribution is commonly used to model the return distribution of financial assets due to its capability of incorporating high kurtosis and nonzero skewness.

### 2.2. Value at Risk

Value at Risk (VaR) is a common measure of financial risk, especially the market risk, indicating the worst loss over a target horizon that will not be exceeded with a given level of confidence. Although various levels and target periods can be set, 1% and 5% probabilities and one day and ten day horizons are common. When a level of 1 − α is given, the VaR of the portfolio is defined as:

$P(Loss of the portfolio≤VaR)=1-α.$

Therefore, if we know the loss distribution of the given portfolio, VaR is easily obtained by the 100(1− α) percentile of the distribution. However, the loss distribution is generally not known, so percentiles are estimated by many different methods. Here, we will find the loss distribution by fitting the joint distribution of the asset returns using copulas and NIG distribution.

2.3. Normal inverse Gaussian distribution

Normal inverse Gaussian (NIG) distribution, introduced by Barndorff-Nielsen (1997), is commonly used for return distributions of financial assets; for instance, Aas et al. (2005), Kalemanova et al. (2007), and Mabitsela et al. (2015). When X follows NIG distribution with parameters α, β, δ, and μ, X|Z = z follows N(μ+βz, z) where Z is a random variable that follows the inverse Gaussian distribution with parameters δ and $α2-β2$. The probability density function of NIG distribution is given by

$fNIG(x)=αδK1(αδ2+(x-μ)2)πδ2+(x-μ)2eδα2-β2+β(x-μ),$

where Kλ denotes a modified Bessel function of the third kind with index λ and its characteristic function is

$φNIG(z)=exp (iμz-δ(α2-(β+iz)2-α2-β2)).$

For more details, refer to Schoutens (2003) or Barndorff-Nielsen et al. (2001).

### 2.4. Copula

A copula is a function that combines several marginal distributions to form a joint distribution. The mathematical definition was initially introduced in Sklar (1959). Sklar’s Theorem says that any p-dimensional distribution function F with marginal distribution functions, F1, . . . , Fp can be written as

$F(x1,…,xp)=C(F1(x1),…Fp(xp)),$

for some copula C, which is uniquely determined on [0, 1]p for the distribution F with absolutely continuous marginals.

The copula is useful to estimate parameters when the dimension is large since the parameters of a multivariate distribution are estimated separately by each marginal distribution and a copula function. Using copulas, we can construct multivariate distributions with different marginal distributions. There are several types of copulas, such as elliptical copulas and Archimedean copulas. Readers may refer to Trivedi and Zimmer (2007) or Cherubini et al. (2011).

2.4.1. Elliptical copula

The most commonly used elliptical copulas are Gaussian (normal) copula and Student’s t copula. Copula is constructed by the inversion of Sklar’s theorem applied to elliptical distributions such as normal distribution and t distribution. Gaussian copula is constructed using the cumulative distribution function (CDF) of normal distributions as follows.

$CRNorm(u1,…,up)=ΦR(Φ-1(u1),…Φ-1(up)),$

where u1, . . . , up are standard uniform variates by using the inverse transformation $x1=F1-1(u1),…,xp=Fp-1(up)$, Φ−1 is the inverse of CDF of a standard normal distribution, and ΦR is the CDF of a p-dimensional normal distribution which has zero mean vector and covariance matrix that equals to the correlation matrix R.

Similarly, t copula can be written as

$Cv,RT(u1,…,up)=tv,R(tv-1(u1),…tv-1(up)),$

where tν,R is the CDF of a multi-dimensional t distribution with degree of freedom ν. When ν is small, the tail mass of t copula increases, and when ν is large, t copula is close to Gaussian copula from Kole et al. (2007).

Elliptical copulas are commonly used in practice since its concept is simple and easy to understand.

2.4.2. Archimedean copula

Archimedean copulas are defined by a strictly decreasing convex generator function φ as follows.

$C(u1,…,up)=φ[-1](φ(u1)+⋯+φ(up)),$

where φ : [0, 1] → [0,∞) is the generator function that has a continuous derivative on (0, 1). φ(1) = 0 and φ(t) is convex and decreasing for all 0 < t < 1. φ[−1] is the pseudo-inverse of the generator defined as

$φ[-1](t)={φ-1(t),0≤t≤φ(0),0,φ(0)≤t<∞.$

Archimedean copulas are useful because they can reflect many dependence structures that are non-linear and non-symmetric. Kendall’s tau and Spearman’s rho are usually used by a measure of dependence under the non-linear relationship rather than the Pearson correlation coefficient. The relationship between the Archimedean copula and Kendall’s tau or Spearman’s rho in bivariate cases is given below. Kendall’s tau can be written as

$ρτ(X,Y)=4∫01∫01C(u1,u2)dC(u1,u2)-1$

and Spearman’s rho can be written as

$ρS(X,Y)=12∫01∫01{C(u1,u2)-u1u2}du1du2.$

This paper considers Clayton, Gumbel, and Frank copula among the many Archimedean copulas. Their generators and copula functions are given in

The Archimedean copula is popular because it can model the dependence structure of any dimensions with one parameter. However, it would have limitations to reflect the actual dependence structure implied in high dimensions. In order to resolve this problem, several methods have been developed, such as vine copula and hierarchical copula, which use two or more parameters to model the copula function. Figure 1 shows the various methods of using copulas.

2.4.3. Vine copula

In order to mitigate the limitations of using a single parameter for reflecting dependence structure, vine copula was proposed by Aas et al. (2009). In the vine copula, paired copula constructions decompose a multivariate distribution into bivariate distributions, and variables are connected in a tree structure that expresses inter-dependence among variables flexibly. The key point of the vine copula is that the multivariate density function can be factorized using conditional density function (Cherubini et al., 2011). Vine based models have an advantage that allows different copula functions for different pairs of variables. As an example, let us consider a joint distribution of 4 variables. Using Sklar’s theorem several times, the multivariate distribution can be written as

$f(x1,x2,x3,x4)= f1(x1)f2(x2)f3(x3)f4(x4) ×c12(F1(x1),F2(x2))c13(F1(x1),F3(x3))c14(F1(x1),F4(x4)) ×c23∣1(F2∣1(x2∣x1),F3∣1(x3∣x1))c24∣1(F2∣1(x2∣x1),F4∣1(x4∣x1)) ×c34∣12(F3∣12(x3∣x1,x2),F4∣12(x4∣x1,x2))$

and

$f(x1,x2,x3,x4)= f1(x1)f2(x2)f3(x3)f4(x4) ×c12(F1(x1),F2(x2))c23(F2(x2),F3(x3))c34(F3(x3),F4(x4)) ×c13∣2(F1∣2(x1∣x2),F3∣2(x3∣x2))c24∣3(F2∣3(x2∣x3),F4∣3(x4∣x3)) ×c14∣23(F1∣23(x1∣x2,x3),F4∣23(x4∣x2,x3)),$

where fi’s are marginal densities, Fi|j is the conditional distribution of variable i on the variable j. ci j is the pair-copula density function for the pair of Fi(xi), Fj(xj), and ci j|k is the pair-copula density function applied to the Fi|k(xi|xk), Fj|k(xj|xk) (Aas et al., 2009; Cherubini et al., 2011).

C-vine copula and D-vine copula are different classes of vine copulas. A C-vine copula is defined as

$C(u1,u2,u3,u4)= C12(u1,u2)C13(u1,u3)C14(u1,u4) ×C23∣1(u2∣1,u3∣1)C24∣1(u2∣1,u4∣1) ×C34∣12(u3∣12,u4∣12)$

and a D-vine copula is defined as

$C(u1,u2,u3,u4)= C12(u1,u2)C23(u2,u3)C34(u3,u4) ×C13∣2(u1∣2,u3∣2)C24∣3(u2∣3,u4∣3) ×C14∣23(u1∣23,u4∣23).$

The number of parameters to reflect the dependence structure is p(p − 1)/2, where p is the number of variables. This number equals the number of all cases that can be created by combining two variables. Note that vine copula is affected by the order of combining marginal distributions and the families of copulas being used in each pair.

2.4.4. Hierarchical copula

Hierarchical copula is another way to ease the restriction that uses a single parameter to reflect dependencies. It is constructed using the composition of the generator functions. Again as an example, we consider a 4-dimensional multivariate distribution. Then the hierarchical copula (or fully nested Archimedean copula) is defined as (Okhrin and Ristig, 2014)

$C(u1,u2,u3,u4)=C3(C2(C1(u1,u2),u3),u4)=φ3-1(φ3°φ2-1(φ2°φ1-1(φ1(u1)+φ1(u2))+φ2(u3))+φ3(u4)),$

where Ci is the bivariate copula distribution, φi is a generator function, and ○ is the composition operator where i = 1, 2, 3.

The number of parameters in a hierarchical copula model is p − 1 that is smaller than the number of parameters of a vine copula model when the joint distribution has p variables.

There is a caveat in the hierarchical copula in (2.15). Let θ1, θ2, and θ3 be parameters of C1, C2, and C3, respectively. They should then satisfy the condition of θ1> θ2> θ3 (Joe, 1997; Cherubini et al., 2011). It implies that variables in the same group have a higher level of dependence than variables in different groups. Therefore, it is reasonable to set a group with more highly dependent variables first.

3. Simulation studies

In this chapter, we will compare the performance of different copulas in calculating VaR through simulation experiments. We consider different combinations of correlation coefficients among log returns of assets to represent different dependence structures.

### 3.1. Settings

Random samples of four variables are generated, which represent daily log returns on four different assets. Since the return distributions of financial assets are commonly assumed to have heavier tails than the normal distribution, we generate random samples from a multivariate NIG distribution. Many research papers including Bølviken and Benth (2000) and Godin et al. (2012) use NIG distribution for modeling return distributions of financial assets. We generate random samples from a multivariate NIG distribution for 1,250 daily log returns. Table 2 shows the combinations of correlation coefficients used in simulation studies. The column under “Assets (i, j)” has Pearson correlation coefficients between ith and jth assets in three different cases.

In Case 1, the correlation coefficients of all pairs of assets are set to be similar. In Case 2, we suggest a hierarchical setup under which correlation coefficients of three pairs of assets are large, two pairs are medium, and one pair is relatively small. In Case 3, the correlation coefficients of all assets are random.

Weighted returns are calculated in order to make portfolio returns of four assets. The weights for each asset can have a variety of values, but they are set to be the same in this section. Although we fix the weights in the simulation studies and the real data applications, we can change them easily to construct different portfolios. To figure out the portfolio return distribution, we may fit the univariate distribution to the computed daily portfolio returns. One of the advantages of modeling the joint distribution of asset returns before computing the portfolio return is that changing the weights in the portfolio is easier.

The vine copula is affected by the order of combining marginal distributions and the families of the copula function. In this chapter, we use the order of combining marginal distributions according to the size of correlation coefficients and choose the copula function that minimizes the Akaike information criterion (AIC).

All parameters are estimated by the maximum likelihood estimation (MLE), and the levels for VaR are set from 95% to 99.5% by 0.5%.

### 3.2. Description of the simulation

In order to find out the best model to calculate VaR, we go through the following steps in each case. The true underlying distribution is assumed to be a multivariate NIG distribution.

• Step 1: 10,000 random vectors of (x1,1, . . . , x1,4), . . . , (x10000,1, . . . , x10000,4) are generated from a 4-dimensional multivariate NIG distribution with (ᾱ, μ, ∑, γ) = (1, 0, ∑, 0). These vectors are considered as vectors of daily log returns of four different assets. In this setting, ∑ is the variance-covariance matrix which is calculated by correlation coefficients in Table 2 and standard deviation σ = (2.11, 2.24, 1.39, 1.85). There are several alternative parametrizations for multivariate NIG distribution (Weibel et al., 2020).

• Step 2: For each vector obtained in Step 1, we calculate the weighted average of $Ri=Σj=14wjxij$ where i = 1, 2, . . . , 10000 to be used as the portfolio return. Although we can use any weights, we assume that the weights are all equal to 1/4 in the simulation (i.e., wj ≡ 1/4, j = 1, . . . , 4). Using the calculated portfolio returns, we calculate the VaR of a given level and this VaR is treated as the true VaR.

• Step 3: We estimate parameters of a copula model with NIG marginal distributions using the 10,000 random vectors generated in the same way as Step 1. The 4-dimensional multivariate NIG distribution in Step 1 serves as a true underlying distribution, and the copula model fitted in this step serves as a fitted distribution. Elliptical, vine, and hierarchical copulas are used.

• Step 4: We create 10,000 random samples from the fitted copula model in Step 3. Then, we calculate the portfolio return with equal weights and the estimated VaR of a given level.

• Step 5: Repeat the above steps 1,000 times to obtain the mean squared error (MSE) of VaR.

The results for each case given in Table 2 are in the following. In Tables 35, we have Elliptical, Vine, and Hierarchical; we used the Gaussian copula and t copula as elliptical copulas, C-vine and D-vine copulas, hierarchical Gumbel, hierarchical Clayton, and hierarchical Frank copulas in Step 3 above. Here, for example, hierarchical Gumbel means that we use Gumbel copula for every pair in the hierarchical copula model. The Archimedean copulas with a single parameter reflecting the dependence structure were not included because they had similar results to the hierarchical copulas and were not as good as the hierarchical copulas.

First, in case of elliptical copulas, Gaussian copula produces desirable results in Case 1, but the accuracy drops in Case 2 and Case 3 (Figures 24). In contrast to Gaussian copula, t copula produces good results in all cases. Although they express the degree of dependence with a single parameter, they look good in terms of MSE. Widely used in finance, the t copula does especially well.

Second, vine copulas show very accurate results in all cases. It seems that vine copulas can catch different dependence structures because they have the number of parameters equal to the number of pairs of variables. However, it does not directly imply that vine copulas will show good performance in prediction. To see the predictability of copula models, we will see their performance with test datasets (Tables 68 and Figures 57). Since vine copula has many parameters, it may have lower predictive power relative to its good performance in modeling.

Finally, for the hierarchical copula, Gumbel, Clayton, and Frank copulas provide different results. The hierarchical copula has 3(= p − 1) parameters for describing the dependence of the model; therefore, we could expect it to show better performance than elliptical copulas but not as good performance as vine copulas. In Figure 4 for Case 3, we obtain a similar result as expected that the hierarchical Clayton copula generally performs better than the Gaussian copula and worse than vine copulas. However, for Cases 1 and 2, contrary to our expectation, hierarchical copulas do not show better results than elliptical nor vine copulas. Also, there are very different results depending on the types. Clayton copula is known to reflect the left tail dependence well. (Charpentier, 2003; Cherubini et al., 2011). Since the VaR reflects the loss, it is sensible that the Clayton copula has a good result. Although not included in the paper, we checked that Gumbel copula reflects the right tail dependence well as it is known.

Contrary to our expectations, we also found that the MSE does not grow as the dependence structure becomes more complex.

### 3.3. Backtesting results

In many cases of calculating VaR, we would like to predict the loss level of the near future. For this purpose, backtesting has been used to examine the accuracy of the VaR prediction. Backtesting divides data into two pieces: ex-ante and ex-post. Models are fitted by using ex-ante data, and VaRs are calculated from models. Then, it is verified whether the estimated VaRs are accurate by using ex-post data. In order to improve the model accuracy, a rolling-window is used. Rolling-window means that ex-ante data and ex-post data move together as much as the number of windows. In this simulation, the number of windows is 1,000, and the size of each window is 250. Detailed steps of simulation experiments are as follows in each case of the correlation structure given in

• Step 1: 1,250 random vectors of (x1,1, . . . , x1,4), . . . , (x1250,1, . . . , x1250,4) are generated from a 4-dimensional multivariate NIG distribution with (ᾱ, μ, ∑, γ) = (1, 0, ∑, 0). These vectors are considered as vectors of daily log returns of four different assets. In this setting, ∑ is the variance-covariance matrix calculated by correlation coefficients in Table 2 and standard deviation σ = (2.11, 2.24, 1.39, 1.85).

• Step 2: For each vector obtained in Step 1, we calculate the weighted average of $Ri=Σj=14wjxij$ where i = 1, 2, . . . , 1250 to be used as the portfolio return. We again assume that the weights are all equal to 1/4. The portfolio loss is calculated by changing the sign of the portfolio return since the VaR is expressed as the maximum loss.

• Step 3: We estimate parameters of a copula model with NIG marginal distributions using the first 250 random samples obtained in Step 1. Several different copula models are used.

• Step 4: We create 10,000 random samples from the fitted copula model in Step 3. Then, we calculate the portfolio return with equal weights and VaR of a given level 1 − α.

• Step 5: Check whether the 251st portfolio loss in Step 2 exceeds the estimated VaR in Step 4.

• Step 6: Using the rolling-window method, we repeat Steps 3 to 5 1,000 times and compute the proportion of cases where the (t + 1)st portfolio loss in Step 2 is larger than the estimated VaR in Step 4 using 250 random samples up to tth.

If the proportion is larger than α, the given copula model underestimates VaR, so this model may not be able to capture the extreme risk. If the proportion is smaller than α, the model overestimates VaR so that the model may miss some investment opportunities.

Thus, we want the violation rate, a proportion of portfolio losses in excess of VaR, as close to α as possible. The target violation rate is shown as the black line in

The results are shown in Tables 68 and Figures 57. Both Gaussian copula and t copula as elliptical copulas underestimate VaR. Although Gaussian copula at low confidence level and t copula at high confidence level are found to show slightly better results, they generally perform similarly. Cvine copula performs slightly better than D-Vine copula at the overall confidence level, but the overall performance of vine copulas is not as good as the previous results in Figures 24. As for hierarchical copulas, hierarchical Clayton copula performs best among the copulas considered in all cases.

The results are different from what we have seen in Section 3.2. In Section 3.2, we dealt with in-sample data so that there can be an overfitting problem. For instance, C-vine copula use 6(= p(p − 1)/2) parameters and hierarchical Clayton copula use 3(= p − 1) parameters. Thus C-vine shows an outstanding goodness-of-fit, but its predictability is low. In the case of hierarchical copulas, especially hierarchical Clayton copula, its predictability is good at overall confidence levels. As previously mentioned, Clayton copula has an advantage in estimating VaR since it is known to reflect the dependence of the left tail (loss). The simulation results tell us that hierarchical Clayton copula performs best regardless of the dependence structure. The number of parameters in the hierarchical copula seems adequate to reflect the dependence structure, but not too large to make an overfitting problem.

4. Data analysis

### 4.1. Data description

In this section, we investigate the performance of copula models for real data. We consider a portfolio that consists of Facebook, Amazon, Netflix, and Google (FANG), the four highest performing technology companies in the market. The daily log returns from January 2nd, 2013 to December 31st, 2017 are obtained from Yahoo Finance. The movements in log returns of stock prices are shown in

All of the log returns move around zero, with a relatively small variation for Google and a relatively large variation for Netflix. Table 9 provides some descriptive statistics and p-values of Kolmogorov-Smirnov test for normality of daily log return data for FANG.

It can be seen that log returns have positive skewness and high kurtosis. Especially, kurtosis is much larger than that of a normal distribution. Kolmogorov-Smirnov (KS) test also shows that the log returns do not follow a normal distribution.

In order to check the dependence structure in the portfolio, the pairwise Kendall’s tau values among the daily log returns of the four assets are computed in Table 10. Pearson correlation coefficients are also shown in parentheses. Amazon and Google have a relatively large Kendall’s tau value of 0.4501 and Pearson’s correlation coefficient of 0.5389, while Facebook and Netflix have a small Kendall’s tau value of 0.2743 and a Pearson’s correlation coefficient of 0.2859.

### 4.2. NIG distribution as Marginal distribution

Normal distributions are not appropriate for marginal distributions in our copula model because of the fat tails of return distributions. Therefore, we instead use the normal inverse Gaussian (NIG) distribution, which is widely used for describing financial data with fat tails (Bølviken and Benth, 2000; Godin et al., 2012). Maximum likelihood estimates of parameters of the NIG distribution and the Kolmogorov-Smirnov goodness-of-fit test results are given in

In Figure 9, we see the empirical densities of log returns for each of the four assets and the densities of the NIG distribution with parameters estimated through MLE. Although the fitted NIG densities for Amazon and Netflix tend to be more peaked than empirical densities, the NIG distribution is well fitted to the real log returns. Also, when we see the goodness-of-fit of NIG distribution to the real log returns using the Kolmogorov-Smirnov test, p-values are very large, which implies that the NIG distribution sufficiently accounts for log return data.

### 4.3. VaR with various copulas

In this section, we compute VaR with various copula models and see which copula model gives the best performance in terms of the violation rate. We consider Gaussian, t, C-vine, D-vine, hierarchical Gumbel, hierarchical Clayton, and hierarchical Frank copulas as before with marginal distributions set to be NIG distribution with maximum likelihood estimates of parameters. We also try the multivariate normal distribution and the multivariate NIG distribution to compare the copula models. The predictive powers of different multivariate models are compared through backtesting with the rolling-window method. The number of windows is again set to 1,000. Table 12 gives VaR values from the level of 0.95 to 0.995 by 0.005 and Figure 10 visually summarizes the numerical results.

The results in Table 12 and Figure 10 can be summarized as follows. First, the multivariate NIG distribution gives a better performance than the multivariate normal distribution, but their performances are not good. Gaussian copula and t copula as elliptical copulas do not seem to work very well. They tend to underestimate VaR, and the t copula has slightly better performance than the Gaussian copula. Vine copulas look better than multivariate distributions and elliptical copulas, especially at low confidence levels. Hierarchical copulas show very different violation rates depending on the copula functions used. The hierarchical Clayton copula shows the violation rate very close to the target. Furthermore, the hierarchical Clayton copula has the best performance among all models considered, which coincides with the result of the simulation studies in Section 3.3.

Table 13 shows the p-value of the Kupiec test. Kupiec test, introduced in Kupiec (1995), is commonly used to see the adequacy of the VaR calculation. Its null hypothesis is that the violation rate has the same as the target level α; for instance, if we compute 95% VaR, the proportion of portfolio losses in excess of VaR is 5%. At 98.5%, 99%, 99.5%, the multivariate normal, the multivariate NIG, Gaussian copula, and t copula are mostly rejected. At 99.5%, vine copulas are also rejected. Among hierarchical copulas, Gumbel and Frank copulas are rejected at most of the levels. The hierarchical Clayton copula has high p-values at all levels, so the null hypothesis is not rejected. Using the Kupiec test, the hierarchical Clayton copula again performs best among all models considered.

One thing to note is that the multivariate NIG distribution takes less computation time to calculate VaR compared to other copula models. Also, its performance is no worse than other copula models except the hierarchical Clayton copula. In that sense, the multivariate NIG distribution would be an alternative when we need a quick result. If we need an accurate result, the hierarchical Clayton copula would be more suitable.

Although not included in this paper, we checked the simulation results when the weights attached to four assets are changed from (0.25, 0.25, 0.25, 0.25) to (0.1, 0.2, 0.3, 0.4) and obtained similar results as those obtained above. We also considered all possible portfolios with three assets among FB, AMZN, NFLX, and GOOG. Again, the vine copula was the best in the in-sample performance and the hierarchical Clayton was the best in the out-of-sample performance.

As another example, we also tried Microsoft, Amazon, Google, and Apple (MAGA) to construct the portfolio within the same period as before. With MAGA, we obtain almost the same results as shown in Tables 14, 15, and

5. Conclusion

Copulas are one way of fitting a joint distribution to data, modeling the marginal distribution and the correlation structure separately. There are several types of copulas, including elliptical copula, vine copula, and hierarchical copula. In this paper, we want to investigate various copulas for modeling a multivariate distribution of returns of many assets in the financial market. We’re often interested in the portfolio of assets; therefore, the risk management of a portfolio is an important issue. If we find a good multivariate distribution that describes the returns of many assets well, we could easily calculate risk measures of portfolios with varying weights. We used the accuracy of VaR of the portfolio to see how each copula model performs. VaR of portfolio returns was computed with various copula functions in simulation experiments and real data applications. As a marginal distribution, we used a normal inverse Gaussian (NIG) distribution, which is widely used to model the asset returns in finance.

We expected that the performance of copula models might depend on the correlation structure of the underlying multivariate distribution. So we assumed three cases of the correlation structure in simulation and tried to find a more appropriate copula model than others in each case. In the simulation experiments of an in-sample performance, vine copulas showed the best performance in terms of the MSE of VaR regardless of the dependence structure. It was somehow expected because the number of parameters in the vine copulas is larger than other copulas. However, when it comes to out-of-sample performance, the backtesting results were no longer the best with vine copulas. We would explain this phenomenon by an overfitting problem because the model becomes too complex. Even though t copula, commonly used in finance, has a single parameter to reflect the dependence structure, it showed good performance in in-sample as vine copulas.

The hierarchical Clayton copula was the best in prediction (out-of-sample performance) in both simulation studies and real data applications, regardless of the dependence structure of the underlying distribution. The number of parameters in the hierarchical copula is adequate to reflect the underlying dependence structure and not too big to make an overfitting problem. Also, it seems that Clayton copula works well because the Clayton copula is known to reflect left tail dependence.

In this paper, we analyzed the four dimensions. The larger the dimension, the harder it would be to establish the appropriate dependence structure and find a good copula model. In reality, diversified portfolios are common to mitigate the unsystematic risk that is inherent in a specific company or industry. Therefore, we would like to look at cases with larger dimensions as future work.

Figures Fig. 1. Copula structure. Fig. 2. MSE with various copulas in Case 1 (Modeling). Fig. 3. MSE with various copulas in Case 2 (Modeling). Fig. 4. MSE with various copulas in Case 3 (Modeling). Fig. 5. Violation rate with various copulas in Case 1 (Prediction). Fig. 6. Violation rate with various copulas in Case 2 (Prediction). Fig. 7. Violation rate with various copulas in Case 3 (Prediction). Fig. 8. Log returns of four stocks over the five years. Fig. 9. Empirical densities of 4 asset log returns and densities of NIG distribution fitted. Fig. 10. Violation rate with various distributions in FANG. Fig. 11. Violation rate with various distributions in MAGA.
TABLES

### Table 1

Common archimedean copulas

φ(t; θ)Range of θBivariate copula C(u, v; θ)
Clayton$1θ(t-θ-1)$[−1,∞)\{0}$[max{u-θ+v-θ-1,0}]-1θ$
Gumbel(− log(t))θ[1,∞)$exp[-((-log(u))-θ+(-log(v))θ)1θ]$
Frank$-log (exp(-θt)-1exp(-θ)-1)$(−∞,∞)$-1θlog [1+(exp(-θu)-1)(exp(-θv)-1)exp(-θ)-1]$

### Table 2

Pearson correlation coefficients for three different cases

Assets (1, 2)Assets (1, 3)Assets (1, 4)Assets (2, 3)Assets (2, 4)Assets (3, 4)
Case 10.730.750.710.720.750.73
Case 20.760.710.800.450.440.27
Case 30.800.680.490.380.160.08

### Table 3

MSE with various copulas in Case 1 (Modeling)

EllipticalVineHierarchical

GaussianTCVineDVineGumbelClaytonFrank
95%0.00850.00890.00930.00950.02100.02420.0079
95.5%0.00980.01050.01060.01090.02830.02850.0114
96%0.01170.01210.01220.01230.03710.03480.0190
96.5%0.01320.01370.01380.01370.04910.04200.0334
97%0.01650.01560.01580.01600.06580.05260.0598
97.5%0.02310.02040.02120.02100.09240.06800.1090
98%0.03120.02830.02930.02780.13330.09130.1956
98.5%0.04770.04440.04290.04170.20360.12550.3635
99%0.08250.07330.07170.07090.34740.19080.7159
99.5%0.17580.15330.15640.16220.69120.38211.6435

### Table 4

MSE with various copulas in Case 2 (Modeling)

EllipticalVineHierarchical

GaussianTCVineDVineGumbelClaytonFrank
95%0.00790.01200.00830.00850.03280.01800.0098
95.5%0.00940.01300.00980.01000.04250.02170.0156
96%0.01120.01430.01060.01090.05490.02530.0251
96.5%0.01350.01490.01200.01220.07130.03060.0411
97%0.01900.01750.01480.01550.09550.04060.0687
97.5%0.02650.02180.01810.02020.13000.05290.1142
98%0.03960.02820.02420.02670.18600.07210.1948
98.5%0.06140.03790.03360.03580.27160.10610.3340
99%0.10860.06590.05820.05960.43560.17580.6209
99.5%0.25370.15210.13990.13940.85870.36951.4181

### Table 5

MSE with various copulas in Case 3 (Modeling)

EllipticalVineHierarchical

GaussianTCVineDVineGumbelClaytonFrank
95%0.00690.01140.00720.00700.02600.00860.0103
95.5%0.00820.01270.00830.00830.03460.00990.0161
96%0.01030.01430.00970.00980.04600.01140.0252
96.5%0.01320.01520.01130.01110.05940.01430.0387
97%0.01790.01710.01290.01340.07890.01770.0607
97.5%0.02600.01990.01630.01680.10810.02280.0968
98%0.04170.02640.02140.02280.15740.02880.1625
98.5%0.06930.03500.03060.03160.23560.04060.2757
99%0.12060.05330.05020.05090.36830.06960.4928
99.5%0.24930.13650.12460.12280.68520.17321.0513

### Table 6

Violation rate with various copulas in Case 1 (Prediction)

EllipticalVineHierarchical

GaussianTCVineDVineGumbelClaytonFrank
95%0.0570.0600.0600.0590.0640.0570.060
95.5%0.0530.0570.0530.0550.0600.0470.056
96%0.0480.0490.0480.0480.0550.0430.052
96.5%0.0430.0450.0450.0440.0510.0390.047
97%0.0400.0380.0370.0380.0440.0330.044
97.5%0.0360.0330.0310.0330.0400.0260.042
98%0.0270.0270.0230.0270.0320.0210.036
98.5%0.0200.0180.0170.0160.0280.0150.032
99%0.0150.0130.0130.0140.0190.0100.024
99.5%0.0060.0050.0060.0030.0120.0040.016

### Table 7

Violation rate with various copulas in Case 2 (Prediction)

EllipticalVineHierarchical

GaussianTCVineDVineGumbelClaytonFrank
95%0.0510.0540.0530.0600.0620.0480.056
95.5%0.0490.0510.0480.0530.0560.0440.053
96%0.0440.0470.0430.0470.0540.0410.047
96.5%0.0420.0420.0420.0450.0470.0370.044
97%0.0390.0370.0380.0410.0450.0300.043
97.5%0.0320.0310.0310.0360.0400.0260.040
98%0.0280.0260.0260.0270.0360.0220.035
98.5%0.0210.0190.0210.0210.0280.0190.029
99%0.0180.0180.0170.0210.0230.0140.025
99.5%0.0130.0090.0090.0100.0190.0090.019

### Table 8

Violation rate with various copulas in Case 3 (Prediction)

EllipticalVineHierarchical

GaussianTCVineDVineGumbelClaytonFrank
95%0.0530.0600.0550.0680.0630.0510.057
95.5%0.0500.0510.0500.0610.0560.0480.052
96%0.0470.0480.0470.0550.0520.0430.047
96.5%0.0430.0450.0420.0490.0470.0390.046
97%0.0380.0380.0370.0430.0440.0330.044
97.5%0.0330.0310.0300.0370.0390.0280.037
98%0.0260.0250.0240.0320.0340.0210.034
98.5%0.0220.0200.0190.0240.0270.0180.029
99%0.0170.0150.0150.0180.0200.0150.020
99.5%0.0130.0100.0090.0130.0150.0070.018

### Table 9

Descriptive statistics of log returns and Kolmogorov-Smirnov test of normality

MeanSdSkewnessKurtosisKS Test (p-value)
FB0.00150.01972.150329.89570.0000
AMZN0.00120.01820.357613.87210.0000
NFLX0.00210.02901.992229.33040.0000
GOOG0.00080.01371.733521.47510.0000

### Table 10

Pairwise Kendall’s tau among log return data (Pearson’s correlation coefficients are in parentheses)

FBAMZNNFLXGOOG
FB10.38210.27430.3990
AMZN(0.4418)10.29900.4501
NFLX(0.2859)(0.3320)10.2816
GOOG(0.4587)(0.5389)(0.3422)1

### Table 11

Maximum likelihood estimates of NIG distribution and p-value of Kolmogorov-Smirnov test for NIG distribution

α̂β̂δ̂μ̂KS Test (p-value)
FB40.72402.11130.01450.00070.8819
AMZN43.96060.90090.01330.00090.5655
NFLX23.25114.50230.0163−0.00110.8559
GOOG64.08773.74630.01100.00020.9006

### Table 12

Violation rate with various distributions in FANG

MultivariateEllipticalVineHierarchical

NormalNIGGaussianTCVineDVineGumbelClaytonFrank
95%0.0530.0520.0530.0530.0530.0520.0640.0500.055
95.5%0.0490.0490.0510.0510.0460.0480.0600.0450.053
96%0.0430.0440.0450.0450.0430.0450.0540.0420.052
96.5%0.0410.0380.0420.0410.0380.0390.0530.0330.048
97%0.0360.0370.0350.0350.0350.0330.0480.0280.045
97.5%0.0300.0290.0330.0310.0290.0280.0440.0260.040
98%0.0280.0230.0280.0260.0250.0240.0380.0220.035
98.5%0.0250.0220.0230.0230.0200.0200.0320.0150.028
99%0.0210.0170.0200.0160.0150.0150.0260.0130.025
99.5%0.0180.0110.0130.0120.0120.0120.0190.0070.022

### Table 13

The p-value of Kupiec test with various distributions in FANG

MultivariateEllipticalVineHierarchical

NormalNIGGaussianTCVineDVineGumbelClaytonFrank
95%0.670.770.670.670.670.770.051.000.47
95.5%0.550.550.370.370.880.650.031.000.23
96%0.630.530.430.430.630.430.030.750.06
96.5%0.310.610.240.310.610.500.000.730.03
97%0.280.210.370.370.370.580.000.710.01
97.5%0.330.430.120.240.430.550.000.840.01
98%0.090.510.090.190.280.380.000.660.00
98.5%0.020.090.050.050.220.220.001.000.00
99%0.000.040.010.080.140.140.000.360.00
99.5%0.000.020.000.010.010.010.000.400.00

### Table 14

Violation rate with various distributions in MAGA

MultivariateEllipticalVineHierarchical

NormalNIGGaussianTCVineDVineGumbelClaytonFrank
95%0.0540.0520.0550.0560.0530.0550.0610.0490.056
95.5%0.0500.0500.0510.0510.0490.0510.0590.0430.054
96%0.0460.0480.0460.0470.0440.0460.0550.0380.049
96.5%0.0400.0400.0410.0410.0390.0400.0500.0340.048
97%0.0370.0340.0370.0370.0350.0360.0470.0320.041
97.5%0.0350.0300.0340.0330.0310.0320.0390.0280.039
98%0.0340.0270.0300.0290.0270.0290.0350.0230.035
98.5%0.0310.0200.0250.0230.0220.0230.0320.0160.030
99%0.0260.0150.0180.0170.0140.0150.0260.0100.028
99.5%0.0150.0060.0090.0050.0070.0080.0170.0030.018

### Table 15

The p-value of Kupiec test with various distributions in MAGA

MultivariateEllipticalVineHierarchical

NormalNIGGaussianTCVineDVineGumbelClaytonFrank
95%0.570.770.470.390.670.470.120.880.39
95.5%0.450.450.370.370.550.370.040.760.18
96%0.340.210.340.270.530.340.020.740.16
96.5%0.400.400.310.310.500.400.020.860.03
97%0.210.470.210.210.370.280.000.710.05
97.5%0.060.330.080.120.240.170.010.550.01
98%0.000.130.040.060.130.060.000.510.00
98.5%0.000.220.020.050.090.050.000.800.00
99%0.000.140.020.040.230.140.001.000.00
99.5%0.000.660.111.000.400.220.000.330.00

References
1. Aas K, Haff IH, and Dimakos XK (2005). Risk estimation using the multivariate normal inverse Gaussian distribution. The Journal of Risk, 8, 39-60.
2. Aas K, Czado C, Frigessi A, and Bakken H (2009). Pair-copula constructions of multiple dependence. Insurance: Mathematics and Economics, 44, 182-198.
3. Barndorff-Nielsen OE (1997). Normal inverse Gaussian distributions and stochastic volatility modelling. Scandinavian Journal of Statistics, 24, 1-13.
4. Barndorff-Nielsen OE, Mikosch T, and Resnick SI (2001). L├®vy Processes: Theory and Applications, New York, Springer Science & Business Media.
5. B├Ėlviken E and Benth FE (2000). Quantification of risk in Norwegian stocks via the normal inverse Gaussian distribution. Proceedings of the AFIR 2000 Colloquium Troms├Ė, Norway. , 87-98.
6. Charpentier A (2003). Tail distribution and dependence measures. Proceedings of the 34th ASTIN Conference Berlin, Germany. , 1-25.
7. Cherubini U, Mulinacci S, Gobbi F, and Romagnoli S (2011). Dynamic Copula Methods in Finance, New York, John Wiley & Sons.
8. Dori─ć D and Dori─ć EN (2011). Return distribution and value at risk estimation for BELEX15. Yugoslav Journal of Operations Research, 21, 103-118.
9. Eberlein E and Keller U (1995). Hyperbolic distributions in finance. Bernoulli, 1, 281-299.
10. Embrechts P, McNeil A, and Straumann D (2002). Correlation and dependence in risk management: properties and pitfalls. Risk Management: Value at Risk and Beyond, 1, 176-223.
11. Eriksson A, Ghysels E, and Wang F (2009). The normal inverse Gaussian distribution and the pricing of derivatives. The Journal of Derivatives, 16, 23-37.
12. Godin F, Mayoral S, and Morales M (2012). Contingent claim pricing using a normal inverse Gaussian probability distortion operator. Journal of Risk and Insurance, 79, 841-866.
13. G├Čnc├╝ A and Yang H (2016). Variance-Gamma and normal-inverse Gaussian models: goodness-of-fit to Chinese high-frequency index returns. The North American Journal of Economics and Finance, 36, 279-292.
14. Hull J and White A (1998). Value at risk when daily changes in market variables are not normally distributed. Journal of Derivatives, 5, 9-19.
15. Joe H (1997). Multivariate Models and Multivariate Dependence Concepts, Boca Raton, CRC Press.
16. Jorion P (2007). Value at Risk: The New Benchmark for Managing Financial Risk (3rd ed), New York, McGraw-Hill.
17. Kalemanova A, Schmid B, and Werner R (2007). The normal inverse Gaussian distribution for synthetic CDO pricing. The Journal of Derivatives, 14, 80-94.
18. Kim T and Song S (2011). Value-at-risk estimation using NIG and VG distribution. Journal of the Korean Data Analysis Society, 13, 1775-1788.
19. Kole E, Koedijk K, and Verbeek M (2007). Selecting copulas for risk management. Journal of Banking & Finance, 31, 2405-2423.
20. Kraus D and Czado C (2017). D-vine copula based quantile regression. Computational Statistics & Data Analysis, 110, 1-18.
21. Kupiec P (1995). Techniques for verifying the accuracy of risk measurement models. The Journal of Derivatives, 3, 73-84.
22. Low RKY, Alcock J, Faff R, and Brailsford T (2013). Canonical vine copulas in the context of modern portfolio management: Are they worth it?. Journal of Banking & Finance, 37, 3085-3099.
23. Mabitsela L, Mar├® E, and Kufakunesu R (2015). Quantification of VaR: A note on VaR valuation in the South African equity market. Journal of Risk and Financial Management, 8, 103-126.
24. Madan DB, Carr PP, and Chang EC (1998). The variance gamma process and option pricing. Review of Finance, 2, 79-105.
25. Markowitz H (1952). Portfolio selection. The Journal of Finance, 7, 77-91.
26. Okhrin O and Ristig A (2014). Hierarchical Archimedean copulae: the HAC package. Journal of Statistical Software, 58, 1-20.
27. Okhrin O and Tetereva A (2017). The realized hierarchical archimedean copula in risk Modelling. Econometrics, 5, 1-31.
28. Schoutens W (2003). L├®vy Processes in Finance: Pricing Financial Derivatives, New York, John Wiley & Sons.
29. Sklar A (1959). Fonctions de r├®partition an dimensions et leurs marges. Publications de lŌĆÖInstitut Statistique de lŌĆÖUniversit├® de Paris, 8, 229-231.
30. Trivedi PK and Zimmer DM (2007). Copula modeling: an introduction for practitioners. Foundations and Trends in Econometrics, 1, 1-111.
31. Venkataraman S (1997). Value at risk for a mixture of normal distributions: the use of quasi-Bayesian estimation techniques. Economic Perspectives-Federal Reserve Bank of Chicago, 21, 2-13.
32. Weibel M, Breymann W, and L├╝thi D (2020). ghyp: A package on generalized hyperbolic distributions. Manual for R Package ghyp.
33. Wilhelmsson A (2009). Value at Risk with time varying variance, skewness and kurtosisŌĆöthe NIG-ACD model. The Econometrics Journal, 12, 82-104.
34. Wu F, Valdez E, and Sherris M (2007). Simulating from exchangeable Archimedean copulas. Communications in StatisticsŌĆöSimulation and Computation, 36, 1019-1034.