TEXT SIZE

CrossRef (0)
Change point analysis in Bitcoin return series : a robust approach

Junmo Song a, Jiwon Kang1,b

aDepartment of Statistics, Kyungpook National University, Korea;
bDepartment of Computer Science and Statistics, Jeju National University, Korea
Correspondence to: 1 Department of Computer Science and Statistics, Jeju National University, 102, Jejudaehak-ro, Jeju-si, Jeju-do, Korea. E-mail: jwkang@jejunu.ac.kr
Received April 9, 2021; Revised May 18, 2021; Accepted May 21, 2021.
Abstract
Over the last decade, Bitcoin has attracted a great deal of public interest and Bitcoin market has grown rapidly. One of the main characteristics of the market is that it often undergoes some events or incidents that cause outlying observations. To obtain reliable results in the statistical analysis of Bitcoin data, these outlying observations need to be carefully treated. In this study, we are interested in change point analysis for Bitcoin return series having such outlying observations. Since these outlying observations can affect change point analysis undesirably, we use a robust test for parameter change to locate change points. We report some significant change points that are not detected by the existing tests and demonstrate that the model allowing for parameter changes is better fitted to the data. Finally, we show that the model with parameter change can improve the forecasting performance of Value-at-Risk.
Keywords : Bitcoin, GARCH model, change point analysis, parameter change, robust, outlying observations
1. Introduction

Over the last decade, Bitcoin has attracted a great deal of public interest and along with this, the Bitcoin market has grown rapidly. Its speculative price movements have also drawn the interest of many researchers as well as financial investors. Accordingly, numerous studies have been devoted to the analysis of Bitcoin, more exactly the volatility modelling of Bitcoin returns.

The Bitcoin returns are highly volatile and exhibit volatility clustering. To capture the stylized characteristics, many studies commonly employed various GARCH type models. See, for e.g., Glaser et al. (2014), Chu et al. (2017) and Katsiampa (2017). As opposed to the conditional variance models, some authors such as Chaim and Laurini (2018) and Tiwari et al. (2019) considered stochastic volatility models. Long range dependence of the Bitcoin return series have also been interestingly studied, for e.g., in Bariviera et al. (2017), Jiang et al. (2018), and Caporale et al. (2018).

Many studies have compared and evaluated various models under the assumption that parameters do not change. However, it is widely recognized that time series often suffer from structural or parameter changes in underlying models. Due to its importance in statistical inferences and actual practice, change point problem has attracted much attention from many researchers, particularly in time series analysis. For a general review, see Aue and Horváth (2013), Horváth and Rice (2014), and the references therein. In financial time series, parameter change often occurs due to, for e.g., changes of monetary policy or critical events. Considering the various events in the Bitcoin markets, it needs to take into account parameter change in analysis of Bitcoin. Indeed, several recent studies such as Thies and Molnár (2018), Canh et al. (2019), and Mensi et al. (2019) addressed that parameter changes occur frequently. It is noteworthy that change point problem has also been dealt with in the perspective of regime switching (RS). In this case, RS-GARCH models have been mainly applied. See, for e.g., Ardia et al. (2019) and Caporale and Zekokh (2019).

Our interest also lies in examining whether change points exist in the Bitcoin return series. For this, we use a test for parameter change to detect change points, but we emphasize in this study that it needs to use a test that is robust against deviating observations. This is because another characteristic of Bitcoin returns that should be noted is the existence of outlying observations, which are mainly caused by some events or incidents such as Mt. Gox incident. As is well known in the statistical literature, outlying observations can lead to serious bias in estimation and erroneous conclusion in testing. See, for e.g., Tsay (1988), Chen and Liu (1993), and Franses and Ghijsels (1999). This problem is still present in change point analysis. When outlying observations are included in a data set being suspected of having parameter changes, it is actually not so easy to determine whether the results are due to genuine changes or not. For more details on the parameter change test in the presence of outliers, see Fearnhead and Rigaill (2019) and Song and Kang (2021).

In this study, we fit GARCH models to Bitcoin returns and then detect change points. In order to reduce the undesirable influence of deviating observations, we conduct a robust parameter change test proposed by Song and Kang (2021) and report some change points that are not detected by the existing tests. We evaluate the models with and without parameter changes via AIC and compare value at risk (VaR) forecasting performance.

The remainder of this paper is organized as follows: Section 2 introduces data and the methodology, Section 3 presents the empirical results, and Section 4 concludes.

2. Data and methodology

The data analyzed in this study consists of daily closing prices from January 2013 to December 2020, total 2,922 observations, which is available at coinmetrics, and the data after October 2020 is used in out-of-sample analysis. The prices {S t} and its log return series {rt}, where S t is the price of Bitcoin at time t and rt = 100 log(S t/S t−1), are presented in Figure 1. As can be seen, the price has risen dramatically in the last several years. In the left subfigure for the return series, we can see some deviating observations and even observe extremely deviating returns below −40%. Aforementioned, these observations are highly likely to have an undesirable effect on statistical inferences.

We consider the standard GARCH (1, 1) model and use the robust test introduced by Song and Kang (2021) to detect change points. In order to model the increase in price over the observation period, non zero mean term μ is included as follows,

$rt=μ+σtɛt,σt2=ω+α1 (rt-1-μ)2+β1σt-12,$

where {εt} is a sequence of i.i.d. random variables from N(0, 1), ω > 0, α1 ≥ 0, and β1 ≥ 0. The test statistics by Song and Kang (2021) was indeed developed for GARCH models without mean term. But it can be readily shown that their result also holds for the model with non zero mean, so here and in what follows we introduce and use the slightly modified version of the test procedure.

The test statistics is constructed based on the so-called minimum density power divergence estimator (MDPDE), which is given by, for α ≥ 0,

$θ^α,n=argminθ∈Θ 1n∑t=1nl˜α(rt;θ),$

where θ = (μ, ω, α, β), Θ denotes a parameter space, and

$l˜α(rt;θ)={(1σ˜t2)α{11+α-(1+1α)exp (-α2(rt-μ)2σ˜t2)},α>0(rt-μ)2σ˜t2+log σ˜t2,α=0.$

Here, ${σ˜t2∣1≤t≤n}$ is obtained recursively by

$σ˜t2:=σ˜t2(θ)=ω+α1 (rt-1-μ)2+β1σ˜t-12.$

MDPDE with α = 0 and α = 1 correspond to the MLE and L2 estimator for GARCH models, respectively. Indeed, the tuning parameter α controls the trade-off between robustness and asymptotic efficiency of the estimator and it is known that the MDPDE with small α has a strong robustness with little loss in asymptotic efficiency relative to MLE. For more details on the MDPDE for GARCH models, see Lee and Song (2009).

The test statistics employed for testing the null hypothesis of no change in the parameter θ is given as

$T˜nα:=max1≤k≤n k2n∂θ′H˜α,k (θ^α,n)J^α-1∂θH˜α,k (θ^α,n),$

where

$H˜α,n(θ)=1n∑t=1nl˜α(rt;θ) and J^α=1n∑t=1n∂θl˜α (rt;θ^α,n)∂θ′l˜α (rt;θ^α,n).$

Under the null hypothesis, $T˜nα$ converges in distribution to $sup0≤s≤1‖B4o(s)‖22$, where ${B4o(s)∣s≥0}$ is the 4-dimensional standard Brownian bridge, see Song and Kang (2021). When $T˜nα$ is large, the null hypothesis is rejected and the change point is then located as

$k^α:=argmax1≤k≤n k2n∂θ′H˜α,k (θ^α,n)J^α-1∂θH˜α,k (θ^α,n).$

### Remark 1

$T˜nα$ with α = 0 is essentially equal to the MLE-based score test for parameter change introduced by Berkes et al. (2004).

One of the main characteristics of the test $T˜nα$ is that the test with α close to zero performs similarly to the score test. As α increases, the robustness of the test also increases. In practice, selecting the optimal α that enjoys efficiency and robustness is indeed an important issue, but this work is difficult in hypothesis testing settings. According to Song and Kang (2021), α does not need to be large and is usually recommended to be in a range between 0.1 and 0.3.

The score test for parameter change is performed based on the ML-estimate. Since the MLE is sensitive to outliers, it is not easy to distinguish whether the testing result of the score test is due to outliers or genuine change in parameter. On the other hand, the test $T˜nα$ is conducted based on the MDPDE, which is robust against outlying observations, so the rejection of the robust test can be understood as an indication of the change of parameter. Song and Kang (2021) particularly emphasized that the score test is is highly likely to miss significant changes when outlying observations are included and demonstrated that $T˜nα$ with small α can effectively detect parameter changes in such situations. Hence, in the analysis of the Bitcoin return data with some deviating observations, a robust test like $T˜nα$ can be usefully used in detecting parameter changes. In what follows, the terms “change point”, “parameter change”, and “break” are used interchangeably.

### Remark 2

In order to find multiple changes, we need to use the binary segmentation procedure as follows: we first perform the test $T˜nα$ on the whole series {r1, . . ., rn}. If the null hypothesis is rejected, split the series into two subseries {X1, . . ., Xα} and {Xα+1, . . ., Xn}. Then, repeating the same procedure on each subseries until no change point is detected, one can locate multiple change points (cf. Aue and Horváth (2013)).

### Remark 3

In the empirical analysis below, we use the AIC modified by Ninomiya (2015) to evaluate the adequacies of candidate models with and without change point. In the above GARCH (1, 1) model with m change points, the number of total parameters including change points is 4(m + 1) + m, so the penalty term of the naive AIC becomes 8(m + 1) + 2m. Ninomiya (2015), however, addressed that more penalty of 4m, i.e., four times the number of change points, should be added to the model with change points. Following this, we calculate the penalty term of AIC as 8(m + 1) + 6m.

3. Empirical results

We conduct the test $T˜nα$ with α > 0 at the significance level of 10% for the period from January 2013 to September 2020. For comparison purpose, we also perform the score test $T˜nα$ with α = 0 in Remark 2 and the following residual-based CUSUM test for parameter change

$TnR:=1nτ^nmax1≤k≤n|∑t=1kɛ^t2-kn∑t=1nɛ^t2|,$

where ∊̂ts denote the residuals of the GARCH (1, 1) model above and $τ^n2=1/n∑t=1nɛ^t4-(1/n)∑t=1nɛ^t2 )2$. Under the null hypothesis of no parameter change, $TnR$ converges in distribution to sup0≤s≤1 |Bo(s)|, where {Bo(s)|s ≥ 0} is the standard Brownian bridge. See, for e.g., Kulperger and Yu (2005). The test is known to detect change points in GARCH models better than the score test. (cf. Song and Kang (2018)).

In the present analysis, we consider α values in {0.01, 0.02, . . ., 0.5}. P-values of $T˜nα$ are presented in Figure 2. One can see that $T˜nα$ with α ≥ 0.04 produces the p-value of less than 0.1, indicating that a parameter change exists in the underlying GARCH model, whereas p-values of the other two tests $TnR$ and $T˜n0$ are obtained to be 0.730 and 0.873, respectively, and thus the null hypothesis is not rejected.

For each α ≥ 0.04, we apply the binary segmentation method in Remark 2 to find further changes and then calculate the AIC introduced in Remark 3 for each model with detected change points. The AICs are depicted in Figure 3, where the dashed red line is the AIC for the model without change point. The smallest AIC is obtained to be 15545 at α = 0.22 and eight change points are detected, splitting the whole series into nine subperiods. The binary segmentation results for $T˜nα$ with α = 0.22 are presented in Table 1, where one can see the p-values and detected change points during 17 iterations. The AIC of the model without break is obtained to be 15636. In terms of the AIC, we therefore select the model (2.1) with the change points induced by α = 0.22 as the optimal model. Change points detected by some α values, including α = 0.22, are presented in

MDPD estimates and descriptive statistics for each subperiod divided by α = 0.22 are presented in Table 2, where ML estimates for the GARCH model without break and summary statistics for the whole series are also reported. From the table, we can clearly see that the parameters are estimated differently for each subperiod. For instance, in the 4th, 6th, and 7th subperiods, the persistence parameter β1 is obtained comparatively large, indicating that these periods had been more persistent than the other subperiods. For most of subperiods, β1 is estimated to be larger than 0.789, the estimate of the model without any break. The relatively large estimates of the parameter α1 in the 1st, 2nd, 5th, 8th subperiods indicate that the Bitcoin market was more sensitive during that subperiods. Recall that the MDPDE is less affected by outlying observations. This means that the MDPD estimates reflect the genuine dynamics of the underlying model in the presence of outlying observations such as the data presented here, and thus we can conclude that the estimate results are due to the existence of parameter changes. As addressed in Song and Kang (2021), we presume that the outlying observations hinder the score and residual-based tests from detecting changes in parameter. From the descriptive statistics in Table 2 and the boxplots in Figure 5, one can also see apparent differences in the distribution of the Bitcoin returns, further supporting our findings.

We now calculate one-step ahead 95% VaR for the period from October 2020 to December 2020, total 92 observations, and evaluate out-of-sample forecasting performance. The following models are considered: the above GARCH (1, 1) model with parameter changes obtained by α = 0.22 (M1) and without break (M2) and additionally the power-transformed and threshold (PTT)–GARCH (1, 1) model with non zero mean and without break (M3). The conditional variance of the PTT–GARCH (1, 1) model is given by

$σt2δ=ω+α1et-12δ1(-∞,0)(et-1)+α2et-12δ1[0,∞)(et-1)+βσt-12δ,$

where et = rtμ and 1A denotes the indicator function. The PTT–GARCH model includes many GARCH-type models as special cases (cf. Hwang and Basawa (2004) and Pan et al. (2008)).

The one-step ahead conditional variance of the models M1 and M2 to be used in calculating VaR is given by

$σ^t+1∣t2=w^t+α^t (rt-μ^t)2+β^tσ˜t2 (θ^t),$

where θ̂t = (μ̂t, ŵt, α̂t, β̂t) is the estimate based on the data {rtc+1, . . ., rt} and $σ˜t2(θ^t)$ is the one obtained as in (2.2). For the model M1 with parameter change, tc is the last change point, that is, 18 July 2019, and tc is zero in the case of the model M2 without break. The one-step ahead conditional variance for the model M3 is defined similarly and tc is also set to be zero. We use MDPDE to estimate the model M1 since, in the data after the last change point, there is an outlying return of less than −40% that may lead to a bias in estimation. The models M2 and M3 are estimated using the MLE. Consequently, we compare the model incorporating parameter changes and robust technique with the naive models without break estimated by the ordinary MLE, which is common situation in empirical analysis.

The one-step-ahead 95% VaR at time t is then calculated as μ̂t + Φ (0.05)−1σ̂t+1|t, where Φ is the cumulative distribution function of N(0, 1). The numbers of the realized returns that exceed the forecasted VaR are obtained to be 3, 1, and 1 for the models M1, M2, and M3, respectively. The corresponding percentages of the VaR violations are 3.26%, 1.09%, and, 1.09%, so we can see that the model M1 produces VaR more close to the risk level of 5%. We also perform the backtest proposed by Kupiec (1995). P-values are obtained with 0.415 and 0.038 for the model M1 and the two models M2 and M3, respectively, also validating that the model M1 forecasts VaR more accurately. The calculated 95% VaRs are depicted in Figure 6, where blue, red, and green points are the VaRs forecasted from the models M1, M2, and M3, respectively. We can see from the figure that the models M2 and M3 overestimate the risk.

4. Conclusion

In this study, we located some change points for the fitted GARCH models. In particular, considering the existence of some outlying observations in Bitcoin return series, we used the robust test for parameter change to detect change points. Based on the AIC that gives more penalty to the model with change points, we selected the model with optimal change points. The empirical results, including estimation results for each subperiod, showed that the whole period is meaningfully divided by the obtained change points, and the model with parameter changes provided a more accurate one-step-ahead 95% VaR compared to the VaRs from the naive models without any break. Our findings emphasize that the model allowing for parameter change can be better fitted to the Bitcoin data and thus can improve the accuracy of VaR forecasting.

Figures
Fig. 1. The plots of daily closing prices (left panel) and return series (right panel) of Bitcoin.
Fig. 2. P-values of the score test ( = 0) and with > 0. Dashed red line is the significance level of 10%.
Fig. 3. AICs of the GARCH (1, 1) model without break ( = 0, dashed line) and with breaks induced by .
Fig. 4. The estimated change points and the optimal subperiods obtained by = 0.22 (solid line).
Fig. 5. The boxplots of Bitcoin returns for each subperiod.
Fig. 6. One-step-ahead 95% VaRs obtained from the GARCH (1, 1) model with parameter change (blue), the GARCH (1, 1) model without break (red), and PTT–GARCH (1, 1) model without break (green).
TABLES

### Table 1

The binary segmentation results of $T˜nα$ with α = 0.22

Iteration no.Start pt.End pt.P-valueChg. pt. (α)
1128290.0171801
2118010.0441389
3113890.021352
413520.160·
535313890.0401045
635310450.187·
7104613890.134·
8139018010.0141449
9139014490.983·
10145018010.308·
11180228290.0992187
12180221870.389·
13218828290.0392389
14218823890.0012309
15218823090.845·
16231023890.772·
17239028290.112·

### Table 2

MDPD estimates and descriptive statistics for each subperiod

No.Periodμ̂ω̂α̂1β̂1NMeanStdSkew.Kurt.
12013/01/01 – 2013/12/190.6120.2260.2350.7663521.127.87−1.9218.03
22013/12/20 – 2015/11/12−0.0380.4140.1330.774693−0.103.83−0.577.14
32015/11/13 – 2016/10/210.0890.0960.0880.8153440.182.78−0.467.73
42016/10/22 – 2016/12/200.3520.0000.0000.944600.401.73−0.894.47
52016/12/21 – 2017/12/071.0031.3150.1920.7183520.874.650.083.20
62017/12/08 – 2018/12/28−0.0780.0000.0660.917386−0.384.55−0.341.54
72018/12/29 – 2019/04/290.1510.0160.0100.9581220.232.761.2912.03
82019/04/30 – 2019/07/181.4241.1050.1800.802800.915.47−0.290.63
92019/07/19 – 2020/09/30−0.0210.6320.0310.8614400.003.94−3.5246.18

2013/01/01 – 2020/09/300.1801.07270.1710.78928290.244.65−1.4523.73

References
1. Ardia D, Bluteau K, and Rüede M (2019). Regime changes in bitcoin GARCH volatility dynamics. Finance Research Letters, 29, 266-271.
2. Aue A and Horváth L (2013). Structural breaks in time series. Journal of Time Series Analysis, 34, 1-16.
3. Bariviera AF, Basgall MJ, Hasperue W, and Naiouf M (2017). Some stylized facts of the bitcoin market. Physica A: Statistical Mechanics and its Applications, 484, 82-90.
4. Berkes I, Horváth L, and Kokoszka P (2004). Testing for parameter constancy in GARCH(p,q) models. Statistics & Probability Letters, 4, 263-273.
5. Canh NP, Wongchoti U, Thanh SD, and Thong NT (2019). Systematic risk in cryptocurrency market: Evidence from DCC-MGARCH model. Finance Research Letters, 29, 90-100.
6. Caporale GM, Gil-Alana L, and Plastun A (2018). Persistence in the cryptocurrency market. Research in International Business and Finance, 46, 141-148.
7. Caporale GM and Zekokh T (2019). Modelling volatility of cryptocurrencies using Markov-Switching GARCH models. Research in International Business and Finance, 48, 143-155.
8. Chaim P and Laurini MP (2018). Volatility and return jumps in bitcoin. Economics Letters, 173, 158-163.
9. Chen C and Liu LM (1993). Joint estimation of model parameters and outlier effects in time series. Journal of the American Statistical Association, 88, 284-297.
10. Chu J, Chan S, Nadarajah S, and Osterrieder J (2017). GARCH modelling of cryptocurrencies. Journal of Risk and Financial Management, 10, 17.
11. Fearnhead P and Rigaill G (2019). Change point detection in the presence of outliers. Journal of the American Statistical Association, 114, 169-183.
12. Franses PH and Ghijsels H (1999). Additive outliers, GARCH and forecasting volatility. International Journal of forecasting, 15, 1-9.
13. Glaser F, Zimmermann K, Haferkorn M, Weber MC, and Siering M (Array). Bitcoin-asset or currency? revealing users’ hidden intentions. Revealing Users’ Hidden Intentions. ECIS 2014(Tel Aviv)
14. Horváth L and Rice G (2014). Extensions of some classical methods in change point analysis. TEST, 23, 219-255.
15. Hwang S and Basawa I (2004). Stationarity and moment structure for Box-Cox transformed threshold GARCH (1, 1) processes. Statistics & Probability Letters, 68, 209-220.
16. Jiang Y, Nie H, and Ruan W (2018). Time-varying long-term memory in bitcoin market. Finance Research Letter, 25, 280-284.
17. Katsiampa P (2017). Volatility estimation for bitcoin: A comparison of GARCH models. Economics Letters, 158, 3-6.
18. Kulperger R and Yu H (2005). High moment partial sum processes of residuals in GARCH models and their applications. The Annals of Statistics, 33, 2395-2422.
19. Kupiec P (1995). Techniques for verifying the accuracy of risk measurement models. The Journal of Derivatives, 3, 73-84.
20. Lee S and Song J (2009). Minimum density power divergence estimator for GARCH models. TEST, 18, 316-341.
21. Mensi W, Al-Yahyaee KH, and Kang SH (2019). Structural breaks and double long memory of cryptocurrency prices: A comparative analysis from bitcoin and ethereum. Finance Research Letters, 29, 222-230.
22. Ninomiya Y (2015). Change-point model selection via AIC. Annals of the Institute of Statistical Mathematics, 67, 943-961.
23. Pan J, Wang H, and Tong H (2008). Estimation and tests for power-transformed and threshold GARCH models. Journal of Econometrics, 142, 352-378.
24. Song J and Kang J (2018). Parameter change tests for ARMA-GARCH models. Computational Statistics & Data Analysis, 121, 41-56.
25. Song J and Kang J (2021). Test for parameter change in the presence of outliers: the density power divergence based approach. Journal of Statistical Computation and Simulation, 5, 1016-1039.
26. Thies S and Molnar P (2018). Bayesian change point analysis of bitcoin returns. Finance Research Letters, 27, 223-227.
27. Tiwari AK, Kumar S, and Pathak R (2019). Modelling the dynamics of bitcoin and litecoin: GARCH versus stochastic volatility models. Applied Economics, 51, 4073-4082.
28. Tsay RS (1988). Outliers, level shifts, and variance changes in time series. Journal of forecasting, 7, 1-20.