TEXT SIZE

search for



CrossRef (0)
Stochastic structures of world’s death counts after World War II
Communications for Statistical Applications and Methods 2022;29:353-371
Published online May 31, 2022
© 2022 Korean Statistical Society.

Jae J. Lee1,a

aSchool of Business, State University of New York, USA
Correspondence to: 1 School of Business, State University of New York, 1 Hawk Road, New Paltz, NY, USA. E-mail: leej@newpaltz.edu
Received November 4, 2021; Revised December 18, 2021; Accepted December 22, 2021.
 Abstract
This paper analyzes death counts after World War II of several countries to identify and to compare their stochastic structures. The stochastic structures that this paper entertains are three structural time series models, a local level with a random walk model, a fixed local linear trend model and a local linear trend model. The structural time series models assume that a time series can be formulated directly with the unobserved components such as trend, slope, seasonal, cycle and daily effect. Random effect of each unobserved component is characterized by its own stochastic structure and a distribution of its irregular component. The structural time series models use the Kalman filter to estimate unknown parameters of a stochastic model, to predict future data, and to do filtering data. This paper identifies the best-fitted stochastic model for three types of death counts (Female, Male and Total) of each country. Two diagnostic procedures are used to check the validity of fitted models. Three criteria, AIC, BIC and SSPE are used to select the best-fitted valid stochastic model for each type of death counts of each country.
Keywords : death counts, structural time series model, state space form, Kalman filter
1. Introduction

This paper analyzes yearly death counts after World War II of 8 countries in three regions, North America, Europe, and Asia-Pacific region to identify and to compare stochastic structures of death counts. The 8 countries are the United States and Canada in North America, United Kingdom, France, Italy, and Spain in Europe, and Taiwan and Australia in Asia-Pacific region. Death counts are from the year 1946 or 1970 (depending on availability) and are separated by gender and total counts to see whether gender influences the stochastic structures. The structural time series models (Harvey, 1981, 1989) assume that a time series can be formulated directly with the unobserved components such as trend, slope, seasonal, cycle, and daily effect. The random effect of each unobserved component is characterized by its stochastic structure and a distribution of its irregular component. Structural time series models that this paper entertained are a local level with a random walk model, a fixed local linear trend model, and a local linear trend model. These models are sensible choices based on the preliminary examination of death counts data. Structural time series models use the Kalman filter (Kalman, 1960) to estimate unknown parameters of the entertained model, forecast future values of time series, and estimate unobserved components in the stochastic model by filtering and smoothing. To apply the Kalman filter, the structural time series models need to be converted to a state space form (Durbin and Koopman, 2012) that is the standardized form for the Kalman filter. To check the validity, two diagnostic procedures are used for a fitted model: One is for checking the normality of residuals, for which the Shapiro-Wilk test is used and the normal QQ plot and density plot are used to confirm the results of the Shapiro-Wilk test. Second is for checking the independence of residuals, for which the Run test is used. To find the best-fitted model among valid models, Akaike information criterion (AIC), Bayesian information citerion (BIC) and sum of square of one-step-ahead prediction errors (SSPE) are used. The best-fitted valid models of death counts for each country by female, male and total are fully examined to see any differences or similarities among countries and regions. The organization of this paper is as follows. In Section 2, the structural time series model is presented. It also shows how to set up the state space form for each structural time series model entertained. In Section 3, the Kalman filter is introduced. This section also shows which R packages to use for the Kalman filter. In Section 4, results of analyzing death counts of 8 countries by female, male and total are presented. Finally, Section 5 concludes the paper.

2. Structural time series models and their state space forms

The structural time series models assume that a time series can be formulated directly with the unobserved components that are characterized by its own stochastic structure and an irregular term. By varying its own stochastic structure and the distribution of an irregular term of an unobserved component, structural times series models can fit a variety of time series in many fields. Outcomes of a structural time series model are estimates of unknown parameters, forecasts of future values of time series, and estimates of unobserved components in the model. Estimates of unobserved components of a model give an insight to fully understand a stochastic structure of a time series of interest. There are many papers regarding the structural time series model. For example, Harvey and Todd (1983) compared the structural time series model with Box and Jenkins’ ARIMA model. Harvey and Peters (1990) showed the number of methods to compute the maximum likelihood estimator of unknown parameters of the structural time series model.

For death counts data, three structural time series models are entertained: local level with random walk model, fixed local linear trend model, and local linear trend model. Choices of these models are based on the examinations of plots of death counts of 8 countries by female, male and total. Plots of death counts of all 8 countries by female, male and total are presented in Section 4.

2.1. Local level with a random salk model

First structural time series model entertained in this paper is the local level with a random walk (LLRW) model:

{yt=μt+ɛt,μt=μt-1+ηt,

where yt is an observed time series data, μt is an unobserved trend component that represents the long-term movement in a time series, ηt is an irregular component that shows the stochastic behavior of the trend of time series, and εt is an irregular component that shows the stochastic behavior other than the trend defined in the second equation in (2.1).

2.2. Local linear trend model

Second model entertained is the local linear trend (LT) model:

{yt=μt+ɛt,μt=μt-1+βt-1+ηt,βt=βt-1+ξt,

where yt is an observed time series data, μt is an unobserved trend component, βt is an unobserved slope component, ζt is an irregular component that shows the stochastic behavior of the slope of the trend of time series. ηt and εt are defined as (2.1). This model shows that not only a trend of series but also a slope of the trend is stochastic with its own irregular component.

2.3. Fixed local linear trend model

Third model is the fixed local linear trend (FT) model that is a variation of LT model:

{yt=μt+ɛt,μt=μt-1+β+ηt,

where yt is an observed time series data, β is a deterministic slope of the trend, μt, ηt, and εt are defined as (2.1) and (2.2). This model assumes that a series has a deterministic slope, rather than a stochastic.

2.4. State space form

Structural time series model uses the Kalman filter to estimate unknown parameters, to forecast future values and to estimate unobserved components. To use the Kalman filter, a state space form (Durbin and Koopman, 2012) is required. The state space form is a standardized form of a stochastic model as an input for the Kalman filter. For the LLRW model (2.1), it is the state space form itself. Thus, the state space form (2.1) has 1 × 1 state vector, μt and two irregular components, ηt and εt. The state space form of the LT model (2.2) is:

{yt=(10)(μtβt)+ɛt,(μtβt)=(1101)(μt-1βt-1)+(1001)(ηtζt).

The state space form (2.4) has 2 × 1 state vector, (μtβt) and three irregular components, ηt, ζt and εt. The state space form of the FT model (2.3) is:

{yt=(10)(μtβt)+ɛt,(μtβt)=(1101)(μt-1βt-1)+(10)ηt.

The state space form (2.5) has 2 × 1 state vector, (μtβt) and two irregular components, ηt and εt.

3. Kalman filter

Kalman filter has been applied to many fields since Kalman’s first paper (Kalman, 1960) was published. Harrison and Stevens (1971) was the first paper to apply the Kalman filter to a time series analysis. Since then, the Kalman filter has been used to analyze several time series models such as ARIMA models (Box and Jenkins, 1976), structural time series models (Harvey, 1989), and ARMAX models (Hannan and Deistler, 1988). Some areas where it has been applied include disease control (Gove and Houston, 1996), actuary claim reserves forecasting (Chukhrova and Johannssen, 2017), rain fall forecasting (Asemota et al., 2016; Zulfi et al., 2018), and machine learning (Nobrega and Oliveira, 2019).

Kalman filter can be applied to either univariate and multivariate time series, and to either time variant structure or time invariant structure of time series. Time series data, Yt, for t = 1, 2, …, T denotes the observed values of a time series of interest. Yt could be a univariate or multivariate. αt denotes the unobserved component vector, called the state vector. For the Kalman filter, the observed data Yt and the unobserved state vector αt have the linear relationship as

Yt=Ztαt+ɛt,         t=1,2,,T,

where Yt is N × 1 (N = 1 for the univariate and N > 1 for the multivariate), Zt is N × m matrix, αt is m × 1 vector, and εt is N × 1 vector of irregular components. For equation (3.1) , Zt is assumed to be a known quantity that shows the relationship between Yt and αt, and εt is assumed to have a multivariate normal distribution with mean of N × 1 vector of zero and covariance of N × N matrix, ht. Equation (3.1) is called the observation equation. For a univariate Yt, the observation equation (3.1) can be written as

Yt=ztαt+ɛt,         t=1,2,,T,

where Yt is a scalar, zt is 1 × m vector, αt is m × 1 vector, εt is a scalar irregular component whose distribution is a normal with mean of zero and a scalar variance, ht. The other equation required in the Kalman filter is called the transition equation, which shows the stochastic behavior of the state vector, αt. The transition equation is also linear as

αt=Ttαt-1+Rtγt,         t=1,2,,T,

where Tt is m × m matrix, Rt is m × g matrix, γt is g × 1 vector of irregular terms whose distribution is assumed to be a multivariate normal with mean of g × 1 vector of zero and covariance of g × g matrix, Qt. The matrices Zt, Tt and Rt, and covariance matrices, ht and Qt may or may not change over time. If the matrices do not change over time, the model is called the time-invariant Kalman filter. Otherwise, it is called the time-variant Kalman filter. Equation (3.1) and (3.3) together is called the multivariate state space form of the Kalman filter, and (3.2) and (3.3) together is called the univariate state space form of the Kalman filter. By comparing (3.2) and (3.3), the state space form of the LLRW model (2.1) has the following identities:

{αt=μt,zt=1,Tt=1,Rt=1,ɛt=ɛt,γt=ηt,ht=variance of ɛt:=VarE,Qt=variance of ηt:=VarN.

The state space form of the LT model (2.4) has the following identities:

{αt=(μtβt),zt=(10),Tt=(1101),Rt=(1001),ɛt=ɛt,γt=(ηtζt),ht=variance of ɛt:=VarE,Qt=variance of (ηtζt):=(VarN00VarK).

The state space form of the FT model (2.5) has the following identities:

{αt=(μtβt),zt=(10),Tt=(1101),Rt=(10),ɛt=ɛt,γt=ηt,ht=variance of ɛt:=VarE,Qt=variance of ηt:=VarN.

The Kalman filter has three assumptions: A1) the initial state vector, α0 has a mean of a0 and Covariance matrix of P0. A2) the disturbances ηt and γt are independent of each other. This assumption could be relaxed. A3) the disturbances ηt and γt are independent with the initial state, α0.

Given the information of initial state variable, α0, the Kalman filter starts off the recursive algorithm. The recursive algorithm provides two estimates of the state variable αt. First is the filtered estimate. It is the estimate of αt given Yt = (yt, …, y1). Second is the forecast estimate. It is the estimate of αt for t = T + 1, T + 2, … given YT = (yT, …, y1). The filtered estimates of the state variable αt provide estimates of unobserved components in the structural time series models. These estimates are minimum mean square estimators (MMSE).

The information required to start off the Kalman filter is mean and covariance matrix of the initial state vector, α0, which are a0 and P0, respectively. If the state vector αt is a nonstationary, then a distribution of α0 is given as a diffuse prior whose mean is given as the first data in the series and covariance matrix is given as kI where k is a very large scalar and I is an m × m identity matrix where m is the dimension of the state vector, αt. For the LLRW model, αt = μt and it is a nonstationary and m = 1. Therefore, a0 is set as y1 and P0 is set as 1.2 e + 10. For the LT model, αt=(μtβt) is also a nonstationary and m = 2. Therefore, a0 is set as (y10) and P0 is set as (k00k) where k = 1.2e + 10. For the FT model, αt=(μtβt) is also a nonstationary and m = 2 and slope βt is a deterministic. Therefore, a0 is set as (y1β) and P0 is set as (k000), where k = 1.2e + 10 and the value of β is the estimate of slope obtained by doing a simple linear regression of yearly data on time.

In the state space form in (3.1) and (3.3) for a multivariate time series or in (3.2) and (3.3) for a univariate time series, there are some unknown parameters in matrices, Zt, Tt, ht and Qt. Before running the recursive algorithm, these unknown parameters should be estimated. From (3.4), (3.5) and (3.6), unknown parameters of LLRW, LT and FT models are ψLLRW = (VarN, VarE), ψLT = (VarN, VarK, VarE) and ψFT = (VarN, VarE), respectively. If the disturbances of εt and ηt are normally distributed, the likelihood function of the observations could be obtained from the Kalman filter via the prediction error decomposition (Harvey and Peters, 1990). Unknown parameters are estimated by maximizing the likelihood function with respect to the unknown parameters.

To analyze our data by the Kalman filter, a R function, StructTS in the R package, stats is used. The function StructTS, first, estimates unknown parameters in matrices, Zt, Tt, ht and Qt, and then given the estimates of unknown parameters, provides predictions and filtered estimations for univariate time series based on the state space form of (3.2) and (3.3). To estimate unknown parameters in matrices, Zt, Tt, ht and Qt, StructTS calls a R function, optim in the R package, stats. For the method for optimization in optim, L-BFGS-B is used. L-BFGS-B is described by Byrd et al. (1995) which allows box constraints, which means that each variable can be given a lower and/or upper bound. For our models, since unknown parameters are variances of irregular terms, the lower bound is set as 0 and the upper bound is set as infinity. The initial values to start the function optim are any nonnegative numbers. This method uses a limited-memory modification of the BFGS, which is a quasi-Newton method. Outputs of StructTS are estimates of unknown parameters, loglikelihood, standardized residuals, and filtered estimates of state vector αt. The standardized residuals are used for the diagnostics of fitted models. From the filtered estimates of αt, estimates of unobserved components of LLRW, LT and FT are obtained.

4. Analysis of data

4.1. Death counts data

Yearly death counts data of 8 countries analyzed in this paper are extracted from the Human Mortality Database (HMD). HMD provides several data such as death counts, census counts, birth counts, and population estimates for calculations of death rates and life tables. The main goal of the HMD is to document the longevity revolution of the modern era and to facilitate research in its causes and consequences. HMD includes relatively wealthy and highly industrialized countries since it is based on design to populations where death registration and census data are virtually complete. In this paper, death counts data of 8 countries are analyzed: Two countries in North America (U.S. and Canada), four countries in Europe (U.K., France, Italy, and Spain), two countries in Asia-Pacific (Taiwan and Australia). From death counts from HMD, three death counts data are generated for each country: Female death counts, Male death counts, and Total death counts. The annual periods of data for countries are for U.S. (1946–2017), Canada (1946–2016), U.K. (1946–2016), France (1946–2017), Spain (1946–2018), Italy (1946–2017), Australia (1946–2018), and Taiwan (1970–2014). HMD keeps death counts data for Taiwan only from the year 1970. This paper uses data after World War II (1939–1945) since there are serious outliers during World War II, especially for data of countries in Europe.

4.2. Diagnostics and Best fitted model selection

All three models, LLRW, FT, and LT are fitted to three types of death counts (female, male and total) of all 8 countries. For each model fitted, two diagnostic procedures based on residuals are applied to see the validity of the fitted model. The residuals in the Kalman filter are obtained from one-step-ahead-prediction errors (also called innovations). One-step-ahead-prediction errors are obtained by the Kalman filter with the estimated values of unknown parameters of the fitted model. Standardized residuals are obtained from one-step-ahead-prediction errors divided by the standard deviation of one-step-ahead-prediction errors. If a fitted model is valid, then the standardized residuals are independent and identically distributed by a normal distribution (Harvey, 1989). Thus, the first diagnostic procedure is to check the normality of the standardized residuals and the second diagnostic procedure is to check the independence of the standardized residuals. Shapiro-Wilk test is used to check the normality, and normal QQ plots and density plots are used to confirm the conclusions of Shapiro-Wilk test. Run test is used to check the independence of the standardized residuals. Models that pass two diagnostics procedures are treated as the valid ones. Once the valid models are identified for each type of data after the diagnostic procedures, a best-fitted valid model is selected based on AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion) and SSPE (Sum of Square of one-step-ahead-prediction errors). If these three criteria do not recommend a model unanimously, then the most recommended one is selected as the best-fitted valid model.

4.3. North American countries, U.S. and Canada

Figure 1 shows plots of death counts of female, male, and total for U.S. and Canada. Both U.S. and Canadian data show a similar linear trend for death counts of female, male and total. Also, both countries’ data show that death counts for males are larger than those of females up to the year 2000. Since then, death counts of females and males are close to each other. Since both countries’ data show a clear linear trend, it is sensible to fit the LT and FT model. The LT model assumes that the slope of linear trend is stochastic, and the FT model assumes that it is deterministic. Outcomes of the LLRW model are also provided to compare with them. Table 1 shows p-values of Shapiro-Wilk tests for testing normality of the standardized residuals of three models fitted to three types of death counts in U.S. and Canada. The p-values of two countries show that none of the three models failed for normality on the level of 0.01. Figure 2 and 3 show normal QQ plots for U.S. and Canada, respectively. These plots confirm the results of Shapiro-Wilk tests. Table 2 shows the p-values of Run tests for independence of standardized residuals. The p-values of two countries show that none of the three models failed for independence on the level of 0.01. Thus, for two countries, none of the three models failed to be valid. Table 3 presents the best model for each type of data based on AIC, BIC and SSPE for the valid models of U.S. and Canada. If a model is not selected unanimously by all three criteria, then the most commonly recommended one is selected as the best one. For example, for total deaths of U.S., the FT is the selected as the best one. Thus, for the U.S., the FT models are the best for both female and total deaths and the LT model is the best for male deaths. For Canada, the LT models are the best for both female and total deaths and the FT model is the best for male deaths. Table 4 presents the estimated values of unknown parameters of the best model for each type of deaths in U.S. and Canada. It is noted from Section 3 that unknown parameters of LLRW, LT and FT model are, respectively, ψLLRW= (VarN, VarE), ψLT = (VarN, VarK, VarE) and ψFT = (VarN, VarE). Thus, for FT and LLRW models, the parameter estimate for VarK is NA.

4.4. European countries, U.K. and France

Figure 4 and 7 show plots of death counts of female, male and total for U.K. and France, and Italy and Spain, respectively. Deaths of European countries do not show a clear linear trend that deaths of North American countries show. Table 5 shows p-values of the Shapiro-Wilk tests for testing normality of the standardized residuals of models fitted to three types of death counts in U.K. and and France. The p-values of two countries show that none of the three models failed for normality on the level of 0.01. Figure 5 and 6 show normal QQ plots of U.K. and France, respectively. These plots confirm the results of Shapiro-Wilk tests. Table 6 shows the the p-values of Run tests for independence of standardized residuals. The p-values of two countries show that none of the three models failed for independence on the level of 0.01. Thus, for two countries, none of the three models failed to be valid. Table 7 presents the best model for each type of data based on AIC, BIC and SSPE for the valid models for each type for the U.K. and France. As North American countries, if a model is not selected by unanimously, then the most commonly recommended one is selected. For the U.K., the LT models are the best for all three types and for France, the LLRW models are the best for all three types. Table 8 shows estimates of unknown parameters of the best model for each type of deaths in U.K. and France. As Table 4, for the LLRW models, parameter estimates for VarK are NAs.

4.5. European countries, Italy and Spain

Table 9 shows p-values of Shapiro-Wilk tests for testing normality of the standardized residuals of models fitted to three types of death counts in Italy and Spain. The p-values show that for Italy, none of models failed for normality on the level of 0.01, and for Spain, however, the LLRW models for both male and total deaths failed for normality on the level of 0.01. Figure 8 and 9 show the normal QQ plots for Italy and Spain, respectively. For Spain, normal QQ plots of both male and total deaths of the LLRW model show clearly the non-normality (left skewed shape) of data which confirms the results of Shapiro-Wilk tests. Figure 10 shows the normal QQ plots of both male LLRW and total LLRW model of Spain again and corresponding density plots for standardized residuals. residuals. Density plots show clearly the left skewed shape of densities. Table 10 shows the p-values of Run tests for Italy and Spain. The p-values of two countries show that none of the three models failed for independence on the level of 0.01. Thus, for countries, all models except male LLRW of Spain and total LLRW of Spain, pass both diagnostic procedures and thus are valid ones. Table 11 presents the best model for each type of data based on AIC, BIC and SSPE for the valid models for Italy and Spain. For Italy, the LT model is the best for female and the LLRW models are the best for both male and total deaths, and for Spain, the LT models are the best for all three types. Table 12 shows estimates of unknown parameters of the best-fitted model for each type of deaths in Italy and Spain.

4.6. Asia-Pacific countries, Taiwan and Australia

Figure 11 shows plots of death counts of female, male and total deaths for Taiwan and Australia. Deaths of these two countries show a clear linear trend similar to deaths of the two North American countries mentioned above. Table 13 shows p-values of Shapiro tests for testing normality of the standardized residuals of models fitted to three types of death counts in Taiwan and Australia. The p-values show that none of models for all three types of deaths in Taiwan and Australia failed for normality on the level of 0.01. Figure 12 and 13 show the normal QQ plots for Taiwan and Australia, respectively. These plots confirm the results of Shapiro-Wilk tests. Table 14 shows the p-values of Run tests for Taiwan and Australia. The p-values of two countries show that none of the three models failed for independence on the level of 0.01. Thus, for two countries, none of the three models failed to be valid. Table 15 presents the best model for each type of data based on AIC, BIC and SSPE for the valid models for Taiwan and Australia. For Taiwan, the LT models are the best for both female and total deaths and the FT model is the best for male deaths, and for Australia, the FT models are the best for all three types of deaths. Table 16 shows estimates of unknown parameters of the best-fitted model for each type of deaths in Taiwan and Australia.

4.7. Signal extraction for unobserved components

One advantage to fit a structural time series model using the Kalman filter is that unobserved components of the time series can be estimated. Both the LLRW and the FT model have an unobserved trend component, μt. The LT model has two unobserved components, trend and slope, μt and βt, respectively. The Kalman filter provides filtered estimates for unobserved components. For U.S. deaths, it is noted from Table 3 that the best-fitted model for female, male and total deaths is the FT, the LT and the FT model, respectively. Figure 14 shows filtered estimates of μt of U.S. female deaths, filtered estimates of μt and βt of U.S. male deaths, and filtered estimates of μt of U.S. total deaths.

5. Conclusions

This paper analyzes female, male and total death counts of 8 countries in several regions in the world using three structural time series models with the Kalman filter. Three structural models are a local level with a random walk (LLRW) model, a fixed local linear trend (FT) model and a local linear trend (LT) model. LLRW model implies that the level of data moves stochastically based on a random walk model. Thus, this model is good to fit data without a clear linear pattern. FT model implies that the level of data moves stochastically with a deterministic slope. Thus, this model is good to fit data with a clear linear pattern. LT model implies that the level of data moves stochastically with a stochastic slope. That is, both level and slope move stochastically and thus this model is the most flexible. Thus, this model is good to fit data both with and without a clear linear pattern. Death counts of all three types of deaths in two North American countries and two Asia-Pacific countries show similar linear trends. Best fitted stochastic models for both North American countries and Asia-Pacific countries are either FT or LT model. Table 17 shows the signal to noise ratios for trend (VarN/VarE) and those of slope (VarK/VarE) of the best-fitted stochastic models for two North American countries. For the U.S., female deaths show higher signal to noise ratio for trend than that of male deaths. For Canada, however, female deaths show lower signal to noise ratio for trend than that of male deaths. Signal to noise ratios of slope are small for both countries. Note that NA in the table is for the FT model where VarK is not presented. Table 18 shows the signal to noise ratios for two Asia-Pacific countries. It shows that two countries do not have high signal to noise ratios for trend. Male deaths of both countries have higher signal to noise ratios than those of female death. Taiwan has low signal to noise ratios of slope as well. European countries do not show any clear linear trend. Best fitted stochastic models for European countries are either LLRW or LT. Table 19 and 20 show that European countries have low signal to noise ratios for both trend and slope.

Table 4, 8, 12 and 16 show the estimates of values of variances of irregular components, VarN, VarK and VarE. These variances show how much each component in the structural model moves stochastically up and down. For example, value of VarN shows how much the level of the trend, μt moves stochastically up and down over time, value of VarK shows how much the slope, βt moves stochastically up and down over time and value of VarE shows how much the observation, yt moves stochastically up and down around the unobserved trend, μt over time. Thus, the large variances do not imply that data has the large magnitudes of the trend and the slope, but imply that data has the large stochastic movements of the trend and the slope. Thus, large values of VarN and VarK of our data imply that our data has the trend and the slope which move a lot stochastically. To see the point, it is noted that for Australia, the best models for both female deaths and total deaths are the FT and from Table 18, ratios of VarN/VarE are similar each other, but from Table 16, VarN of total deaths is a lot larger than that of Female. Thus it implies that the level of the trend for total deaths moves stochastically more than that of female deaths. Figure 15 shows the estimates of the trend for both female deaths and total deaths from the Kalman filter. The plots of two estimated trends show that the trend of total deaths shows more volatility than that of female. Standard deviation of the trend of total deaths is 23064.75 and that of female deaths is 12681.7.

In Table 18 through 20, value of signal to noise ratio for trend (VarN/VarE) of LT model is zero for female model of Taiwan, male and total model of U.K., female and total model of Italy, and all models of Spain. For these cases, LT model (2.2) is adjusted as

{yt=μt+ɛt,μt=μt-1+βt-1,βt=βt-1+ζt.

This adjusted model implies that the level of the trend at time t moves stochastically based on only the level of trend and slope at time t − 1 without an irregular component of the trend. In this case, the stochastic movement of the level of the trend is purely based on the stochastic movement of slope, VarK.

Figures
Fig. 1. Death counts for female, male and total of U.S. and Canada.
Fig. 2. QQ plots of U.S.
Fig. 3. QQ plots of Canada.
Fig. 4. Death counts for female, male and total of U.K. and France
Fig. 5. QQ plots of U.K.
Fig. 6. QQ plots of France.
Fig. 7. Death counts for female, male and total of Italy and Spain.
Fig. 8. QQ plots of Italy.
Fig. 9. QQ plots of Spain.
Fig. 10. QQ plots and Density plots for male and total of Spain.
Fig. 11. Death counts for female, male and total of Taiwan and Australia.
Fig. 12. QQ plots of Taiwan.
Fig. 13. QQ plots of Australia.
Fig. 14. Filtered estimates of unobserved components for U.S. female, male and total Deaths.
Fig. 15. Estimates of trend component for female and total deaths of Australia.
TABLES

Table 1

p-values of Shapiro-Wilk tests for death counts of U.S. and Canada

U.S.Canada


LLRWFTLTLLRWFTLT
Female0.9360.6660.7170.7240.6100.723
Male0.4000.5850.6590.5780.4060.198

Total0.9300.9560.9930.8670.9160.603

Table 2

p-values of Run tests for death counts of U.S. and Canada

U.S.Canada


LLRWFTLTLLRWFTLT
Female0.0420.5490.6310.8090.3360.543
Male0.2810.2810.9980.4710.4710.395

Total0.2810.9040.9980.8090.8090.543

Table 3

Best models for death counts of U.S. and Canada

U.S.Canada


AISBISSSPEAISBISSSPE
FemaleFTFTFTLTLTLT
MaleLTLTLTFTFTLT

TotalFTFTLTLTFTLT

Table 4

Estimates of parameters for death counts of U.S. and Canada

U.S.Canada


VarNVarKVarEVarNVarKVarE
Female by FT1.922503E+08NA3.115608E+07Female by LT5.618963E+051.730798E+043.000166E+05
Male by LT5.625876E+071.459582E+077.079134E+07Male by FT1.437916E+06NA4.166670E+04

Total by FT7.584381E+08NA1.080131E+08Total by LT3.156514E+063.567977E+047.245420E+05

Table 5

p-values of Shapiro-Wilk tests for death counts of U.K. and France

U.K.France


LLRWFTLTLLRWFTLT
Female0.9990.9950.8690.8190.5530.489
Male0.8290.9020.3450.3770.2510.306

Total0.9960.9690.8070.8020.4480.314

Table 6

p-values of Run tests for death counts of U.K. and France

U.K.France


LLRWFTLTLLRWFTLT
Female0.4710.4710.9020.7190.7190.631
Male0.0920.8090.9020.0730.0730.054

Total0.8090.8090.5430.4010.1880.631

Table 7

Best models for death counts of U.K. and France

U.K.France


AISBISSSPEAISBISSSPE
FemaleLTLTLLRWLLRWLLRWLLRW
MaleLTLTLLRWLLRWLLRWLLRW

TotalLTLTLLRWLLRWLLRWLLRW

Table 8

Estimates of parameters for death counts of U.K. and France

U.K.France


VarNVarKVarEVarNVarKVarE
Female by LT6.732207E+062.012691E+054.868096E+07Female by LLRW1.462518E+07NA4.612672E+07
Male by LT0.000000E+005.816742E+053.787161E+07Male by LLRW9.614802E+06NA2.995601E+07

Total by LT0.000000E+002.140323E+061.745933E+08Total by LLRW4.473024E+07NA1.475258E+08

Table 9

p-values of Shapiro-Wilk tests for death counts of Italy and Spain

ItalySpain


LLRWFTLTLLRWFTLT
Female0.6890.3030.1680.0190.1240.503
Male0.4530.0890.1290.000*0.0210.119

Total0.5190.2150.0980.002*0.0660.278

*Significance with 0.01


Table 10

p-values of Run tests for death counts of Italy and Spain

ItalySpain


LLRWFTLTLLRWFTLT
Female0.4020.1880.3360.8120.4760.719
Male0.7190.7190.6310.4760.4760.719

Total0.4020.1880.3360.4760.2350.719

*Significance with 0.01


Table 11

Best models for death counts of Italy and Spain

ItalySpain


AISBISSSPEAISBISSSPE
FemaleLTLTLLRWLTLTLLRW
MaleLLRWLLRWLLRWLTLTLLRW

TotalLTLLRWLLRWLTLTLLRW

Table 12

Estimates of parameters for death counts of Italy and Spain

ItalySpain


VarNVarKVarEVarNVarKVarE
Female by LT0.000000E+007.834581E+054.880840E+07Female by LT0.000000E+002.607934E+051.998147E+07
Male by LLRW3.068192E+07NA2.589466E+07Male by LT0.000000E+005.494487E+052.062101E+07

Total by LLRW1.113686E+08NA1.315861E+08Total by LT0.000000E+001.511941E+067.905366E+07

Table 13

p-values of Shapiro-Wilk tests for death counts of Taiwan and Australia

TaiwanAustralia


LLRWFTLTLLRWFTLT
Female0.6230.5700.9390.2430.2430.358
Male0.6230.8620.9160.2650.1870.483

Total0.4500.6550.9380.1670.1890.198

*Significance with 0.01


Table 14

p-values of Run tests for death counts of Taiwan and Australia

TaiwanAustralia


LLRWFTLTLLRWFTLT
Female0.3610.1270.2780.0180.0180.402
Male0.1270.7610.4420.3420.9980.719

Total0.7610.3610.6410.1540.3420.402

*Significance with 0.01


Table 15

Best models for death counts of Taiwan and Australia

TaiwanAustralia


AISBISSSPEAISBISSSPE
FemaleLTLTFTFTFTLT
MaleLTFTFTFTFTLLRW

TotalLTLTLTFTFTLLRW

Table 16

Estimates of parameters for death counts of Taiwan and Australia

TaiwanAustralia


VarNVarKVarEVarNVarKVarE
Female by LT0.000000E+002.669466E+043.679238E+05Female by FT7.231256E+05NA7.907837E+05
Male by FT1.282648E+06NA3.192780E+05Male by FT9.721123E+05NA7.088836E+05

Total by LT1.765875E+051.316371E+052.106885E+06Total by FT3.093034E+06NA2.932334E+06

Table 17

Signal to Noise Ratio for U.S. and Canada

U.S.Canada


VarN/VarEVarK/VarEVarN/VarEVarK/VarE
Female by FT6.171NAFemale by LT1.8730.058
Male by LT0.7950.206Male by FT34.510NA

Total by FT7.022NATotal by LT4.3570.049

Table 18

Signal to Noise Ratio for Taiwan and Australia

TaiwanAustralia


VarN/VarEVarK/VarEVarN/VarEVarK/VarE
Female by LT0.000.07Female by FT.0.91NA
Male by FT4.020.04Male by FT1.37NA

Total by LT0.080.06Total by FT1.05NA

Table 19

Signal to Noise Ratio for U.K. and France

U.K.France


VarN/VarEVarK/VarEVarN/VarEVarK/VarE
Female by LT0.140.00Female by LLRW0.32NA
Male by LT0.000.02Male by LLRW0.32NA

Total by LT0.000.01Total by LLRW0.30NA

Table 20

Signal to Noise Ratio for Italy and Spain

ItalySpain


VarN/VarEVarK/VarEVarN/VarEVarK/VarE
Female by LT0.000.02Female by LT0.000.01
Male by LLRW1.18NAMale by LT0.000.03

Total by LLRW0.85NATotal by LT0.000.02

References
  1. Asemota OJ, Bamanga MA, and Alaribe OJ (2016). Modelling seasonal behavior of rainfall in northeast Nigeria. A State Space Approach. International Journal of Statistics and Applications, 6, 203-222.
  2. Box GEP and Jenkins GM (1976). Time Series Analysis: Forecasting and Control, San Francisco, Holden-Day.
  3. Byrd RH, Lu P, Nocedal J, and Zhu C (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16, 1190-1208.
    CrossRef
  4. Chukhrova N and Johannssen A (2017). State space models and the Kalman-Filter in stochastic claims reserving: Forecasting, Filtering and Smoothing. Risks, 5, 30.
    CrossRef
  5. Durbin J and Koopman SJ (2012). Time Series Analysis by State Space Methods (2nd Ed), Oxford, Oxford University Press.
    CrossRef
  6. Gove JH and Houston DR (1996). Monitoring the growth of American beech affected by beech bark disease in Maine using Kalman filter. Environmental and Ecological Statistics, 3, 167-187.
    CrossRef
  7. Harvey AC (1981). Time Series Models, Deddington, Oxford University Press, Oxford.
  8. Harvey AC and Todd PHJ (1983). Forecasting economic time series with structural and Box-Jenkins models, (with discussion). Journal of Business and Economic Statistics, 1, 299-315.
  9. Harvey AC (1989). Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge, Cambridge University Press.
  10. Harvey AC and Peters S (1990). Estimation procedures for structural time series models. Journal of Forecasting, 9, 89-108.
    CrossRef
  11. Hannan EJ and Deistler M (1988). The Statistical Theory of Linear System, New York, Wiley.
  12. Harrison PJ and Stevens CF (1971). A Bayesian approach to short-Term forecasting. Operations Research Quarterly, 22, 341-362.
    CrossRef
  13. Kalman RE (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82, 34-45.
    CrossRef
  14. Nobrega JP and Oliveira ALI (2019). A sequential learning with Kalman filter and extreme learning machine for regression and time series forecasting. Neurocomputing, 337, 235-250.
    CrossRef
  15. Zulfi M, Hasan M, and Purnomo KD (2018). The development rainfall forecasting using Kalman filter. Journal of Physics: Conf Series, 1008, 012006.