TEXT SIZE

search for



CrossRef (0)
N-point modified exponential model for household projections in Korea using multi-point register-based census data
Communications for Statistical Applications and Methods 2024;31:377-391
Published online July 31, 2024
© 2024 Korean Statistical Society.

Saebom Jeona, Tae Yeon Kwon1,b

aDepartment Marketing Bigdata, Mokwon University, Korea;
bDepartment of International Finance, Hankuk University of Foreign Studies, Korea
Correspondence to: 1 Department of International Finance, Hankuk University of Foreign Studies, 81 Mohyeon-myeon, Oedae-ro Cheoin-gu, Yongin-si, Gyeonggi-do 17035, Korea. Email: tykwon@hufs.ac.kr.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT)(NRF-2022R1F1A1065520 to Jeon, and NRF-2021R1F1A1059513 to Kwon) and this work was supported by Hankuk University of Foreign Studies Research Fund.
Received November 29, 2023; Revised December 24, 2023; Accepted December 24, 2023.
 Abstract
Accurate household projections are essential for sectors such as housing supply and tax policy planning, given the rapid social changes like declining birthrates, an aging population, and a rise in single-person households that impact household size and type. Korea introduced its first register-based census in 2015, transitioning from five-year general survey-based approach to an annual administrative data-based census. This change in census allows for more frequent and effective capturing the rapid demographic shifts and trends. However, this change in census has caused challenges in future projection by the existing household projection model due to the rapid dynamics. This paper proposes a new household projection method, the N-point Modified Exponential Model (MEM), that accurately reflects register-based census data and mitigates the impact of rapid demographic changes, in three types: the Weighted N-point MEM, the Regression-based N-point MEM, and the Rolling Weighted N+point MEM. Using register-based census data from 2016 to 2020 to forecast household headship rates by age, household size, and household type to 2051, the N-point modified exponential model outperformed the existing model in both long- and short-term forecast accuracy, suggesting its suitability as a future household projection model for Korea.
Keywords : two-point exponential model, register-based census, household projection, headship rate
1. Introduction

Household projection means forecasting future household sizes and types based on future population projections, while considering recent trends in household changes. Households serve as the smallest common living group in society and fulfilling the function of population reproduction. Economically, they represent the basic unit of consumption, as the foundational unit for the supply and demand in the housing market and for durable goods. Therefore, information on both the quantitative and qualitative changes in households is essential for formulating social and economic government policies.

The pace of changes in household structures is accelerating across different times and countries (Hu and Peng, 2015; Jacobsen et al., 2012). This is also observed in Korea, where the rapid shift towards single-person and elderly households underscores the growing importance of accurate household projections (Kim et al., 2018). Different countries have varying approaches to these projections: Korea and Japan publish their future household projections every five years, coinciding with their census schedules (Kim et al., 2018; National Institute of Population and Social Security Research, Japan, 2018; Kajiwara et al., 2022). In contrast, Australia releases its projections irregularly, but also based on a five-yearly census (Australian Bureau of Statistics, 2019; Wilson, 2013). England, Scotland, and Wales, on the other hand, update their household projections every two or three years, based on a ten-yearly census (Nash, 2021; Taylor, 2020; Office for National Statistics, UK, 2020).

However, due to both internal and external challenges in the survey environment and increasing difficulty and costs associated with traditional censuses, several countries, including Denmark, Sweden, Australia, and Singapore, have transitioned to a register-based census. Unlike traditional censuses where surveyors personally visit households, a register-based census generates statistics using administrative data sources like resident registers, building registers, and immigration records (Park and Lee, 2017; Jun, 2020). Korea introduced its first register-based census in 2015, transitioning from a five-year general survey-based approach to an annual administrative data-based census.

The change in census not only shortens the data cycle but also provides more data points within the same time interval, allowing population trends and fluctuations to be captured over shorter intervals. While the register-based census effectively captures rapid shifts in Korea’s population and household structure, these quick changes can result in unrealistic projections when using the existing two-point modified exponential model for headship rate projection. In this paper, we introduce a new and robust household projection model that accurately reflects register-based census data and maintains stability in long-term projections, even in rapid and transient trend changes.

Household projections are generally categorized into two types: static methods, which rely on population and household distributions at a specific point in time, and dynamic methods, which account for changes in individuals and households over time. Depending on the data source, these methods can be macro, using census or registration data, or micro, employing samples or individual data (Wilson, 2013; van Imhoff et al., 2013; Bell et al., 1995). In major countries like the United States, Canada, and the United Kingdom, as well as in Korea, macro-static methods based on the census are commonly used. Since 2012, Korea has also implemented a semi-dynamic method, the household headship rate method, which considers changes in marital status. The headship rate is the proportion of individuals who become heads of households, differentiated by gender and age, and it is distinct from the concept of a family, referring to ‘one or more people who realistically live together’ (Bell and Cooper, 1990; Ironmonger and Lloyd-Smith, 1992; Kim et al., 2018). It is noted that the majority of countries using the headship rate method adopt a two-point exponential model for projection due to its simplicity and the suitability of its data requirements (Glick, 1957; Alias et al., 2018).

Given that the two-point exponential model relies on two censuses with extended time intervals, it is not ideally suited for register-based censuses that offer multiple data points over shorter intervals. Therefore, we aimed to develop a new household headship rate estimation model that can take into account household changes and utilize all the accumulated time series information, which is an advantage of the registration census, where multiple point-in-time data are accumulated with a short time frequency. In this paper, we propose an N-point modified exponential method (MEM), which are designed to reflect the dynamics of the household headship rate at various times using registration data, and consider three modifications: Weighted N-point MEM, regression-based N-point MEM, and rolling weighted N+point MEM.

Using these methods, we projected the future household headship rates by number of household members and by household type based on the 2016 to 2020 resident registration-based census statistics in Korea. First, we conduct long-term projections up to 2051 and compare these results with those from the existing two-point exponential model. Long-term projections over 20 years using the traditional two-point exponential model tend to produce divergent fluctuations and unstable projections. In contrast, the MEM methods reduce this divergence and instability. Particularly, the rolling weighted N+point MEM method ensures stable projection results.

Second, we assess the accuracy of the short-term projection using the 2021 and 2022 data from the updated actual register-based censuses. The results show that the MEM methods generally outperform the two-point exponential model. When the rate of change in the headship rate is consistently increasing or decreasing, the regression-based N-point MEM shows better performance compared to both the weighted N-point MEM and the rolling weighted N+point MEM.

The paper is organized as follows: After the introduction in Section 1, Section 2 describes the two-point exponential model and discusses the limitations of its previous extension to three points. Section 3 introduces the multipoint exponential models proposed in this paper, including the weighted N-point MEM, regression-based N-point MEM, and rolling weighted N+point MEM. Section 4 summarize the time series of the headship rate in Korea using register-based census data. Section 5 presents the results of household projections in Korea, comparing the performance of the MEM methods with the traditional two-point exponential model. The paper concludes with final remarks in Section 6.

2. Two-point exponential model and the limitations of previous extension

Many countries, including England, Scotland, France, Canada, Japan, and Korea, have implemented future household projections based on headship rates. There are two primary methodologies for projecting these rates: One involves extrapolating changes linearly or exponentially into the future using mathematical formula; the other models the relationship between headship rates and variables such as socio-economic factors and government policies, developing scenarios for projection (Leiwen and O’Neill, 2004).

Many national statistical organizations favor the mathematical method becuase of its simplicity, intuitiveness and convenience. The two-point exponential model is the most representative. This model uses headship rates observed at the two most recent censuses, segmented by age, to forecast future rates by age. England (Nash, 2021), Scotland (Taylor, 2020), Japan (National Institute of Population and Social Security Research, Japan, 2018), and Korea (Statistics Korea, 2019) employ this model, while France (INSEE France, 2024) and Canada (Statistics Canada, 2020) base their projections on scenario settings. Generally, when there is a considerable interval between censuses, as is often the case with traditional census, it is logical to use the changes observed in the two most recent censuses for future projections.

The two-point (modified) exponential model is based on the headship rates hx,t1 and hx,t2 from the two most recent census times, t1 and t2, where, t1 < t2. It projects the headship rate hx,t* by age x at some future time t*, where, t* > t2, as given in equation (1).

h˜x,t*=δ+αβγt,

where, δ={1if hx,t2>hx,t10if hx,t2hx,t1, α = hx,t1δ, β=hx,t2-δhx,t1-δ, and γt=t*-t1t2-t1.

The two-point exponential model projects future headship rates based on the trend observed in the two most recent census periods, assuming that the increase or decrease in these rates will continue. Howeover, if the change over these two periods is too big, the predicted headship rate for the distant future could diverge to an unrealistic or illogical value. The key parameter in this model, ct, which determines the trend of change, is sometimes adjusted with a power function as the projection horizon lengthens. This adjustment, however, can be arbitrary and subjective due to the lack of clear criteria. Therefore, England and Scotland, which conduct censuses every 10 years, apply the two-point exponential model for projections spanning only up to a decade (Nash, 2021; Taylor, 2020).

To address this limitation, an extension to three points has been explored (Statistics Korea, 2019). In the three-point exponential model, which uses headship rates hx,t1, hx,t2 and hx,t3 to project headship rate hx,t* by age x at some future point t* > t3, we define α,β, and δ in equation (1) as follows:

δ=hx,t2-α,         α=hx,t3-hx,t2β-1,         and         β=hx,t3-hx,t2hx,t2-hx,t1.

However, this method can yield unstable estimates if the headship rate does not consistently increase or decrease between t1 and t3. In the three-point model, an illogical situation can occur where the sign of term, αβ2 reverses, depending on whether the direction of change in the headship rate varies or remains consistent between the observed periods.

This means that a direct extension of the 2-point exponential model to three points has limited applicability; it is only effective when the direction of change in the headship rate remains consistent throughout the entire period. Given the rapid and varied changes currently observed in headship rates, this model is not a suitable alternative model for multi-point data.

3. N-point modified exponential model

The register-based census data which is observed more frequently can capture the dynamic in household composition more accurately for household projection. This requires a more sophisticated model that incorporates headship rates at multiple time points. However, as previously discussed, the existing expanded three-point model encounters problems in consistency and stability issues due to changes in trends. Moreover, extending this form to a four-point or higher, accumulating N-points, necessitates alterations to model structure, which complicates its application. Considering that the two-point exponential model is widely used by many statistical agencies for its adequacy of required data and simplicity (Alias et al., 2018), too complex formula with accumulating census points may be impractical as a projection model.

To address this issue, we propose a new household projection model suitable for short-cycle registration censuses. This model extends the two-point exponential model with three key considerations: (1) The structure of model remains consistent, regardless of the length of the time series history. (2) It avoids unstable and illogical projections that may arise from trend changes. (3) It prevents divergence in long-term projections.

Focusing on these aspects, we suggest three modified N-point exponential models: The weighted N-point MEM, the regression-based N-point MEM, and the rolling weighted N-point MEM.

3.1. Weighted N-point modified exponential model

Suppose the headship rate hx,t by age x is observed at N time points t1, t2, . . ., tN−1, tN. The Weighted N-point MEM projects the headship rate hx,t* by age x at future point t* > tN as follows:

hx,t*(weighted)=δN+αNβNγtN,

where, δN={1if hx,tN>μx,N-10if hx,tNμx,N-1, αN = μx,N−1δN, βN=hx,tN-δNμx,N-1-δN,γtN=t*-t1tN-t1,

and μx,N−1 is weighted mean of observed headship rate hx,t1, . . ., hx,tN−1 for t1, t2, . . ., tN−1.

The most recent headship rate hx,tN is utilized directly, while the information from the remaining N − 1 headship rates hx,t1, . . ., hx,tN−1 is incorporated into the model in the form of their weighted average μx,tN−1.

By incorporating past census information through the weighted average μx,tN−1 of headship rates observed at more than two time points, this method effectively mitigates illogical projection issues. This remains true even amid fluctuations in the direction and speed of headship rate changes. Furthermore, as observational data continues to accumulate, the interval between the initial point t1 and the most recent point tN extends. This increased duration allows for more stable projections compared to the two-point exponential model, which is dependent on information from only two points in time.

3.2. Regression-based N-point modified exponential model

Suppose the headship rate hx,t by age x is observed at N time points t1, t2, . . ., tN−1, tN. The Regression-based N-point MEM projects hx,t* by age x at future point t* > tN as follows:

hx,t*(reg)=δreg+αregβregγtN,

where, δreg={1if hx,tN>h^x,t1:1N-10if hx,tNh^x,t1:1N-1, αreg = ĥx,N−1δreg,βreg=hx,tN-δregh^x,N-1-δreg,γtN=t*-t1tN-t1.

Here, ĥx,t1:1...N−1 is the predicted headship rate at time t1 using timeseries regression model of observed headship rate over time (t1, t2, . . ., tN−1), i.e., ĥx,t = β̂0x + β̂1xt for t = t1, t2, . . ., tN−1, and β̂0x, β̂1x are OLS estimates.

The most recent headship rate hx,tN is utilized directly, while information from the remaining N-1 headship rates hx,t1, . . ., hx,tN−1 is incorporated into the model as a regression-fitted value, ĥx,t1:1...N−1. Where ĥx,t1:1...N−1 is the fitted headship rate at the first census point, estimated by regressing the N-1 headship rates hx,t1, . . ., hx,tN−1 on the corresponding times t = t1, t2, . . ., tN−1 through a regression model.

By reflecting past census information through the fitted value of the regression model ĥx,t1:1...N−1, the benefits of the previously described weighted N-point modified exponential model (MEM) can be similarly attained.

3.3. Rolling weighted N+point modified exponential model

Suppose the headship rate hx,t by age x is observed at N time points t1, t2, . . ., tN−1, tN. The rolling weighted N+point MEM projects the headship rate hx,t* by age x at future point t* > tN as follows:

h^x,t*(rolling)=δroll+αrollβrollγtN,

where, δroll={1if h^x,t*-1(rolling)>μx,t*-20if h^x,t*-1(rolling)μx,t*-2, αroll = μx,t*−2δroll, βroll=h^x,t*-1(rolling)-δrollμx,t*-2-δroll,γtN=t*-t1(t*-1)-t1, and μx,t*−2 is weighted mean of hx,t1, . . ., hx,tN, h^x,tN+1(rolling),,h^x,t*-1(rolling).

For the initial projection point, the weighted N-point MEM is applied. For projections beyond this point, the projection employs both the actual observed headship rates and the data from previously projected headship rates. This information is continuously accumulated and updated in a rolling manner for future projections. In other words, the projection at the initial point, tN+1, is identical to that of the weighted N-point MEM. It is from the subsequent projection point, tN+2, onwards that the projection results differ.

To project hx,N+1, we use N headship rates hx,1, . . ., hx,N. For projecting hx,N+2, we use N observed headship rates and one predicted headship rate, hx,1, . . ., hx,N, x,N+1. As we continue to project further into the future, the number of headship rates used in the projections increases. This approach extends the period between the earliest and the latest time points. Consequently, as the amount of information used in the forecast accumulates, the stability of the forecast improves.

4. Data: Headship rates in Korea

Headship rates from the annual register-based census data from 2016 to 2020 are used for projection. The projection of household headship rates is segmented by age, while also considering the size and type of the household (Statistics Korea, 2019; Bell and Cooper, 1990; Wilson, 2013; Kajiwara et al., 2022). The household sizes are divided into categories of 1, 2, 3, 4, and 5 or more members, and there are 12 distinct household types. The rates of household composition by headship, are presented in Table 1 and Table 2.

Single- and two-person households continue to increase, while households with more than three members are decreasing. In 2020, single-person households are the most common, but before 2019, ‘couple+children’ are the most common.

For future household projections, the age-specific household headship rate is utilized. It is therefore essential to analyze the age distribution of the household headship rate, considering both the number of household members and the type of household. Household headship rates, categorized by the number of household members and by household type and segmented in five-year intervals, are depicted in Figure 1 and Figure 2, respectively.

As shown in Figure 1, the distribution of household headship rates by age differs according to the number of household members. Notably, in single-person households, younger individuals are more likely to be household heads, while among middle-aged and older groups, the rate of heading households with two or more persons is higher. This pattern likely arises from younger people initially living independently in single-person households due to factors like education and employment, and then transitioning to larger households as they age. Although the age distribution for such households generally remains consistent over time, there is a noticeable decline in households with five or more members across almost all age groups.

Recently, with the rise in the single-person household headship rate, there has been a decrease in the headship rate of 2–3 person households, particularly among individuals in their 20s. The diminishing headship rate for individuals in their 30s in 3–4 person households is seemingly linked to the declining birth rate. Households with four members, and those with five or more members, exhibit similar age-specific trends. A notable time-series anomaly in the headship rate of older households is observed in 2016.

Figure 2 presents the household headship rates by age, categorized by household type. Although the distribution of headship rates by age varies across different household types, certain types demonstrate similar patterns. Figures 2(e) and 2(f), representing the distributions for single fathers with children and single mothers with children respectively, exhibit similar age distribution patterns. Moreover, a time-series anomaly in the headship rate among the elderly is observed across nearly all household types in 2016.

5. Projection results

For projecting future households by age, it is necessary to have age-specific household headship rates at one-year intervals. For this purpose, Beer’s formula (Beers, 1945) is employed to interpolate headship rates from five-year age groups to annual rates (Park et al., 2009). Furthermore, to address time-series anomalies in headship rates among older and younger populations, a robust regression model (Rousseeuw and Leroy, 2005) and Greville’s graduation (Greville, 1948) is applied creating a series of annual age-specific headship rates (Kim et al., 2018; Cho et al., 2018a, b). In instances where the age-specific headship rate is zero, such as for individuals under 19 years old, a value of 10−6 is used instead to facilitate model fitting for the predictions.

5.1. Long-term projection results

To address the unrealistic forecasts frequently encountered in long-term predictions, a challenge when projecting future households using short-cycle registration censuses, we first examine the results of these long-term projections. Figures 3 and 4 display the projected household headship rates from 2021 to 2051 for single-person and married-couple households, respectively. Given that the projection patterns for all household sizes and types are similar, we present these two representative cases to conserve space.

Figure 3(a) and 4(a) presents the projection results using the two-point index model, indicating that long-term forecasts over 20 years are highly dependent on the variations at the observed two points. The projection results using the new multi-point modified exponential model (MEM) reveal that the reg-based N-point MEM does not reduce these fluctuations as much as anticipated. However, the results from the weighted N-point MEM demonstrate a considerable reduction in fluctuations compared to the two-point model. Additionally, the Rolling weighted N+point MEM provides the most stable results for long-term projections. The effectiveness of this Rolling weighted N+point MEM is particularly is more prominently displayed in Figure 4.

5.2. Short-term projection results

To evaluate the effectiveness of our new N-point MEM, we compare short-term prediction results using actual observed registration census data from 2021 to 2022. For this evaluation, we employ root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and symmetric mean absolute percentage error (SMAPE). These are standard metrics used to measure the goodness of fit and predictive accuracy of a model (Makridakis, 1993).

MSE and MAE are the averages of the squared differences (ĥx,thx,t)2 and the absolute differences |ĥx,thx,t|, respectively. The RMSE, which is the square-rooted MSE, is the most intuitive and widely used metric. However, it has the drawback of being sensitive to outliers. MAPE is similar to MAE but calculates the absolute percentage error by dividing the absolute error by the actual value, |ĥx,thx,t|/hx,t, and averaging it. This results in a probability value between 0 to 100%, making it easy to interpret. Nonetheless, a very small actual value hx,t can lead to a disproportionately large MAPE value. SMAPE addresses some of MAPE’s limitations. Instead of dividing the error by the actual value, as in MAPE, SMAPE divides the error by the average of the actual and predicted values, |ĥx,thx,t|/((hx,t +ĥx,t)/2), for its calculation. It yields a probability value between 0–200%, facilitating straightforward interpretation. However, if both the actual and predicted values are very close to zero, SMAPE can yield a very large value.

Table 3 shows the accuracy of short-term headship rate forecasts by the number of household members. The shaded cells in the table highlight the values with the minimum error in each case. Generally, the N-point modified exponential model (MEM) method outperforms the two-point exponential model. Considering dataset comprises only five annual data points from 2016 to 2020, the differences in RMSE and MSE are not substantial. However, it is apparent that the newly proposed N-point MEM methods provide more precise forecasts than the previous two-point exponential model.

The regression-based N-point MEM performs best for single-person households, which are currently increasing at an accelerating rate. Similarly, this method is most accurate for households with three or more individuals, which are declining, with the rate of decrease also accelerating. In contrast, for two-person households, which have been increasing but at a recently slowed rate, the weighted methods show the best performance.

Table 4 presents the accuracy of short-term predictions for the headship rate by household type. Similar to Table 3, the N-point MEM demonstrate superior performance compared to the two-point method. However, caution should be needed when interpreting the MAPE and SMAPE for ‘married-couple+ children’ and ‘grandparents+grandchildren’ household types, as their components are quite small (0.5%–0.7%). It is shown that weighted methods perform better in scenarios involving a change in growth rate, such as with ‘married couples’ (transitioning from the steepest to the softest growth) and ‘single mothers & children’ (also shifting from the steepest to the softest growth).

6. Conclusion

Since the introduction of the first register-based census in Korea in 2015, the census has shifted from a five-year general survey to a one-year administrative data-based census, which is now published. This change facilitates the reflection of rapid changes in Korea’s household structure, such as the increase in single-person households and the decrease in household size due to low birth rates and an aging population. However, population data based on the register-based census not only shortens the data cycle but also introduces variations and trends over time.

These fluctuations in the data necessitate a reevaluation of existing household projection models and the development of new, more appropriate models. The two-point exponential model, a traditional projection model based on censuses with longer cycles, is not suitable for register-based censuses with multiple data points over short cycles. This paper proposes N-point modified exponential methods (MEM) that can capture the dynamics of household headship rates at multiple points using register-based censuses, and considers three modifications: Weighted N-point MEM, regression-based N-point MEM, and rolling weighted N+point MEM.

Using these methods, future households in Korea have been projected by household size and type based on regiter-based census statistics from 2016 to 2020. Long-term projection results up to 2051 show that the previous two-point exponential model leads to large fluctuations and unstable forecasts for projections spanning 20 years. In contrast, the N-point methods show a reduction in these fluctuations. In particular, the rolling weighted N+point MEM provides the most stable long-term projection results.

Further, an examination of the accuracy of short-term projection results based on the updated 2021 and 2022 register-based census data shows that the newly proposed N-point methods generally outperform the previous two-point model. If the rate of change in the headship rate is simply increasing or decreasing, the regression-based N-point MEM performs better; otherwise, the weighted N-point MEM or rolling weighted N+point MEM methods are more effective.

Despite limitations in performance evaluation due to the short period of register-based census data, we can sufficiently show that the N-point MEMs are more suitable than previous two-point model for household projections. The N-point MEMs effectively incorporate multi-point register-based census information in a timely manner without the need for additional data or adding to the model’s complexity, which enhances its practical utility. As registration census data continues to accumulate over time, we expect they to enable the creation of more precise and scientifically robust national statistics, including population and household estimates.

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT)(NRF-2022R1F1A1065520 to Jeon, and NRF-2021R1F1A1059513 to Kwon). This work was supported by Hankuk University of Foreign Studies Research Fund.

Figures
Fig. 1. Household headship rate by age (agehrr) according to the number of household members in Korea.
Fig. 2. Household headship rate by age (agehrr) according to household type in Korea.
Fig. 3. Projection of the headship rate for single-person households by projection methods.
Fig. 4. Projection of the headship rate for married-couple households by various projection methods.
TABLES

Table 1

Headship rate by number of household members in Korea

Number of household members 2016 2017 2018 2019 2020
1 27.66 28.33 29.04 29.90 31.24
2 26.15 26.55 27.08 27.65 27.96
3 21.44 21.31 21.11 20.84 20.29
4 18.48 17.88 17.22 16.48 15.83
5+ 6.26 5.94 5.56 5.13 4.68

Table 2

Headship rate by household type in Korea

Generations Household type 2016 2017 2018 2019 2020
Single-person 1 Generation Single-person 27.66 28.33 29.04 29.90 31.24
Married-couple 15.51 15.74 16.13 16.52 16.76
1 Generation Others 1.75 1.74 1.76 1.78 1.79

2 Generations Married-couple+ Children 32.03 31.53 30.87 30.08 29.32
Single father + Children 2.72 2.65 2.60 2.55 2.45
Single mother + Children 7.76 7.59 7.52 7.42 7.33
Married-couple+ Parents 0.75 0.72 0.69 0.67 0.62
Grandparents+ unmarried grand children 0.57 0.57 0.57 0.56 0.56
2 Generations Others 4.66 4.64 4.54 4.40 4.11

3 Generations Married-couple + Unmarried children +Parents 2.98 2.81 2.60 2.38 2.10
3 or More Generations Others 2.31 2.16 2.03 1.90 1.72

Others Non-relatives 1.30 1.51 1.66 1.84 1.98

Table 3

Performance of projection models by number of household members (2021–2022)

Number of household Error 2-point Weighted N-point Reg based N-point Rolling weighted N+point
1 RMSE 0.019 0.020 0.018 0.021
MAE 0.021 0.024 0.020 0.024
MAPE 4.974 6.082 4.898 6.224
SMAPE 1.267 1.553 1.247 1.590

2 RMSE 0.011 0.010 0.011 0.010
MAE 0.011 0.009 0.011 0.009
MAPE 6.036 5.324 6.051 5.283
SMAPE 1.569 1.372 1.572 1.358

3 RMSE 0.007 0.007 0.006 0.007
MAE 0.008 0.008 0.007 0.008
MAPE 8.963 8.296 8.485 8.447
SMAPE 2.281 2.088 2.158 2.122

4 RMSE 0.007 0.009 0.007 0.009
MAE 0.007 0.009 0.007 0.009
MAPE 16.067 17.834 15.938 18.183
SMAPE 3.985 4.412 3.926 4.489

5+ RMSE 0.003 0.004 0.003 0.004
MAE 0.003 0.004 0.003 0.005
MAPE 13.722 21.837 13.476 22.691
SMAPE 3.485 5.456 3.433 5.633

*Shaded the minimum value.


Table 4

Performance of projection models by household type (2021–2022)

Generation Household type Error 2-point Weighted N-point Reg based N-point Rolling weighted N+point
One-generation Married-couple RMSE 0.008 0.008 0.008 0.008
MAE 0.007 0.007 0.007 0.007
MAPE 24.635 26.847 25.414 26.958
SMAPE 4.215 4.460 4.350 4.455

Others RMSE 0.004 0.003 0.004 0.003
MAE 0.002 0.002 0.002 0.002
MAPE 14.676 12.055 14.162 12.104
SMAPE 3.794 3.046 3.655 3.052

Two-generations Married-couple & Children RMSE 0.009 0.010 0.008 0.010
MAE 0.008 0.009 0.008 0.009
MAPE 9.355 8.262 8.966 8.435
SMAPE 2.363 2.057 2.263 2.097

Single father & Children RMSE 0.001 0.001 0.001 0.001
MAE 0.001 0.001 0.001 0.001
MAPE 13.335 10.478 11.903 10.460
SMAPE 3.773 2.807 3.318 2.788

Single mother & Children RMSE 0.003 0.003 0.003 0.003
MAE 0.003 0.003 0.003 0.003
MAPE 6.686 6.173 6.312 6.420
SMAPE 1.688 1.539 1.587 1.600

Married-couple & Parents RMSE 0.001 0.002 0.001 0.002
MAE 0.001 0.001 0.001 0.001
MAPE 25.936 30.121 26.286 30.718
SMAPE 4.786 5.417 4.811 5.523

Grandparents & grand children RMSE 0.002 0.002 0.002 0.002
MAE 0.001 0.001 0.001 0.001
MAPE 12.846 15.012 12.682 15.144
SMAPE 3.143 3.711 3.093 3.746

Others RMSE 0.005 0.005 0.005 0.005
MAE 0.005 0.005 0.005 0.005
MAPE 11.883 13.952 11.730 14.306
SMAPE 3.352 3.776 3.287 3.849

Three or more generations Married-couple & Unmarried children & Parents RMSE 0.001 0.002 0.001 0.002
MAE 0.001 0.002 0.001 0.003
MAPE 21.241 28.935 21.117 29.869
SMAPE 7.801 9.473 7.760 9.671

Others RMSE 0.002 0.002 0.001 0.002
MAE 0.002 0.002 0.001 0.002
MAPE 55.001 87.283 58.320 88.878
SMAPE 5.634 7.428 5.619 7.613

Non-relatives RMSE 0.004 0.003 0.004 0.003
MAE 0.004 0.003 0.004 0.003
MAPE 13.829 11.600 14.263 11.830
SMAPE 3.334 2.907 3.422 2.972

*Shaded the minimum value.


References
  1. Alias AR, Zainun NY, Rahman IA, Suratkon A, Sulaiman N, Ghazali FEM, and Shamsuddin N (2018). Headship rate projections for housing demand in Johor, Malaysia. In Journal of Physics: Conference Series, IOP Publishing, p. 012005.
    CrossRef
  2. Australian Bureau of Statistics (2019). Household and Family Projections, Australia.
  3. Beers HS (1945). Six-term formulas for routine actuarial interpolation. The Record of the American Institute of Actuaries, 34, 59-60.
  4. Bell M and Cooper J (1990). Household forecasting: Replacing the headship rate model. In Fifth National Conference, Australian Population Association, Melbourne, November.
  5. Bell M, Cooper J, and Les M (1995). Household and Family Forecasting Models: A Review, Commonwealth Department of Housing and Regional Development.
  6. Cho Y, Kim K, Jeon S, and Kim S (2018a). A study on adjustment of transition rates between marital status. Journal of The Korean Data Analysis Society, 20, 137-146.
    CrossRef
  7. Cho Y, Kim K, Kim S, and Jeon S (2018b). Heuristic Lee-Carter model for long-term marital rate prediction. Journal of The Korean Data Analysis Society, 20, 125-136.
    CrossRef
  8. Glick PC (1957). American families, a volume in the 1950 Census Monograph Series, New York: Wiley. 1957b The family cycle, American Sociological Review, 12, 164-174.
  9. Greville TN (1948). Recent developments in graduation and interpolation. Journal of the American Statistical Association, 43, 428-441.
    Pubmed CrossRef
  10. Hu Z and Peng X (2015). Household changes in contemporary China: An analysis based on the four recent censuses. The Journal of Chinese Sociology, 2, 1-20.
    CrossRef
  11. van Imhoff E, Kuijsten A, Hooimeijer P, and van Wissen LJ (2013). Household demography and household modeling, Springer Science & Business Media.
  12. INSEE France (2024). Household Forecasts-Documentation on methodology.
  13. Ironmonger DS and Lloyd-Smith CW (1992). Projections of households and household populations by household size propensities. Journal of the Australian Population Association, 9, 153-171.
    Pubmed CrossRef
  14. Jacobsen LA, Mather M, and Dupuis G (2012). Household change in the United States. Population Bulletin, 67, 2-13.
  15. Jun KH (2020). Singapore’s transition to register-based population censuses and some lessons for 2020 round census in Asia. Social Science Study, 31, 237-259.
    CrossRef
  16. Kajiwara K, Ma J, Seto T, Sekimoto Y, Ogawa Y, and Omata H (2022). Development of current estimated household data and agent-based simulation of the future population distribution of households in Japan. Computers, Environment and Urban Systems, 98, 101873.
    CrossRef
  17. Kim S, Kim K, and Jeon S (2018). Heuristic Li-Lee model for long-term prediction of regional headship rate. Journal of The Korean Data Analysis Society, 20, 657-668.
    CrossRef
  18. Leiwen J and O’Neill BC (2004). Toward a new model for probabilistic household forecasts. International Statistical Review, 72, 51-64.
    CrossRef
  19. Makridakis S (1993). Accuracy measures: Theoretical and practical concerns. International Journal of Forecasting, 9, 527-529.
    CrossRef
  20. Nash A (2021). Household projections for England: 2018-based. Office for National Statistics, UK.
  21. National Institute of Population and Social Security Research, Japan (2018). Future estimate of the number of households in Japan.
  22. Office for National Statistics, UK (2020). Local authority household projections for Wales: 2018-based.
  23. Park Y and Lee S (2017). For the reliability of the 2015 registration census and future population projection. Survey Research, 18, 1-37.
    CrossRef
  24. Park YS, Park SK, Choi BS, and Kim KH (2009). A comparison of probability of death result using beers’s interpolation coefficient and Greville’s formula. Journal of the Korean Data Analysis Society, 11, 97-110.
  25. Rousseeuw PJ and Leroy AM (2005). Robust regression and outlier detection, John Wiley & Sons.
  26. Statistics Canada (2020). Housing market insight-CMHC household projection, National and provincial analysis.
  27. Statistics Korea (2019). Household Projections by Province (2020∼ 2050).
  28. Taylor S (2020). Household Projections for Scotland: 2018-based, National Records of Scotland.
  29. Wilson T (2013). The sequential propensity household projection model. Demographic Research, 28, 681-712.
    CrossRef