
Smart grid appears as a key term in the context of optimizing energy efficiency. One of the integral facets of the smart grid is advanced metering infrastructure (AMI), which enables two-way communication meters and the availability of electricity consumption data at a higher frequency. By 2021, UK energy suppliers will have installed 50 million smart meters, and 22.5 million smart meter installations are planned in South Korea.
Therefore, various studies have analyzed AMI data in relation to energy consumption in households. Krishna
Various clustering methods have been applied to characterize electricity consumption data and interpret the load pattern. Kwac
In this paper, we develop a data-adaptive clustering method that may reflect this characteristic of the AMI data and also provide a new perspective on electricity consumption compared to the existing clustering methods. The proposed method is based on the functional PCA (FPCA), which is commonly used for data reduction in functional data (Van Der Linde, 2008; Silverman and Ramsay, 1997). Although concepts and methods in functional data analysis may be robust for serial dependence, they have been developed for independent observations. However, time series data such as economy and energy consumptions, does not support this assumption. To solve this problem, Brillinger (2001) first suggested frequency domain PCA (FDPCA), which considers the correlation in time. FDPCA is also called a dynamic PCA (DPCA), and has been studied in various fields (Salvador
However, FDPCA is based on the spectral representation of a stationary signal. Stationarity, which means statistical characteristics such as mean and variance do not change over time, is a very important assumption in time series analysis. However, in real data analysis, the stationarity assumption is rarely satisfied (Hamilton, 1989; Azadeh
Figure 1, which depicts the load pattern from randomly selected households in South Korea from January, 2012 to October, 2014, illustrates the non-stationary characteristic of the data.
To alleviate the stationarity assumption, Ombao and Ho (2006) proposed a time-dependent FDPCA for the locally stationary data. They formed a small block around each time point, and therefore, the multi-channel signal was almost stationary in that time block, which defined the time-varying spectral density matrix. Their method extracted time-varying spectral features of a multi-channel signal. Thus, we expect that the time-dependent FDPCA is a proper starting point to cluster the non-stationary time series data.
In this paper, we propose a time-dependent FDPCA based clustering method for AMI data that is non-stationary time series data. The clustering results are further applied to validate the reform of the progressive electricity tariff system in South Korea. The South Korean government has planned to apply differentiated tariff rates by season and hour, which involves charging higher rates during peak seasons/hours and lower rates during non-peak seasons/hours, and was scheduled to start in July, 2021. We apply the reformed system to each cluster group, and analyze the effect of the reform on household energy usage.
The remainder of this paper is organized as follows: In Section 2, we describe the proposed time-dependent frequency domain principal component clustering method and also provide a practical algorithm. Simulation results are presented in Section 3, and real AMI data analysis are illustrated in Section 4. Finally, the concluding remarks are presented in Section 5.
Here, we first briefly review the FDPCA and time-dependent FDPCA, and then explain the proposed clustering method based on the time-dependent FDPCA.
Suppose that we have
where
where
We now intend to approximate
where
Then, we can reconstruct
where
To obtain
Then, the minimum solution is
where
As mentioned above, the stationarity assumption is crucial in the FDPCA. Ombao and Ho (2006) divided the time series into blocks to apply FDPCA to the locally stationary process. Therefore, the time series within the block appears stationary. Then, define
where
where
Ombao and Ho (2006) conducted the FDPCA in the neighborhood of a particular rescaled time
where
where
Finally, the time-varying eigenvalues and eigenvectors are obtained by applying eigenanalysis to the matrix
Compared to the eigenvalues,
Therefore, the dimensions of the eigenvalues and eigenvectors in the time-dependent FDPCA are higher than those of the FDPCA. The eigenvalue,
To cluster the data based on the time-varying FDPCA, we define weighted time-varying eigenvectors as follows. For each time point,
Now, we perform the conventional
Here, we present a practical algorithm of the proposed method.
Time-dependent frequency domain principal component clustering (TFDPC clustering)
1: | |
| |
Number of shift | |
Number of blocks | |
Number of observations in neighborhood | |
2: | |
| |
| |
3: | |
4: | Let |
5: | Calculate Fourier coefficients |
6: | Construct the smoothed periodogram matrix as follows. |
○ Calculate | |
○ | |
7: | Applying eigenanalysis to the matrix |
8: | Obtain the first |
9: | |
10: | |
11: | |
12: | Computing the first |
13: | Performing |
Here, we conduct simulation studies to confirm the performance of the proposed method.
We generate 50 curves from
The simulation settings are similar to that of Fryzlewicz and Ombao (2009) with small modifications.
1. Case 1 – Nonstationary autoregressive processes with abruptly changing parameters
where
(a) Two groups (
(b) Two groups (
(c) Three groups (
2. Case 2 – Nonstationary autoregressive processes with sinusoidal waves
For
where
(a)
(b)
The sample curves generated from Case 1-(a) and Case 2-(a) are shown in Figure 2
In each case, 100 Monte Carlo simulations are conducted, and the average performances are presented in the following tables. As an evaluation measure, the correct classification rate (CCR) and the adjusted Rand index (aRand) of Hubert and Arabie (1985) are considered. aRand measures the correspondence between two partitions on how object pairs are classified in the contingency table. A larger aRand value indicates a higher similarity between two partitions.
For comparison, we consider two conventional clustering methods along with the proposed time-dependent FDPC (TFDPC) clustering method. We apply the
From the results, we observe that the proposed TFDPC clustering method works best in Case 1. However, in Case 1-(c), all three methods works poorly and there is no significant difference between the methods.
In Case 2 with small noise variance,
From the simulation results, we conclude that the conventional methods also works well in well-separated cases, but the proposed one works best in some non-stationary autoregressive processes.
The proposed time-varying clustering method is applied in order to assign the load patterns to different clusters. Hourly AMI data were measured for 668 residential customers in Seoul, South Korea from January 1, 2012 to October 31, 2014. The data was sourced from the Korea Electric Power Corporation. For fast computation, we converted hourly data into a three-day averaged time series. Therefore, the number of time points in each AMI dataset is
We first performed the augmented Dickey–Fuller test to confirm the non-stationarity of the data (Said and Dickey, 1984). We find that the null hypothesis cannot be rejected for 189 out of the total 668 customers, implying that some datasets in the AMI time series are non-stationary
Before applying the method, we first normalized each time series. For each AMI dataset, all its measurements are normalized in the range of [0, 1] by using its maximum value as the reference power (Xu
We now perform the TFDPC clustering method as follows: We first segment the data into 10 blocks, each with 93 observations and the number of shift is set to 28. Figure 4 plots the first two eigenvalues,
Now, we compute the first weighted time-varying eigenvector,
Based on the clustering results in Figure 5, we plot the average AMI data for each group in Figure 6. The proposed method clearly classifies households according to the pattern of energy usage. Cluster 1 contains 201 households, and the average AMI time series in this group has relatively high usage during the winter season. By contrast, cluster 3, which contains 147 households, shows a high usage pattern during the summer season and relatively low energy usage in the winter season. Cluster 2 contains 320 households, of which energy usage is high during both the summer and winter, although relatively higher during the summer season.
For comparison, we also apply conventional clustering methods to the normalized AMI data. We apply
The policy of differentiating tariff rates by season and hour involves charging higher rates during peak seasons/hours and lower rates during non-peak seasons/hours. Higher rates are applied in the summer and winter season, and during peak hours. Spring and autumn, as well as sub-peak and off-peak hours are subject to lower tariff rates. Table 3 presents the electricity price system before tariff reform, and Table 4 lists the electricity tariff rates after tariff reform.
Figure 8 presents the box-plot of the electricity prices for each cluster group before tariff reform, after tariff reform, and the differences. As the charges depend on the season, we plot the results according to the season. We observe three outcomes:
in summer, the charge on cluster 1, which is the high usage group during the winter season, decreases after tariff reform,
in winter, the mean charge on cluster 3, the lowest usage group during the winter, decreases after tariff reform, while it increases for the other cluster groups,
in spring and fall, most households’ electricity bills increase.
Therefore, we expect that the proposed clustering results can be used to validate the reform of the progressive electricity tariff system in South Korea.
In this paper, we propose a new clustering method, the TFDPC clustering method. The proposed method is based on the time-varying FDPCA, and we apply the
From a practical viewpoint, we believe that the proposed method can be extended as follows,
The clustering results can be applied to characterize residential energy use if more detailed information on residential energy usage is provided. Then, the electricity tariff system can be more precisely customized.
Other clustering methods can also be applied to the time-varying eigenvector. Comparing these results can offer deeper insights.
Clustering results of Case 1. Bold face indicates the best performance
Method | CCR | aRand | |
---|---|---|---|
(a) | TFDPC clustering | ||
0.661(0.046) | 0.107(0.051) | ||
kCFC | 0.651(0.077) | 0.107(0.092) | |
(b) | TFDPC clustering | ||
0.661(0.039) | 0.104(0.048) | ||
kCFC | 0.604(0.085) | 0.063(0.087) | |
(c) | TFDPC clustering | ||
0.523(0.054) | 0.103(0.038) | ||
kCFC | 0.469(0.061) | 0.072(0.051) |
Clustering results of Case 2. Bold face indicates the best performance
Method | CCR | aRand | ||
---|---|---|---|---|
(a) | 1.52 | TFDPC clustering | 0.901(0.06) | 0.653(0.18) |
kCFC | 0.871(0.04) | 0.551(0.118) | ||
2.52 | TFDPC clustering | |||
0.728(0.086) | 0.23(0.144) | |||
kCFC | 0.736(0.057) | 0.227(0.107) | ||
(b) | 1.52 | TFDPC clustering | 0.888(0.036) | 0.606(0.106) |
0.902(0.042) | 0.606(0.131) | |||
kCFC | ||||
2.52 | TFDPC clustering | |||
0.691(0.058) | 0.15(0.084) | |||
kCFC | 0.598(0.076) | 0.052(0.083) |
Electricity tariff rates for households by bracket before the tariff reform (High-voltage)
Summer (July–Aug) | ||
---|---|---|
Monthly consumption | Demand charge (won/household) | Energy charge (won/kWh) |
1 – 300kWh | 730 | 78.3 |
301 – 450kWh | 1,260 | 147.3 |
> 450kWh | 6,060 | 215.6 |
Other Seasons | ||
Monthly consumption | Demand charge (won/household) | Energy charge (won/kWh) |
1 – 200kWh | 730 | 78.3 |
201 – 400kWh | 1,260 | 147.3 |
> 400kWh | 6,060 | 215.6 |
Electricity tariff rates for households by bracket after tariff reform (Normal Type)
Summer | Spring/Fall | Winter | ||
---|---|---|---|---|
(June – Aug) | (Mar – May) (Sep – Oct) |
(Nov – Feb) | ||
Peak load | Time Zone | 13 – 17 | - | 9 – 12 |
Charge (won/kWh) | 188 | - | 159 | |
Mid load | Time | 9 – 13 & 17 – 23 | 9 – 23 | 12 – 23 |
Charge (won/kWh) | 155 | 109 | 138 | |
Off-peak load | Time | 23 – 9 | 23 – 9 | 23 – 9 |
Charge (won/kWh) | 82 | 82 | 95 |