Paper Conference

Proceedings of Building Simulation 2021: 17th Conference of IBPSA


Comparison of different clustering approaches on different databases of smart meter data

Martina Ferrando 1,2, Debora Nozza 3, Tianzhen Hong 2, Francesco Causone 1
1 Politecnico di Milano, Italy
2 Lawrence Berkeley National Laboratory, California
3 Università Bocconi, Italy

Abstract: Various clustering methods have been applied to determine representative groups of buildings based on their energy use patterns. We reviewed and selected the most commonly used clustering methods, including kmeans, k-medoids, Self-Organizing Map (SOM) coupled with k-means and hierarchical, and our proposed deep clustering algorithm for comparative performance assessment using datasets of smart meters. After the data preparation (data cleaning, segmentation, and normalization), the clustering is run, firstly, letting the number of clusters free to be chosen by the optimization process, and then forcing it to be equal to the number of primary functions of buildings. Depending on the purpose of clustering, e.g., to identify daily 24-hour load shape, to identify primary building use type (e.g., office, residential, school, retail), the optimal number of clustering can vary greatly. Thus, based on the final aim, forcing somehow the number of clusters is the most followed and suggested for engineering purposes. The kmeans, the k-medoid, and the hierarchical algorithms show the best results, in all cases. While for the nature of the databases the additional step of adding a SOM to the k-means algorithms does not show improvements in terms of evaluation metrics. The direct comparison of the different algorithms gives a clear overview of the existing main clustering approaches and their performance in capturing typical use patterns in typical smart meter databases. The resulting cluster centroids could be used to better understand and characterize the energy use patterns of different buildings and building typologies with the final aims of benchmarking or customers segmentation.
Keywords: Smart Meter Data, Clustering, Machine learning, Classification, Deep learning
Pages: 1155 - 1162