Paper Conference

Proceedings of uSim Conference 2022: 3rd uSim Conference of IBPSA-Scotland



Sandhya Patidar, David Jenkins, Andrew Peacock, Ashkan Lotfipor

Abstract: Missing data are an integral part of a large dataset and one of the first key challenge that needs to be resolved effectively before conducting any reliable data-driven analytics or model development. Recently, several studies focused their attention on investigating the potential of computational approaches for missing data imputation. With readily available smart meter data, there is a growing demand for an efficient missing data imputation algorithm that can effectively capture the intrinsic patterns of highly stochastic dynamics of electricity demand data, specifically in the region of large gaps. However, due to the highly stochastic nature of electricity demand data, most of the conventional approaches facilitate limited scope and applicability. This paper is aimed to investigate the potential of a simple logical algorithm (developed by authors) in parallel to the widely applied 'mice' (R package) algorithm for infilling high-resolution (1 minute) electricity demand data simultaneously at multiple sites. To optimise the performance of the 'mice' algorithm a hierarchical cluster analysis using absolute correlation as a distance measure is utilised. The paper investigated a block of two months of data for 121 sites (methodologically), extracted from a large database of 661 sites monitored for almost a year, for a case-study community Auroville" in India. The performance of both algorithms is intensively assessed using key statistical indicators involving discrete percentile distribution