Authors: Al-Bazzaz H, Azam M, Amayri M, Bouguila N
Smart meter datasets have recently transitioned from monthly intervals to one-second granularity, yielding invaluable insights for diverse metering functions. Clustering analysis, a fundamental data mining technique, is extensively applied to discern unique energy consumption patterns. However, the advent of high-resolution smart meter data brings forth formidable challenges, including non-Gaussian data distributions, unknown cluster counts, and varying feature importance within high-dimensional spaces. This article introduces an innovative learning framework integrating the expectation-maximization algorithm with the minimum message length criterion. This unified approach enables concurrent feature and model selection, finely tuned for the proposed bounded asymmetric generalized Gaussian mixture model with feature saliency. Our experiments aim to replicate an efficient smart meter data analysis scenario by incorporating three distinct feature extraction methods. We rigorously validate the clustering efficacy of our proposed algorithm against several state-of-the-art approaches, employing diverse performance metrics across synthetic and real smart meter datasets. The clusters that we identify effectively highlight variations in residential energy consumption, furnishing utility companies with actionable insights for targeted demand reduction efforts. Moreover, we demonstrate our method's robustness and real-world applicability by harnessing Concordia's High-Performance Computing infrastructure. This facilitates efficient energy pattern characterization, particularly within smart meter environments involving edge cloud computing. Finally, we emphasize that our proposed mixture model outperforms three other models in this paper's comparative study. We achieve superior performance compared to the non-bounded variant of the proposed mixture model by an average percentage improvement of 7.828%.
Keywords: asymmetric generalized Gaussian distribution; bounded mixture models; energy analytics; feature selection; probabilistic modelling;
PubMed: https://pubmed.ncbi.nlm.nih.gov/37837127/
DOI: 10.3390/s23198296