1 Introduction
Global warming and climate change remain unavoidable problems facing the 21st century, primarily due to energy consumption and carbon emissions from human activities (
McMichael et al., 2006). Building energy accounts for 20%–40% of the primary energy consumption (
Castleton et al., 2010) and even 45% in some developed regions (
Butler, 2008). However, current energy use in buildings is not efficient (
Molina-Solana et al., 2017). Nearly one-third of the energy consumption of buildings can be reduced through efficient energy management (
Li et al., 2019). Therefore, it is crucial to clearly define the scope of the management activities related to building energy management (BEM), that is, the boundary of BEM.
Numerous scholars have conducted research on BEM from different perspectives, including building energy demand and consumption prediction (
Dong et al., 2005;
Tso and Yau, 2007;
Neto and Fiorelli, 2008), building energy performance evaluation (
Scheuer et al., 2003;
Crawley et al., 2008;
Schlueter and Thesseling, 2009), energy operation diagnosis (
Jaggs and Palmer, 2000;
Ascione et al., 2013;
Magoulès et al., 2013;
Ficco et al., 2015), and energy system control and optimization (
Lozano et al., 2009;
Dagdougui et al., 2012;
Lu et al., 2015;
Wu et al., 2017). However, few studies have directly defined the concept of BEM, partly due to its broad knowledge scope and vague edge.
de Wilde et al. (2013) regarded BEM as monitoring and analyzing the energy use of buildings with the aim of controlling and reducing energy expenditure.
Wu (2009) divided the BEM into macro and micro levels, with the former indicating policies, regulations and standards, while the latter focusing on the daily operation and maintenance of buildings and occupancy energy-consuming behavior.
With the rise of artificial intelligence (AI) in recent years, AI techniques have already been adopted in several fields, including facial recognition for smartphone cameras (
Zhao et al., 2003), fraud detection in credit card payments (
Ngai et al., 2011), vehicle and pedestrian identification in self-driving cars (
Goodfellow et al., 2016), online product recommendations and advertisements (
Turban et al., 2018), etc., and BEM research is no exception. An increasing number of scholars apply the AI method to the study of BEM, among which machine learning is the most widely used. Machine learning (ML) is a popular AI domain because of its ability to extract and generate knowledge from raw data that can be used to solve problems intuitively, in a manner similar to that of human beings. The most well-known definition of ML was articulated by Tom Mitchell in 1997: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”
As one of the recent technological developments, ML has been widely used to solve BEM-related problems, exerting a considerable impact on the BEM research method and therefore changing the boundaries of BEM. Many scholars have conducted literature reviews on the application of ML in the specific field of BEM.
Foucquier et al. (2013) reviewed the application of ML to building modeling and energy performance prediction, focusing on the resolution of equations simulating building thermal behavior and other mathematics-related equations.
Deb et al. (2017) focused on time series forecasting techniques for building energy consumption, presenting a comprehensive review of the existing ML techniques for forecasting time series energy consumption. There are also many reviews of energy use forecasting (
Kuster et al., 2017;
Amasyali and El-Gohary, 2018), specific ML technology applications, e.g., artificial neural network (ANN) applications (
Yildiz et al., 2017;
Rodrigues et al., 2018), and energy efficiency evaluations (
Goudarzi and Mostafaeipour, 2017).
Most of the current reviews in this field are merely focused on the application of specific ML techniques in one specific aspect of BEM, while lacking of clear depiction of the ML-BEM boundary and a comprehensive view of ML-BEM. With the increase in building energy demand, higher requirements are put forward for BEM (
Danish et al., 2019), calling for more systematic and efficient energy management (
Tronchin et al., 2018), which also highlights the need for a full-scale review. A good understanding of ML-BEM is also crucial for improving the efficiency of BEM and making comprehensive and effective policies. Moreover, the current review focuses more on the description of the status of the ML-BEM-related research, without interpreting the development process and current stage of the application of ML in BEM. Based on the hype cycle model, this paper analyzes the current development of ML-BEM and tries to predict its future development trend, which can make the management of building energy more forward-looking.
Therefore, the main objectives of this study are (1) to understand how ML techniques refine the boundary of BEM during the whole building life-cycle; (2) to develop an integrated framework for ML-BEM and analyze the holistic research status; and (3) to depict the knowledge evolution of ML-BEM and to predict its future development trend based on the hype cycle model. The rest of this paper is structured as follows. Section 2 elaborates the research method used in this study, and Section 3 states the results of the quantitative analysis. A comprehensive framework is established and discussed in Section 4, and Section 5 shows the ML-BEM knowledge evolution. Future directions are discussed in Section 6, and Section 7 concludes this study.
2 Methodology
The core research methodology is illustrated in Fig. 1. The first step is to identify the keywords. According to the topic of the review, the keywords to be searched can be divided into two categories: Related to ML or to BEM. The ML-related keywords consist of core paradigms of ML, including supervised learning, unsupervised learning and reinforcement learning (
Jordan and Mitchell, 2015). The BEM-related keywords are deduced from the definition of BEM, and after going through a number of highly cited publications in BEM (
Dounis and Caraiscos, 2009;
Guan et al., 2010;
Masoso and Grobler, 2010;
Palensky and Dietrich, 2011;
Menezes et al., 2012;
Ahmad et al., 2014;
Gu et al., 2014;
Shaikh et al., 2014;
de Wilde, 2014), in addition to “building energy management”, some frequent terms, including “building energy consumption”, “building energy performance”, “building energy monitor”, and “building energy diagnosis”, are identified. Overall, the key phrases selected are shown in Table 1.
The next step is to search in the selected database. Regarding the database, Web of Science (WoS), Scopus and Google Scholar were considered. Taking record quality and coverage depth into account, WoS and Scopus are the main sources with the largest number of papers used for citation data (
Norris and Oppenheim, 2007;
Mongeon and Paul-Hus, 2016). In terms of coverage, WoS focuses on Science, Technology, Social Science, and Arts and Humanities, while Scopus concentrates more on Physical Sciences, Health Sciences and Life Sciences. In addition, WoS is designed to conduct citation analyses, providing more detailed citation analyses of high quality (
Falagas et al., 2008), which are more suitable for the analysis method in this paper. Considering the coverage field and citation analysis quality, WoS was selected as the database to search in.
After being put forward in 1956, AI experienced a trough after a short hot period until the early 21st century, when AI saw a significant rise in a variety of fields. Therefore, this study conducted a two-decade literature review, with the time span set as 1998–2020. Only peer-reviewed English articles were included for further analysis to ensure the representativeness of the papers. Book reviews, editorials and proceeding papers were excluded to maintain the consistency of the analytical structure of the research aims and methods. To ensure that the topics are restricted to the managerial issues of life-cycle building energy, the search was further refined by limiting the research areas to “engineering”, “construction building technology”, “environmental sciences ecology”, “operations research management science” and “business economics”. As a result, 581 items were retrieved. Despite the specified criteria, there were still some papers that are outside the scope. Following the criteria that the approach must be ML and that the theme must be relevant to BEM, after screening and removing extraneous papers manually, a total of 387 papers were selected for further analysis.
3 Quantitative analysis
3.1 Paper co-citation analysis
Co-citation refers to a paper cited by two papers simultaneously. The paper co-citation network of ML-BEM generated by CiteSpace is shown in Fig. 2. The nodes represent cited references, and lines between two nodes represent co-citation relationships. The thicker the line is, the more frequent the co-citation is, and the larger the node is, the more critical the study is. Detailed information of the publications co-cited more than 10 times is illustrated in Table 2. These highly co-cited papers can be divided into three categories: Literature reviews (
Zhao and Magoulès, 2012;
Foucquier et al., 2013;
Ahmad et al., 2014;
Robinson et al., 2017;
Amasyali and El-Gohary, 2018;
Wei et al., 2018), basic ML technology (
Pedregosa et al., 2011), and predictive models (
Li et al., 2009;
Edwards et al., 2012;
Tsanas and Xifara, 2012;
Fan et al., 2014;
2017;
Jain et al., 2014;
Chae et al., 2016). It is normal for the literature reviews to have high citations because of their comprehensive and systematic analysis. The ML technology is a key tool for studying the BEM problems, and energy prediction is one of the most concerning topic among scholars. More specifically, in energy prediction, scholars focus more on short-term forecasting (hourly, daily) than on long-term forecasting (weekly, monthly).
3.2 Keyword co-occurrence analysis
When different terms are used in the same article, there is a relationship between the terms the authors adopted. The keywords of a paper are the reliable terms that the authors choose from the paper, reflecting the main content of the article (
Whittaker et al., 1989). The papers were analyzed in terms of keywords, and the years per slice was set to 2. The keywords “machine learning”, “machine learning method”, and “machine learning model” were defined as the domain stop words, because these three words are directly related to ML-BEM. Substantial value was not added to the present analysis. In addition, the high percentage they occupied in the domain can influence the other keywords’ analysis. The keywords co-occurrence network is shown in Fig. 3, where the node represents keywords occurring more than twice, and the size of the keyword label is proportional to its frequency. The lines represent that the two keywords were used in one paper.
The keywords with the most frequent co-occurrence are “artificial neural network”, “consumption” and “performance”. The ANN has the highest frequency, which illustrates that it is still the most commonly used ML technique in the ML-BEM domain, which is consistent with
Zhao and Magoulès (2012). In addition, building energy consumption and building energy performance are the most interesting issues in the ML-BEM domain. The top 5 ML techniques and BEM topics are demonstrated in Table 3. The topics “model”, “prediction”, and “optimization” correspond to the following BEM categories: (a) building energy modeling, (b) building energy use prediction, and (c) building energy optimization, respectively. The modeling of building energy is the basis of building performance estimation (
Eisenhower et al., 2012;
Kontokosta and Tull, 2017;
Touzani et al., 2018), and the prediction of energy use is one of the most crucial parts of building energy consumption. The purpose of building energy research is to reduce the environmental impact, and the optimization of building energy systems can obviously achieve this goal, which accounts for the attention attracted.
3.3 Cluster analysis
Keyword frequency has difficulty showing the knowledge structure. As a mathematical and statistical method, cluster analysis is used to identify the latent semantic themes within the textual data (
Hossain et al., 2011). CiteSpace provides functions to decompose a network into clusters and automatically label the clusters (
Chen, 2006). In this paper, keyword cluster analysis was conducted to identify the main theme groupings in the ML-BEM field. Figure 4 shows the seven clusters identified. The larger the cluster size is, the larger the automatically generated label is. The minimum size is set to 10. The modularity
Q is 0.5737 (>0.3), indicating that the network is reasonably divided into loosely coupled clusters (
Newman, 2006), while the mean silhouette is 0.5956 (>0.5), representing the consistency of each cluster.
The details of the seven clusters are shown in Table 4. The cluster size means the number of papers in this cluster, and each cluster is sorted in descending order of size. Automatic labeled clusters can lead to misinterpretation if their labels are taken out of context (
He et al., 2017). The terms of top frequency are shown to further clarify the content of each category.
Regarding further interpretation, Cluster #0, namely, support vector machine (SVM), has the largest size, including 20 papers. Its top terms show that when studying energy demand and consumption prediction, SVM is often used, with more attention given to short-term (hourly) prediction and residential buildings. Cluster #1 illustrates that the research on the energy of cooling load generally uses the unsupervised ML method under the context of smart grids, and during the building maintenance stage, the failure mode is a common concern problem. Data mining is an important auxiliary means of ML because it is driving the vital developments of ML in the 21st century. Cluster #2 shows that in the field of ML-BEM, data mining is intensively used in occupancy and indoor environmental data, and the linear mixed effect model is often adopted. Cluster #3, urban sustainability, focuses on energy consumption identification, and with the assistance of an adaptive algorithm, novel platforms, such as cloud forecasting systems and home automation systems, are used to solve urban sustainability problems. Cluster #4 reveals that air conditioning and mechanical research commonly concentrate on Assist-Controlled Mechanical Ventilation (ACMV) systems and that sparse swarm algorithms occur frequently. Cluster #5 illustrates that when studying the energy management of commercial buildings, especially office buildings, multiple buildings and passive commercial buildings, the surrogate model is introduced to reduce computation and to guide informed design (
Westermann and Evins, 2019). Cluster #6 refers to the deep learning method applied in ML-BEM, including random forest (RF), gradient boosting (GB) machine, and deep recurrent neural network methods, and fault diagnosis is the most common problem solved (
Zhao et al., 2019).
Previous research has reviewed several clusters, mainly concerning the ML applications (Cluster #0), data-driven methods (Cluster #2), as well as deep learning methods (Cluster #6) in BEM, indicating that attention has been given to technology improvement and innovation. In addition, research on cooling loads (Cluster #1) and air-conditioning systems (Cluster #4) has also gained much emphasis. From the results of keywords co-occurrence and cluster analysis, the main research topics are revealed to be accordant, and their complementarity has enriched the study of ML-BEM.
4 Integrated framework of ML-BEM
An integrated framework is developed to describe the current research status of the ML applications in BEM, as shown in Fig. 5. Four layers and a series of driving factors are identified to form the main body of the ML-BEM knowledge domain.
4.1 Perception layer
The technology layers are composed of perception layer, data layer and algorithm layer. The bottom layer, the perception layer, is used to identify and collect information, which feeds basic information regarding various building and environment parameters.
The most crucial elements in the perception layer are various types of sensors located in the interior of building equipment or the immediate environment. A huge amount of data are generated by sensors, which can be exploited to increase energy efficiency (
Molina-Solana et al., 2017). These sensors collect different kinds of information about the outside. For example, thermistors, thermocouples and IC sensors can collect temperature information, capacitive sensors and resistive sensors can collect humidity information, and light information can be collected by photodiodes and ultraviolet radiation sensors. What should be emphasized is the relevant sensor that collects occupant information. On the one hand, occupant behaviors and occupancy can affect the efficiency and performance of building energy.
Dong and Lam (2014),
Liang et al. (2016),
Aftab et al. (2017) and
Wei et al. (2019b) all stressed occupant behaviors and occupancy information when studying BEM. On the other hand, research on occupant thermal comfort (
Zhai and Soh, 2017;
Ghahramani et al., 2018;
Du et al., 2019) is also of great concern in BEM. To collect occupant-related data, infrared sensor cameras are adopted to obtain the heat information of the human body to optimize the building energy performance, while the Radio Frequency Identification (RFID) device can gain occupancy information, helping determine the heating and cooling loads. In addition, a laser scanner and infrared scanner can collect building shape and appearance information.
4.2 Data layer
The second layer is the data layer, where the data already collected are preprocessed before entering the algorithm layer. The sources of the data are varied. Apart from different kinds of sensors and other automatic identification equipment, climate, weather records and energy bills are data sources. Environmental factors have been influencing the development of BEM. Increasing climate change and environmental degradation, as well as people’s requirements for environmental sustainability, continue to promote and influence research in BEM. Climate change (
Cenek et al., 2018), weather conditions (
Deb et al., 2017) and building environment conditions (
Yun and Won, 2012) all exert impacts on building energy-related problems, especially on building energy efficiency and building energy modeling. Various data sources lead to heterogeneous data with diverse structures. As a consequence, it is necessary to aggregate the data properly, according to the research purpose and algorithm requirements. Data cleaning is another crucial part of the data layer because it is common to face inconsistent data, missing-value data, duplicate data and data with invalid values.
4.3 Algorithm layer
Before the introduction of ML into BEM, engineering and statistical methods were the most commonly used methods. Engineering methods are comprehensive but time-consuming, and statistical methods require a large amount of data. Through exploring many studies, it can be found that ML has been widely used in many fields, and BEM is no exception.
A large number of ML methods have been used to study BEM, including ANN (
de Wilde, 2014;
Benedetti et al., 2016;
Deb et al., 2016;
Zhang et al., 2016), SVM (
Eisenhower et al., 2012;
Jain et al., 2014;
Cui et al., 2016;
Grolinger et al., 2016;
Candanedo et al., 2017), regression (
Edwards et al., 2012;
Tsanas and Xifara, 2012;
Aftab et al., 2017;
Fan et al., 2017;
Kontokosta and Tull, 2017), and decision tree (DT) methods (
Zhao et al., 2014;
Basu et al., 2015;
Idowu et al., 2016;
Liang et al., 2016;
Ryu and Moon, 2016). As seen in Table 5, ANN, SVM, regression, especially linear regression (LR), random forest and decision tree methods are the most popular methods. Additionally, many articles use several ML methods to study BEM-related problems (
Edwards et al., 2012;
Dong and Lam, 2014;
Castelli et al., 2015;
Mocanu et al., 2016a;
Kim et al., 2018).
Edwards et al. (2012) reported on the evaluation of seven different ML algorithms applied to predict next hour residential building consumption and showed that the least squares SVM performs the best.
Dong and Lam (2014) illustrated a methodology for integrated building heating and cooling control to reduce energy consumption, employing an adaptive Gaussian process (GP) and hidden Markov model.
In the comprehensive framework, the most frequently used ML algorithms every year from 2003 to 2020 are shown. ANN is obviously the most commonly used ML method due to its strong robustness (
Liu et al., 2019), which has the capability to be immune to faults and noise (
Tso and Yau, 2007).
Benedetti et al. (2016) compared three different structures of neural networks to create the most suitable energy consumption control tool with an automated mechanism.
Rahman et al. (2018) developed a recurrent neural network model to make medium- to long-term electricity consumption predictions for commercial and residential buildings with the purpose of supporting decision making concerning operation and demand response strategies. Typically, SVM, as a generalized classifier, can settle nonlinear and complicated problems effectively and is also frequently used in BEM.
Eisenhower et al. (2012) established an analytical meta-model to optimize building energy models without repeating time-consuming energy simulations.
Paudel et al. (2017) adopted the SVM method to predict the heating energy consumption of low-energy residential buildings, utilizing two kinds of modeling approaches, namely, “all data” and “relevant data”, where the “relevant data” approach has lower complexity and higher accuracy. In addition, there are many other ML methods adopted in BEM, including random forest, extreme learning machine (ELM), decision tree methods, etc.
4.4 Application layer
Scholars generally focus on specific research scenarios. The research scenarios can be divided into four dimensions: Building type, building life-cycle phase, temporal granularity and energy type. The characteristics of energy use and management of different building types vary greatly, and the focal points of the papers are also different. Among the papers reviewed, the following building types were studied: Residential buildings, commercial buildings, industrial buildings, laboratory rooms and others. For a larger scale, multibuilding (
Chen and Yang, 2017;
Xu et al., 2019) or urban city energy (
Kontokosta and Tull, 2017;
Nutkiewicz et al., 2018) are also studied. If a paper studies two or more types of buildings, the counts are done separately. As Table 6 shows, the vast majority of the studies focused on commercial buildings (
Liu and Henze, 2006;
Najafi et al., 2012;
Gao and Malkawi, 2014;
Mulumba et al., 2015;
Wang et al., 2018b), research on residential buildings takes second place (
Wen et al., 2015;
Henao et al., 2017;
Jin et al., 2017;
Paudel et al., 2017;
Papadopoulos and Kontokosta, 2019), and research on industrial buildings accounts for only 12.4% (
Eisenhower et al., 2012;
de Wilde et al., 2013;
Gong et al., 2017;
Gallagher et al., 2018). Some scholars have studied problems with building types other than those above.
Deb et al. (2016) focused on institutional buildings, employing the ANN method to forecast diurnal cooling load energy consumption, and through examination of three institutional buildings, the method was shown to be applicable to other institutional buildings.
In terms of the building life-cycle phase, most of the research focuses on the design phase and operation phase of buildings. While vast amounts of energy are consumed during the building operation phase, energy consumption during the construction phase, particularly in the production of construction materials, is also substantial. To estimate energy consumption during material production, an ANN model, taking into account the influence of the production and composition of asphalt, was developed by
Androjić and Dolacek-Alduk (2018) with the purpose of successfully forecasting natural gas consumption in the process of hot mix asphalt production.
In regard to temporal granularity, hourly, daily, weekly, monthly and even annual energy use are studied. The granularity of the collected data and the research topic lead to different selection of temporal granularity. For instance, until now, most residential modeling research only has access to monthly electrical consumption data. It is difficult to predict hourly or daily energy consumption based on monthly unit data alone, but it is possible to analyze time units with larger granularity by accumulating energy consumption over a period of time. Meanwhile, the research issues also affect the choice of data granularity. As to office building energy consumption prediction, for example, due to the difference between rest and working days, daily energy use data are required. While analyzing the fluctuation of household electricity consumption in one day, it is necessary to select hourly data at least. Many scholars have concentrated on hourly energy consumption prediction (
Edwards et al., 2012;
Chen et al., 2017;
Wang et al., 2018b), while others have focused on daily energy consumption prediction (
Manjarres et al., 2017;
Wang et al., 2018a). In addition, many models have been developed for multiple granularities.
Li et al. (2018) established a deep belief network-based hybrid model for daily and weekly prediction, while
Nutkiewicz et al. (2018) proposed a data-driven urban energy simulation framework for hourly, daily and monthly use.
With regard to energy type, topics ranging from different energy loads, including cooling, heating, and electricity, to various energy systems, such as high voltage alternating current (HVAC) systems or a single air-conditioning system, are fully discussed. Among various energy types, the most studied one is electricity.
Ahmad et al. (2019),
Dong and Lam (2014),
Lin et al. (2019) and
Ngo (2019) studied cooling and/or heating loads, and
Zhu et al. (2013),
Chen and Sun (2017),
Ghahramani et al. (2017), and
Tian et al. (2019) considered different energy systems. Different combinations of building type, building life-cycle phase, temporal granularity and energy type constitute the research scenarios of ML-BEM.
With respect to the management of energy buildings, scholars have discussed different aspects of ML-BEM, and five main topics can be identified: a) energy demand/use/consumption prediction; b) energy system analysis and design; c) energy performance and optimization; d) fault detection and diagnosis; and e) occupant behavior analysis. The topic of energy demand/use/consumption prediction exerts a vital impact on reducing energy consumption and improving building energy demand and supply management (
Amasyali and El-Gohary, 2018).
Mocanu et al. (2016b) investigated two reinforcement learning algorithms based on a deep belief network to model building energy consumption in a smart grid context. The topic, fault detection and diagnosis, is gaining increasing attention.
Li et al. (2016) proposed a fault detection and diagnosis method based on tree-structured learning for building cooling systems that improved the performance with higher-accuracy fault severity levels.
Karami and Wang (2018),
Guo et al. (2018c) and
Hu et al. (2019) also studied the fault detection and diagnosis of building energy. Occupant behavior analysis is another hot topic, playing an important role in energy consumption in buildings (
Zhu et al., 2017;
Razavi et al., 2019;
Wei et al., 2019a). The topic of building energy modeling is more likely to solve building energy-related problems. Many scholars develop various building energy models to solve various energy problems for predicting (
Tooke et al., 2014;
Dong et al., 2016;
Candanedo et al., 2017), optimizing (
Eisenhower et al., 2012;
Zhai et al., 2017;
Smarra et al., 2018) and other purposes.
4.5 Driving factors
With the development of society, a number of new technologies, such as the Internet of Things (IoT) and big data, have gradually come into view. Four driving factors are identified, including cloud computing, big data, the IoT and automated measurement, which exert considerable impacts on all layers of the ML-BEM knowledge domain.
Big data are regarded as an effective tool for solving many social problems, providing new perspectives for many fields (
Boyd and Crawford, 2012), and the ML-BEM area is in the same way. On the one hand, big data expand the data capacity of the data layer; on the other hand, it also promotes the data-driven development of algorithms in the algorithm layer.
Edwards et al. (2017) presented a large-scale surrogate model using feed forward neural networks and Lasso regression in the context of big data.
Grolinger et al. (2016) used ANN and SVM to show the significance of big data, especially the influence that the temporal data granularity exerts on the accuracy of energy consumption forecasting in the background of an event-organizing venue.
Luo et al. (2019) developed an IoT-based big data platform based on the hybrids of
K-means clustering and ANN for the prediction of building heating and cooling demands and demonstrated that due to the involvement of IoT sensors and big data, the overall prediction accuracy was improved.
The IoT provides a new opportunity for improving the intelligence of BEM by integrating various sensors, smart meters and actuators in a cost-saving way (
Pan et al., 2015). For the perception layer, the IoT enables many new types of sensors to be installed and diversified sensor networks to be established.
Tushar et al. (2018) illustrated the process of gaining building occupancy data via IoT sensors with low price and explained how occupancy behaviors influenced building energy use. Considering a transfer learning-based technique and an unsupervised learning technique, this research can help to design more efficient energy conservation measures and reduce electricity costs.
Cloud computing and the IoT cannot be separated from each other.
Mengistu et al. (2019),
Yu et al. (2016),
Chou and Truong (2019), and
Gallagher et al. (2019) adopted ML methods, together with cloud-based technology and the IoT, to explore building energy.
Yu et al. (2016) proposed a building management system based on the IoT and cloud-based technology that helps reduce the energy consumption of buildings. However, although cloud computing reduces the computing cost and expands the computing power and flexibility, it also increases the security risk and latency issues, which should be of importance.
An increasing number of automatic and intelligent equipment have put forward higher requirements for the management of building energy and automated measurement, therefore gaining much attention. Automatic measurement increases the complexity of sensors for data acquisition in the perception layer because increasing automation leads to the need for a more complex intelligence program. Moreover, the automated measurement enriches the data set and refines the data granularity. Along with moving toward smart buildings and smart cities, as one important component of automated measurement, the smart grid appears.
Fenza et al. (2019) proposed a methodology for anomaly detection based on sensor data coming from a smart grid, monitoring the difference between the predicted and actual consumption to help identify energy consumption anomalies. Actually, the greatest challenge of automated measurement is determining which part of the data should be measured to support research and decision making.
The four layers, i.e., perception layer, data layer, algorithm layer and application layer, together with a series of driving factors, constitute the main knowledge domains of ML-BEM. The evolutionary trend of knowledge is discussed as follows.
5 Knowledge evolution
The ML technology has gone through many different stages since it was applied to BEM. Based on the hype cycle model introduced by Gartner nearly two decades ago, the knowledge evolution of ML-BEM can be concluded. It is very difficult to predict the development of a certain technology in one field because its numerous influencing factors increase its uncertainty. One useful model to analyze and predict technology development is the hype cycle model, which has gained substantial attention from many scholars in recent years (Jarvenpaa and Makinen, 2008;
van Lente et al., 2013;
Dedehayir and Steinert, 2016;
Khodayari and Aslani, 2018). The hype cycle model proposes that technologies go through several stages: First culminating in peaks, then decreasing through disappointments, and finally reverting to expectations and achieving a smooth upward trend. The model explains a general path that a technology tracks over time. As shown in Fig. 6, the evolution of the ML-BEM knowledge is concluded and divided into four phases: The ML-BEM concept formation phase, the booming phase, the concept extension and strategy phase, and the new technology and context phase. In light of the number of papers published per year, the four stages correspond to the ascent phase of the hype cycle curve before reaching the peak.
From 2003 to 2013, the basic conceptual problem was discussed. Frequent terms, such as “system”, “reinforcement learning”, “consumption”, “model”, and “thermal storage”, reflect the range of study: The system and energy use as the initial research object, the design and operation as the main life-cycle stages and the modeling as the dominating tool. This can be considered as the incipient phase of ML-BEM. Since 2014, ML has been explosively adopted in BEM, and ML-BEM has entered a booming phase. A range of ML methods have come into view, including ANN, SVM, and regression, accounting for a large frequency of occurrence. From 2016 to 2017, the research entered a period of accelerated development, during which the scope of ML-BEM extended to a wider range. Instead of concerning solely the system or the building itself, the research scope has gradually included the demand side, incorporating human behavior and comfort into the field. The keywords or terms “demand”, “behavior”, “thermal comfort”, and “consumer” reflect this tendency. From 2018 to 2020, new technologies and a new context led ML-BEM into a new phase. With the emergence of the IoT, cloud computing, big data, and smart sensors/grids/buildings/cities, ML-BEM faces more challenges and opportunities. In this phase, new technologies are introduced in conjunction with the ML technology, generating various integrated and innovative approaches for every application scenario of BEM. In summary, the trend of ML-BEM is toward a more extensive research scope with more comprehensive research methods. According to the general pattern of technology development, as ML-BEM flourishes, it is worth considering that bottlenecks and downturns may be coming.
6 Future directions
6.1 Behavioral impact on BEM
It is widely accepted that sustainability analysis consists of environmental, social and economic impact evaluations. In the field of ML-BEM, occupant presence and behavior in buildings have been shown to have large impacts on space heating, cooling and ventilation demand, energy consumption of lighting and space appliances, and building controls (
Page et al., 2008).
Pan et al. (2019) used the Gauss distribution model to predict window behavior in office buildings, which significantly impacts the indoor environment and energy consumption. This study also compared the prediction performance between the Gaussian distribution modeling approach and the logistic regression modeling approach, and the results showed that the Gaussian distribution models could provide higher prediction accuracy. With the development of ML and other intelligent technologies, data collection methods, data types, and data magnitude also change. To reach the purpose of forecasting building energy and performance, it is essential to study occupant-related factors in BEM. For instance, in the context of ML and big data, the number of people in the building can be acquired and processed through the mobile terminal to realize the fine operation and maintenance management of the building. In addition to the impact of human behavior on building energy use, balancing occupant comfort and the energy consumption is also worth studying.
6.2 Integrated management of renewable energy
Having access to clean, affordable and reliable energy has been a cornerstone of the world’s increasing prosperity and economic growth since the beginning of the industrial revolution (
Chu and Majumdar, 2012). An increasing amount of renewable energy is used in buildings. Some scholars have studied BEM related to renewable energy, such as solar energy.
Mohajeri et al. (2018) used SVM to classify 10085 building roofs in relation to their received solar energy in the city of Geneva in Switzerland, providing basic information for designing clean buildings and efficient solar integration on the roofs of buildings. The intermittent nature of renewable energy also brings new problems to energy management. With the gradual promotion of renewable energy in buildings, the integration of renewable energy in the existing building energy system and the joint management of traditional energy and renewable energy are topics worthy of study.
6.3 Security concerns of ML-BEM
As big data and the IoT become more popular, an increasing number of sensors and IoT devices are installed in buildings during the initial design phase or later retrofitting phase to collect many kinds of data. The data spreading over wireless networks and the energy management database both aggravate the security issue of BEM.
Rathinavel et al. (2017) analyzed the security threats raised from IoT devices in the context of a specific building automation system and came up with corresponding countermeasures. There are many topics worth studying in terms of security in ML-BEM to ensure security and confidentiality when data are transmitted, stored and read, such as developing more robust security technologies and designing more secure approaches for information transmission and storage. In addition, different security scenarios can be developed to test different models of energy management.
6.4 Extension to other building life-cycle phases
Most of the research on building energy demand and sustainability focuses on the design phase and operation phase of buildings.
Paudel et al. (2014) evaluated the transition- and time-dependent effects of operational power level characteristics of heating plant systems and employed neural networks to predict building heating energy demand over a short time horizon, which was essential for energy service companies to control the heat plant system dynamically. While a majority of energy consumption occurs in the operation stage, the consumption during the construction phase is not to be underestimated. Although a small number of scholars have begun to focus on building energy consumption during the construction phase, research that focuses on the construction phase is too sparse compared to research that focuses on the design and operation phase. It is suggested that the construction phase also needs to be considered.
6.5 Focus on automatic fault detection and diagnosis
Often, ML serves three purposes: To predict, to diagnose, or to summarize (
Jordan and Mitchell, 2015). Almost all studies on the application of ML in BEM are predictive studies, and relatively few studies are for diagnostic purposes.
Karami and Wang (2018) developed an adaptive Gaussian mixture modeling approach for automatic fault detection and diagnosis in nonlinear systems.
Najafi et al. (2012) proposed diagnostic algorithms based on a Bayesian network for air handling units that can solve the modeling limitations and measurement constraints. As more energy systems are used over longer periods of time, the likelihood of accidents inevitably increases. Combined with an image recognition algorithm, the running state of a building energy system can be analyzed and diagnosed based on infrared images of equipment. The fault detection and diagnosis of energy systems, especially early detection and diagnosis, has become more important and needs more attention.
7 Conclusions
Defining the scope of energy management in buildings can help improve energy management and thus improve energy efficiency in buildings (
Guan et al., 2010;
Li et al., 2019). The increasing use of ML technology brings about many profound changes in BEM and redefines the boundary of BEM. This study analyzes publications related to ML-BEM with the purpose of providing a comprehensive knowledge summary and development tendency of ML-BEM.
First, an integrated framework of ML-BEM was proposed. Four layers, namely, the perception layer, data layer, algorithm layer and application layer, composed of a series of research scenarios and main topics, together with four driving factors, constitute the knowledge domains. The main topics include 5 parts: a) energy consumption prediction; b) BEM analysis and design; c) energy performance and optimization; d) fault detection and diagnosis; and e) occupant behavior analysis. The framework proposed in the paper can be regarded as one of the best practices of ML-BEM. The increasingly renewing driving factors may bring breakthroughs to the application of ML in the field of BEM, which can promote the further development of ML-BEM and alleviate the downward trend to some extent.
Based on the hype cycle model, the knowledge evolution of ML-BEM can be divided into 4 phases: The ML-BEM concept formation phase, the booming phase, the concept extension and strategy phase, and the new technology and context phase. It can be seen from the ML-BEM knowledge evolution that the ML-BEM research is fully operational, developing toward more extensive applications. It is also necessary to pay attention to the arrival of downturns.
Then, based on the discussion above, five future research directions were discussed. (1) The behavioral impact on BEM, such as occupancy, occupant behaviors and occupant comfort, can be taken into consideration when considering the factors affecting building energy and performance. (2) With the wide use of renewable energy, research on the integrated management of renewable energy in BEM can be strengthened. (3) As more information is transferred and stored, to ensure the reliability of BEM, more research is needed to explore the security of BEM. (4) Due to the vast amounts of energy consumed when buildings are constructed, more attention should be given to the construction phase. And (5) as more energy systems are used over longer periods of time, future studies are recommended to focus on building energy detection and diagnosis.
Although the use of ML in traditional areas such as facial recognition, fraud detection, and product recommendations is highly mature, the research on application of ML in BEM still needs to be developed. This study provides an integrated framework and development tendency of ML-BEM, contributing to the knowledge body of ML-BEM. This paper reviews the ML-BEM research based on the papers retrieved from WoS database before December 2020. To ensure the representativeness of papers and the consistency of the analytical structure of the research aims and methods, only peer-reviewed articles were included, which may miss some relevant proceeding papers. Despite the rapid development of ML, the most advanced development is in the theoretical stage and has not been applied in practice in the BEM area. Hence, although the papers published after 2020 are not covered, the framework of the paper would not be changed. With the development of the ML technology and the emergence of other new technologies, much exploration is required to further promote the development of ML-BEM research.