Transfer Learning For Multiobjective Non Intrusive Load Monitoring in Smart Building

Applied Energy 329 (2023) 120223
Contents lists available at ScienceDirect
Applied Energy
journal homepage: www.elsevier.com/locate/apenergy
Transfer learning for multi-objective non-intrusive load monitoring in smart

building
Dandan Li a,e , Jiangfeng Li b , Xin Zeng c , Vladimir Stankovic b , Lina Stankovic b ,
Changjiang Xiao d,f , Qingjiang Shi a,e ,∗
a
School of Software Engineering, Tongji University, Shanghai, China
b Department of Electronic and Electrical Engineering, University of Strathclyde, Glasgow, Scotland, United Kingdom
c College of Electronic and Information Engineering, Tongji University, Shanghai, China
d College of Surveying and Geo-Information, Tongji University, Shanghai, China
e
Shenzhen Research Institute of Big Data, Shenzhen, China
f
Frontiers Science Center for Intelligent Autonomous Systems, Tongji University, Shanghai, China
ARTICLE INFO ABSTRACT
Keywords: Buildings represent 39% of global greenhouse gas emissions, thus reducing carbon emissions in buildings is
NILM of importance to greenhouse gas emissions reductions. This requires understanding how electricity is utilized
Energy disaggregation in the buildings, then optimizing electricity management to seek conservation of energy. Non-intrusive load
Transfer learning
monitoring (NILM) is a technique that disaggregates a house’s total load to estimate each appliance’s electric
One-to-many model
power usage. Several strategies for estimating one appliance at a time (one-to-one model) have been presented
and experimentally proven to be effective, with two mainstream trends: appliance transfer learning and cross-
domain transfer learning. The former refers to the transfer between different types of appliances in the same
data domain, while the latter refers to the transfer between different data domains for the same type of
appliance. Different from the previous work, this paper explores the approach of adopting one model for all
appliances (one-to-many model) and proposes a novel transfer learning scheme, that incorporates appliance
transfer learning and cross-domain transfer learning. Thus, a well-trained model can be transferred and
utilized to effectively estimate the power consumption in another data set for all appliances, which demands
fewer measurements and only one model. Three public data sets, REFIT, REDD, and UK-DALE, are used
in our experiments. Further, a set of smart electricity meters was deployed in a practical non-residential
building to validate the proposed method. The results demonstrate the accuracy and practicality compared
to start-of-the-art one-to-one NILM transferred models.
1. Introduction effective energy feedback [9]. In recent years, the research on NILM has
been intensified due to practical demands (improving energy efficiency
Climate change has been raised as one of the great threats and and meeting global climate change targets) and the availability of data
as such attracts global attention [1,2]. To mitigate the negative ef- via massive roll-out of smart meters [10].
fects of climate change, it is urgently required to reduce greenhouse From the perspective of data usage, NILM can be divided into
gas emissions and approach net-zero targets [3]. In reaching net- high-frequency and low-frequency approaches according to the sam-
zero targets, buildings play a crucial role, as buildings represent 39% pling frequency of the raw data [11]. High-frequency data has the
of global greenhouse gas emissions [4]. This requires understanding apparent advantages due to the large information content available
how electricity is utilized in the buildings, then optimizing electricity
for processing. As presented in [12], with high-frequency data, it is
management to seek conservation of energy [5].
even possible to distinguish identical appliances. However, the high fre-
Non-intrusive load monitoring (NILM) is a cost-effective and promis-
quency requires high cost for both hardware dedicated infrastructure,
ing approach that enables estimating individual building loads without
extra installation effort as well as data storage and processing. In recent
a need for separate, appliance metering. It aims at disaggregating
the total energy consumption of a building down to individual appli- years, with the pervasive roll-out of low-frequency smart meters, NILM
ances [6,7], improving appliance anomaly detection [8] and facilitating research tends to explore low-frequency approaches [13].
∗ Corresponding author at: School of Software Engineering, Tongji University, Shanghai, China.
E-mail address: shiqj@tongji.edu.cn (Q. Shi).
https://doi.org/10.1016/j.apenergy.2022.120223
Received 13 July 2022; Received in revised form 6 October 2022; Accepted 23 October 2022
Available online 8 November 2022
0306-2619/© 2022 Elsevier Ltd. All rights reserved.
D. Li et al. Applied Energy 329 (2023) 120223
Various NILM approaches established on low-frequency data can 2. Related work

be roughly classified into event-based and non event-based categories
depending on whether or not there is a clear state change detec- NILM has been extensively studied as a tool that can support energy
tion [14]. In the previous research, the event-based methods are the management, smart home automation, and demand response mea-
main trend [15]. The work in [16] proposed a multi-class classifi- sures [15,29]. Here we provide an elaborate discussion about the
cation model that uses current shapelets to extract features of each related work on multi-task learning and transfer learning for NILM.
appliance. The work in [17] proposed a graph signal processing based
method for NILM, based on which [18] proposed a training-less so- 2.1. Multi-object learning
lution. With a large availability of data for training models, recently,
deep learning-based methods have been popular for NILM as non-event-
For practical NILM applications, a multi-object classification model
based methods. A Convolutional Neural Network (CNN) architecture
is often a desirable option because it requires fewer parameters to
for sequence-to-point NILM, disaggregating one appliance at a time,
optimize, and requires less training time, fewer computational re-
was first proposed in [19]. Furthermore, the work in [20] proposed a
sources and less storage demands. Consequently, a few studies have
fully-convolutional denoising auto-encoder architecture for large non-
investigated multi-object classification for NILM [30]. The work in [31]
residential buildings. In addition, the work in [21] employed a 3-
layers CNN architecture for NILM, which can estimate the electricity provided a complete NILM architecture containing event type classi-
consumption of multiple appliances simultaneously. fication, load identification, event sample detection, and the multi-
Though a large number of network architectures have been pro- classification. The work in [23] applied Signal2Vec dimensionality
posed in the literature, there are still clear gaps between academic reduction for NILM multi-object classification. The work in [32] used
research and practical application: (1) Data Scarcity: The lack of la- a combination of multi-object classification and convolutional transfer
beled data is an open issue for NILM [22], since gathering sufficient learning for NILM. See [33] for a review of existing multi-object
training data takes a long time in practice. How to use less training classification algorithms. Overall, the state-of-the-art multi-object clas-
data in practical applications while maintaining accuracy remains a sification methods produce promising results, but some challenges
challenge [13]. (2) Model Complexity: Complex approaches often re- remain.
quire high computing power, which limits the large-scale application Indeed, the above approaches require large amounts of labeled
of models [23]. appliance power data [34]. To address this problem, transfer learning is
To address the aforementioned challenges, this paper proposes a used in the NILM model to reduce the amount of required training data.
one-to-many model transfer strategy, building upon initial results from In the following, the data used to pre-train the model is referred to as
our conference paper [24]. The first challenge is addressed via an source data, while the data to be disaggregated is referred to as target
architecture optimized for transfer learning (transferring the knowledge data. Transfer learning models, in general, learn and store knowledge
of the same appliances among different datasets). The second challenge from source data before applying it to target data.
is addressed by having a single network for all appliances, training
and testing a single model for the entire household, instead of one 2.2. Transfer learning
model per appliance. The performance is verified on three public data
sets (REFIT [25], REDD [26], and UK-DALE [27]). In particular, the
As reviewed in [13], the transferring models are great opportunities
proposed workflow contains three steps: pre-training using labeled
to tackle the problem of data scarcity. Currently, the research on the
source data, retraining of the last two dense layers utilizing the target
application of transfer learning in NILM is limited. As the inception of
data, and fine-tuning the CNN layers with the target data. Fair compar-
transfer learning for the NILM task, the work in [35] firstly designed
isons with the state-of-the-art transfer learning model [28] are further
provided to demonstrate the applicability of the proposed approach. two data-driven deep learning-based architectures (CNN and GRU) for
To demonstrate the efficiency of the network for real-time NILM NILM. Cross-domain transfer learning was applied to REFIT, UK-DALE,
processing, we equipped a non-residential building with energy meters and REDD datasets. Across all three datasets, both networks predict
and disaggregate six appliances: washer, fridge, air conditioner, server, state and consumption; a significant performance loss is observed when
freezer, and a central air conditioner (CAC). The proposed model is training and testing datasets are different. Another transfer learning
trained with REFIT data, and then retrained and fine-tuned with the scheme, appliance-domain transfer learning, was proposed in [28]. The
collected data. Promising results were obtained after the practical test work in [28] employed a one-to-one transferring model, which can
and application. predict one appliance at a time.
The key contributions of this paper are three-fold: Further, the work in [36] proposed a transfer learning-based NILM
model, which uses a long short-term memory neural network to extract
1. A one-to-many NILM model is designed (one model for all features and then classify the appliance type with a probabilistic neural
appliances), which significantly reduces the model size and the
network. This model can infer the appliance type with limited mea-
complexity of NILM for multiple appliances.
surement, however, it needs prior knowledge of electrical appliances,
2. A transfer one-to-many NILM model is first proposed. Thus, a
e.g., rated power.
well-trained model can be transferred to be used for another data
Meta-learning is the second kind of transfer learning model, based
set, effectively estimating power consumption for all appliances,
on which [37] was proposed. Using the source data, this model pre-
with fewer measurements and less training.
trained the parameters to identify more general initial parameters for a
3. A novel metric, Overall Disaggregation Proportion Error (ODPE),
quick training with target data. Nonetheless, instead of fine-tuning, it
is defined to measure the disaggregation error of each type of
focuses on training a new generic model, which is then used to estimate
electrical appliance from the perspective of load ratio.
one appliance at a time.
We organized the paper as follows. Section 2 provides an elaborate Despite the above progress, NILM still faces the following chal-
discussion of multi-object learning and transfer learning for NILM. lenges, which can be addressed by our proposed method:
Section 3 starts by describing how meters and devices are deployed
in buildings for NILM, and the formulation of the NILM problem. Sec- 1. Current multi-object models mainly focus on a few typical appli-
tion 4 introduces the proposed model to address the one-to-many NILM ances rather than all the appliances in the circuit. It is not clear
problem and transfer learning in the NILM problem. Section 5 follows whether all the appliances in a data set can be disaggregated
up with a detailed experimental design, experimental dataset, metrics, with an adaptive single model, i.e., using one model for all
and results. Finally, the conclusions and future work are presented in appliances (one-to-many model) instead of one model for each
Section 6. appliance or a few typical appliances.
2
4.1. Sequence-to-point (seq2point) architecture
In a seq2point architecture, the raw data is divided into short

time windows (i.e., sequences), and the data from each window of
the main electricity meter is used to estimate the power consumption
of the targeted appliance at the window’s midpoint using a neural
network [40]. The model can be expressed as
𝑥̂ (𝑖) (𝑖)
𝑡 = 𝑓 (𝑦𝜏 ) + 𝜖, (2)
0
where 𝑦𝜏 (𝜏 = 1, … , 𝑇 ) represents the 𝜏-th window of the input aggre-

gate data, and 𝑇 is the number of windows. 𝜖 represents the model
noise and 𝑥̂ (𝑖)
𝑡0 is the output of the model, i.e., the estimated power for
appliance 𝑖 at 𝑡0 , which is the middle point of the window 𝜏. 𝑓 (𝑖) (⋅) is
a mapping function of 𝑦𝜏 to 𝑥̂ (𝑖)
𝑡0 for appliance 𝑖 and it is different for
each appliance. For each appliance, the model’s output is based on the
Fig. 1. Compared to intrusive load monitoring, the electricity meters between the maximizing model’s aposteriori probability and can be formulated as:
electricity distribution bins and the appliances can be eliminated when NILM is used.
∏
𝑇
max 𝑝 (𝑥̂ (𝑖)
𝑡 ∣ 𝑦𝜏 , 𝜃𝑖 ), (3)
0
𝜏=1
2. It is unclear whether this one-to-many model can be made trans- where 𝜃𝑖 denotes the entire network parameters for appliance 𝑖, and
ferable and how it compares with the one-to-one transferability each appliance requires such a model.
model. Additionally, very little attention has been devoted to The One-to-one (one model for one appliance) model for solving
transfer learning based on one-to-many models for NILM. Eq. (3) builds a separate model 𝜃𝑖 for each appliance. This one-to-one
model-based method is widely adapted in most recent NILM archi-
3. Problem formulation tectures [18,28,41], where one model is trained for each appliance
to increase the likelihood of correctly predicting a certain type of
Understanding how electricity is used in buildings at the appliance appliance.
level is necessary for improving energy efficiency [38]. There are
two main options for estimating the electric power consumption of 4.2. One-to-many model
each appliance: (a) installing a meter at each appliance and (b) using
the NILM method [39]. Compared with the former method, NILM’s In this paper, we improve disaggregation efficiency by using a
main advantage is the requirement for only one single electric meter one-to-many (one model for many appliances) structure. The model
deployed to monitor the overall load, as illustrated in Fig. 1. The main takes into account all 𝑁 appliances at the same time, resulting in the
meter is connected to the indoor electricity distribution bin, which following problem formulation:
is connected directly to each electrical appliance without electricity
∏
𝑇
meters between them. max 𝑝(𝑥̂ (1)
𝑡 ,𝑥̂ (2) ̂ (𝑁)
𝑡 ,…,𝑥 𝑡 ∣ 𝑦𝜏 , 𝜃), (4)
0 0 0
The main meter power measurement at time stamp 𝑡, 𝑦𝑡 , can be 𝜏=1
expressed as:
where 𝑝(𝑥̂ (1) ̂ (2)
𝑡0 , 𝑥 ̂ (𝑁)
𝑡0 , … , 𝑥 𝑡0 ∣ 𝑦𝜏 , 𝜃) is a joint conditional probability
∑
𝑁 density function of all 𝑁 appliances for each window 𝜏. 𝜃 denotes the
𝑦𝑡 = 𝑥(𝑖)
𝑡 + 𝜖, (1) overall network parameters of the NILM model. Therefore, the one-to-
𝑖=1
many model requires only one model for all appliances, and requires
where 𝑁 denotes the number of known appliances in the building, less storage and computational resources.
and 𝑥(𝑖)𝑡 is the electric consumption of appliance 𝑖 at time stamp 𝑡. In order to solve the NILM problem represented by Eq. (4), we
𝜖 represents the noise which includes measurement noise and all un- provide an efficient CNN architecture that constructs the one-to-many
known load; it is usually modeled as a random variable with Gaussian model to estimate all appliances in the household in one run of the
distribution. The task of NILM is to estimate 𝑥(𝑖)
𝑡 given 𝑦𝑡 for each time network using one model. The continuous recorded main power con-
stamp 𝑡. sumptions are divided into 𝑇 time-windows, each containing 2𝑡0 + 1
The conventional, one-model-for-one appliance (one-to-one) model samples. Thus the input data has the shape of 𝑇 ∗ (2𝑡0 + 1) ∗ 1. The
(𝑗)
estimates 𝑥(𝑖)
𝑡 for each 𝑖, independently from other 𝑥𝑡 , 𝑗 ≠ 𝑖; that is, total units in the output layer are denoted as 𝑁, which is set by the
a separate model is built for each appliance and NILM classification number of appliances to be disaggregated.
is executed 𝑁 times. Some methods then perform correction step to The designed CNN architecture, as shown in Fig. 2, consists of
ensure that the sum of the estimated loads is below the aggregate 𝑦𝑡 . four convolution layers that capture time-dependent information in the
On the other hand, the proposed one-to-many model estimates 𝑥(𝑖) 𝑡 for receptive field. Fig. 2 shows the filter size, stride, and padding that we
all 𝑖 in one pass, without requiring separate models for each appliance. employed in all simulations.
4. Proposed approach 4.3. Adaptive domain transfer learning
Following recent trends [13], and capitalizing on a large amount The model described in the previous subsection relies on the net-
of open access electrical measurements data, in this paper, a novel work’s generalization ability, and is expected to work well if the test
deep learning-based NILM method is proposed. In contrast to existing data distribution follows well the training data distribution. This is a
methods, the proposed method offers high transferrability of the model reasonable assumption if training and test datasets are generated from
(reducing the labeling effort) and low complexity. In this section, first, the same domain [22], which in the context of this paper, means the
we describe the classical sequence-to-point architecture and one-to-one same building [28]. Cross-domain transfer learning therefore can be
model, then focus on the proposed one-to-many model and its design, described as the model trained on one building and tested on another
and last introduce the adaptive domain transfer learning strategy. building.
3
Fig. 2. The structure of the proposed one-to-many NILM model.
Table 1
The description of the four data sets used in this work.
Data set House index Number of appliances
REFIT 11, 20 9, 9
REDD 1, 2, 5 20, 11, 26
UK-DALE 3, 4 53, 19
SC-EDNRR 1 6
5. Experiments
5.1. Datasets
We evaluate the proposed method using three open access data sets,
widely used in NILM research, namely, REFIT, REDD, and UK-DALE, as
well as our own recorded data that we refer to as SC-EDNRR. Table 1
shows the data sets used in this paper, the specific houses used, and the
number of labeled appliances in each house. We always use the entire
available dataset for each considered house.
(1) REFIT. The cleaned REFIT dataset contains electrical measure-
ments from 20 households in the Loughborough area of the UK from
2013 to 2014 [42]. Nine appliances are monitored in each house, and
all data are recorded at 8-second intervals. Five types of appliances,
Fig. 3. The process of the proposed transfer one-to-many NILM model. namely kettle, microwave, fridge, dishwasher, and washing machine
are the most commonly used baseline appliances in related NILM
state-of-the-art literature. Since only two REFIT houses contain all five
appliances, houses 11 and 20, are chosen for training and testing in this
Further, we consider if the NILM model developed for one domain
paper. We use data in the period from 30/Jul/2014 to 14/Aug/2014 to
can be reused as the initial point for a model on another domain. Source
make up samples described in Table 2.
domain refers to the collected training data and usually, is a large data (2) REDD. Reference Energy Disaggregation Data set (REDD) [43]
set. The target domain is the data to be disaggregated (i.e., test dataset). comprises power usage data from 6 US households. Each house has two
To design a model that generalizes well across different but similar main meters and 10 to 25 individual appliance monitoring meters. The
domains, i.e., that enables transfer learning, we adapt the approach data is sampled once every 3 s. Houses 1 and 2 are chosen for training
proposed in [28] as shown in Fig. 3. and testing since they contain all the five baseline appliances. We use
The first 4 layers learn generic features from the source domain, data in the period from 11/May/2011 to 23/May/2011 to make up
and as shown in Section 4.2 architecture, and are unchanged when the samples described in Table 2.
source and target domains are different. The few top layers learn more (3) UK-DALE. The UK Domestic Appliance-Level Electricity (UK-
specialized features for each appliance and are fine-tuned to work with DALE) [44] dataset contains readings from 5 UK houses. A main meter
the target domain. In our method, the number of output appliances and several individual appliance meters are used in each house, and the
in the target model is adaptive because it is automatically set to the electrical measurements are collected every 6 s. Houses 3 and 4 have
relatively complete data and have been chosen as training and testing
number of appliances to be disaggregated in the target data set.
data. We use data in the period from 03/May/2013 to 13/May/2013
Overall, the steps of the proposed one-to-many transfer learning to make up samples described in Table 2.
method are as follows, (1) obtain the pre-trained one-to-many model (4) SC-EDNRR. For a more thorough examination of the proposed
(Section 4.2); (2) create a base model, which has the same structure network and to verify its applicability to another country and real-
as the pre-trained model; (3) train the dense layers using the labeled time performance, a set of smart meters were installed in a room of a
target data set while not changing the CNN layers; (5) finally, in non-residential facility. The data set is dubbed ‘‘Energy Disaggregation
contrast to [28], we further improve the model via fine-tuning of data set in a Non-Residential Room in Shanghai, China’’ (SC-EDNRR),
hyper-parameters. which consists of six appliances: freezer, fridge, washer, air conditioner,
4
Fig. 4. A diagram of the room’s energy meters, which include a main power meter and six appliance electrical meters used to generate SC-ENDRR dataset.
Table 2
The description of the four data sets used in the four experiments of this work. K: Thousand.
Case Pre-training Fine-tuning Testing
Dataset House Samples (K) Dataset House Samples (K) Dataset House Samples (K)
i REFIT 11 199.6 REFIT 20 84.3 REFIT 20 63.2
ii REFIT 11 199.6 REDD 1, 2, 5 35.1 REDD 1, 2 63.2
iii REFIT 11 199.6 UK-DALE 3, 4 71.7 UK-DALE 3, 4 63.2
iv REFIT 11 199.6 SC-ENDRR 1 39.5 SC-ENDRR 1 179.8
central air conditioner (CAC), and a computer server. As in REFIT and EpD is a metric to measure daily prediction error:
REDD datasets, to provide ground truth, each appliance is monitored by
1 ∑
𝐷
an individual appliance energy meter. The overall power usage is pro- 𝐸𝑝𝐷 = |𝑒 − 𝑒̂𝑖 |, (7)
vided by a single main meter. The data set covers a four-month period 𝐷 𝑑=1 𝑖
from 22/Sep/2021 to 13/Jan/2022 and is sampled once per second. where 𝑒̂𝑖 and 𝑒𝑖 represent the estimated and true energy consumption
The smart meters and equipment in the room are depicted in Fig. 4. The of appliance 𝑖 in a day, respectively. The difference between 𝑒𝑖 and 𝑟𝑖
dataset is available on https://pureportal.strath.ac.uk/en/datasets/energy- is that the former denotes the energy consumption of appliance 𝑖 in the
disaggregation-data-set-in-a-non-residential-room-in-shang. whole test period, while the latter represents the energy consumption
Since the REFIT has 8-seconds intervals, all other data sets were of appliance 𝑖 per day. 𝐷 denotes the total number of days in the test
resampled to 8 s as well. data.
In addition to these three measures, we introduce a new one, and we
5.2. Metrics name it the Overall Disaggregation Proportion Error (ODPE). It reflects
the long-term disaggregation error of each type of electrical appliance
To evaluate the performance, we use three metrics commonly used from the perspective of load ratio, taking into account the proportion
in recent NILM literature [28]: mean Normalized Disaggregation Error of electricity consumed by each appliance:
(NDE), normalized Signal Aggregate Error (SAE), and Energy per Day |̂𝑟 − 𝑟 |
𝑂𝐷𝑃 𝐸 = ∑𝑖 𝑁 𝑖 ∗ 100%, (8)
(EpD). NDE measures the normalized error of the squared difference,
𝑖=1 𝑟𝑖
and is defined as:
∑
∑ ( )2 where 𝑁 𝑖=1 𝑟𝑖 is the total number of samples of the aggregate metered
𝑖,𝑡 𝑥𝑖𝑡 − 𝑥 ̂ 𝑖𝑡
energy consumption in the testing period. ODEP focuses on the esti-
𝑁𝐷𝐸 = ∑ 2
, (5)
𝑖,𝑡 𝑥𝑖𝑡 mated error of the proportion of electricity consumed by each type of
where 𝑖 ∈ {1, … , 𝑁} denotes appliance number, 𝑁 is the total number appliance relative to the total electricity consumed.
of all appliances, and 𝑡 indicates the time step. 𝑥𝑖𝑡 and 𝑥̂ 𝑖𝑡 denote the
real consumption and the estimated consumption of appliance 𝑖 at time 5.3. Settings for training neural networks
𝑡, respectively.
SAE is used to evaluate the difference between the total energy A fixed-length window, which is set as 599, is used for the input data
consumption and estimated energy consumption: and is shown in Fig. 2. Furthermore, each window’s data is created by
sliding a window one time step [45], that is, each subsequent window
|𝑟𝑖 − 𝑟̂𝑖 |
𝑆𝐴𝐸 = , (6) is shifted from one sample to the next one. The procedure of a sliding
𝑟𝑖
window is depicted in Fig. 5, where 𝐿 denotes the number of samples of
∑
where 𝑟𝑖 = 𝑡 𝑥𝑖𝑡 denotes the total energy consumption of appliance 𝑖 the data set. Samples are chosen at random from all the samples listed
∑
for all time step 𝑡 in the test period, while 𝑟̂𝑖 = 𝑡 𝑥̂ 𝑖𝑡 represents the its in Table 2 to make up the training, validation and testing sets with a
estimated total energy consumption. 60%–20%–20% ratio.
5
machine, and Dishwasher is obtained using the one-to-many model.

For EpD, it can be found that two appliances perform better when
using the one-to-one transfer model, and three appliances perform
better when using the one-to-many transfer strategy. In terms of NDE,
three appliances have lower NDE when using the one-to-one transfer
model, one appliances perform better when using the one-to-many
transfer strategy, and one appliance, Dishwasher, has the same NDE
when using the two models. While using the proposed one-to-many
transfer strategy has a higher NDE, the average values of SAE and EpD
Fig. 5. The process of data generation. are lower, demonstrating that although the variance of decomposition
results using the proposed one-to-many transfer strategy is slightly
Table 3 larger than that of the one-to-one transfer model, the average results of
Hyper-parameters for training. the whole are better. Fig. 6 is an example of the disaggregation curve
Pretrain for the Fridge using the proposed one-to-many model. It can be verified
Input window size (sequence length) 599 from the figure that the estimated trace accurately follows the ground
Number of maximal epochs 300 truth curve. The ODPE values are less than 2 for all appliances except
Batch size 1024
for the washing machine, which means the disaggregation ratio error of
Learning rate 1e−2
all appliances is less than 2%. The results indicate that the one-to-many
Retrain of the last two Dense layers
model is competitive and performs very well when trained and tested
Batch size 1024
on the same dataset.
Number of maximal epochs 50
Learning rate 1e−2
Tables 5 and 6 show the results of domain transfer learning, where
the models are pre-trained on the REFIT dataset, and tested on REDD
Fine-tune of the first four CNN layers
and UK-DALE, respectively. For the REDD dataset, the SAE values of
Batch size 1024
Fridge and Washing machine using the proposed one-to-many transfer
Number of maximal epochs 20
Learning rate 1e−4 model are lower than that of the one-to-one model, while the SAE
values for two other appliances are higher. Similarly, two appliances
and three appliances perform better in the metrics of EpD and NDE
respectively using the proposed one-to-many transfer model. The ODPE
The first four CNN layers are kept unchanged to enable transfer values of all the appliances using the proposed model are lower than
learning, and the last two dense layers are fine-tuned when applied 7, indicating that the disaggregation percentage errors of all appliances
to the target domain. There are five steps in the process of transfer are less than 7%. Fig. 10 shows the proportion of electricity consumed
learning. (1) Obtain the pre-trained model, and to be consistent with by each appliance (actual and estimated) in the REDD dataset. Figs. 7,
the state-of-the-art research, the REFIT data set was chosen as the 8, and 9 plot an example of the disaggregation curves for Microwave,
source domain as per [28]. (2) Create the base model with the same first WashingMachine, and Dishwasher in REDD dataset, respectively, using
four CNN layers as the pre-trained model. The output number is set as the proposed transferred one-to-many model. It can be seen from the
the total number of appliances in each target house in the experiments. figures that the disaggregation of each appliance estimates well the
(3) Keep the CNN layers in the pre-trained model fixed, and add two trend of the true electricity consumption.
trainable dense layers, and train them using target domain data (5) For the UK-DALE dataset, the SAE values for the Fridge, Dishwasher,
Fine-tune the first four layers at a very low learning rate, which is set and Washing machine using the proposed one-to-many transfer model
to 1e−4 in our experiment. are lower than that of the one-to-one model. Similarly, four appliances
Four groups of experiments were conducted. In the first experiment and three appliances, the majority of all devices, perform better on
(Case i), the performances of the traditional one-to-one model [28] and the EpD and NDE metrics respectively using the proposed one-to-many
the one-to-many model proposed in this paper are compared based on transfer model. The ODPE of all the appliances using the proposed
the REFIT dataset. In the second and third experiments (Case ii and model are all lower than 4, indicating at most 4% of disaggregation
Case iii), the performances of the one-to-one transferred model [28] error. Fig. 11 shows the proportion of electricity consumed by each ap-
and the proposed one-to-many transferred model in this paper are pliance in the UK-DALE dataset, actual and estimated, showing that the
compared. In these two sets of experiments, model was trained on the proposed method can accurately estimate appliance consumption pro-
REFIT dataset and tested on REDD houses 2, 5 and UK-DALE houses 3, viding competitive performance to the more complex state-of-the-art
4, respectively, to verify the generalization of the proposed model and method of [28].
compare it with the state-of-the-art method. In the fourth experiment iv In summary, the results show that disaggregating all appliances at
(Case iv), the traditional one-to-one model [28] and the proposed one- once using a simple transferring one-to-many NILM model improves
to-many model are applied to our own recorded SC-EDNRR dataset to effectiveness and practicality without sacrificing accuracy. The reason
verify the accuracy of the proposed method. behind this is that we add fine-tune to the first four CNN layers that are
In all simulations we use the data described in Table 2 and network used to learn low-level features, to overcome the difference between
setting as shown in Table 3. source and target domain data [46].
To verify the universality and practicality of the proposed method,
5.4. Results we tested the model on the SC-ENDRR dataset, after pre-training the
model on REFIT. Table 7 shows the results of the benchmark model and
We compare our results with the results achieved by the state-of- the proposed model in terms of various performance metrics including
the-art one-to-one seq2point transfer learning method of [28], which SAE, NDE, EpD and ODPE. It can be seen that the proposed one-to-
were taken from [28] for (Case i–iii). many model performs better for most appliances. Indeed, the SAE, NDE,
Table 4 shows the results for estimating five appliances with the EpD, and ODPE values of Freezer, Fridge, Air. C (Air Conditioner) and
one-to-one model of [28] and the proposed one-to-many model on the Server are all lower with the proposed method. The explanation is that,
REFIT dataset, which is the most challenging dataset of the three con- the load characteristics and the electricity usage of this dataset are
sidered. It can be seen that a lower SAE for Microwave and Kettle are somewhat different from those of the training dataset. Therefore, it is
obtained by the one-to-one model, while lower SAE for Fridge, Washing not enough to train the dense layer only, it is more important to do
6
Fig. 6. Disaggregation for Fridge using One-to-many model trained on REFIT and tested on REFIT.
Fig. 7. Disaggregation for Microwave using Proposed Transferred One-to-many model trained on REFIT and tested on REDD House2.
Fig. 8. Disaggregation for WashingMachine Using Proposed Transferred One-to-many model trained on REFIT and tested on REDD House1.
Fig. 9. Disaggregation for Dishwasher using Proposed Transferred One-to-many model trained on REFIT and tested on REDD House1.
7
Table 4
Disaggregation performance comparison of SAE, EpD(Wh), NDE, and ODPE(%) evaluation metrics of one-to-one model and
one-to-many model (Case i).
Method Trained on REFIT Trained on REFIT
Tested on REFIT Tested on REFIT
(Benchmark: one-to-one model) (one-to-many model)
SAE EpD NDE SAE EpD NDE ODPE
Microwave 0.05 96.04 0.20 0.13 191.91 0.94 1.03
Fridge 0.44 270.48 0.45 0.07 133.40 0.31 0.80
Washing m. 2.61 319.11 0.54 0.62 151.56 0.83 3.67
Dishwasher 0.61 206.15 0.49 0.07 40.20 0.49 0.45
Kettle 0.05 96.04 0.20 0.13 189.12 0.51 1.24
Mean 0.75 197.56 0.37 0.17 141.23 0.61 1.43
Std 1.06 100.96 0.16 0.17 61.71 0.26 1.28
Table 5
Disaggregation performance comparison of SAE, EpD (Wh), NDE, and ODPE (%) evaluation metrics of transfer learning experiments, where the
models are trained on REFIT and tested on REDD data set (Case ii).
Method CNN trained on REFIT Pre-trained on REFIT
Dense layers trained on REDD Dense layers trained on REDD
Tested on REDD CNN fine-tuned on REDD
(Benchmark: one-to-one transfer model) Tested on REDD House 2
(The proposed model)
Microwave 0.023 247.57 0.68 0.241 200.33 0.54 3.98
Fridge 0.057 118.02 0.29 0.033 128.79 0.17 6.46
Dishwasher 0.007 274.62 0.35 0.729 78.06 0.69 3.20
Washing.m 0.128 181.31 0.19 0.025 1.87 0.09 0.07
Mean 0.054 205.38 0.37 0.257 102.26 0.37 3.43
Std 0.053 70.20 0.21 0.330 83.63 0.28 2.63
Table 6
models are trained on REFIT and tested on UK-DALE data set (Case iii).
Method CNN trained on REFIT Pre-trained on REFIT
Dense layers trained on UK-DALE Dense layers trained on UK-DALE
Tested on UK-DALE CNN fine-tuned on UK-DALE
(Benchmark: one-to-one transfer model) Tested on UK-DALE
(The proposed model)
Fridge 0.266 297.75 0.51 0.080 61.34 0.28 3.19
Dishwasher 0.610 206.15 0.49 0.066 40.20 0.49 0.45
Washing.m 0.899 329.83 1.19 0.580 55.74 0.79 2.47
Kettle 0.043 72.92 0.21 0.603 48.69 1.06 2.07
Mean 0.454 226.66 0.60 0.332 51.49 0.65 2.04
Std 0.377 115.11 0.41 0.299 9.13 0.34 1.15
Table 7
Disaggregation performance comparison of SAE, EpD (Wh), NDE, and ODPE (%) evaluation metrics of transfer learning experiments, where the models are trained on REFIT and
tested on our practical SC-ENDRR dataset (Case iv).
Method Pre-trained on REFIT Pre-trained on REFIT
Dense layers trained on SC-EDNRR Dense layers trained on SC-EDNRR
Tested on SC-EDNRR CNN fine-tuned on SC-EDNRR
(Benchmark: one-to-one transfer model) Tested on SC-EDNRR
(the proposed model)
SAE EpD NDE ODPE SAE EpD NDE ODPE
Freezer 0.096 280.58 0.092 2.533 0.005 18.87 0.0001 0.30
Fridge 0.111 47.52 0.173 1.21 0.027 13.90 0.0840 0.30
Washing.m 0.003 0.38 0.001 0.003 0.223 27.95 0.1660 0.56
Air c. 0.932 136.39 0.985 3.79 0.639 77.82 0.4425 1.48
CAC 0.027 0.31 0.007 0.017 0.061 8.09 0.0050 0.18
Server 0.088 20.08 0.215 0.336 0.005 5.83 0.0005 0.05
Mean 0.209 80.87 0.245 1.314 0.160 25.41 0.1163 0.47
Std 0.325 100.61 0.339 1.413 0.248 26.87 0.1727 0.51
the fine-tuning of the CNN layers, which is used for feature extraction, use, providing a reasonable strategy for disaggregation. In general, the
to further improve its performance. Fig. 12 depicts disaggregation pie results suggest that the proposed transfer one-to-many model proposed
charts for each of the six appliances. All the ODPE values are under in this paper is suitable for both public dataset and self-collected
2, which means the disaggregation percentage error of all appliances dataset, and has good practical value.
is less than 2%. This chart shows that the model pre-trained by the Furthermore, the total number of learnable parameters for these
public dataset can be used for a pre-training model for our specific networks are shown in Table 8. It can be seen the proposed method has
8
Table A.1
The description of the data sets used in the additional experiments of this work. K:
Thousand.
Dataset House Samples (K)
Pre-training REDD 2 252.9
Fine-tuning REFIT 20 239.5
Testing REFIT 20 59.9
much fewer parameters to learn, which means the proposed method

is easier to implement in practice and requires fewer computational
resources to train the models.
6. Conclusion
Fig. 10. The proportion of electricity consumed by each electrical appliance in the
REDD dataset. A. The actual energy consumed; B. The NILM result using the proposed
transferred one-to-many model. The CNN layers trained on the REFIT dataset were
fixed, then the dense layers were retrained and CNN fine-tuned on the REDD dataset. In this paper, we propose a one-to-many model and a transfer one-
to-many model for multi-target energy disaggregation to improve the
effectiveness and practicality of NILM. The proposed transfer strategy
for multi-targets NILM can achieve promising results using a single
model and limited measurements, according to experimental results
on four public datasets and our practical collected data, using four
different metrics. The intuition behind the proposed method is that
adding a fine-tuning process for the target source can help improve the
model’s performance. In future research, more attempts will be made to
further reduce training data and avoid prior installation of sub-meters
on electrical appliances.
CRediT authorship contribution statement
Dandan Li: Methodology, Data curation, Software, Writing – origi-

Fig. 11. The proportion of electricity consumed by each electrical appliance in the UK-
DALE dataset. A. The actual energy consumed; B. The NILM result using the proposed nal draft. Jiangfeng Li: Writing – review & editing. Xin Zeng: Valida-
transferred one-to-many model. The CNN layers trained on the REFIT data were fixed, tion, Writing – review & editing. Vladimir Stankovic: Data curation,
then the dense layers were retrained and CNN fine-tuned on the UK-DALE dataset. Reviewing and editing. Lina Stankovic: Data curation, Reviewing and
editing. Changjiang Xiao: Conceptualization, Writing – review & edit-
ing. Qingjiang Shi: Writing – review & editing, Funding acquisition,
Supervision.
Declaration of competing interest
The authors declare that they have no known competing finan-

cial interests or personal relationships that could have appeared to
influence the work reported in this paper.
Data availability
Fig. 12. The proportion of electricity consumed by each electrical appliance in the
our own collected data set. A. The actual energy consumed; B. The NILM result using
the proposed transferred one-to-many model. The CNN trained on the REFIT data were Data will be made available on request.
fixed, then the dense layers were retrained and CNN fine-tuned on the our own collected
data set: SC-EDNRR.
Acknowledgments
Table 8
The number of parameters to be optimized in total with the proposed method in this
work and the benchmark. All numbers are in thousand (K) and 𝑁 denotes the number This work was supported in part by the National Key Research and
of appliances.
Development Program, China (No. 2017YFE0119300), the National Na-
Method Pre-training Fine-tuning
ture Science Foundation of China (NSFC) Program (Nos. 62231019 and
Total (K) Frozen (K) Training (K)
42001372), Shanghai Sailing Program (No. 19YF1451500), and China
Proposed 203.33 11.58 191.74
Postdoctoral Science Foundation (Nos. 2019M661621 and
Benchmark [28] 30708.25 ∗ 𝑁 38.43 ∗ 𝑁 30669.82 ∗ 𝑁
2021T140513). It was also partially supported by the European Union’s
Horizon 2020 research and innovation program under the Marie
Sklodowska-Curie grant agreement No. 734331.
9
Table A.2
Disaggregation performance comparison of SAE, EpD (Wh), NDE, and ODPE (%) evaluation metrics of one-to-one model and one-to-many model,
where the models are trained on REDD and tested on REDD.
Method Trained on REDD Trained on REDD
Tested on REDD Tested on REDD
(Benchmark: one-to-one model) (one-to-many model)
Microwave 0.073 43.84 0.05 0.63 0.113 79.45 0.27 6.17
Fridge 0.086 178.47 0.12 3.29 0.202 378.95 0.16 5.74
Dishwasher 0.758 126.01 0.52 2.73 0.539 91.37 0.34 1.44
Washing.m 0.035 1.41 0.09 0.03 0.008 2.95 0.11 1.33
Mean 0.238 87.43 0.19 1.67 0.215 138.18 0.22 3.67
Std 0.301 69.05 0.18 1.37 0.198 143.08 0.09 2.29
Table A.3
models are trained on REDD and tested on REFIT.
Method Pre-trained on REDD Pre-trained on REDD
Dense layers trained on REFIT Dense layers trained on REFIT
Tested on REFIT CNN fine-tuned on REFIT
(Benchmark: one-to-one transfer model) Tested on REFIT
(the proposed model)
Microwave 0.188 120.08 0.91 0.83 0.544 201.90 1.03 4.61
Fridge 0.224 197.56 0.59 1.72 0.128 125.51 0.38 3.22
Dishwasher 0.079 31.10 0.62 0.25 0.236 32.47 0.52 0.06
Washing.m 0.43 157.76 0.93 1.57 0.271 139.08 0.91 0.27
Mean 0.23 127.12 0.76 1.09 0.294 124.74 0.71 2.04
Std 0.127 60.80 0.15 0.59 0.153 60.56 0.26 1.93
Appendix A. Supplemental description for CNN layers of the of electricity consumed by each appliance, and the estimated error of
model in Fig. 2 proportion of electricity consumed by each type of appliance relative
to the total electricity consumed. On this basis, we proposed the metric
|̂𝑟 −𝑟 |
As described in Section 4.1, the 𝜏th input window can be repre- of Overall Disaggregation Proportion Error (ODPE), 𝑂𝐷𝑃 𝐸 = ∑𝑖𝑁 𝑖 ∗
𝑖=1 𝑟𝑖
sented as 𝑦𝜏 . Four CNN layers are used in the proposed model. The 100%, (i.e., the Eq. (8) in Section 5.2). The SAE and ODPE have a similar
output of the 𝑐th convolution layer, at time stamp 𝑡 is: form, however, they measure different aspects for energy estimation.
∑
𝑦𝑐𝑡 = 𝑦𝑡+𝑘 𝑤𝑐 + 𝑏𝑐 , (9) The former measures the error of an appliance’s consumption relative
∀𝑘 to its true value, while the latter measures the error relative to the sum
where 𝑘 is the length of the receptive field, 𝑤𝑐 and 𝑏𝑐 , are the weights true value of all the appliances. The latter calculates a percentage for
and bias, respectively, and 𝑦𝑐𝑡 denotes the output of the layer. Note that each appliance, which can reflect the overall disaggregation proportion
dimensions of the vectors 𝑦𝑐𝑡 , 𝑤𝑐 and 𝑏𝑐 would differ from layer to layer. error.
After performing the convolution operations, the output is flattened, For example, suppose there are 4 appliances in a room, TV, lighting,
i.e., reshaped into a column vector, and then a dense layer is applied: fridge and washing machine (WM). If the aggregated energy con-
sumption in a certain period (2 days, which will be used for the
𝑦𝑑𝑡 = 𝑤𝑑 𝑦𝑐𝑡 + 𝑏𝑑 , (10) calculation of EpD) is 300 Wh, the real energy consumptions are 40 Wh,
where 𝑤𝑑 and 𝑏𝑑 represents the weight and bias in the dense layer, 50 Wh, 90 Wh and 120 Wh, for the four appliances, respectively.
respectively. Finally, the output layer receives the input from the dense Noise is neglected in this toy example. Suppose the estimated energy
layer and outputs: consumptions of the four appliances are 32 Wh, 55 Wh, 109 Wh, and
130 Wh respectively (the summation of the estimated consumption
𝑥̂ (𝑖) 𝑑
𝑡 = 𝑤𝑜 𝑦𝑡 + 𝑏𝑜 , (11) may not be equal to 300 Wh). Thus, it can be calculated that the
2
where 𝑤𝑜 and 𝑏𝑜 represents the weight and bias, respectively. is𝑥̂ (𝑖) NDE of TV is 𝑁𝐷𝐸𝑇 𝑉 = (40 Wh−32 2Wh) = 0.40, the SAE of TV is
𝑡 (40 Wh)
the output of the dense layer, and since this is a regression network, 𝑆𝐴𝐸𝑇 𝑉 = |40 Wh−32
40 Wh
Wh|
= 0.20, and the EpD of TV is 𝐸𝑝𝐷𝑇 𝑉 =
1
represents estimated electric consumption of appliance 𝑖.
2
|40 Wh − 32 Wh| = 4.00 Wh. It can be seen that although the disag-
Note that all weights and bias vectors are parameters to be learned gregation results have been evaluated from several perspectives for the
via backpropagation. To obtain these parameters, 𝜃 = [𝑤𝑐 , 𝑏𝑐 , 𝑤𝑑 , 𝑏𝑑 , 𝑤𝑜 , TV, it lacks a metric that measures percentage of electricity consumed
𝑏𝑜 ], a mean square error loss function is set as follows: by the TV relative to the whole. This may be useful in many scenarios,
such as estimating energy bill for each appliance, or estimating the
1 1 ∑ ∑ (𝑖)
𝑇 𝑁
𝐿(𝑥|𝑓 (𝑦, 𝜃)) = (𝑥 − 𝑓 (𝑖) (𝑦𝜏 , 𝜃))2 , (12) proportion of electrical appliances consumed over a longer period of
𝑇 𝑁 𝜏=1 𝑖=1 𝑡0
time. ODPE is proposed to address this issue. The ODPE of TV is
where 𝑥𝑡(𝑖) represents the true consumption of appliance 𝑖 at time 𝑡0 , 𝑂𝐷𝑃 𝐸𝑇 𝑉 = |40 Wh−32
300 Wh
Wh|
∗ 100% = 2.66%.
0
𝑓 (𝑖) (𝑦𝜏 , 𝜃) the estimated consumption of appliance 𝑖 at time 𝑡0 , and 𝑡0
the midpoint of each time window 𝜏. Appendix C. Additional experiments
Appendix B. More detail for the metric of ODPE To demonstrate cross-domain transfer learning, an additional exper-
iment, where the model is trained on REDD data set and tested on both
In the field of NILM, the aim is to disaggregate the electricity con- REDD and REFIT datasets, was conducted. The description of the data
sumption of each appliance, thus, it is helpful to understand proportion sets used in this experiment is shown in Table A.1. Table A.2 shows the
10
disaggregation performance comparison using SAE, EpD (Wh), NDE, [21] Ciancetta Fabrizio, Bucci Giovanni, Fiorucci Edoardo, Mari Simone, Fiora-
and ODPE evaluation metrics between the one-to-one model and the vanti Andrea. A new convolutional neural network-based system for nilm
applications. IEEE Trans Instrum Meas 2020;70:1–12.
proposed one-to-many model. One can see that the one-to-one model
[22] Wang Jindong, Chen Yiqiang. Introduction to transfer learning, Vol. 4 of 10.
performs only slightly better than the one-to-many model, while in third ed.. The address: Publishing House of Electronic Industry; 2021.
Table A.3, the one-to-many transfer model performs better on the dis- [23] Nalmpantis Christoforos, Vrakas Dimitris. On time series representations for
aggregation of Fridge, Dishwasher, and Washing machine. The results multi-label NILM. Neural Comput Appl 2020;32(23):17275–90.
show that the proposed transfer learning strategy with fine-tuning its [24] Li Dandan, Li Jiangfeng, Zeng Xin, Stankovic Vladimir, Stankovic Lina,
Shi Qingjiang. Non-intrusive load monitoring for multi-objects in smart building.
feature extraction layers, i.e., the CNN layers, can perform as good as In: 2021 international balkan conference on communications and networking.
(and often better than) the one-to-one transfer model, which requires IEEE; 2021, p. 117–21.
one model for each appliance and more parameters for each model, is [25] Murray David, Stankovic Lina, Stankovic Vladimir. An electrical load measure-
hence computationally more intensive. ments dataset of United Kingdom households from a two-year longitudinal study.
Sci Data 2017;4(1):1–12.
[26] Kolter J Zico, Johnson Matthew J. REDD: A public data set for energy disaggre-
References gation research. In: Workshop on data mining applications in sustainability, Vol.
25. (Citeseer):2011, p. 59–62.
[1] Mohsin Muhammad, Rasheed AK, Sun Huaping, Zhang Jijian, Iram Robina, [27] Kelly Jack, Knottenbelt William. The UK-DALE dataset, domestic appliance-level
Iqbal Nadeem, Abbas Qaiser. Developing low carbon economies: an aggregated electricity demand and whole-house demand from five UK homes. Sci Data
composite index based on carbon emissions. Sustain Energy Technol Assess 2015;2(1):1–14.
2019;35:365–74. [28] D’Incecco Michele, Squartini Stefano, Zhong Mingjun. Transfer learning for
[2] Li Dandan, Xiao Changjiang, Zeng Xin, Shi Qingjiang. Short-mid term electricity non-intrusive load monitoring. IEEE Trans Smart Grid 2019;11(2):1419–29.
consumption prediction using non-intrusive attention-augmented deep learning [29] Wang Yi, Chen Qixin, Hong Tao, Kang Chongqing. Review of smart meter data
model. Energy Reports 2022;8:10570–81. analytics: Applications, methodologies, and challenges. IEEE Trans Smart Grid
[3] Rogelj Joeri, Geden Oliver, Cowie Annette, Reisinger Andy. Three ways to 2018;10(3):3125–48.
improve net-zero emissions targets. Nature 2021;591(7850):365–8. [30] Tabatabaei Seyed Mostafa, Dick Scott, Xu Wilsun. Toward non-intrusive load
[4] Council World Green Building. Global status report 2017. 2017, https://www. monitoring via multi-label classification. IEEE Trans Smart Grid 2016;8(1):26–40.
worldgbc.org/news-media/global-status-report-2017. [31] da Silva Nolasco Lucas, Lazzaretti André Eugenio, Mulinari Bruna Machado.
[5] Ahammed Md Tanvir, Hasan Md Mehedi, Arefin Md Shamsul, Islam Md Rafiqul, DeepDFML-NILM: A new CNN-based architecture for detection, feature extraction
Rahman Md Aminur, Hossain Eklas, et al. Real-time non-intrusive electrical load and multi-label classification in NILM signals. IEEE Sens J 2021;22(1):501–9.
classification over IoT using machine learning. IEEE Access 2021;9:115053–67. [32] Verma Sagar, Singh Shikha, Majumdar Angshul. Multi label restricted boltzmann
[6] Hart George William. Nonintrusive appliance load monitoring. Proc IEEE machine for non-intrusive load monitoring. In: ICASSP 2019-2019 IEEE inter-
1992;80(12):1870–91. national conference on acoustics, speech and signal processing. IEEE; 2019, p.
[7] Liu Chao, Akintayo Adedotun, Jiang Zhanhong, Henze Gregor P, Sarkar Soumik. 8345–9.
Multivariate exploration of non-intrusive load monitoring via spatiotemporal [33] Li Ding, Dick Scott. Non-intrusive load monitoring using multi-label classification
pattern network. Appl Energy 2018;211:1106–22. methods. Electr Eng 2021;103(1):607–19.
[8] Rashid Haroon, Singh Pushpendra, Stankovic Vladimir, Stankovic Lina. Can non- [34] Harell Alon, Jones Richard, Makonin Stephen, Bajić Ivan V. TraceGAN: Synthe-
intrusive load monitoring be used for identifying an appliance’s anomalous sizing appliance power signatures using generative adversarial networks. IEEE
behaviour? Appl Energy 2019;238:796–805. Trans Smart Grid 2021.
[9] Kong Weicong, Dong Zhao Yang, Wang Bo, Zhao Junhua, Huang Jie. A practical [35] Murray David, Stankovic Lina, Stankovic Vladimir, Lulic Srdjan, Sladoje-
solution for non-intrusive type II load monitoring based on deep learning and vic Srdjan. Transferability of neural network approaches for low-rate energy
post-processing. IEEE Trans Smart Grid 2019;11(1):148–60. disaggregation. In: ICASSP 2019-2019 IEEE international conference on acoustics,
[10] Stankovic Lina, Stankovic Vladimir, Liao Jing, Wilson Clevo. Measuring the speech and signal processing. IEEE; 2019, p. 8330–4.
energy intensity of domestic activities from smart meter data. Appl Energy [36] Zhou Zejian, Xiang Yingmeng, Xu Hao, Yi Zhehan, Shi Di, Wang Zhiwei. A novel
2016;183:1565–80. transfer learning-based intelligent nonintrusive load-monitoring with limited
[11] Shi Xin, Ming Hao, Shakkottai Srinivas, Xie Le, Yao Jianguo. Nonintrusive measurements. IEEE Trans Instrum Meas 2020;70:1–8.
load monitoring in residential households with low-resolution data. Appl Energy [37] Wang Lingxiao, Mao Shiwen, Wilamowski Bogdan, Nelms RM. Pre-trained models
2019;252:113283. for non-intrusive appliance load monitoring. IEEE Trans Green Commun Netw
[12] Gupta Sidhant, Reynolds Matthew S, Patel Shwetak N. ElectriSense: single-point 2021.
sensing using EMI for electrical event detection and classification in the home. In: [38] Ehrhardt-Martinez Karen, Donnelly Kat A, Laitner Skip, et al. Advanced
Proceedings of the 12th ACM international conference on ubiquitous computing. metering initiatives and residential feedback programs: a meta-review for house-
2010, p. 139–48. hold electricity-saving opportunities. American Council for an Energy-Efficient
[13] Huber Patrick, Calatroni Alberto, Rumsch Andreas, Paice Andrew. Review on Economy Washington, DC; 2010.
deep neural networks applied to low-frequency nilm. Energies 2021;14(9):2390. [39] Bernard Timo. Non-intrusive load monitoring (NILM): combining multiple dis-
[14] Anderson Kyle D, Berges Mario E, Ocneanu Adrian, Benitez Diego, tinct electrical features and unsupervised machine learning techniques (Ph.D.
Moura Jose MF. Event detection for non intrusive load monitoring. In: IECON thesis), 2018.
2012-38th annual conference on IEEE industrial electronics society. IEEE; 2012, [40] Yang Mingzhi, Li Xinchun, Liu Yue. Sequence to point learning based on
p. 3312–7. an attention neural network for nonintrusive load decomposition. Electronics
[15] Angelis Georgios-Fotios, Timplalexis Christos, Krinidis Stelios, Ioannidis Dimos- 2021;10(14):1657.
thenis, Tzovaras Dimitrios. NILM applications: Literature review of learning [41] Paradiso Francesca, Paganelli Federica, Giuli Dino, Capobianco Samuele.
approaches, recent developments and challenges. Energy Build 2022;111951. Context-based energy disaggregation in smart homes. Fut Internet 2016;8(1):4.
[16] Hasan Md Mehedi, Chowdhury Dhiman, Khan Md Ziaur Rahman. Non-intrusive [42] Murray David, Stankovic Lina, Stankovic Vladimir. An electrical load measure-
load monitoring using current shapelets. Appl Sci 2019;9(24):5363. ments dataset of United Kingdom households from a two-year longitudinal study.
[17] He Kanghang, Stankovic Lina, Liao Jing, Stankovic Vladimir. Non-intrusive Sci Data 2017;4(1):1–12.
load disaggregation using graph signal processing. IEEE Trans Smart Grid [43] Kolter J Zico, Johnson Matthew J. REDD: A public data set for energy disaggre-
2016;9(3):1739–47. gation research. In: Workshop on data mining applications in sustainability, Vol.
[18] Zhao Bochao, Stankovic Lina, Stankovic Vladimir. On a training-less solution 25. 2011, p. 59–62.
for non-intrusive appliance load monitoring using graph signal processing. IEEE [44] Kelly Jack, Knottenbelt William. The UK-DALE dataset, domestic appliance-level
Access 2016;4:1784–99. electricity demand and whole-house demand from five UK homes. Sci Data
[19] Zhang Chaoyun, Zhong Mingjun, Wang Zongzuo, Goddard Nigel, Sutton Charles. 2015;2(1):1–14.
Sequence-to-point learning with neural networks for non-intrusive load monitor- [45] Xiao Changjiang, Tong Xiaohua, Li Dandan, Chen Xiaojian, Yang Qiquan,
ing. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 32. Xv Xiong, Lin Hui, Huang Min. Prediction of long lead monthly three-dimensional
2018. ocean temperature using time series gridded argo data and a deep learning
[20] García-Pérez Diego, Pérez-López Daniel, Díaz-Blanco Ignacio, González- method. Int J Appl Earth Obs Geoinf 2022;112:102971.
Muñiz Ana, Domínguez-González Manuel, Vega Abel Alberto Cuadrado. [46] Neyshabur Behnam, Sedghi Hanie, Zhang Chiyuan. What is being transferred in
Fully-convolutional denoising auto-encoders for NILM in large non-residential transfer learning? 2020, arXiv preprint arXiv:2008.11687.
buildings. IEEE Trans Smart Grid 2020;12(3):2722–31.
11

Transfer Learning For Multiobjective Non Intrusive Load Monitoring in Smart Building

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Transfer Learning For Multiobjective Non Intrusive Load Monitoring in Smart Building

Uploaded by

Copyright:

Available Formats

Applied Energy 329 (2023) 120223

Contents lists available at ScienceDirect

Transfer learning for multi-objective non-intrusive load monitoring in smart

ARTICLE INFO ABSTRACT

Various NILM approaches established on low-frequency data can 2. Related work

4.1. Sequence-to-point (seq2point) architecture

In a seq2point architecture, the raw data is divided into short

where 𝑦𝜏 (𝜏 = 1, … , 𝑇 ) represents the 𝜏-th window of the input aggre-

4. Proposed approach 4.3. Adaptive domain transfer learning

Fig. 2. The structure of the proposed one-to-many NILM model.

machine, and Dishwasher is obtained using the one-to-many model.

much fewer parameters to learn, which means the proposed method

CRediT authorship contribution statement

Dandan Li: Methodology, Data curation, Software, Writing – origi-

Declaration of competing interest

The authors declare that they have no known competing finan-

You might also like