Professional Documents
Culture Documents
International Journal of Electrical Power and Energy Systems
International Journal of Electrical Power and Energy Systems
A R T I C L E I N F O A B S T R A C T
Keywords: The theft of electricity affects power supply quality and safety of grid operation, and non-technical losses (NTL)
Electricity theft detection have become the major reason of unfair power supply and economic losses for power companies. For more
Similarity measure effective electricity theft inspection, an electricity theft detection method based on similarity measure and de
WGAN
cision tree combined K-Nearest Neighbor and support vector machine (DT-KSVM) is proposed in the paper.
Electricity consumption behavior analysis
DT-KSVM
Firstly, the condensed feature set is devised based on feature selection strategy, typical power consumption
characteristic curves of users are obtained based on kernel fuzzy C-means algorithm (KFCM). Next, to solve the
problem of lack of stealing data and realize the reasonable use of advanced metering infrastructure (AMI). One
dimensional Wasserstein generative adversarial networks (1D-WGAN) is used to generate more simulated
stealing data. Then the numerical and morphological features in the similarity measurement process are
comprehensively considered to conduct preliminary detection of NTL. And DT-KSVM is used to perform sec
ondary detection and identify suspicious customers. At last, simulation experiments verify the effectiveness of the
proposed method.
1. Introduction over fine-grained time intervals has made it easier for utility companies
to monitor anomalies in the network [5]. The research goal of the paper
1.1. Motivation is to analyze the customer’s power consumption behavior based on the
collected big power data and then to realize the accurate identification
There are two types of losses in transmission and distribution net of abnormal power consumption behavior by using the machine
works: technical losses and non-technical losses (NTL). Technical losses learning method. The research results are applied by some power grid
are caused by the heating of resistive elements in lines, transformers, companies, which could not only reduce electricity theft but also pro
and other equipment. NTL is most caused by electricity theft, meter vide a new solution for the detection of power meter fault. This article
failures, or billing errors [1], which account for 20–40% of total losses, detection of an illegal customer is a high probability of electricity theft
and most of related to energy theft [2]. According to the World Bank, users. The manual investigation should be used to determine whether it
electricity theft has caused electricity supply losses to exceed 25% of is an electricity theft user
India’s supply, 16% in Brazil, 6% in China [3]. The impact of NTL is also
significant in developed countries, electricity theft is estimated at £173 1.2. Literation review
million every year in the UK, and it may be worth up to $6 billion in the
USA [4]. The use of metering data of AMI for electricity theft detection mainly
With the construction of smart grid and ubiquitous electric power includes data statistics-based methods and machine learning-based
Internet of Things, power systems have gradually achieved digitization methods. The method based on data statistics is obtained by analyzing
and interaction. Power selling companies have continuously obtained the energy consumption relationship between the main smart meter
more information about customers. The introduction of advanced (called master smart meter) and the smart meter installed on each user
metering infrastructure (AMI) to monitor electrical power consumption (home or business) in the same time interval. Rengaraju [6] compared
* Corresponding author.
E-mail address: eekongxy@tju.edu.cn (X. Kong).
https://doi.org/10.1016/j.ijepes.2020.106544
Received 28 February 2020; Received in revised form 14 August 2020; Accepted 19 September 2020
Available online 1 October 2020
0142-0615/© 2020 Elsevier Ltd. All rights reserved.
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
the difference between the master smart meter and smart meter to generate data, which is similarity exist sample, achieve the balance of
determine whether the electricity pilferage in the low-voltage station. the training set; this method avoids the loss of key information. The
Faria L [7] proposed a method to detect whether there is abnormal traditional ROS [19] reduces the imbalance of the training set by
power consumption behavior on a line according to the mutation of line copying a few samples. Xiaolong proposed a synthetic minority over
loss. The implementation of the statistical method is relatively simple, sampling technique (SMOTE) [20], which randomly creates artificial
but an important drawback is that it can only be judged that electricity samples along the line joining a minority sample and one of its nearest
stealing has occurred in the low-voltage station, and it cannot accurately neighbors. SMOTE has been modified to produce the Random-SMOTE
locate the illegal consumers. If to determine the suspected user, manu [21] and Kmeans-SMOTE [22]. However, the above oversampling al
ally check all the users in this area one by one is needed, which is less gorithms do not take into account the overall distribution characteristics
efficient and puts higher requirements on the quality of the detection of the data, so the improvement of model classification performance is
personnel in the detection process. often limited [23].
With the increase of measurement data, the power industry has Based on the above discussion, it can be summarised that the use of
entered the era of big data. Massive measurement data information also SVM has become an important research direction to detect electricity
provides a broader platform for the application of machine learning and theft. However, to obtain accurate analysis results, several key points
other artificial intelligence technologies [8]. The use of machine need to be resolved:
learning for NTL has become mainstream [9], Angelos [10] proposed K-
means to group customers with similar profiles to create a general (1) The numbers of normal and abnormal samples are not in the same
pattern of power consumption, and customers with vast euclidean dis range. Benign samples are easily available using historical data.
tances to the cluster centers were considered potential fraudsters. Joa Theft samples, on the other hand, rarely or do not exist for a given
quim [11] propose the use of fuzzy Gustafson-Kessel clustering (GK) to customer.
get consumption patterns. The farther away data from an analyzed (2) Combining supervised and non-supervised classification tech
consumer is from the regular prototypes, the higher they may be stealing niques to detect a synthetic consumption pattern that results from
electricity. The use of unsupervised clustering algorithms has also been theft, achieving the best results with SVM classification.
used to identify anomalies in consumers’ demand patterns in AMI. While (3) When using SVM to detect electricity theft, it is necessary to
these methods are useful in identifying customers with similar load improve the accuracy of detection near the decision plane.
patterns, due to their unsupervised nature, they will typically result in
high false-positive rates if acting as the main algorithm for theft- Keeping the above constraints in mind, a comprehensive top-down
detection applications [12]. scheme has been put forth in this paper to identify electrical theft.
As a data mining technique, SVM is used to classify user electricity This paper proposes electricity theft detection based on similarity
consumption patterns or load profiles. Anish Jindal [13] based on the measure and decision tree combined K-Nearest Neighbor and support
combination of DT and SVM classifiers. It can be viewed as a two-level vector machine (DT-KSVM). The proposed electricity theft detection
data processing, and analysis approach since the data processed by DT based on the data augment method involves three critical steps shown in
are fed as an input to the SVM classifier rigorous analysis of gathered Fig. 1.
electricity consumption data to identify suspected users of theft. J. Nagi Step 1: Determine the suspected station and use KFCM clustering
[14] used historical consumption data, along with the SVM classifier, algorithm to cluster the user’s historical data . Obtaining the user’s
were used to detect abnormal behaviors. The average daily consump power consumption characteristic curve .
tions of customers were calculated, and the long term trend in energy Step 2: Based on similarity constraints and real constraints, 1D-
consumption was used to identify fraudulent customers. Jian et al. [15] WGAN was used to generate high-precision measurement data that
proposed an electricity theft detection scheme based on the One-class matched the characteristics of electricity theft.
SVM classification algorithm. By learning the user’s historical power Step 3: Comprehensively, considering the numerical and morpho
consumption data, a typical power consumption model was constructed logical characteristics of the curve to be measured, and the characteristic
to identify abnormal power consumption. The SVM methods for elec curve, the suspected user and illegal costumers are obtained. Then the
tricity theft detection have the advantages of higher theft detection ac DT-KSVM is trained with a balanced data set and using the trained DT-
curacy, and the characteristics learned during the process. But it faces KSVM to identified the illegal consumers.
some problems: (1) When SVM is used for multiple classifications, and
the upper node fails to separate the samples correctly, the probability of
the misclassified samples entering the lower node and cause error
accumulation would increase; (2) In the process of classification, points Real data Historical Theft data
Step1
near the decision plane are not accurately classified by SVM.
Step2
Deep learning techniques for electricity theft detection are studied in Discriminator pre-
Data preprocessing Feature selsction
[16], and the comparison between different deep learning architectures, training
such as convolutional neural network (CNN), long-short-term memory Discriminator
No
(LSTM) recurrent neural network (RNN), and stacked autoencoders, are Determined KFCM training Alternate
suspected station
provided. Moreover, the authors in [17] proposed a deep neural network Generator training
Yes training
(DNN) based customer-specific detector that can efficiently thwart such
Test curve in suspected Characteristic
cyber attacks. In [18], a wide and deep CNN model was developed and station
No curve in station Generate theft
applied to analyze the electricity theft in smart grids. The use of deep sample
learning methods for electricity theft detection can yield higher accu
Test curve and Theft samples balanced
racy, but training complex neural networks, the system is easy to fall Trained DT-KSVM
characteristic curve with normal samples
into the local minimum and cannot jump out. It is difficult to determine
the number of convolutional layers and network hyperparameters for Detection by Construct DT-KSVM
Evaluate
deep learning due to the lack of theoretical foundation. similarity measures
Classification effect
classifier
At present, there are two main ideas for processing unbalanced Determine the Step3
sample sets: random oversampling (ROS) and undersampling (RUS) al suspected users
Training DT-KSVM
2
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
1.3. Contribution this station is located based on similarity measure and DT-KSVM.
The theft data obtained with the AMI system involves m users in time
This paper proposes an electricity theft detection method based on n, and their form of data is described by matrix. The data of the same
similarity measure and DT-KSVM, which combined unsupervised and user at different periods can be specified xj . For the data of different
supervised learning to detect electricity theft users. The main contri users at the moment n, the Xi description is as follow.
butions to the paper are as follows.
3
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
members, etc. [26]. A feature library is formed with common features. The generator (G) is responsible for learning the regularities of the
The specific features and meanings are shown in Table 1: distribution of samples and generating new samples. G is composed of a
The feature selection strategy in this paper as follows: neural network, and the input is a prior distribution PZ , corresponding
random variable z, the output is G(z). The distribution law Pg (z) of the
(1) Selecting the number of features is 1 and determine the feature generated data gradually fits the sample data pdata (x). The goal of the
which is the highest clustering evaluation criteria when the generator is to generate as realistic data as possible to confuse the
number of features is one; discriminator, so its loss function can be defined as Ez̃PZ [− D(G(z))]. The
(2) Selecting the number of features is 2, selecting new features based objective function of generator is:
on the selected features, determine two features which the two
features with the highest clustering evaluation criteria; minEz̃PZ [ - D(G(z))] (3)
(3) Selecting the number of features is i, selecting new features based
The discriminator (D) is responsible for determining whether the
on the selected features, determine i + 1 features which the
input data is real. D is also a neural network, but the input is actual data
i + 1 features with the highest clustering evaluation criteria;
or data generated by the generator. The main task of the discriminator is
(4) Repeat the above steps until when n +1 features are selected, the
to distinguish two kinds of data, so its output is a scalar between 0 and 1,
accuracy rate is lower than the n features, then determine the
which is the probability of belonging to the actual data or generating
optimal number of features. The feature with the highest clus
data. The loss function of D can be defined as
tering evaluation criteria is the selected feature. More detailed
Ex̃P [D(x)] + Ez̃PZ [− D(G(z))]. The objective function is:
criteria could be found in [27]. data
formance. DT-SVM is applied as classifiers, which uses the features Using the Wasserstein distance instead of the JS divergence. Training
extracted by SUAE to output a judgment result. the GAN with the minimized Wasserstein distance as the target effec
tively improves the stability of GAN training. Using Wasserstein distance
3. Sample data generation based on 1D-WGAN can alleviate the problem of gradient disappearance during training and
improve training stability [30]. Wasserstein distance is defined as:
This paper uses one-dimensional generative adversarial networks ( )
(1D-WGAN) to generate electricity theft data. The objective function W pdata , pg = ∏inf E(x,y)∼γ [‖x − y‖] (6)
used in GAN image generation and integrating the advantages of Was γ∼ (pdata ,pg )
4
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
similarity constraints must be met [31]. as normal electricity consumption. In the process of selecting the char
The authenticity constraint is used to ensure that the generated data acteristic curve, this paper uses the weighted-average method to obtain
can be close to the real situation. The loss of authenticity Lr is defined as: the characteristic curve of the user in normal power consumption mode
( ( ) ) [33].
Lr = W G z; θ(G) ; θ(D) (8) The similarity of time series includes two aspects: value and
( ) morphological. Most of the researches on the similarity of time series has
where G(z; θ) is the generated data of the generator; W ̃; θ(D) repre failed to take into account well. For accounting the morphological and
sents the Wasserstein distance between the generated data and the real value of the curve, the Euclidean distance is used to measure the simi
sample. larity, and DTW to measure the similarity of morphological features.
The generated data should be as similar to the actual data as possible, To simply and accurately describe the morphological characteristics
so the similarity loss Ls is defined as: of the curve, such as rise, fall, and stability at various periods, the slope
⃦ ( ) ⃦ of the line is used to represent the morphological characteristics of the
Ls = ⃦G z; θ(G) , I ⃦2 (9)
period. Therefore, the time series of length n is reduced to a morpho
where I is the actual data, and the 2-norm is used to measure the simi logical sequence of n - 1.
larity of the two matrices. xi+1 − xi
(12)
′
xi = i = 1, 2, ⋯, n − 1
Therefore, the ultimate optimization goal of data generation is: Δt
The user’s characteristic curve reflects the user’s electricity con numerical and morphological characteristics:
sumption characteristics and electricity consumption behavior. In the √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
similarity measurement, determine the characteristic curve of the user (15)
′ ′
Dwhole (X, Y) = αD2 (X, Y) + λDTW(X , Y )
5
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
(1) The SVM1 separates the electricity theft features of the first class We used the smart energy data from the Irish Smart Energy Trial [38]
from the electricity theft features of the 2nd, 3th, …, Nth and in our tests. The dataset was released by Electric Ireland and Sustainable
normal samples, and constructs the SVM1; Energy Authority of Ireland (SEAI) in January 2012. It includes half-
(2) The SVMi separates the electricity theft features of the ith class hourly electricity usage reports of over 5000 Irish homes and busi
from the electricity theft features of the i + 1th, i + 2th, …, Nth nesses during 2009 and 2010. Customers who participated in the trial
and normal samples, and constructs the SVMi; had a smart meter installed in their homes and agreed to take part in the
(3) The SVMn separates the electricity theft features of the Nth class research. For each customer, there is a file containing half-hourly
from the normal samples, and constructs the SVMn. metering reports for 535 days. Therefore, it is a reasonable assump
tion that all samples belong to honest users. The large number and va
Finally, N classifiers are constructed according to the structure of the riety of customers, long periods of measurements, and availability to the
binary tree. When performing electricity theft detection, the SVM of public make this dataset an excellent source for research.
each layer only recognizes one type of power stealing. The remaining According to the actual situation of stealing electricity, this paper
sample set is identified by the next level of SVM, which is gradually sets up six types of stealing electricity. The first type of electricity theft,
reduced. The SVM of the last layer separates the last characteristic of all samples are multiplied by the same randomly chosen coefficient. The
stealing electricity from the normal samples, and the leaf nodes of DT are second type of electricity theft is an ‘on-off’ attack in which consumption
the type of electricity theft. is reported as zero during some intervals. The third type of electricity
The structure of DT-SVM is shown in Fig. 4. However, since the de theft multiplies the consumption by a random factor that varies over
cision tree is constructed as a hierarchical classification model, the time. The fourth type of electricity theft is combining of the second and
biggest problem is “error accumulation”, which affects the accuracy of third types. The fifth type of electricity theft is multiplied by the same
classification. If a biased binomial tree is used for classification, a de randomly chosen coefficient in peak period. The sixth type of electricity
cision tree with low error accumulation and high classification accuracy theft is an ‘on-off’ attack in random period, but the duration is short and
needs to be constructed first. In order to reduce the effect of “error discontinuous, reducing the total electricity consumption. Compared
accumulation”, this paper adopts the projection vector [36] method to with the second type, the sixth type is more difficult to detect because of
measure the degree of separation between classes and construct a biased the randomness of the selected time period. The example of the weekly
binomial decision tree based on this. consumption of a customer and the corresponding attack patterns is
As the data is far away from the hyperplane, the SVM algorithm shown in Fig. 6.
could accurately classify. But when the distance is close to the hyper
plane, the classification effect is low, and misclassification is prone to 5.1. Analysis of customer behavior
occur near the hyperplane. Therefore, the information provided by
samples near the interface is used to improve the accuracy of classifi This article numbers the characteristics commonly used to represent
cation, which combines SVM and KNN to establish a combined classifier users’ electricity consumption behavior, including the valley electricity
SVM-KNN (KSVM). When classifying the sample to be identified, the coefficient and peak-time electricity usage load rate. The common
distance between the sample and the classification hyperplane is
calculated. If the distance is greater than the given threshold a, the SVM
classification is directly applied. Otherwise, KNN is used for Training samples Test samples
classification.
In KNN classification, support vectors of each class are represented to
calculate the distance between the recognized sample and each SVM,
Calculate the distance
SVM classifier between the test sample
which distance is the feature space instead of the original space. The
and the decision plane
class of the sample to be divided is determined by distance. The distance
calculation formula is as follows [37]:
Determine decision
planes and support Yes
The distance
vectors >Threshold
No
Normal
SVM1 SVMi SVMn Classification Classification
sample
The
susp-
with SVM with KNN
ected
users Determine the number of the suspected user and the
Class 1 Class i Class n type of electricitytheft
Fig. 4. The structure of DT-SVM. Fig. 5. Flowchart of electricity theft detection with KSVM.
6
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
0.3 1.4
1
Consumption/(KW·h)
0.2
Consumption/(kW·h)
0.8
0.15
0.6
0.1
0.4
0.05 0.2
0 0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
t/30min t/30min
(a) (d)
1.6 4
Normal consumption Normal consumption
1.4
Attack pattern Attack pattern
3.5
1.2 3
Consumption/(kW·h)
Consumption/(KW·h)
1 2.5
0.8 2
0.6 1.5
0.4 1
0.2 0.5
0 0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
t/30min t/30min
(b) (e)
1.8 0.6
Normal consumption
1.6 Attack pattern
Normal consumption
Attack pattern 0.5
1.4
Consumption/(KW·h)
Consumption/(kW·h)
1.2 0.4
1
0.3
0.8
0.6 0.2
0.4
0.1
0.2
0 0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
t/30min t/30min
(c) (f)
Fig. 6. The weekly consumption of a customer and the corresponding attack patterns.
90
It can be seen from Fig. 8, as the number of features increases, the
accuracy of clustering increases, but when the number of features ex
85
ceeds 4, the accuracy of clustering decreases, so this article finally
determined the number of characteristics, which representing the elec
80
tricity consumption behavior. The determined characteristic indicators
are load rate, valley coefficient, peak-hour electricity consumption ratio, 75
and percentage of consumption in a normal period. 0 1 2 3 4 5 6
The number of selected features
It can be seen from Fig. 8 KFCM has fewer iterations, and the
Fig. 7. The trend of the number of features and the accuracy.
Table 2
The correspondence between features and numbers. obtained objective function value is smaller, that is, the algebraic sum of
Number Feature Number Feature each point to the cluster center is the smallest, so KFCM can effectively
1 Daily maximum load 6 Electricity consumption ratio improve the classification effect and iteration time of the algorithm
in peak period compared with FCM.
2 Percentage consumption in 7 Ratio between peak and According to the selected features to cluster. The daily load charac
normal period valley consumption teristic curve is obtained according to the weighted average method, is
3 Valley coefficient 8 Daily average load
4 Daily consumption 9 variance
shown in Fig. 9.
5 Load rate 10 Coefficient of variation
7
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
15 0.7
real sample
14 FCM 0.6 generated sample
KFCM
The valve of objective function
Consumption / (kW·h)
13
0.5
12
0.4
11
0.3
10
0.2
9
8 0.1
7 0
0 10 20 30 40 50 60 70 80 90 100
6 t/15min
0 10 20 30 40 50 60
(a) Samples generated after 192 pieces of training
The number of iterations
0.7
real sample
Fig. 8. Comparison of the number iterations of KFCM and FCM. 0.6 generated sample
Consumption / (kW·h)
0.5
0.4
1.5
The first type of typical consumption mode
Consumption / (kW·h)
0.3
1 0.2
0.1
0.5 0
0 10 20 30 40 50 60 70 80 90 100
t/15min
(b) Samples generated after 3840 pieces of training
0
0 5 10 15 20 25 30 35 40 45 50 0.7
Time/(30min)
real sample
(a) 0.6 generated sample
1.4
Consumption / (kW·h)
The second type of typical 0.5
1.2
Consumption / (kW·h)
consumption mode
1 0.4
0.8 0.3
0.6
0.2
0.4
0.2 0.1
0 0
0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 60 70 80 90 100
Time/(30min) t/15min
(b) (c) Samples generated after 6400 pieces of training
1.2
The third type of typical
1 consumption mode Fig. 10. Comparison between generated samples and real samples.
Consumption / (kW·h)
0.8
0.6
pieces of training. It can be seen from Fig. 10 (c) that the generator after
192 pieces of training has initially learned the distribution of real
0.4
samples, but the distance from the real samples is still large. Fig. 10(b) is
0.2
the data generated by the generator after 3840 pieces of training. It can
0
0 5 10 15 20 25 30 35 40 45 50
be seen from Fig. 10 (b) that the gap between the samples generated by
Time/(30min) the 3840 training generators and the real samples is tiny. It can be seen
(c) from Fig. 10(c) that the generator after 6400 pieces of training, the
2.5
sample generated by the generator can already deceive the discrimi
The fourth type of typical
2 consumption mode nator. The comparison between the generated samples and the real
samples Fig. 10 shows that the samples generated based on 1D-WGAN
Consumption / (kW·h)
1.5 are not exactly the same as the original samples, but the same fluctua
tion rules between them and differences in specific locations, thus
1
ensuring the diversity of the generated electricity theft samples. In
0.5 practical, data generation should be decided according to the amount of
electricity theft data.
0
0 5 10 15 20 25 30 35 40 45 50 This article compares several common data generation algorithms to
Time/(30min) generate sample, and the classification accuracy comparison results of
(d) the sets are shown in Table 3. No matter whether there is noise or no
noise, the samples generated by 1D-WGAN model can effectively
Fig. 9. The daily load characteristic curve.
8
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
improve the classification accuracy of the classifier. The results show Table 5
that by learning the sample distribution, 1D-WGAN generates new The values determined by similarity measures.
samples that are similar to the original samples but not the same. The Characteristic curve and test curve Measurement value
generated samples have a good effect and can solve the problem of
The first type of electricity theft 1.223
classifier overfitting. At the same time, the 1D-WGAN model reduces the The second type of electricity theft 0.8977
interference effect of noise in the process of adversarial learning and has The third type of electricity theft 2.3681
strong robustness and generalization. The fourth type of electricity theft 3.3260
Table 4 shows the running time of various methods at different SR. It The fifth type of electricity theft 3.5981
The sixth type of electricity theft 0.7843
can be seen that ROS has the fastest running time, and the running time
for WGAN to generate data at the same sampling rate is slightly higher
than ROS, but much lower than the SMOTE and ADA-SYN methods.
With the increase of the sampling rate, the operation time of ROS, 40 Excessive detection ratio 12
SMOTE, and ADA-SYN methods does not change significantly, and the Omission detection ratio
method based on WGAN increases significantly. 10
Excessive detection /%
30
Omission ratio\%
8
4
After several tests, the first threshold value D2 = 3, and the second 10
threshold value D1 = 0.7. The detection threshold in this interval can 2
tion, the method used in this paper has the lowest omission and exces
DT-SVM
90 This paper
sive detection ratio compared with other methods.
With the increase in the number of normal samples and power 85
Accuracy/%
similarity measure and SVM; The 3rd electricity theft detection method Omission ratio 8
94
defined as no generate data and similarity measure; The 4th electricity 7
theft detection method defined as generated data and SVM and com
92 6
bined similarity measure and DT-KSVM; The 5th electricity theft
Omission ratio/%
90
measure and SVM; The method proposed in this paper defined as 6. 4
Fig. 13 describes the accuracy and omission ratio under different
88 3
detection methods. It can be seen from the Fig. 13 that the detection
accuracy of the method used in this paper is the highest accuracy and the 2
86
lowest omission ratio. At the same time, comparing different methods, it 1
84 0
Table 4 1 2 3 4 5 6
Running time of various sampling methods. Detection method
Method Running time (s) Fig. 13. Comparative analysis of different schemes with respect to accuracy
SR = 0.5 SR = 1 and omission ratio.
9
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
Omission ratio /%
Accuracy/%
0.85
4
AUC
94 0.8
3
92 0.75
2
0.7
90 1
0.65
88 The first The second
0
The third The sixth Whole 0.6
KNN NN RF SVM CNN This paper
type type type type detection
Algorithm
Fig. 14. Detection accuracy and omission ratio of various electricity
Fig. 15. AUC performance of different classification model in different
theft methods.
attack types.
Omission ratio /%
network (CNN) structure and parameters for detecting electicity theft. 90
Accuracy/%
7
Fig. 15 show AUC (Area Under Curve) performance of different classi
fication models in different attack types, and Fig. 16 shows the com 85
5
parison with different algorithms. It can be seen that the proposed
method has accuracy advantages compared to other machine learning 80 3
methods, but is slightly less effective than CNN.
Fig. 17 shows the performance and response time compare CNN with 75 1
KNN NN RF SVM CNN This paper
DT-KSVM under different data sets. The lines with * indicate the accu
Algorithm
racy comparison between the two algorithms, and the lines with ◇
indicate the response time comparison between the two algorithms. Fig. 16. The accuracy comparison with other algorithms.
When the amount of data is small (less than 3000), the algorithm used in
this paper is more accurate than CNN, but when the amount of data is
sufficient, CNN can achieve better results. Apart from the above anal 100 3000
ysis, it is quite essential to compute the response time of the proposed
scheme and CNN for theft detection. Comparing the response time be 90 2500
tween the two, when the amount of data is small, the reaction time is
similar, but when the amount of data is large, the learning time of CNN is
80 DT-KSVM 2000
Deep learning
longer. At the same time, traditional machine learning, it’s simpler to 70 DT-KSVM 1500
Deep learning
adjust hyperparameters and change model designs because of more
comprehensive understanding for the underlying algorithms. However, 60 1000
10
X. Kong et al. International Journal of Electrical Power and Energy Systems 125 (2021) 106544
[6] Rengaraju P, Pandian SR, Lung CH. Communication networks and non-technical
95 energy loss control system for smart grid networks. IEEE Innov Smart Grid Technol
2014:418–23.
90
Accuracy/%
11