Electrical Power and Energy Systems: Sciencedirect

Electrical Power and Energy Systems 124 (2021) 106399
Contents lists available at ScienceDirect
Electrical Power and Energy Systems

journal homepage: www.elsevier.com/locate/ijepes
Fault diagnosis of shipboard medium-voltage DC power system based on T

machine learning
⁎
Sheng Liua, Yue Suna,b, , Lanyong Zhanga, Peng Suc
a
College of Automation, Harbin Engineering University, Harbin 150001, China
b
Faculty of Engineering, National University of Singapore, Singapore 119077, Singapore
c
China Ship Development and Design Center, Wuhan, 430064, China
A B S T R A C T
This study proposed a fault diagnosis method of a shipboard medium-voltage DC (MVDC) power system based on Noise-Assisted Multivariate Empirical Mode
Decomposition (NA-MEMD) and Multilevel Iterative – LightGBM (MI-LightGBM), which overcomes the limitations of the existing fault diagnosis methods in this
regard, such as relies heavily on the relay or slow training process. MI-LightGBM is proposed to solve the problem of unbalanced training samples caused by the
difficulty in obtaining fault samples in practical engineering. First, NA-MEMD was adopted to pre-process the voltage signals, which were decomposed into a set of
components called Intrinsic Mode Functions (IMFs) according to the local characteristic time scales of the original signals. The energy moment of each order IMF was
calculated as fault feature vector to train the MI-LightGBM model, which led to the development of a high-precision fault classifier. A model of a shipboard MVDC
power system was established using the AppSIM Real-Time Simulator. Simulations were performed on earth fault and short-circuit fault at the generator output and
DC cable. Compared with the existing fault diagnosis methods, the proposed method is simple to use and save more than half of the training time while maintaining
high diagnostic performance, which is more suitable for engineering applications.
1. Introduction protection units for centralized control, and uses differential elements
for fault detection. However, this method requires many sub-microgrid
Shipboard integrated power system, especially medium-voltage DC relays (SMR) to be set in each region and has a strong dependence on
(MVDC) [1], is a current main trend in the development of high-tech SMR, making the diagnostic accuracy subject to SMR. Chanda et al. [9]
shipboard power systems and have been gradually adopted as the main designed a fault classification and localization methodology for MVDC
power systems for next-generation naval vessels [2]. A shipboard shipboard power systems, building a system model based on PSCAD and
MVDC power system can be regarded as an independent controlled using an ANN-based approach to quickly and accurately detect faults on
microgrid that is powered by distributed generation units and is more DC cables. However, this methodology relies on all data for diagnosis,
fragile and prone to failure than conventional power grids. This result which leads to slow training of the ANN-based classifier model. Li et al.
in a rapid increase in fault current, which poses a great threat to system [10,11] proposed a method based on wavelet transform multiresolution
safety and thereby places a high safety requirement on the system. analysis and artificial neural networks for fault diagnosis of MVDC
However, the fault diagnosis of an MVDC power system is still in its shipboard power systems. This method does not require interzone
infancy, with many problems remaining to be solved [3]. communication and external signal injection. It has high generalization
The existing fault detection methods consist of traditional ones and capability for application to systems with different parameters. How-
new ones [4], with the traditional ones including AC/DC Side Over- ever, the training speed of ANN is slow, so it is not suitable for dealing
current Protection, Directional Protection and Current Differential with the problem of large amount of data. Eristi et al. [12] proposed a
Protection, and the new ones including Short-time Fourier Transform power system disturbances detection method based on WT and support
[5], Wavelet Transforms (WT) [6], and Artificial Neural Networks. vector machines (SVM). The method has high accuracy, strong ro-
Baran et al. [7] developed a fault detection and fast isolation method bustness and small random-access memory (RAM), but it is difficult to
for multiterminal dc systems that is based on the idea of overcurrent select the parameters of wavelet and SVM. Given that the range of
protection. This method, however, is prone to interference, and the shipboard MVDC power system is large, traditional fault detection
setting of overcurrent threshold is subjective. Monadi et al. [8] pro- methods require the system to be divided in detail and many relays to
posed a central protection strategy for medium-voltage DC microgrids be installed, which results in a high financial burden. Moreover, da-
that divides the microgrids into several regions. They design centralized mage to one relay would result in failure of the detection. For MVDC
⁎
Corresponding author.
E-mail address: 576194000@hrbeu.edu.cn (Y. Sun).
https://doi.org/10.1016/j.ijepes.2020.106399
Received 22 March 2020; Received in revised form 18 June 2020; Accepted 24 July 2020
Available online 07 August 2020
0142-0615/ © 2020 Elsevier Ltd. All rights reserved.
S. Liu, et al. Electrical Power and Energy Systems 124 (2021) 106399
Table 1
Comparative studies of fault feature extraction methods.
Method Advantages Disadvantages
Fourier Transform [5] 1. Simple principle 1. Lack of extraction of time and frequency information
2. It can well describe the frequency characteristics of signals 2. Limited to non-stationary signals
3. Poor sensitivity to abrupt signal
WT [14] 1. Highlight frequency features through variable window size 1. Has to select proper basis wavelets and decomposition levels according to different
2. Suitable for non-stationary signals signals
2. Cannot be simultaneously applied to multi-channel signals
EMD [15] 1. Without a need to pre-set any basis function 1. Endpoint effect
2. strong adaptability 2. Modal aliasing
3. Cannot be simultaneously applied to multi-channel signals
MEMD [16] 1. It can process multi-channel signals simultaneously 1. Endpoint effect
2. Multi-channel signals have waveform similarity on the same 2. Modal aliasing
scale
systems, a small fault would likely paralyze the entire system. The ex- MVDC power system with faults including the generator phase-to-phase
isting new detection methods are not only subjected to a slow model short circuit, earth fault, and DC cable ground fault. The methodology
training process but also require the model parameters to be carefully involves the following steps. First, an MVDC power system model is
adjusted to meet the accuracy requirement for fault detection. built in the AppSIM Real Time Simulator to simulate various faults,
Many signal processing methods have been applied to the analysis with fault voltages pre-processed by NA-MEMD. The NA-MEMD de-
of power systems. In Table 1, the advantages and disadvantages of composes the signals into the sum of a set of IMFs according to the local
several fault feature extraction methods for fault diagnosis are shown. It characteristic time scales of the signals to describe the signal frequency
can be seen that many of them are limited in the simultaneous pro- components, which can effectively characterize important signal
cessing of multi-channel signals. The emergence of noise-assisted mul- properties. Next, energy moments of IMFs are calculated. These energy
tivariate empirical mode decomposition (NA-MEMD) [13] solves the moments are used as fault feature vectors based on joint consideration
above problem. It performs signal decomposition based on the time- of two aspects: the energy magnitude of each order IMF and the tem-
scale features of the data without a need to pre-set any basis functions. poral distribution of IMF energy. Finally, MI-LightGBM is used as a fault
Its biggest advantage is that it can process multi-channel signals si- classifier to accurately classify fault feature vectors. The proposed fault
multaneously to obtain intrinsic mode functions (IMFs) that have the diagnosis method is not affected by electric parameters such as cable
same dimension and matched frequencies so as to facilitate later ana- inductance and impedance of the generator and propulsion motor, and
lysis. NA-MEMD eliminates the problems of the endpoint effect and not subject to external interference; furthermore, it does not need inter
modal aliasing in the EMD method by introducing a new independent region communication and has strong generalization ability. It solves
channel of auxiliary white noise while ensuring that the original signals the problem of classification error deviation caused by sample im-
are not contaminated by the white-noise signal. The main frequency balance.
components of the original signal clearly exist in the IMF, so NA-MEMD This paper is organized as follows. In Section 2, we introduce the
is very suitable for fault feature extraction. MVDC power system model used in this paper. Feature engineering and
LightGBM [17] is a machine learning algorithm based on gradient fault classification are presented in Section 3. In Section 4, we introduce
boosting decision tree (GBDT) [18], which mainly employs two new a detailed numerical simulation to validate the effectiveness of the
technologies, namely gradient-based one-side sampling (GOSS) and proposed fault diagnosis method of the power system. Finally, the
exclusive feature bundling (EFB), to ensure that the method is fast and conclusions are drawn in Section 5.
accurate. It adopts a histogram-based algorithm to effectively exploit
the sparsity characteristics of large-scale data, allows for extremely fast 2. Model of shipboard MVDC power system
training and diagnosis. It also has an increased robustness to noise,
thereby making it very suitable for engineering applications [19]. In The original conceptual model of a shipboard MVDC power system
practical engineering, the system is usually in normal operation, and in that was established by the Electric Ship Research and Development
rare cases, the failure occurs, so the number of fault samples is much Consortium (ESRDC) [20,21] was simplified in this study to obtain a
smaller than the normal sample. Among the training samples, the MVDC power system conceptual model (Fig. 1). This model consists of
normal samples account for the majority and can be regarded as simple two power generation module (PGM): the main PGM and auxiliary
samples. A large number of simple samples guide the direction of gra- PGM. The main PGM contained a twin-shaft gas turbine as the prime
dient updating method of the classifier, which will affect the accuracy mover (a single-shaft gas turbine used in the auxiliary PGM) to drive the
of fault samples. In the model training process, the training stops as notional round rotor synchronous machine, and an IEEE AC8B exciter
long as the overall accuracy reaches the training requirements. That is was used as the excitation system. The prime mover provides an input
to say, the classification accuracy of the model for a large number of power to the synchronous machine model, which, in turn, provides the
normal samples is high, and it can also meet the training requirements. shaft speed as feedback. A six-pulse diode rectifier was employed to
This makes the training of fault samples inadequate, and the accuracy convert alternating current into direct current, which was delivered
of fault classification is lower than that of normal state. In order to solve through RL cables to the loads. The propulsion system is equivalized to
the above problems, this paper proposes a Multilevel Iterative- a motor drive inverter and a motor. The service loads are modelled as
LightGBM (MI-LightGBM). This method weights the samples with low constant power loads (CPLs) and connected to the main grid via power
classification confidence level, so that the model increases the training electronic converters. The DC voltage of the system is 5 kV. The main
gradient and improve the classification accuracy of the samples with parameters in the MVDC power system model are shown in Table 2.
insufficient training. MI-LightGBM not only increases the diagnostic AC system is the main power supply source of DC power grid. When
accuracy of the fault samples, but also weights the normal samples AC system fails, fault current will affect DC side through converter. The
which are not well fitted, thus further improving the overall accuracy. faults at the output of the synchronous machine in main PGM and
This paper proposes a fault diagnosis model combining NA-MEMD auxiliary PGM is simulated in this paper. There are single-phase earth
and MI-LightGBM that can be used for fault diagnosis of a shipboard fault M - F1, two-phase earth fault M - F2 , two-phase short-circuit fault
2
using data processing technology while removing interfering informa-

tion; next, machine learning is performed on the features to generate
the desired fault diagnosis model. This paper proposed using an NA-
MEMD method to pre-process the fault data and highlight the useful
information in the data for fault diagnosis.
3.1.1. NA-MEMD method

When using a traditional empirical mode decomposition method to
analyze multi-channel vibration signals, solutions will be separately
sought for each signal. This results in a different number and scale of
IMFs decomposed in each channel, which is not conducive to the cor-
relation analysis between multi-channel signals at the same time,
thereby degrading the accuracy of fault diagnosis. The NA-MEMD
method can process multi-channel signals simultaneously and decom-
pose them into IMFs with the same dimensions, which makes it easier to
conduct subsequent component analysis. NA-MEMD processes mixed
signals consisting of multi-channel signals and gaussian white noise of
independent channels. In dealing with white noise, the characteristics
of MEMD binary filter banks are fully utilized to solve the modal
Fig. 1. MVDC power system model. aliasing phenomenon in MEMD. It can effectively separate the original
signal and auxiliary white noise after decomposition. It has strong
adaptability and robustness and can provide high decomposition ac-
M - F11, and three-phase short-circuit fault M - F3 of main PGM, single-
curacy in the presence of background noise. The NA-MEMD method
phase earth faultA - F1, two-phase earth fault A - F2 , two-phase short-
decomposes the signal into IMF components with different character-
circuit fault A - F11, and three-phase short-circuit fault A - F3 of aux-
istic time scales.
iliary PGM. Since the main and auxiliary power generation have dif-
The detailed steps are as follows.
ferent contributions to the power of the DC grid, the impact of the main
For a-dimensional variable input signal, the detailed steps of the
and auxiliary power generation faults on the DC gird is different. Single-
NA-MEMD algorithm are as follows.
phase earth fault, two-phase earth fault, two-phase short-circuit fault is
Step 1: Generate a p -channel irrelevant Gaussian white noise signal
asymmetric fault. When the asymmetric fault occurs on the AC side, the
s (t ) = {s1 (t ), s2 (t ), …, sp (t )} , the length j of which is the same as the
phase voltage at the fault point decreases instantaneously, and the
original signal x (t ) .
current is opposite. Faults on the AC side also affect the voltage and
Step 2: Combine the p -channel Gauss white noise signal s (t ) with the
current waveforms on the DC side. The asymmetrical fault of AC side
q -channel original signal x (t ) to form the g -channel multivariate signal
makes the voltage and current of DC side fluctuate, but it does not cause
z (t ) = {z1 (t ), z2 (t ), …, z g (t )} , where g = q + p .
DC over-voltage and over-current. Three-phase short-circuit fault is the
Step 3: Use the Hammersley sequence sampling method to obtain a
most serious type of AC side, which will have a serious impact on the
proper direction vector of the n -dimensional space on the (n − 1) -di-
DC power system. When three-phase short-circuit fault occurs at AC
mensional spherical surface.
side, the current after the short-circuit point is cut off, so the voltage
Step 4: Calculate the projection P k (t ) of the signal z (t ) along the
and current of DC bus will be greatly affected. DC fault FDC was also
k -th direction vector X k , where l is the total number of direction vectors
studied in this paper, which will have a fatal effect on the DC power
and k = 1, 2, …, l .
system. After the DC fault occurs, the DC current will increase sharply,
Step 5: Find the time tik corresponding to the maximum and
and the voltage of the pole with the DC fault is 0. Therefore, the voltage
minimum values of the projection pk (t ) .
on DC bus can be used to judge both DC fault and AC system fault. The
Step 6: Apply multivariable spline interpolation to the extreme
fault earth resistance is 0.01 Ω.
point [tik , x (tik )], and obtain the multivariate envelope E k (t ) .
Step 7: For the l direction vector of sphere space, calculate the mean
3. Fault diagnosis method m (t ) of envelope curve as
l
3.1. Feature engineering 1
m (t ) =
l
∑ Ek (t )
k=1 (1)
Feature engineering is a particularly important step in fault diag-
nosis. If machine learning is directly applied to raw data to construct a Step 8: Use Di (t ) = x (t ) − m (t ) to calculate the i -th order intrinsic
fault diagnosis model, model training will be slow owing to large data mode function Di (t ) . If Di (t ) satisfies the multivariate IMF criterion,
dimensions. This is because this algorithm cannot automatically extract Di (t ) is the i -th multivariate IMF component, and the formula for cal-
meaningful information from the original data. Therefore, it is neces- culating the residual function is ri (t ) = x (t ) − Di (t ) . Then, ri (t ) is taken
sary to pre-process raw data to obtain appropriate feature vectors so as as the new initial signal. The above steps are continued until the re-
to establish a more efficient fault diagnosis model. In other words, raw sidual function ri (t ) becomes a monotonic function; then, the sieving
data is obtained first, from which meaningful features are extracted process is stopped. Otherwise, use Di (t ) instead of the raw data to repeat
Table 2
Parameters of the MVDC power system model.
Prime mover (main Generator (main Prime mover Generator (auxiliary Propulsion motor Service Inductance of cable (per Length of
PGM) PGM) (auxiliary PGM) PGM) loads meter) cable
Twin-shaft Synchronous Single-shaft Synchronous Synchronous 10 MVA 4.77e-7H 200 m

36 MW 47 MVA 4 MW 5 MVA 36.5 MW
3600 rpm 3600 rpm 14400 rpm 14400 rpm 120 rpm
3
steps 4–7 until the stop criteria are met. with a smaller memory footprint.
Step 9: Decompose the (q + p) -dimensional IMF component after
the above steps are completed. Among them, the p -dimensional IMF 3.3. LightGBM
components generated by the auxiliary noise channel decomposition
are discarded, and finally the IMF component obtained by decomposing To overcome the above shortcomings, two novel technologies are
the q -dimensional channels from the original signal is obtained. proposed in LightGBM, namely gradient-based one-side sampling
Different signals have the same number of IMFs after processed by (GOSS) and exclusive feature bundling (EFB).
NA-MEMD, and the same frequency components in the different signals a. GOSS is employed as a sampling algorithm to reduce the number
exist in same order IMF components, which makes subsequent fault of samples.
analysis less difficult. Traditional AdaBoost uses weights to measure the importance of a
dataset, assigning larger weights to samples that have given poor
3.1.2. Energy moment training results so that the training of the next base learner will be
To characterize the difference between different forms of faults in a based more on these samples. In other words, the training data can be
more straightforward manner, it is necessary to obtain distinctive selected according to the weights so as to reduce the training burden.
characteristic information through analysis and calculation. A diagnosis LightGBM uses a brand-new GOSS-based technology to filter data in-
model established using characteristic information will have lower stances to find split values. Based on the above principle, LightGBM
model complexity but higher sensitivity and accuracy for fault identi- overcomes the problem that GBDT faces, which is that GBDT uses all
fication. When a short circuit or ground fault occurs, the frequency of the datasets for each training owing to the lack of weight and thereby
the voltage and current changes, which in turn leads to changes in the slows the computation speed. LightGBM reduces the amount of data to
energy distribution. Therefore, when working with different faults, the be analyzed while ensuring the accuracy of the learning model.
energy of the signal changes with the working condition of the system. b. EFB is employed to pre-process the sparse data and reduce the
We calculated the energy moment of the IMF to form a fault feature number of features.
vector [22]. This method considers the effect of a time scale on energy As its name suggests, exclusive feature bundling is centered on the
entropy and can represent failure information more comprehensively. principle that several features are combined to effectively reduce the
The detailed calculation of the IMF energy moment is as follows. number of features. Most high-dimension datasets are sparse. The
Step 1: The q -channel signals are decomposed by NA-MEMD, and sparsity of features indicates that many features are mutually exclusive.
the IMF component c1, c2, …, cN and residual function Res are obtained. This method exploits this attribute to bind mutually exclusive features
Step 2: The energy moment of c1, c2, …, cN is calculated as shown in together, converting a sparse matrix into a dense matrix. This strategy
Eq. (2). The energy moment ER of the residual function Res is calculated leads to a smaller number of features while not affecting the model’s
by Eq. (3). performance, thereby speeding up the training. After placing features in
different scopes into the same bundle, it is necessary to ensure that the
En = ∫ t |cn (t )|2 dt (2) values of the original features before bundling can be accurately iden-
tified in the bundle. In order to ensure that the values between features
ER = ∫ t |Res|2 dt (3) do not conflict, the value range of features in the merged bundle is
rebuilt by adding offset.
where n = 1, 2, …, N . Then, all the IMFs energy moments consist of the
LightGBM adopts split search algorithm based on histogram, which
eigenvector T .
brings great convenience to operation. Regardless of whether it is a
T = [E1, E2, …, EN , ER] (4) discrete feature or a continuous feature, the histogram algorithm will
process the feature value according to the discrete feature. It discrete
Step 3: Normalizing the energy moment of IMF results in
the feature value into discrete values and construct histogram. Then
N 1 2 traverse the training data and count the cumulative statistics of each
⎛ ⎞
E = ⎜ ∑ |En |2 + |ER |2 ⎟ discrete value in the histogram. When performing feature selection,
⎝n=1 ⎠ (5) only the discrete value of histogram is needed to traverse to find the
optimal segmentation point. The use of histogram algorithm does not
E E E E
T = ⎡ 1 , 2 , …, N , R ⎤ need to store the pre-sorting results, only save the value of discrete
⎣E E E E⎦ (6)
features, which reduces the memory consumption of the algorithm.
Therefore, T is the extracted fault feature vector. In addition to the above two major technical improvements,
LightGBM uses a leaf-wise strategy to generate trees to reduce un-
3.2. Fault identification necessary split costs, so that it has a lower memory footprint and higher
computation speed with good accuracy. Although this may lead to
The GBDT is a popular machine learning algorithm that has drawn overfitting, LightGBM adds a maximum depth limit, thereby solving the
extensive attention since introduction. It can be implemented in various problem.
forms, such as XGBoost and pGBRT. XGBoost is a highly efficient im-
plementation of the GBDT algorithm and contains many optimizations, 3.4. Multilevel Iterative LightGBM
improving the generalization capability of the model. However, it fails
to give satisfactory operating speeds and results when handling ex- Unbalanced training data is one of the biggest problems in machine
tremely large data dimensions and magnitudes. This is because, when learning. In practical engineering, it is very difficult to obtain fault
performing judgement on each feature, the models are required to samples. Therefore, the number of fault samples is less than the number
calculate the information gain of each split based on all the original of normal samples. The fault diagnosis often needs to face the problem
data. This practice is advantageous in that it considers all possibilities, of sample imbalance. In this case, the update of the training gradient
but it is computationally intensive and time-consuming with a large will be more biased towards the normal samples. This will result in that
memory footprint. To solve this problem, Microsoft proposed a method although the overall classification accuracy of classifier meets the
known as LightGBM in 2017, which contains many new methods that training requirements, the diagnosis accuracy of fault samples is very
improve the model. It has many advantages compared with conven- low. In order to make LightGBM more suitable for application in the
tional algorithms, such as the ability to allow faster model training and field of fault diagnosis and solve the above problems, this paper pro-
higher accuracy, support parallel learning, and process large-scale data posed the MI-LightGBM. This method solves the above problems by
4
increasing the weight of samples with misclassification or low classifi-

cation confidence. The detailed improvement methods are as follows.
The improved objective function is expressed by Eq. (7):
n
Obj (t ) = ∑ κi l (yi , y i(t−1) + ft (x i )) + Ω(ft ) + con
i=1 (7)
where the weight κi is expressed as
c conf k < tv
κi = ⎧
⎨1
⎩ conf k > tv (8)
n
∑ probk (j ) − max probk
j=1
conf k = max probk −
n−1 (9)
In Eq. (18), Ω( ) is regularization term. c is a constant, and
conf k denotes the classification confidence level of sample k, which
measures the reliability of the classification result; tv is a pre-set
threshold, n is the number of categories, and probk (j ) denotes the
probability that sample k falls in category j . The tv needs to be de-
termined according to the application scenarios. In this paper, tv = 0.9
to ensure the high reliability of the fault diagnosis model. After each
training, if the classification confidence level is lower than the
threshold, it indicates that the classification result has a low reliability,
and therefore, the weight of sample k will be increased in the set of
training samples to obtain a new sample set. Model training will be
conducted using the new sample set. This strategy allows the model to
preferentially fit the samples with low classification reliabilities to
improve model accuracy. However, the training accuracy will decrease
if the model over-fits the data. Therefore, as the number of training
iterations increases, the model accuracy will increase first and then
decrease. In this study, the number of training iterations that gives the
highest model accuracy will be experimentally determined, so that a
final classification model can be established.
4. Experiment
4.1. Experimental description
A numerical simulation based on the AppSIM real-time simulator

was conducted to evaluate the performance of the proposed fault di-
agnosis method of the ship MVDC power system. The MVDC power
system model introduced in Section 2 was implemented on an AppSIM
real-time simulator so that the switching details of the power electronic
converter could be simulated in real-time. The configuration of the
AppSIM real-time simulator is shown in Fig. 2. Single-phase earth fault,
two-phase earth fault, two-phase short-circuit fault, and three-phase
short-circuit fault were simulated on a synchronous generator, and a
Fig. 3. Fault waveforms on the DC cable.
short-circuit fault on a DC bus was simulated on an RL cable with the

MVDC power system model. Five signals were collected to analyze fault
information simultaneously. There are the voltageM − Va , M − Vb at
the output of the synchronous machine in the main PGM, the
voltage A − Va , A − Vb at the output of the synchronous machine in the
auxiliary PGM and the DC voltage Vdc on the DC cable.
For fault waveforms, the simulation duration was set to 10 s, while
the fault was set from 5 s. The fundamental sample time was 1e−5 s.
The voltage waveforms on the DC cable in different states are shown in
Fig. 3.
Fig. 2. Configuration of the AppSIM real time simulator.
5
4.2. Feature engineering Table 3

The parameters of MI-LightGBM.
The voltage signals within 10 ms after fault is used for fault detec- Parameter Value Parameter Value
tion [4]. The NA-MEMD method was adopted to simultaneously pre-
process the signals of M − Va, M − Vb, A − Va, A − Vb, Vdc in each Boosting_type gbdt Learning_rate 0.06
Objective multiclass Feature_fraction 0.8
system state. In NA-MEMD, the amplitude and the number of channels
Num_class 10 Bagging_fraction 0.8
of the added noise are selected according to experience [13]. Ex- Metric multi_error Bagging_freq 5
cessively low noise power cannot constitute a stable quasi- dyadic filter Num_leaves 80 Lambda_11 0.4
bank structure for input data, resulting in endpoint effects and modal Min_data_in_leaf 100 Lambda_128 0.5
aliasing in the decomposition results. Although the increase of noise Min_gain_to_split 0.2 max_depth 7
power will enhance the structure of the quasi-dyadic filter bank struc-
ture, the excessive noise power will damage the data-driven ability of
training data, and the remaining 3000 sets were used as the test data.
the algorithm. In this study, we set the number of auxiliary white-noise
The parameters of LightGBM are optimized by Grid SearchCV. The
channels to 2, and the noise variance was 0.1. Different frequencies of
parameter settings are specified in Table 3.
voltage components would lead to different characteristic time scales in
a different fault state of the system, which would result in different final
results of NA-MEMD. Signal pre-processing with NA-MEMD has strong 4.3.1. Comparison with other intelligent fault diagnosis methods
robustness, with negligible interference from the I/O noise of AppSIM, The performance of fault classifier based on MI-LightGBM was
that is, the process of signal denoising could be omitted. Moreover, the compared with those of other intelligent methods, and the experimental
NA-MEMD method can simultaneously process signals in five channels results are shown in Table 4. The testing time represents the time that
and decompose them into the same number of IMFs, thereby avoiding the classifier detects all the 3000 test samples. As can be seen from
loss of local features and maximizing preservation of dynamic features Table 4, the MI-LightGBM method has excellent performance in each
of the signals. It makes feature analysis less challenging. Then, we index. Although the training speed of k-Nearest Neighbor (KNN) and
calculated the energy moments of the IMFs. Since the voltage signals logical regression (LR) is very fast, the other indexes of these two
are used as the original signal, which only contains the frequency of classifiers cannot meet the requirements of shipboard MVDC power
signal itself and I/O noise of APPSIM, the energy moments of some system fault diagnosis. The classifier based on fuzzy NN can achieve
order of IMFs is very small or almost zero. These energy moments in- good classification performance, but it needs too long training time. In
dicate that the correlation between the IMF and the original signal is actual engineering, the number of training samples is larger, this
very small, which is not the main component of it. Instead of ignoring shortcoming will be more obvious. Therefore, under the condition of
these small energy moments, this study used them to form feature high diagnosis performance, the fault classification method proposed in
vectors for training the fault classification model. Because of EFB this paper takes the shortest time. These results reveal that the MI-
technology, LightGBM performs very well in dealing with sparse data. LightGBM-based fault classification method proposed in this study has
The feature vectors close to 0 would not have adverse impacts on di- great advantages in training large datasets. It effectively solves the
agnosis results. Instead, they would be beneficial for model training in conventional problem of low computational efficiency by ensuring
MI-LightGBM and improve training accuracy. model accuracy while speeding up the operation of the classifier model.
It can also be seen from Table 4 that in addition to the advantages in
4.3. Fault classification training speed and accuracy, the method proposed in this paper still has
a good performance in precision, sensitivity, and F1-score. The ex-
MI-LightGBM, which was implemented in Python, was adopted as periment of the fault classifier with different cable inductance and
the fault classifier. By using the above-extracted fault feature vectors as impedance of the generator and propulsion motor is also completed.
model inputs in MI-LightGBM, a fault diagnosis model of the MVDC The experimental results are very similar to those shown in Table 4, so
power system was established. The experimental process is shown in they are not listed. It shows that these parameters will not affect the
Fig. 4. In each system state, 1000 datasets were selected to form 10,000 fault diagnosis results. The fault diagnosis method proposed in this
sets of samples. From these sets, 7000 sets were randomly selected as paper can not be affected by these electric parameters.
Receiver operator characteristic curve (ROC) curve is used to
compare the reliability of seven classifiers. In the ROC curve diagram,
the horizontal axis is false positive rate = 1 − specificity . The larger
false positive rate is, the more actual negative classes are predicted in
the positive category. The vertical axis represents
true positive rate = sensitivity . The larger true positive rate, the more
actual positive classes among the predicted positive classes. Therefore,
the larger the sensitivity and specificity (ROC curve closer to (0,1)
point), the better the performance of the classifier. Area under curve
(AUC) value is the area size under ROC curve. Obviously, AUC is di-
rectly proportional to classification performance. It can be seen from
Fig. 5 that XGBoost, LightGBM and MI-LightGBM can perform well in
all classes. Although the average AUC value of other classifiers is high,
the diagnosis of certain types of faults will not be effective. This means
that there will be a high probability of making a wrong judgment on
this kind of fault, which may lead to the paralysis the whole shipboard
MVDC power system.
4.3.2. Comparison between LightGBM and MI-LightGBM under different

sample proportions
In order to verify the superiority of the proposed method, the model
Fig. 4. Experimental process of proposed fault diagnosis. performance of LightGBM and MI-LightGBM is evaluated at different
6
Table 4
Indicators for different fault classifiers.
Methods Training time (s) Testing time (s) Average accuracy (%) Weight precision (%) Weight sensitivity (%) Weight F1-score (%)
KNN [23] 2.652 0.851 94.267 95.635 94.267 93.959

Decision Tree C4.5 [24] 22.157 0.085 98.633 98.637 98.633 98.632
Fuzzy NN [25] 123.974 0.096 99.167 99.168 99.167 99.166
LR [26] 2.481 1.547 99.233 99.238 99.233 99.233
XGBoost [27] 87.721 0.283 99.771 99.300 99.300 93.300
LightGBM [17] 11.77 0.065 99.767 99.767 99.767 99.767
MI-LightGBM 12.32 0.071 99.773 99.773 99.773 99.773
sample proportions. In the case of unbalanced sample proportion, the Table 5

overall accuracy cannot accurately reflect the performance of the Meaning of TP, FP, FN and TN.
model. Therefore, the Precision-recall curve (PR-curve) is used as the Actual positive Actual negative
evaluation standard in this paper. The calculation formulas of precision
and recall are as follows: Predicted positive TP FP
Predicted negative FN TN
TP
Precision =
TP + FP (10)
results are shown in Fig. 6. Class 0 indicates normal condition and Class
TP
Recall = 1 indicates fault condition.
TP + FN (11)
It can be seen that in the case of sample ratios of 1:1 and 1:2, the
where the meaning of TP , FP , FN is shown in Table 5. The area en- accuracy difference between LightGBM and MI-LightGBM is very small.
closed by PR-curve indicates the accuracy, and the larger the area, the However, with the increase of sample proportion, MI-lightGBM shows
better the model performance. significant advantages. The classification performance of LightGBM is
A total of 1000 groups of short circuit fault on DC bus data and seriously affected by the sample proportion, especially the classification
normal data are selected for comparison. Set the ratio of fault samples of fault samples, which is only 43.6% in the case of 1:10. But MI-
and normal samples to 1:1, 1:2, 1:5, 1:10 respectively to verify the LightGBM can still guarantee good performance when the sample
performance of the model under different conditions. The experimental
Fig. 5. ROC curve of classifiers.
7
Fig. 6. Precision-Recall curve under different sample proportions (left: LightGBM, right: MI-LightGBM).
proportion is very unbalanced. Therefore, it can be seen that the fault imbalance caused by the difficulty of obtaining fault samples in prac-
classifier based on MI -LightGBM proposed in this paper can still tical engineering.
maintain a relatively ideal classification result under the condition of
sample imbalance, which is more suitable for the problem of sample
8
4.4. Comparison of fault feature extraction methods model, and a cumbersome application process. In particular, MI-
LightGBM was proposed for the imbalance of training samples in actual
A comparative experiment has been done to study the classification engineering. In the new method, the first step was to simultaneously
accuracies with different feature extraction methods. First, the data is pre-process the voltage signals of the synchronous motor and DC bus by
pre-processed using WT, EMD, EEMD and NA-MEMD methods, re- using NA-MEMD, which decomposed the multi-channel signals into
spectively. The feature extraction is then performed using variance, IMFs that contained local features of the signals. Next, the energy
distance evaluation technique, and energy moments. The detailed in- moments of the IMFs were calculated as fault feature vectors. Finally,
troduction of variance and distance evaluation technique are shown the fault feature vectors are used as input data to train the classifier
below. based on MI-LightGBM and get the fault diagnosis model. In the ex-
1) Variance: periment, a shipboard MVDC power system was established using the
N AppSIM Real Time Simulator. Fault feature extraction and fault clas-
1 sifier design were implemented using Python, and the results were
var =
N
∑ xi2
i=1 (12) compared with those of other fault diagnosis methods. The experi-
mental results revealed that the proposed method can achieve fast
where N is the number of sample point, x i , i = 1, 2, …, N is the input
training speed under the condition of ensuring the performance of
signal.
classifier. It has strong generalization ability and is more suitable for
2) Distance evaluation technique
for n categories, there is {f (j, k ) , j = 1, 2, …, n; k = 1, 2, …, Nj} , f (j, k ) engineering application. The main contributions of this study are
is the k-th feature of the i-th category. Nj is the number of features in i-th summarized as follows:
category. 1) NA-MEMD was successfully employed to pre-process voltage
a. Calculate the average distance of all features in i-th category. signals. It is still a relatively new algorithm which has few applications
in the area of fault diagnosis. NA-MEMD is an adaptive signal decom-
Nj Nj
1 1 1 position method that does not involve basis functions selection. It im-
Dj =
2 Nj
∑ Nj − 1
∑ |f (j,m) − f (j, k ) |
proves the accuracy of fault feature extraction and facilitates the next
m=1 k=1 (13)
step of feature analysis.
Then the average distance of Dj is 2) The existing problems in fault diagnosis of a shipboard MVDC
1
n power system were solved in this study by innovatively adopting a
Dw =
n
∑ Dj state-of-the-art machine learning method. MI-LightGBM was used as the
j=1 (14) fault classifier, which can get higher accuracy with shorter training
b. Calculate the average distance between n categories. time.
n
3) The MI-LightGBM method was proposed by improving the
1 LightGBM objective function. This method can still maintain a good
Dr =
n
∑ |d(j) − d|
j=1 (15) classification performance in the case of imbalance between fault
samples and normal samples. It solves the problem that it is difficult to
Nj
1 obtain fault samples in the actual engineering. Therefore, the fault di-
d (j) =
N
∑ f (j,k) agnosis method proposed in this study will be more suitable for en-
k=1 (16)
gineering applications.
n
1 4) The fault diagnosis method proposed in this study has high ac-
d=
n
∑ d(j) curacy and strong robustness. It does not need to rely on many relays
j=1 (17)
and is not subject to the impedance of the motor and the cable in-
c. The distance evaluation criteria is calculated as ductances. It has a strong generalization capability and may be applied
Dr to other systems in the future.
J=
Dw (18)
CRediT authorship contribution statement
According to the above description, the best feature can be selected
according to the larger distance evaluation criteria J . Sheng Liu: Investigation, Resources, Supervision, Funding acquisi-
The experimental results are shown in Fig. 7. It can be seen that the tion. Yue Sun: Conceptualization, Methodology, Software, Validation,
NA-MEMD + energy moment method used in this paper can obtain the Writing - original draft, Writing - review & editing. Lanyong Zhang:
highest diagnostic accuracy. The NA-MEMD is an adaptive signal pro- Formal analysis, Project administration. Peng Su: Data curation,
cessing method, which does not need to select the mother wavelet and Visualization.
decomposition levels as WT. WT, EMD and EEMD can only process
single-channel signals, which has great limitations for multi-channel
Declaration of Competing Interest
signals. NA-MEMD is a multi-channel signal processing method, so as to
ensure that the number and frequency of IMFs in each channel are
matched. It has better adaptability and time–frequency localization The authors declare that they have no known competing financial
ability, and improve the accuracy of fault diagnosis. Using variance and interests or personal relationships that could have appeared to influ-
distance evaluation technology to represent fault features can only ence the work reported in this paper.
extract specific feature information, and most of the information will be
lost. This will lead to incomplete feature information extraction and Acknowledgments
affect the accuracy of diagnosis.
This work was supported by the National Natural Science
5. Conclusion Foundation of China subsidization project (51579047), the Natural
Science Foundation of Heilongjiang Province (QC2017048), the Natural
This paper proposed a fault diagnosis method based on NA-MEMD Science Foundation of Harbin (2016RAQXJ077), and the fundamental
and MI-LightGBM for shipboard MVDC power systems that simplified research funds for the central universities (3072019CF407). Miss Yue
complicated problems and overcame the shortcomings of existing Sun acknowledges the financial support from Chinese Scholarship
methods, such as reliance on relays, slow training of the classifier Council (CSC) under grant 201906680084.
9
Fig. 7. Accuracies of different feature extraction method.
Appendix A. Supplementary material classification of power system disturbances using support vector machines. Electr
Power Syst Res 2010;80(7):743–52.
[13] Ur Rehman N, Park C, Huang NE, Mandic DP. Emd Via Memd: multivariate noise-
Supplementary data to this article can be found online at https:// aided computation of standard Emd. Ad Adapt Data Anal 2013;05(02):1350007.
doi.org/10.1016/j.ijepes.2020.106399. [14] Telford RD, Galloway S, Stephen B, Elders I. Diagnosis of series DC arc faults—a
machine learning approach. IEEE Trans Ind Inf 2017;13(4):1598–609.
[15] Mishra M, Rout PK. Detection and classification of micro-grid faults based on HHT
References and machine learning techniques. IET Gener Transm Distrib 2018;12(2):388–97.
[16] Lv Y, Yuan R, Song G. Multivariate empirical mode decomposition and its appli-
[1] Amy NDJ. DC voltage interface standards for naval applications. 2015 IEEE Electric cation to fault diagnosis of rolling bearing. Mech Syst Sig Process 2016;81:219–34.
Ship Technologies Symposium (ESTS); 2015. [17] Guolin Ke QM, Finley Thomas, Wang Taifeng, Chen Wei, Ma Weidong, Ye Qiwei
[2] Zohrabi N, Shi J, Abdelwahed S. An overview of design specifications and re- et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural
quirements for the MVDC shipboard power system. Int J Electr Power Energy Syst Inform Process Syst 2017;3147–3155.
2019;104:680–93. [18] Rao H, et al. Feature selection based on artificial bee colony and gradient boosting
[3] Li H, Li W, Luo M, Monti A, Ponci F. Design of smart MVDC power grid protection. decision tree. Appl Soft Comput 2019;74:634–42.
IEEE Trans Instrum Meas 2011;60(9):3035–46. [19] Fei Li LZ, Chen Bin, Gao Dianzhu, Cheng Yijun, Zhang Xiaoyong, Yang Yingze et al.
[4] Satpathi K, Ukil A, Pou J. Short-circuit fault management in DC electric ship pro- A light gradient boosting machine for remainning useful life estimation of aircraft
pulsion system: protection requirements, review of existing technologies and future engines. In: IEEE Conference on Intelligent Transportation Systems; 2018. p.
research trends. IEEE Trans Transp Electrif 2018;4(1):272–91. 3562–3567.
[5] Lima ÉM, dos Santos Junqueira CM, Brito NSD, de Souza BA, de Almeida Coelho R, [20] Andrus M BM, Crider J, Ouroua H, Santi E, Sudhoff S. Notional system models.
Meira Suassuna de Medeiros Gayoso. High impedance fault detection method based Technical Report, Electric Ship Research and Development Consortium (ESRDC),
on the short-time Fourier transform. IET Gener Transm Distrib December 2013.
2018;12(11):2577–84. [21] Andrus M BM, Crider J, Ouroua H, Santi E, Sudhoff S. Notional system report.
[6] Silva KM, Souza BA, Brito NSD. Fault detection and classification in transmission Electric Ship Research and Development Consortium (ESRDC); June 2014.
lines based on wavelet transform and ANN. IEEE Trans Power Delivery [22] Bin GF, Gao JJ, Li XJ, Dhillon BS. Early fault diagnosis of rotating machinery based
2006;21(4):2058–63. on wavelet packets-empirical mode decomposition feature extraction and neural
[7] Baran ME, Mahajan NR. Overcurrent protection on voltage-source-converter-based network,“ (in English). Mech Syst Sig Process 2012;27:696–711.
multiterminal DC distribution systems. IEEE Trans Power Delivery [23] Ali MZ, Shabbir MNSK, Liang X, Zhang Y, Hu T. Machine learning-based fault di-
2007;22(1):406–12. agnosis for single- and multi-faults in induction motors using measured stator
[8] Monadi M, Gavriluta C, Luna A, Candela JI, Rodriguez P. Centralized protection currents and vibration signals. IEEE Trans Ind Appl 2019;55(3):2378–91.
strategy for medium voltage DC microgrids. IEEE Trans Power Delivery [24] Gohari M, Eydi AM. Modelling of shaft unbalance: modelling a multi discs rotor
2017;32(1):430–40. using K-Nearest neighbor and decision tree algorithms. Measurement
[9] Fu NKCY. ANN-based fault classification and location in MVDC shipboard power 2020;151:107253.
systems. 2011 North American Power Symposium; 2011. [25] Xu Xianzhen, Cao Dan, Zhou Yu, Gao Jun. Application of neural network algorithm
[10] Li W, Monti A, Ponci F. Fault detection and classification in medium voltage DC in fault diagnosis of mechanical intelligence. Mech Syst Sig Process
shipboard power systems with wavelets and artificial neural networks. IEEE Trans 2020;141:106625. https://doi.org/10.1016/j.ymssp.2020.106625.
Instrum Meas 2014;63(11):2651–65. [26] Chen HR, et al. “Islanding detection method of distribution generation system based
[11] Li WL, Luo M, Monti A, Ponci F. Ieee, Wavelet based Method for Fault Detection in on logistic regression,” (in English). J Eng-Joe 2019:2296–300. Article no. 16.
Medium Voltage DC Shipboard Power Systems. In: 2012 Ieee International [27] Chen T, Carlos Guestrin. XGBoost: a scalable tree boosting system. In: Proceedings
Instrumentation and Measurement Technology Conference (IEEE Instrumentation of the ACM SIGKDD International Conference on Knowledge Discovery and Data
and Measurement Technology Conference; 2012. p. 2155–160. Mining, vol. 13-17-August-2016, pp. 785-794, 2016.
[12] Erişti H, Uçar A, Demir Y. Wavelet-based feature extraction and selection for
10

Electrical Power and Energy Systems: Sciencedirect

Uploaded by

Copyright:

Available Formats

You might also like

Electrical Power and Energy Systems: Sciencedirect

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Electrical Power and Energy Systems: Sciencedirect

Uploaded by

Copyright:

Available Formats

Electrical Power and Energy Systems 124 (2021) 106399

Contents lists available at ScienceDirect

Electrical Power and Energy Systems

Fault diagnosis of shipboard medium-voltage DC power system based on T

using data processing technology while removing interfering informa-

3.1.1. NA-MEMD method

Twin-shaft Synchronous Single-shaft Synchronous Synchronous 10 MVA 4.77e-7H 200 m

increasing the weight of samples with misclassiﬁcation or low classiﬁ-

4.1. Experimental description

A numerical simulation based on the AppSIM real-time simulator

Fig. 3. Fault waveforms on the DC cable.

short-circuit fault on a DC bus was simulated on an RL cable with the

4.2. Feature engineering Table 3

4.3.2. Comparison between LightGBM and MI-LightGBM under diﬀerent

KNN [23] 2.652 0.851 94.267 95.635 94.267 93.959

sample proportions. In the case of unbalanced sample proportion, the Table 5

Fig. 5. ROC curve of classiﬁers.

Fig. 7. Accuracies of diﬀerent feature extraction method.

You might also like