Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Accepted Manuscript

Title: Premature Ventricular Contraction Detection Combining


Deep Neural Networks and Rules Inference

Authors: Zhou Fei-yan, Jin Lin-peng, Dong Jun

PII: S0933-3657(16)30566-8
DOI: http://dx.doi.org/doi:10.1016/j.artmed.2017.06.004
Reference: ARTMED 1531

To appear in: ARTMED

Received date: 19-12-2016


Revised date: 3-6-2017
Accepted date: 7-6-2017

Please cite this article as: Zhou Fei-yan, Jin Lin-peng, Dong Jun.Premature Ventricular
Contraction Detection Combining Deep Neural Networks and Rules Inference.Articial
Intelligence in Medicine http://dx.doi.org/10.1016/j.artmed.2017.06.004

This is a PDF file of an unedited manuscript that has been accepted for publication.
As a service to our customers we are providing this early version of the manuscript.
The manuscript will undergo copyediting, typesetting, and review of the resulting proof
before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that
apply to the journal pertain.
Premature Ventricular Contraction Detection
Combining Deep Neural Networks and Rules Inference
ZHOU Fei-yana,b, JIN Lin-penga,b, DONG Juna*
(a. Suzhou Institute of Nano-tech and Nano-bionics, Chinese Academy of Sciences, Suzhou, Jiangsu 215123, China;

b. University of Chinese Academy of Sciences, Beijing 100049, China)


*
Corresponding author: E-mail: jdong2010@sinano.ac.cn

Highlights

We proposed a new approach combining deep neural networks and rule inference for PVC
detection.

We combined LCNN and LSTM for PVC detection.

We used rules inference for detecting the recordings or beats that were classified into non-PVC
class and PVC class by the method with the LCNN and LSTM.
The results obtained by our method represented a performance improvement with respect to
some published methods.

AbstractPremature ventricular contraction (PVC), which is a common form of cardiac arrhythmia caused

by ectopic heartbeat, can lead to life-threatening cardiac conditions. Computer-aided PVC detection is of
considerable importance in medical centers or outpatient ECG rooms. In this paper, we proposed a new approach
that combined deep neural networks and rules inference for PVC detection. The detection performance and
generalization were studied using publicly available databases: the MIT-BIH arrhythmia database (MIT-BIH-AR)
and the Chinese Cardiovascular Disease Database (CCDD). The PVC detection accuracy on the MIT-BIH-AR
database was 99.41%, with a sensitivity and specicity of 97.59% and 99.54%, respectively, which were better
than the results from other existing methods. To test the generalization capability, the detection performance was
also evaluated on the CCDD. The effectiveness of the proposed method was confirmed by the accuracy (98.03%),
sensitivity (96.42%) and specificity (98.06%) with the dataset over 140,000 ECG recordings of the CCDD.

Index TermsDetection, Premature Ventricular Contraction (PVC), Deep Neural Networks, Rules

Inference

1 Introduction
An electrocardiogram (ECG) reflects the heart's activity and provides a large amount of
information about the state of the heart. Thus, an ECG is extremely useful as a diagnostic tool in
clinical practice. A typical heartbeat in an ECG signal contains four basic waveforms: the P-wave,
the QRS complex, the T-wave, and the U-wave[1]. Physicians can correctly diagnose heart diseases
based on some of the characteristic parameters, such as RR interval, ST segment, QRS complex
duration, and QRS complex morphology, and so on.
Premature ventricular contraction (PVC) is a relatively common form of cardiac arrhythmia
disease caused by the existence of ectopic centers in the ventricles that change the path of
propagation of the activation front and that lead to the generation of QRS complexes with wide
and bizarre waveforms[2]. An ECG recording of a patient who has PVC from the Chinese
Cardiovascular Disease Database[3] (CCDD, http://58.210.56.164:88/ccdd/) is shown in Fig. 1.
The presence of PVC has also been shown to be associated with an increased total mortality in
some patient subgroups, which suggests that a high frequency of PVCs is a marker of a more
severe disease process rather than the provocateur of a terminal electrical event[4]. Therefore,
accurate and rapid detection of PVC is of great clinical significance. Computer-aided ECG
analysis can greatly reduce the clinical workload of physicians and improve their diagnosis
efficiency and the quality of their medical services. In recent years, many methods that concern
computer-aided PVC detection have been presented by various scholars[5-9]. Based on a previously
developed automatic classifier and a clustering method, Llamedo et al.[5] used a sequential floating
feature selection (SFFS) method to enhance the Bayesian classifier for ECG heartbeat detection.
The method for the class ventricular ectopic beat (V) achieved a sensitivity of 82.94% and a
positive predictivity of 87.97% for 22 recordings of the MIT-BIH-AR[10] database, which is
an internationally recognized standard database provided by the Massachusetts Institute of
Technology. Li et al.[6] presented a low-complexity data-adaptive approach for PVC recognition
based on template matching and achieved a sensitivity of 93.12% and a positive predictivity of
81.44%. Zhang et al.[7] proposed a novel disease-specic feature selection method for heartbeat
classication with ECG data by introducing the one-versus-one (OvO) combination method using
a series of support vector machine (SVM) binary classiers. Using 22 recordings of the
MIT-BIH-AR database, the results achieved by their method for class V were 85.48% of
sensitivity and 92.75% of positive predictivity. Angel et al.[8] used reservoir computing to classify
the heartbeats of the MIT-BIH-AR database into five classes. The method for class V obtained a
sensitivity of 96.06% and a positive predictivity of 99.49% for 22 recordings. Zarei et al.[9] found
that replacing a non-PVC beat with a PVC beat will cause a larger effect on principal directions
than replacing a non-PVC beat with another non-PVC beat, and they proposed an online PVC
detection method on the variation in principal directions of Replacing strategy. Their classier
was tested on 22 recordings of the MIT-BIH-AR database and obtained a sensitivity of 96.12%
and a positive predictivity of 86.48%
There are several problems in the PVC detection that appear in the research studies cited above,
and some of these problems can be summarized as follows:
1) The presence of noise, such as baseline wandering, power line interference and muscle noise
makes it difficult to extract some features in such a way that it could affect the PVC
classication performance.
2) Most of these methods were based on an ECG dataset with only a small number of ECG
recordings or that used overlapped training and testing datasets. Their efciency over a
large number of recordings is, in general, a difficult problem to address.
3) Because ECG commonly presents high inter- and intra-patient variability, both in
morphology and timing, achieving a simpler classier with high accuracy over a large
number of patients is a very difficult problem to address [2].
In this paper, an approach based on deep neural networks, such as lead convolutional neural
network (LCNN)[11] or long short-term memory (LSTM) network [12], and rules inference for PVC
detection was proposed, which avoided some of the above limitations. Unlike some other models,
LCNN and LSTM are both able to seamlessly learn features from the input signals in a supervised
fashion. Since both deep neural networks and ensemble learning have advantages in constructing
complicated nonlinear functions, a combination of the two can better handle difficult artificial
intelligence tasks[13]. Therefore, we also combined LCNN and LSTM for PVC detection in this
work. To further enhance the PVC detection performance, we took PVC disease characteristics
such as premature, ectopic, wide QRS complexes and compensatory pause into account and used
rules inference in this paper. We will use the MIT-BIH-AR database and the CCDD to test the
validity of our proposed method.
2 ECG Data Base
2.1 MIT-BIH-AR database
The MIT-BIH-AR database is regarded as the benchmark database in arrhythmia detection and
classication and has been extensively used for method validation. In this paper, this baseline
database was also used for performance evaluation of our proposed method, which allowed a
comparison with other published results. This database contains 48 two-lead long-term ECG
recordings from 47 subjects, each of which contains a 30-minute segment selected from 24 hours
recordings and sampling at 360 Hz.
2.2 CCDD
CCDD has been established by our group and, in addition, aims at supporting other research
groups that work in computer-aided ECG analysis[3]. CCDD consists of approximately 200,000
short-term ECG recordings, and each recording has its own diagnostic result. The recordings of
CCDD are all 12-lead ECGs with approximately 10~30 seconds in duration, each digitized at 500
Hz. There are 251 recordings on our database that have detailed annotation features, including
typical fiducial points (onset and offset of P and QRS, onset and offset of T waves) and the
morphology feature of QRS complexes and beat diagnosis results, among others. Compared with
the study of computer-aided ECG detection, little attention has been paid to the development of a
standard ECG database, which can prevent further research, especially for practical application[3].
The primary difference between the CCDD and other databases is that its data content is dynamic;
in other words, the data is enhanced continuously on the basis of feedback from other parts of the
platform and thus provides more useful annotation features[14]. All of the recordings are collected
from the real clinical environment with high quality and without any artificial effort, and as a
result, it can well reflect the detection performance of the proposed methods in practical
applications.
3 Proposed framework
3.1 Method for PVC detection in CCDD
In clinical applications, short-term ECG signals with approximately 10~30 seconds of duration
are usually used in medical centers or outpatient ECG rooms. Many papers on ECG recognition
exist, but little of this research has been tested on more than a standard dataset (the MIT-BIH-AR
database, or in some cases, only part of it), or on a satisfactory real application[14]. To address
these problems, a method that combined LCNN, LSTM and rules inference was proposed for the
whole CCDD in this work. The framework of PVC detection on the CCDD is shown in Fig. 2.
A raw ECG recording of CCDD as an input signal was fed into the framework in this work.
First, the ECG recording was preprocessed by a bandpass filter with a passband from 0.5 to 40 Hz
to reduce the ECG noise. Then, we used LCNN and LSTM as base classifiers. The ECG recording
was classified in parallel using two sets of classifiers. We respectively selected m classifiers and n
classifiers to form the M1 classifiers and N2 classifiers by selective ensemble learning method.
After the selective ensemble, we needed to combine the m detection results and the n detection
results in such a way that the classification of the ECG recording into a non-PVC or PVC class
was achieved. As shown in Fig. 2, the computer-aided diagnosis results of the entire recording
were obtained by the selective ensemble method, and the whole detection process was fully
automatic. After the recordings were classified into the non-PVC class or PVC class by the
ensemble learning method, the method with rules inference was used for detection in these
recordings, to further enhance the PVC detection performance. In practical applications, it was
very important to assess the ECG signal quality for the PVC detection method based on rules
inference, even though the bandpass filter can reduce most of the noise in the ECG signals.
However, it cannot filter out the noise, such as lead-off, large baseline drift, and so on. There was
an essential requirement to assess signal quality before using the method based on rules inference.
3.1.1 ECG preprocessing
The preprocessing of raw ECG signals is necessary because they are often contaminated by
different types of noise, such as power line interference, patient-electrode motion artifacts, and
baseline wandering. To improve the signal-to-noise ratio (SNR), which can be benecial to the
subsequent ducial point (e.g., the location of the R peak) detection and heartbeat classication,
the ECG signals were preprocessed by a bandpass filter in this paper. The reason is that most of
the energy of ECG signals is concentrated between 0.5 and 40 Hz, and the bandpass filter has
many advantages, such as small size and high reliability. Thus, we chose a bandpass filter with
a passband from 0.5 to 40 Hz to reduce the noise that was present in the signals in this study. Then,
the ltered ECG signals were fed into the next step for further processing.
3.1.2 LCNN
Traditional CNN, which is specifically designed for the variability of low-dimensional (2D)
shapes, have now been successfully applied to deep learning tasks such as object detection in
natural images, face recognition, and image detection, among others. CNN is commonly contains
three main types of layers: a convolutional layer, pooling layer and fully connected layer. Jin et
al.[11] proposed an LCNN method for ECG classification. In fact, the structure of LCNN is a
one-dimensional (1D) CNN. There are four key ideas behind LCNN that take advantage of the
properties of natural signals: local connections, shared weights, pooling and the use of many
layers[16]. In this paper, LCNN is composed of 9 layers: an input layer, three convolutional layers,
three max-pooling layers, a fully connected layer and a softmax layer. Units in a convolutional
layer or a pooling layer are organized into feature maps, within which each unit is connected to
local patches in the feature maps of the previous layer[16]. The role of the convolutional layer is to
detect local conjunctions of features from the previous layer, and the role of the pooling layer is to
merge semantically similar features into one[16]. Finally, the features from the fully connected
layers are used for classification in the logistic regression layer.
3.1.3 LSTM
Recurrent neural network (RNN) is a powerful type of deep neural network that is designed to
handle sequence dependence[16]. LSTM is an improved RNN architecture, except that the standard
recurrent hidden layer is replaced with a recurrent LSTM layer instead. A recurrent LSTM layer
consists of recurrently connected memory blocks, each of which contains one or more special
units known as memory cells along with three multiplicative gate units: the input, output, and
forget gates[17]. As shown in Fig. 3, one LSTM memory block contains one memory cell. The
gates of these cells are normal units and control the flow of information into and out of each
cell[18]. When the input gate is open, the central units value is replaced by the output activation of
the net input unit. When the output gate is open, information flows out into the network, and when
the forget gate is open, the cells memory is reset to zero[18]. The cell input is multiplied by the
activation of the input gate, the cell output by that of the output gate, and the previous cell values
by the forget gate[17]. In this paper, the LSTM model is composed of an input layer, a recurrent
LSTM layer and an output layer.

Because the recurrent connections do not go through nonlinear transforms in the LSTM, the
gradients will be backpropagated smoothly through the recurrent connections using the stochastic
gradient descent[19]. Thus, LSTM can overcome the vanishing gradient problem of standard RNN
and enable learning from sequences. Unlike traditional RNN, a LSTM network is well-suited to
learn from experience to classify, process and predict time series when there are long time lags of
unknown sizes between important events[20]. This capability is one of the main reasons why
LSTM outperforms other sequence learning methods in many applications, such as achieving the
best known performance in speech recognition[20]. We utilized LSTM network as base classifier in
this paper. We used Theano-0.6rc [21-22], which is a Python library for our implementation of the
LCNN and LSTM. When the LSTM model was implemented with Theano-0.6rc, we needed to set
a fixed length that was smaller than the length of the input sequence. In addition, the input
sequence was segmented into individual segmentations based on the fixed length. For example,
we set the fixed length to be 200 sampling points. Thus, the input sequence was segmented into
some segmentations, which started from the 1st, 201st, 401st, and 601st sampling point, and so on.
The LSTM model would learn the context of these segmentations and, then, fuse the information
of these segmentations and finally give the classification result of the input sequence. In this paper,
we denoted the fixed length as Len.
3.1.4 Selective ensemble
Ensemble detection refers to a collection of methods that learn a target function by training a
number of base classifiers. Base classifiers refer to individual classifiers that are used to construct
the ensemble classifiers[23]. Ensemble learning methods have often demonstrated significantly
better performance than single classiers[24]. Zhou et al.[25] analyzed the relationship between the
ensemble and its component neural networks from the context of both regression and detection
and proved that many could be better than all. In addition, they presented a new framework called
selective ensemble. Selective ensemble chooses a subset of base classifiers for the final prediction.
The advantages of selective ensembles over traditional ensembles lie in a smaller ensemble size
and potentially better generalization ability in such a way that the selective ensemble is believed to
be much more effective than a single classier and the traditional ensemble system[26]. The subset
of classifiers used in the selective ensemble for a specific dataset should have good prediction
effect and sufficient diversity between individual classifiers. In addition, the specific detection
error of each base classifier should be less than 0.5; otherwise, the error rate of the ensemble
results will increase[27]. Therefore, we should select appropriate classifiers whose detection errors
were less than 0.5 to achieve good ensemble performance.
First, M1 LCNNs were used to classify the recordings of CCDD into two classes in parallel.
Then, we selected m classifiers from the M1 classifiers, where the m classifiers were a subset of
the M1 classifiers. Finally, the predictions of the m classifiers were fused with some rules. There
are a number of classifier combination schemes, such as the sum rule, product rule, max rule, and
so on. In this paper, the representation of an ECG recording was denoted by x. Given binary

classified data, classes 0 and 1 represented non-PVC and PVC recordings, respectively, where

0 and 1 were denoted as 0 and 1. The final probability estimates P ( y i | x ), i 0 ,1 could be

calculated based on the individual probabilities Pl ( y i | x ), l 1, 2 ,..., m , which were

achieved from the corresponding individual classifier. Using the sum rule as the fusion strategy,
we obtained the final probability estimates by LCNNs as follows:
m

Pl ( y i | x )
l 1
PLCNN ( y i | x ) 1 m
(1)
Pl ( y j
| x)
j 0 l 1

where x was fed into the lth individual classifier. Therefore, the probability of the recording being
from the ith class was given as the joint probability of the m classifiers being from the ith class. In
this paper, the joint probability was normalized into the range of [0,1] to be a valid probability
estimate. Similarly, we also used the sum rule as the fusion strategy and obtained the final
probability estimates by LSTMs as follows:
n

Pk ( y i | x )
k 1
PLSTM ( y i | x ) 1 n
(2)
Pk ( y h | x )
h 0 k 1

With linear ensemble learning, we combined the two probability estimates mentioned above[28]:

P ( y i | x ) 0 . 5 * ( PLCNN ( y i | x ) PLSTM ( y i | x )) (3)

Thus, the winning class t(t=0,1) was determined to correspond to the higher final probability
estimate in the two classes, given by
t arg max ( P ( y i | x )) (4)
i

3.1.5 ECG signal quality


Quality estimation of ECG signals is a challenging problem. A non-heart signal is called a
lead-off. First, we needed to detect the existence of a lead-off in the 12-lead ECG. An ECG signal
would be discarded when it was a lead-off. Otherwise, using the signal to noise ratio (SNR), we
judged whether there was a large amount of noise in the ECG signal. In this work, we used the
SNR evaluation method proposed in the literature [29] and the ECG feature extraction method
provided in the literature [30]. We judged whether there was lead-off or large baseline drift by the
signal quality judgment method described in the literature [31]. If the amplitudes of an ECG
recording were constant, then the ECG recording was lead-off. As shown in Fig. 4, an ECG
recording of CCDD was lead-off. The baseline drift judgement index was the maximum amplitude
of the onsets of several QRS complexes minus the minimum amplitude of these onsets. If the
baseline drift judgement index was larger than th1, then there was a large baseline drift in the ECG
recording. Th1 was an empirical threshold value. We set the th1 to be 1.5 millivolt in this paper.
3.1.6 Rules inference
PVC can occur as an isolated single extra cardiac beat or in sequence with another to cause
serious arrhythmias such as ventricular tachycardia (VT)[32]. Thus, if there are one or more PVC
beats in an ECG signal, the ECG signal is classified into the PVC class. As seen in Fig. 1, the QRS
complex of the PVC beat is bizarre in shape and is very different from other normally shaped QRS
complex, and the amplitude of the QRS complex is much larger or smaller than normal beats[33].
PVC starts with a shorter pulse, which means that the RR interval between the PVC and the
previous normal beat is shorter than the average beat-to-beat interval [34]. That circumstance is
labeled prematurity and is usually followed by a compensatory pause, which also entails a
uctuation compared with the normal distance between consecutive beats (the distance between
the PVC and the next normal beat is greater than the mean RR interval)[34]. With these PVC
disease characteristics, RR intervals, width of QRS complex, QRS similarity and amplitude of
QRS complex were considered, and rules inference designed on the basis of these features was
used to identify PVC. In this work, we utilized RR interval to judge whether there was a
premature QRS complex in an ECG signal. Previous work in the literature [15] used a set of rules
that were provided by medical experts and were based on clinical procedures for detecting
arrhythmic events from the RR intervals. We used the condition of whether the RR interval of the
ECG signal of interest had a shorter duration compared to the average of the previous RR intervals
(RRm) or not. Then, while considering the ECG waveform morphological differences between
PVC and non-PVC beats, two simple beat-by-beat template-matching processes were employed.
In theory, the correlation coefficients were very low (correlation coefficient<0.5) if it was a PVC
beat, and vice versa. Thus, we needed to check the QRS similarity. Afterward, we added a
condition that was whether the width of the QRS complex of interest was greater than the average
width of normal QRS complexes or not. In Fig. 2, AWidth was the average width of the QRS
complexes of the most recent several non-PVC beats. In addition, the parameters w1, w2, w3 and
w4 were obtained according to experience. Finally, we checked whether the amplitude of the QRS
complex of interest was much larger or much smaller than the average of the most recent several
non-PVC beats. Here, features such as the R wave, onset and offset of QRS complex were given
by the method proposed in the literature [31].
3.2 Method for PVC detection in the MIT-BIH-AR Database
According to the Association for the Advancement of Medical Instrumentation (AAMI) [35]
standard, the 4 recordings that contain paced beats were excluded in our experimental evaluation
process because these beats did not retain sufficient signal quality for reliable processing[9]. The
remaining 44 recordings were divided into two datasets (DS1 and DS2), with each dataset
containing ECG data from 22 recordings[9]. In this work, we also used the same dataset division
scheme used in [9] for comparison purposes, with DS1 for training and DS2 for testing.
Because the data of the MIT-BIH-AR database is a long-term ECG recording, the detection
process based on the MIT-BIH-AR database was slightly different from the detection process
based on the CCDD. The MIT-BIH-AR database contains only 48 ECG recordings from 47
subjects. Its complexity is less than that of the CCDD. We did not consider the ECG signal quality
processing and the width of QRS complex in this part. The primary difference was that the inputs
of LCNNs and LSTMs on the MIT-BIH-AR database were beats, and the ECG recordings must be
segmented into individual heartbeats based on the positions of the R peak after preprocessing. Fig.
5 described the stages of our proposed system for detection of PVC beats on the MIT-BIH-AR
database.
Similar to the method used on the CCDD, raw ECG recording of the MIT-BIH-AR database
was first preprocessed by a bandpass filter with a passband from 0.5 to 40 Hz. Before classifying
the heartbeats, the ECG recording must be segmented into individual heartbeats based on the
positions of the R peak. An R-peak detection method based on peaks of Shannon energy
envelope[36] was used to detect the location of the R peak. To perform a comparison with other
results in the literature, the ECG signal was also segmented as in [9]. Given the sampling rate of
360 Hz, each heartbeat segment consisted of 50 sampling points (138.88 millisecond) before the R
peak location and 99 sampling points (275 millisecond) after the R peak, i.e., a total of 150
sampling points, which corresponded to 416.67 millisecond. Fig. 6 depicts a segment extracted
from patient recording 101.
LCNN and LSTM were also used as base classifiers of ensemble learning. The structures of
LCNN and LSTM were different from the structures used on the CCDD. As shown in Fig. 5, the
RR interval was also defined as the interval between the processing beat and the previous beat.
RRa was determined by averaging the RR-intervals of several RR-intervals that were surrounding
the processing heartbeat. The parameter w5 was also achieved according to experience. In this part,
we also needed to check the QRS similarity by the correlation coefficients, and we checked
whether the amplitude of the processing QRS complex was much larger or much smaller than the
average amplitude of the several QRS complexes that surrounded the heartbeat.
4 Detection performance measures
The detection performance of the classifiers was measured using the four standard metrics
found in literature [11]: accuracy (Acc), sensitivity (Se), specicity (Sp), and positive predictivity
(PPV). The confusion matrix of the classication is shown in Table 1:
These metrics were defined as follows:
TP TN TP
Acc , Se (5)
TN FP TP FN TP FN
TN TP
Sp , PPV (6)
TN FP TP FP
It was easy to ask but rather difcult to answer what method we should choose if one performs
better on one class and the other performs better on the other class[38]. Here, we used Youdens
index for classication performance. The definition of is as follows[38]:
Se Sp 1 (7)
where evaluates the methods ability to avoid failure, and a higher value of indicates better
ability to avoid failure[38].
5 Results and discussion
5.1 The detection results based on the CCDD

To evaluate the effectiveness of our proposed method in practical applications, our experiments
were conducted on the whole CCDD. The entire dataset was split into training and testing datasets.
There were a total of 5260 PVC recordings of the CCDD. The training dataset consisted of
non-PVC recordings and PVC recordings. We used 3112 PVC recordings as training samples.
Because the number of non-PVC recordings was much greater than the number of PVC recordings,
the quantity of non-PVC recordings was several times the quantity of PVC recordings on the
training dataset. Specifically, 35,840 recordings (including 3112 PVC recordings) were randomly
selected to constitute the training dataset. The remaining recordings (141,046 recordings) were
used as a testing dataset. The lengths of the ECG recordings of the CCDD were between 10 and
30 seconds. The LCNN models required a fixed input size. Thus, after reducing the ECG noise by
a bandpass filter with a passband from 0.5 to 40 Hz, the sampling rate of the ECG recordings was
downsampled to 200 Hz. Then, we used the same length with 9.5 seconds of each ECG recording
as the inputs of the LCNN models, and thus, the input size of the LCNN models was 1900 (i.e.,
9.5*200).

We set the total numbers of base classifiers (LCNN and LSTM) in the ensemble learning to
be M1=14 and N2=14. Then, we utilized the selective ensemble learning method based on a
greedy strategy to select the appropriate classifiers from these individual classifiers. The numbers
of these appropriate classifiers were 6 and 6, namely, m=6 and n=6. In addition, there was a large
variation in these appropriate classifiers. In this paper, the three convolutional kernel sizes were
1*18, 1*10 and 1*5 (1 denoted the size in the vertical direction, and 18 denoted the size in the
horizontal direction). The kernel size was 1*3 for all of the pooling layers. The number of feature
maps in the three convolutional layers was 15, 20 and 15, separately. The pooling layer had the
same number of feature maps as its previous convolutional layer. In addition, we set different
misclassification costs for each class in the cost function of LCNN model. The misclassification
rates of five out of six LCNN models were 3:1, i.e., the misclassification cost for the non-PVC
class was 3, and the misclassification cost for the PVC class was 1. At the same time, the initial
weight values in these five models were different from one another. The misclassification rate of
one out of six LCNN models was 2:1. The numbers of memory cells in the recurrent LSTM layers
of the six LSTM models were 200, 300, 220, 128, 400 and 250. In addition, the corresponding
fixed lengths of their input signals (i.e., Len) were 150, 200, 180, 150, 300 and 180, respectively,
when the LSTM models were implemented with the Theano-0.6rc. LSTM can process input
signals of arbitrary length because of its special structure, and its input size was the length of the
ECG recordings.

We adopted the sum rule to combine the independent results. Afterward, we combined the two
fusion results obtained by the selective ensemble learning method with formula (3). To enhance
the overall detection performance, we used rules inference to test those recordings that were
classified as non-PVC class and PVC class by the ensemble learning method. Table 2 showed the
PVC detection results by four combination methods for 141,046 recordings of the CCDD. As seen
in Table 2, our proposed method combined LCNN, LSTM and rules inference and obtained a
much higher Youdens index than other combination methods. Since the problem of PVC
detection suffered from severe class imbalance, the accuracy did not provide sufficient
information to choose a method reliably. In this work, we used the measure , and the higher
value of indicated that our method was better at avoiding failure.
Our performance results were finally compared with [39,40]. Table 3 showed the comparison
with [39,40]. Xu et al.[39] presented a computer-aided PVC detection method. They created a
dynamic-link library for their method and provided the dynamic-link library to us for PVC
detection. Thus, the detection results from the recordings of the CCDD can be obtained by the
dynamic-link library. The results verified the validity of our method achieving slightly higher
performance than the results presented in the literature [39]. The computer-aided PVC detection
method in the literature [40] was our prior work. We used the approach that combined LCNN and
diagnostic rules for PVC detection on the whole CCDD[40]. PVCs are characterized by the
premature occurrence of large bizarre-shaped QRS complexes, and the T wave is usually large and
opposite in direction to the major deflection of the QRS complexes. Thus, the method [40] based
on diagnostic rules used the premature QRS complexes, QRS width, T-wave direction and QRS
direction as feature parameters. Although most of the PVCs satisfied the abovementioned
characteristics, there were still many PVCs that were misclassified into the non-PVC class. From
Table 3, we can see that the proposed method in this paper achieved much higher Acc, Se, Sp, PPV
and than in [40]. Compared to [40], the proposed method in this paper was better at avoiding
failure. Despite the improved results presented in this work, there was still room for improvement
in the field, since the Sp, Se and PPV for the PVC classification are 98.06%, 96.42% and 43.40%,
respectively. These results suggest that other features, classifiers or feature extractor strategies can
be developed to improve the PVC recognition performance.

In this paper, we presented a PVC detection method that is suitable for a broad range of
scenarios, based on a combination of LCNN, LSTM and rules inference. LCNN and LSTM were
both trained end-to-end directly from input-output pairs, which avoided extracting handcrafted
features that could increase the training errors. In this part, we first combined LCNN and LSTM to
classify the ECG recordings into two different classes: PVC class and non-PVC class. The main
disadvantage of LCNN was that it required a fixed input size, which could cause the loss of
disease information such as PVC. Moreover, the data in practical applications was
usually extremely unbalanced, which affected the detection performance of the LCNN and LSTM.
The method based on rules inference did not require a training process, and it directly detected
PVC according to certain rules. In this paper, rules inference was used to detect the PVC class and
non-PVC class, which were from the deep neural networks. However, the rules inference
depended on handcrafted features such as R wave or onset and offset of QRS complex. The
accuracies of these feature extractions are easily affected by ECG noise, which has yet to be
further improved. To effectively improve the PVC detection, we took advantage of LCNN, LSTM
and rules inference in this work. The experiments showed the effectiveness of our
proposed detection approach, and our proposed method was superior to others in the published
literature. In addition, Jin et al.[41]integrated LCNN and rules inference for normal vs. abnormal
classification and achieved better performance than the method that was used only LCNN in the
literature [11]. Test were conducted that used more than 150,000 ECG recordings of CCDD, and
they showed that the method based on LCNN and rules inference had an accuracy of 86.22%,
while the method based on LCNN in the literature [11] had only an accuracy of 83.66% on the
same dataset. These experimental results all showed that integrating deep neural networks and
rules inference can achieve better performance.
5.2 The detection results based on the MIT-BIH-AR database
The training set (DS1) contained 3669 PVC beats and 47,218 non-PVC beats. After the training
process, the detection performance of our proposed method was evaluated on the DS2 (testing
dataset). There were 46,329 non-PVC beats and 3194 PVC beats on DS2. First, we used a
selective ensemble learning method based on a greedy strategy to test the DS2. We set the total
number of base classifiers (LCNN and LSTM) in the ensemble learning to be H1=20 and G2=20.
Then, we also used a selective ensemble learning method based on a greedy strategy to select the
appropriate classifiers from these individual classifiers. The numbers of these appropriate
classifiers, respectively, were 13 and 3, specifically, h=13 and g=3. In this part, the three
convolutional kernel sizes were 1*13, 1*8 and 1*8. The kernel size was 1*3 for all of the pooling
layers. The number of feature maps in the three convolutional layers was 10, 15 and 25, separately.
Here, we also set different misclassification costs for each class in the cost function of the LCNN
model. The numbers of memory cells in the recurrent LSTM layers of the three LSTM models
were 50, 100 and 330. The inputs of LCNN and LSTM both were heartbeats, and their input sizes
were both 150. When implementing the LSTM models with Theano-0.6rc, we set Len (i.e., the
fixed length of the input signals) to be 150. An approach based of the sum rule was adopted to
combine these independent results from the individual classifiers.
Training a LCNN or LSTM to achieve good detection performance required a large dataset.
However, the DS1 contained the patient information only from 22 recordings. For this reason, the
detection performance of LCNN and LSTM was somehow restricted. As seen from the results in
table 4, the sensitivity was relatively low, which meant that many PVC beats were misclassified as
non-PVC beats. Thus, we also used rules inference to test those beats that had been classified into
non-PVC class and PVC class by the method combined LCNN and LSTM. The objective of the
method based on rules inference was to enhance the overall detection performance. Table 4 shows
the PVC detection performance for 22 recordings on DS2. Compared to the method that was based
on only LCNN and LSTM, the method that combined LCNN, LSTM and rules inference achieved
higher Acc, Se and .

Table 5 provided a comparison of the PVC detection performance between the proposed
method and earlier published results [5-9]. ECG recordings that were in common with [5-9] were
used. The performance results from our method were compared directly with the corresponding
reported results. These methods were tested on the testing dataset (DS2). Llamedo et al.[5]
modified the AAMI guidelines. They discarded AAMI class Q, arguing that it was marginally
represented on the MIT-BIH-AR database. Li et al.[6] proposed two beat-to-beat templates in their
study based on the morphological differences of the PVC beats in the ventricular depolarization
phase and repolarization phase. Zhang et al.[7] utilized feature selection and SVM to build their
classifiers. Angel et al.[8] used reservoir computing for the detection of heartbeats on the
MIT-BIH-AR database. The specific point named R-peak used in the literature [8] was annotated.
Zarei et al.[9] studied the variation in the principal directions when replacing one heart beat with a
new heart beat into the data matrix and demonstrated that the variation in the principal directions
caused by PVC beats can be used to accurately identify PVC. However, in [9] it required a
selection of k1 normal beats from the first 2 minutes of each recording to compute the dominant
principal direction. However, it did not mention how to select the k1 normal beats in [9]. Because
when the datasets were imbalanced, the overall accuracy could be heavily biased to favor the
majority class, and the use of accuracy would usually produce sub-optimal models. Therefore, we
also used Youdens index as a performance measure in this part. In comparison with these
published results [5-9] (Table 5), our PVC detection approach showed some degree of
improvement, especially in Se and for PVC beats. Since the sensitivity counted for the
percentage of true positive samples that were classified correctly, it was especially significant for
clinical use. However, one limitation of this study was that Sp for PVC beats was not as good as
other approaches [8], and the Se and in our method were much higher. The higher value
of indicated that our method was better at avoiding failure than other methods[5-9]. LCNN and
LSTM are both deep learning models, and their architectures are more complex than some other
shallow architectures, such as SVM. LCNN and LSTM have great advantages in both feature
extraction and model fitting. Additionally, they are very good at discovering increasingly abstract
distributed feature representations from the raw input data. The experimental results showed that
our method had obtained high accuracy over a large number of ECG recordings or a small number
of ECG recordings. The performance of our classifier on DS2 was the following: Acc 99.41%, Se
97.59%, PPV 93.55%, Sp 99.54% and 97.13%. Based on these results, we can conclude that our
method offered some improvement over earlier approaches[5-9]. In this paper, our presented
method focuses on PVC detection only, and we did not compare experimentally between the
proposed method and other exciting studies that focus on multi-class heartbeat detection. The
validity of the generalization capability of our proposed method was somehow restricted to the
available data and should be corroborated in future work by including new databases in the
analysis or other methods. Despite this limitation, the degree of generalization of the suggested
method was expected to be better than the methods obtained, when considering the CCDD and the
MIT-BIH-AR database.
6 Conclusions
To relieve the heavy workloads of medical professionals, computer-aided PVC detection
techniques are required that are robust to the signals. In this paper, we presented a systematic
method that combined LCNN, LSTM and rules inference for computer-aided PVC detection. The
proposed method was evaluated on the MIT-BIH-AR database and on the CCDD. The results
presented in this paper represented a performance improvement with respect to the published
methods in the field of computer-aided PVC detection. Although PVC detection is a simpler
problem compared to multi-class detection, there is still need for further research to develop
approaches that can accurately identify PVC. In the future, it is of interest to extract of more
effective disease-specific features for accurate PVC detection. In addition, it is also interesting to
explore new techniques for PVC detection.

REFERENCES
[1] Hasan MA, Mamun M. Hardware approach of R-peak detection for the measurement of fetal and maternal
heart rates[J]. Journal of Applied Research and Technology, 2012, 10: 835-844.
[2] Talbi ML, Charef A. PVC discrimination using the QRS power spectrum and self-organizing maps[J].
Computer Methods and Programs in Biomedicine, 2009, 94:223-231.

[3] Zhang JW, Liu X, Dong J. CCDD: an enhanced standard ECG database with its management and annotation
tools[J]. International Journal on Article Intelligence Tools, 2012, 21(5):1-26.
[4] Sayadi O, Shamsollahi MB, Clifford GD. Robust detection of premature ventricular contractions using a
wave-based bayesian framework[J]. IEEE Transactions on Biomedical Engineering, 2010, 57(2):353-362.
[5] Llamedo M, Martinez JP. Heartbeat detection using feature selection driven by database generalization
criteria[J]. IEEE Transactions Biomedical Engineering. 2011,58 (3): 616625.
[6] Li P, Liu CY, Wang XP, et al. A low-complexity data-adaptive approach for premature ventricular contraction
recognition[J]. Signal Image and Video Processing, 2014,8(1):111-120.
[7] Zhang ZC, Dong J, Luo XQ, et al. Heartbeat classication using disease-specic feature selection[J].
Computers in Biology and Medicine, 2014, 46:79-89.
[8] Escalona-Moran MA, Soriano MC, Fischer I, et al. Electrocardiogram detection using reservoir computing
with logistic regression[J]. IEEE Jounal of Biomedical and Health Informatics, 2015, 19(3):892-898.

[9] Zarei R, He J, Huang GY, et al. Effective and efficient detection of premature ventricular contractions based on
variation of principal direction[J]. Digital Signal Processing, 2016, 5:93-102.
[10] MIT-BIH-AR arrhythmia database [EB/OL]. [ 2016-06-10]. http://www.physionet.org/physiobank/databa
se/mitdb/.
[11] Jin LP, Dong J. Deep learning research on clinical electrocardiogram analysis [J]. Science China: Information
Sciences, 2015, 45(3): 398-416. ( in Chinese)
[12] Hochreiter S, Schmidhuber J. Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780.
[13] Jin LP, Dong J. Ensemble deep learning for biomedical time series detection[J]. Computational Intelligence
and Neuroscience, 2016, 2016(3):1-13.
[14] Dong J, Zhang JW, Zhu HH, et al. Wearable ECG monitors and its remote diagnosis service platform[J].
Intelligent Systems IEEE, 2012, 27(6): 36-43.
[15] Tsipouras MG, Fotiadis DI, Sideris D. An arrhythmia detection system based on the RR-interval signal[J].
Artificial Intelligence in Medicine, 2005, 33:237-250.
[16] LeCun Y, Bengio F, Hinton GE. Deep learning[J]. Nature, 2015, 521: 436-444.
[17] Wollmer M, Blaschke C, Schindl T, et al. Online driver distraction detection using long short-term memory [J].
IEEE Transactions on Intellingent Transportation Systems, 2011, 12(2):574-582.
[18] Frinken V, Fischer A, Manmatha R, et al. A novel word spotting method based on recurrent neural
networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(2):211-224.
[19] Cai M, Liu J. Maxout neurons for deep convolutional and LSTM neural networks in speech recognition[J].
Speech Communication, 2016, 77:53-64.
[20] Lyu Q, Zhu J. Revisit long short-term memory: an optimization perspective[C]. In NIPS Workshop on Deep
Learning and Representation Learning(NIPS-DL 2014), Montreal, Canada, 2014.

[21] Rfou RA, Alain G, Almahairi A, et al. Theano: a python framework for fast computation of mathematical
expressions[J]. ArXiv:1605.02688v1, 2016.

[22] Theano[EB/OL]. [2016-9-22]. https://github.com/Theano/Theano.

[23] Rahman A, Tasnim S. Ensemble classifiers and their applications: a review[J]. International Journal of
Computer Trends and Technology, 2014, 10(1):31-35.
[24] Kittler J, Hatef M, Duin RPW, et al. On combining classifiers [J]. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 1998, 20(3):226-238.
[25] Zhou ZH, Wu JX, Tang W. Ensembling neural networks: many could be better than all [J]. Artificial
Intelligence, 2002, 137: 239-263.
[26] Guo YW, Jiao LC, Wang S, et al.. A novel dynamic rough subspace based selective ensemble[J]. Pattern
Recognition, 2015, 48(5):1638-1652.
[27] Valentini G, Masulli F. Ensembles of learning machines [C]. Proceedings of the Italian Workshop on Neural
Nets-revised Papers, London: Springer Berlin Heidelberg, 2002, 2486: 3-20.
[28] Deng L, Platt JC. Ensemble deep learning for speech recognition[C]. Proceedings of the Annual Conference
of the International Speech Communication Association, Singapore, 2014.
[29] Sun ZH, Su JW, Xie CC, et al.. Reducing ECG alarm fatigue based on SQI analysis[J]. Computer in
Cardiology, 2014, 41:345-348.
[30] Xu YJ, Zhong YQ, Zeng WB. ECG feature points detection based on electrocardial vectors [P]. China Patent:
201410068351.6, 2014-02-27.
[31] Ye WY, Zhang GL, Hong JB, et al. A method based on multi-lead simultaneous ECG signal processing
method and apparatus[P]. China Patent: CN 101467879A, 2009-07-01.
[32] Farrokhi F, Moradi MH, Miri R. Automatic detection of premature complexes in ECG using wavelet features
and fuzzy hybrid neural network[J]. Iranian Journal of Electrical and Computer Engineering, 2004,
3(2):132-137.
[33] Shen Z, Hu C, Li P, et al. Research on premature ventricular contraction real-time detection based support
vector machine[C]. IEEE International Conference on Information and Automation (ICIA 2011), Shenzhen,
China, 2011:864-869.
[34] Cuesta P, Lado MJ, Vila XA, et al.. Detection of premature ventricular contractions using the RR-interval
signal: a simple method for mobile devices[J]. Technology and Health Care, 2014,22:651-656.
[35]Association for the advancement of medical instrumentation. ANSI/AAMI EC57:2012 Standard: Testing and
Reporting Performance Results of Cardiac Rhythm and ST Segment Measurement Methods [S]. USA: AAMI
Press, 2012.
[36] Zhu HH, Dong J. An R-peak detection method based on peaks of Shannon energy envelope[J]. Biomedical
Signal Processing and Control, 2013, 8(5):466-474.
[37] Hamzah NABA, Besar RB, Rahman NZBA. Premature ventricular contraction (PVC) classifications by
probabilistic neural network (PNN) using the optimal mother wavelets[C]. International Conference on
Graphic and Image Processing, 2011, 8285(23):170-177.
[38] Sokolova M, Japkowicz N, Szpakowicz S. Beyond accuracy, F-score and ROC: a family of discriminant
measures for performance evaluation[C].19th Australian Joint Conference on Artificial Intelligence. Hobart,
Australia, 2006:1015-1021.
[39] Xu YJ, Zhong YQ, Zeng WB. Nanjing dragon technology co., LTD[EB/OL].
[2016-9-22].http://www.aecg.com.cn/.
[40] Zhou FY, Jin LP, Dong J. PVC recognition method based on ensemble learning[J]. Acta Electronica Sinica,
2017,45(2):501-507.(in Chinese)
[41] Jin LP, Dong J. Classification of normal and abnormal ECG records using lead convolutional neural network
and rule inference[J]. Science China: Information Sciences,2017, doi: 10.1007/s11432-016-9047-6.

Fig. 1. An ECG recording of the CCDD. Note: N=non-PVC class, V= PVC class.
Fig. 2. The computer-aidedPVC detection procedure on the CCDD

Fig. 3 The architecture of a memory cell in a recurrent LSTM layer[18]

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Fig. 4. An ECG recording that was lead-off of the CCDD.


Fig. 5. The computer-aided PVC detection procedure on the MIT-BIH-AR database

1.2

0.8

0.6

0.4

0.2

-0.2

-0.4
0 50 100 150

Fig. 6. A segment extracted from the patient recording #101

Table 1 Confusion matrix of the classication


Predicted
N V Total
N TN FP TN+ FP
V FN TP FN+ TP
True

Total TN+ FN FP+ TP TN+ FP+ FN+ TP

Table 2 Results on the CCDD

Predicted

Acc Se Sp PPV
Classifiers N V
(%) (%) (%) (%) (%)
N 135287 3611
True

LCNN+LSTM 97.16 81.52 97.40 32.66 78.92


V 393 1751
N 136318 2580
LCNN+ rules inference 98.10 95.16 98.14 44.20 93.30
V 104 2044
N 136026 2872
LSTM + rules inference 97.91 96.46 97.93 41.91 94.39
V 76 2072
LCNN+ LSTM + rules N 136197 2701
98.03 96.42 98.06 43.40 94.47
inference V 77 2071

Table 3 Comparison between this study and [39,40]

Methods Recordings Measures


Acc(%) Se(%) Sp(%) PPV(%) (%)
Xu[39] 97.28 95.89 97.32 35.65 93.21
Our prior work[40] 141,046 Recordings of CCDD 97.87 87.94 98.02 40.75 85.96
Proposed method 98.03 96.42 98.06 43.40 94.47

Table 4 Results on the DS2 (MIT-BIH-AR database)

Predicted

Acc Se Sp PPV
Classifiers N V
(%) (%) (%) (%) (%)
N 46181 148
True

LCNN+ LSTM 98.88 87.23 99.68 94.96 86.91


V 408 2786
N 46114 215
LCNN+ LSTM + rules inference 99.41 97.59 99.54 93.55 97.13
V 77 3117

Table 5 Comparison between this study and published studies

Methods Recordings Measures


Acc(%) Se(%) Sp(%) PPV(%) (%)
Llamedo[5] 98.16 82.94 99.21 87.97 82.15
Li[6] 98.18 93.12 98.53 81.44 91.65
Zhang[7] All recordings 98.63 85.48 99.54 92.75 85.02
Angel[8] of DS2 99.71 96.06 99.97 99.49 96.03
Zarei[9] 98.77 96.12 98.96 86.48 95.08
Proposed method 99.41 97.59 99.54 93.55 97.13

You might also like