Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Computers in Biology and Medicine 150 (2022) 106199

Contents lists available at ScienceDirect

Computers in Biology and Medicine


journal homepage: www.elsevier.com/locate/compbiomed

MCA-net: A multi-task channel attention network for Myocardial infarction


detection and location using 12-lead ECGs
Weibai Pan a,b ,1 , Ying An c ,1 , Yuxia Guan a,b , Jianxin Wang a,b ,∗
a
School of Computer Science and Engineering, Central South University, Changsha, 410083, China
b
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
c
School of Big Data Institute, Central South University, Changsha Hunan, China

ARTICLE INFO ABSTRACT

Keywords: Problem: Myocardial infarction (MI) is a classic cardiovascular disease (CVD) that requires prompt diagnosis.
Electrocardiogram However, due to the complexity of its pathology, it is difficult for cardiologists to make an accurate diagnosis
Deep neural network in a short period.
Multi-task learning
Aim: In the clinical, MI can be detected and located by the morphological changes on a 12-lead electrocar-
Myocardial infarction
diogram (ECG). Therefore, we need to develop an automatic, high-performance, and easily scalable algorithm
for MI detection and location using 12-lead ECGs to effectively reduce the burden on cardiologists.
Methods: This paper proposes a multi-task channel attention network (MCA-net) for MI detection and location
using 12-lead ECGs. It employs a channel attention network based on a residual structure to efficiently capture
and integrate features from different leads. On top of this, a multi-task framework is used to additionally
introduce the shared and complementary information between MI detection and location tasks to further
enhance the model performance.
Results: Our method is evaluated on two datasets (The PTB and PTBXL datasets). It achieved more than 90%
accuracy for MI detection task on both datasets. For MI location tasks, we achieved 68.90% and 49.18%
accuracy on the PTB dataset, respectively. And on the PTBXL dataset, we achieved more than 80% accuracy.
Conclusion: Numerous comparison experiments demonstrate that MCA-net outperforms the state-of-the-art
methods and has a better generalization. Therefore, it can effectively assist cardiologists to detect and locate
MI and has important implications for the early diagnosis of MI and patient prognosis.

1. Introduction and location of MI are essential for prolonging the life expectancy of
patients and improving the life quality of patients.
Cardiovascular disease (CVD) is a global health problem, and its Continuous acquisition and recording of dynamic changes in the
occurrence in developing countries is continuing to rise. Myocardial subject’s ECG waveform can often provide critical information to the
infarction (MI) is a typical CVD, which is caused by coronary artery physician making the diagnosis of MI in the clinical [4,5]. Therefore,
obstruction, and these obstructions result in a severe decrease in blood ECG is a necessary tool for the diagnosis of MI and should be obtained
flow to the heart muscle [1]. Since MI is difficult to detect and occurs and interpreted promptly after the first medical contact. However,
rapidly, it has also been described as a silent heart attack. According due to the complex and diverse ECG manifestations of MI, using 12-
to statistics from the American Heart Association (AHA), each year, lead ECGs to detect and locate MI are extremely time-consuming and
approximately 605,000 people have an episode of MI, and 200,000 MI
laborious for cardiologists, and even prone to misdiagnosis. Therefore,
patients have a relapse [2]. Moreover, the locations of MI are numerous
automatic ECG-based diagnosis of MI has become a research hotspot in
and proximate to each other, involving different criminal vessels and
the field of medical artificial intelligence.
varying degrees of criticality, which can seriously affect the patient’s
The traditional methods for MI detection and location usually in-
prognosis. As shown in Fig. 1, they can be broadly classified as anterior
MI (AMI), anteroseptal MI (ASMI), anterolateral MI (ALMI), inferior MI clude three steps: feature extraction, feature selection, and classifica-
(IMI), inferolateral MI (ILMI), and others [3]. Therefore, early detection tion [6]. Early feature extraction methods usually extracted relevant

∗ Corresponding author at: School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
E-mail addresses: pwb0531@csu.edu.cn (W. Pan), anying@csu.edu.cn (Y. An), guanyx1997@csu.edu.cn (Y. Guan), jxwang@csu.edu.cn (J. Wang).
1
Both authors contributed equally to this research.

https://doi.org/10.1016/j.compbiomed.2022.106199
Received 13 June 2022; Received in revised form 18 September 2022; Accepted 9 October 2022
Available online 13 October 2022
0010-4825/© 2022 Elsevier Ltd. All rights reserved.
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Fig. 1. Description of the standard 12-lead system and its correspondence to the MI location.

ECG features such as ST-segment elevation and Q-wave abnormali- location features, thus affecting the accuracy of identification. Unlike
ties [7,8] by wavelet transform (WT) [9], principal component anal- most existing methods that use only single-lead or partial-lead ECGs,
ysis (PCA) [10], or empirical mode decomposition (EMD) [11], etc. we fully used 12-lead ECGs to introduce more data information to
Then, traditional machine learning methods such as K-nearest neigh- enhance the performance of our method. Second, although some studies
bors (KNN) [12,13], random forests (RF) [14], or support vector ma- have also considered 12-lead ECGs, they tend to treat the MI detection
chines (SVM) [15,16] are used for classification. However, the feature and location tasks as two independent tasks and did not consider
selection process of these methods often relies on human intervention the correlation between them [22–25], which may severely limit the
and requires extensive medical expertise and experience. On top of performance and improvement space of the existing methods.
that, the classification performance of these conventional MI detection Therefore, to address the above problems, we propose a multi-
and location methods can be greatly reduced if the morphological task channel attention network (MCA-net) for MI detection and loca-
characteristics of the selected ECG waveforms are not accurate enough. tion tasks. The contributions of this paper can be summarized as the
Therefore, to solve the above problems, a more stable method for following points:
end-to-end MI automatic detection and location is needed.
As deep learning methods achieve remarkable results in many fields, 1. We propose a novel channel attention network for feature learn-
they also show their advantages in CVD classification based on ECG ing from 12-lead ECGs, which employs multiple independent
signals [17,18], such as Convolutional Neural Network (CNN) and Long channels network to extract features from individual leads. In
Short-Term Memory (LSTM). They are not only able to perform the each channel network, the features are efficiently extracted and
feature extraction and classification process simultaneously, but also integrated with different stages using the squeeze-and-excitation
usually achieved higher accuracy compared to traditional methods. (SE) [26] attention mechanism and skip connections.
Therefore, researchers have proposed many methods based on deep 2. According to the relevance of MI detection and location tasks, we
learning methods to detect and locate MI which achieved promising introduce a multi-task framework in MCA-net. This multi-task
results. For example, Strodthoff et al. [19] proposed an MI detection framework can improve the performance and generality of MCA-
and interpretation algorithm based on a full CNN using eight-lead ECGs, net by exploiting the shared and complementary information
which extracted features directly from the raw ECG data and achieved among the tasks in a joint optimization loss manner. Therefore,
93.3% sensitivity and 89.7% specificity. Acharya et al. [20] proposed a MCA-net can be easily extended to different tasks.
novel approach that used a CNN with eleven layers to automatically 3. In the experimental part, we use two public datasets to fully
detect MI using single-lead ECGs. This approach needed no manual validate the performance and generalizability of MCA-net. We
feature extraction or selection so that even in the presence of noise use the multi-fold cross validation strategy for a more rigorous
it can also make accurate detection of MI. As a result, it obtained evaluation of the model performance. We also conduct many
93.53% and 95.22% accuracy in the presence and absence of noise, ablation experiments to compare and analyze the impacts of the
respectively. He et al. [21] developed a novel 12-lead ECG-based deep multi-task framework and the role of each module in the channel
active semi-supervised learning framework for MI location which was attention network. The experimental results indicate that MCA-
named multibranch densely connected convolutional network (MB- net is superior both in performance and generality compared to
DenseNet). It first employed active learning (AL) to improve the classi- other studies.
fication performance of the initial model and then designed a new semi-
supervised learning method named self-training with spatial match- The rest of this paper is organized as follows: The related works
ing (STSM) to update the model. MB-DenseNet obtained 99.87% and about MI detection and location methods are discussed in Section 2.
96.09% accuracy under the intra-patient and patient-specific scheme, Section 3 describes the dataset and preprocessing methods we used. The
respectively. details of our proposed method are described in Section 4. In Section 5,
The above-mentioned methods have brought some impressive im- we present the details of experimental implementation and discuss the
provements in the automatic detection and location of MI. However, results. In Section 6, ablation experiments and further analysis of our
there are still many challenges that need to be solved. First, MI episodes model are presented. Finally, we conclude our work in Section 7.
in different heart locations usually cause abnormal ECG waveforms in
different combinations of leads. For example, we can see from Fig. 1 2. Related works
that AMI may cause ST-segment elevation in leads V3 and V4, whereas
ALMI may cause ST-segment elevation in different leads (II, III, and Because of its easy accessibility and noninvasive nature, ECG is
AVF). Therefore, incomplete lead information may lead to a lack of often used as a critical means for clinical MI detection and location.

2
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

However, the traditional manual analysis of ECG is extremely time- demonstrated. Therefore, more and more deep learning-based methods
consuming and laborious and needs the professional knowledge and for automatic MI detection have been proposed. For instance, Liu
experience of doctors. Therefore, researchers have explored and devel- et al. [36] developed a novel algorithm based on a multi-lead CNN
oped many automatic MI diagnosis algorithms to reduce the burden on (ML-CNN) for MI detection via four-lead ECGs. In this method, the lead
cardiologists. asymmetric pooling (LAP) layers and sub-two-dimensional (2-D) convo-
Early automatic MI diagnosis algorithms focused on the MI de- lutional layers were applied to learn multi-scale feature representations
tection task that classified MI and HC (Healthy Control) populations. of different leads which performed 96.00% accuracy. Li et al. [37]
Researchers typically use biosignal processing methods combined with proposed an automatic MI detection model to synthesize single-lead
classic machine learning methods. These methods usually required ECGs with high morphological similarity by generative adversarial
many tedious manual data processing steps before classification, such networks (GANs). It automatically detected MI using CNNs for both the
as denoising, normalization, feature extraction, and selection. Com-
original ECG and the synthetic ECG from GANs and achieved 99.06%
monly used methods include WT, frequency domain analysis
accuracy. Moreover, Liu et al. [23] presented a multiple-feature-branch
(FDA) [27], hidden markov model (HMM) [28] and SVM, etc.
convolutional bidirectional recurrent neural network (MFB-CBRNN) to
For example, to perform automatic detection of MI based on ECGs,
effectively perform MI automatic detection based on the 12-lead ECGs
Sun et al. [29] proposed a multi-instance learning (MIL) approach
which consisted of 12 multi-feature branches. Specifically, MFB-CBRNN
named latent topic MIL. This method used the unlabeled heartbeats
utilized multiple stacked multi-scale convolutional layers combined
in the training set to create a topic space and mapped the ECGs to
this space. Then, the ECG-level topic vectors are fed into the SVM with pooling operations to fully extract the features within the ECG
for the final classification. Nidhyananthan et al. [30] presented a leads. Then, it used bidirectional LSTM to fuse features from each
wavelet-based method to detect MI and user identity. This model first multi-feature branch and achieved 99.54% and 86.29% accuracy under
used Daubechies wavelet transform to decompose the multi-layer ECG inter-patient and intra-patient paradigms, respectively.
signals and then employed an SVM classifier to classify the normal and And as automatic MI detection methods continue to improve, re-
abnormal cases in the signals. Benameur et al. [31] based on three searchers are gradually shifting their research focus to the more chal-
parametric imaging techniques for cardiac MRI to quantify areas of lenging and clinically valuable task of MI location. For example, Liu
infarct-related systolic impairment to detect MI. The parametric images et al. [38] proposed a new multiple-feature branch CNN (MFB-CNN)
generated from the monogenic signal achieved the best performance. to automatically detect and locate MI based on 12-lead ECGs which
Pereira et al. [32] designed a method combined with wavelet decompo- learned features of leads by corresponding feature branches. Thus, this
sition and eigenspace analysis to detect MI. The wavelet decomposition model obtained 98.79% and 94.82% mean accuracies on MI detection
is used to extract waveforms of different frequencies in the ECGs and location tasks, respectively. Cao et al. [39] proposed a multi-scale
and separate them. The eigenspace analysis provided the eigenvalues deep learning model based on a combination of residual network and
calculated from covariance matrices. Therefore, these training features attention mechanism for 12-lead ECG recordings. It mainly used the
can help to classify HC and MI patients. Liu et al. [33] used the SENet model and the Grad-CAM algorithm to calculate the weights of
tunable Q-factor wavelet transform (TQWT) to design a novel ECG each lead and detected and located MI based on the weighted features,
signal processing method to extract high-quality ECGs and perform and both achieved over 99% accuracy. Han et al. [22] designed a multi-
MI detection. In this proposed method, the authors decomposed the lead residual neural network (ML-ResNet) for MI detection and location
12-lead ECG signal into high and low Q-factor components, and then
tasks using 12-lead ECG records. It extracted local features from differ-
manually extracted a large number of features, which were finally fed
ent leads to construct the global feature representation and achieved
into KNN to obtain better performance. Wah et al. [34] used SVM
95.49% and 55.74% accuracy under the inter-patient paradigm on MI
and KNN as classifiers to classify cardiac conditions in ECG signals
detection and location tasks, respectively. Cao et al. [24] proposed a
based on extracting relevant waveforms or features from the ECG
novel multichannel lightweight model (ML-Net) to provide algorithmic
signal and analyzing certain specific regions of the peaks and their
support for automatic MI detection and location on portable devices. In
temporal frequencies. Fatimah et al. [13] developed two models to
automatically detect MI called primary and modified methods based detail, ML-Net utilized a multi-lead framework enabling features within
on single-lead ECGs, respectively. The primary algorithm applied the each lead to be learned independently, thus preserving the unique
Fourier decomposition method (FDM) on the heartbeats obtained after features of different angles represented by different leads, and finally
the segmentation operation based on R-peaks and used a modified performed 96.00% and 66.85% classification accuracy in MI detection
algorithm to calculate its efficiency. Features obtained by these two and location tasks, respectively. However, most methods are designed
methods were fed into a variety of conventional machine learning for specific tasks and are difficult to extend to other downstream tasks,
classifiers and achieved the best performance in KNN. Han et al. [16] limiting their use in clinical practice. In contrast, methods that consider
proposed a new method to detect MI by fusing energy entropy and both MI detection and location tasks usually treat MI detection and
morphological features based on 12-lead ECGs. In detail, the ECGs were location as two separate tasks. They do not consider the correlation
first decomposed and then calculated the decomposed coefficients to between the tasks and do not enable effective information sharing and
get the energy entropy, finally fused with local morphological features complementarity between the tasks during training, which may limit
as global features for more accurate classification. Sahu et al. [35] their overall performance.
designed a novel MI detection and location method using variational To tackle the limitations of existing methods, we design a multi-
mode decomposition (VMD) and regularized neighborhood component task channel attention network named MCA-net. In our model, we use
analysis (RNCA). It constructed a feature set from statistical and non- a channel attention network to learn and fuse the features of each
linear features computed by the intrinsic mode function of the VMD lead and employ a multi-task framework to efficiently combine shared
and ranked using RNCA, then fed into a KNN and AdaBoost classifier and complementary information between MI detection and location
for classification. Unfortunately, these traditional methods based on tasks. Experiments show that MCA-net can automatically perform MI
feature engineering rely excessively on professional medical experience detection and location tasks and has a high application value.
and require a long feature extraction process and a rigorous feature
selection process. As a result, they are susceptible to limitations in terms
of convenience and stability and are difficult to meet clinical standards. 3. Data description and preprocessing
However, as deep learning methods have shown great advantages
in the fields of computer vision, biosignal analysis, and machine trans- The ECG data used in this paper are from the PTB [40] and PTBXL
lation, their powerful automatic feature extraction ability has been datasets [41], respectively.

3
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Fig. 2. The overview of MCA-net. 2D-C: The 2-D Convolution Layer. BN: The Batch Normalization Layer. LR: The LeakyReLU Layer. 2D-RB: The 2-D Residual Block. 1D-RB:
The 1-D Residual Block. GAP: The Global Average Pooling Layer. FC layer: The Fully Connected Layer. Detection: The two-category detection task. Location-5: The five-category
location task. Location-7: The seven-category location task.

Table 1 3.3. Preprocessing


Summary of the PTB dataset used in the paper.
Class Subjects Records Heartbeats For each record in the PTB dataset, we first resample it to 500 Hz.
AMI 17 47 4369 Then, we split the resampled record into some fixed-length heartbeats,
ASMI 28 79 7596
heartbeats smaller than that length are discarded. There is no overlap
ALMI 16 43 4152
IMI 30 89 7970 between each heartbeat. Since the duration of a complete ECG wave-
ILMI 23 56 5235 form is about 0.6 s, to ensure that each heartbeat contains at least one
Other 13 32 2956 complete ECG waveform, the fixed length of each heartbeat is set to
HC 52 80 7866 1.2 s (600 sampling points) in this paper. As shown in Table 1, we end
Total 179 426 40,144 up with a total of 40,144 heartbeats.
For the PTBXL dataset, we directly select the 500 Hz sampling
Table 2 frequency version to maintain uniformity with the PTB dataset. Con-
Summary of the PTBXL dataset used in the paper. sidering that each record in the PTBXL dataset may contain more than
Class Subjects Records one annotation, we do not segment it to ensure the integrity of the
AMI 348 354 multi-label information.
ASMI 2037 2363 Based on the above preprocessing, we obtain a single-labeled dataset
ALMI 245 290 with heartbeats as the basic unit and a multi-labeled dataset with ECG
IMI 2362 2685 records as the basic unit, respectively, to fully validate the effectiveness
ILMI 416 479
Other 278 302
of our method.
HC 8903 9528
Total 13,460 15,013 4. Method

As illustrated in Fig. 2, the overall structure of the MCA-net is


composed of a channel attention network based on a multi-task frame-
3.1. The PTB dataset work. In detail, we input the 12-lead ECGs to a channel attention
network with 12 channels, each channel is assigned to a lead. For each
The PTB dataset is commonly used to train and evaluate different MI lead of the ECG signal, we first format the data into a 2-D matrix
detection and location methods. It contains 549 12-lead ECG records, of by adding an extra dimension. The 2-D matrix is then fed into a 2-D
which 368 ECG records from 52 HC subjects and 148 MI patients are convolutional layer to capture important information in the early stages
used in this paper. All ECG data in this dataset are available online. of feature extraction. Then, to extract multi-scale features, we employ
Each ECG record consists of 12 standard leads (as shown in Fig. 1) and 2-D residual blocks with different kernel sizes. Immediately afterward,
3 Frank leads (VX, VY, and VZ), sampled at 1000 Hz. Moreover, these we reshape the features into 1-D features before feeding them into three
ECG records are all annotated by cardiologists, a subject may provide 1-D residual blocks to further extract finer-grained features. After the
more than one ECG record, but each ECG record has only one label. 1-D residual blocks, a global average pooling layer is used to extract
Most of these ECG records last approximately 2 min and at least 30 s. key features and improve generalization. Finally, we concatenate the
The statistics of the used 12-lead ECGs on the PTB dataset are shown features from different channels, use dropout to avoid overfitting, and
in Table 1. input the flattened global features into three fully connected layers. All
the convolutions we use in MCA-net are the Same Convolution which
is one of the Standard Convolutions (both 1-D and 2-D). The details of
3.2. The PTBXL dataset
each part are shown in Table 3 and described below.

The PTBXL dataset is a large publicly available ECG dataset that 4.1. Channel attention network
is newly published in April 2020 [41] which has 18,885 subjects and
provides approximately 21,837 ECG records. Among them, there are The overview of our channel attention network structure is shown
5486 MI records and 9528 HC records. Similar to the PTB dataset, inside the dashed box in Fig. 2. Conventional 12-lead ECG data can
there are 12 standard leads in each record, and two sampling frequency reflect different views of cardiac activity and provide a large number
versions of 100 Hz and 200 Hz are available. Each record lasts 10 s. of different informative representations. Especially in the MI location
Each subject in this dataset provides one to multiple annotated ECG task, the features provided by different leads may point to different
records, and most records contain multiple annotations. The statistics locations of infarction. Therefore, it is crucial to make full use of the key
for the 12-lead ECGs used in the PTBXL dataset are shown in Table 2. information in 12-lead ECGs. In the channel attention network, each

4
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Table 3 Table 4
Detailed information of Channel Attention network. Detailed information of the 2-D residual block.
Part Layer Kernels Input size Type Kernel size Kernels
PTB PTBXL 2D-C 17 2/4/8/16
Expand_dims – 600 × 1 5000 × 1 BN+LR – –
I 2D-C 2 600 × 1 × 1 5000 × 1 × 1 2D-C 11 2/4/8/16
BN+LR – – –
BN+LR – –
II 2D-RB 4 600 × 1 × 2 5000 × 1 × 2 2D-C 5 2/4/8/16
Reshape – 600 × 1 × 4 5000 × 1 × 4 BN+LR – –
1D-RB 4 300 × 4 2500 × 4
III SE – –
1D-RB 8 100 × 4 833 × 4
Skip connection 1 –
1D-RB 16 33 × 8 277 × 8
LR – –
IV GAP – 11 × 16 92 × 16
Concatenate – 1 × 16 1 × 16
V Dropout – 1 × 192 1 × 192
Flatten – 1 × 192 1 × 192 representation. Then, we combine the features from the SE module
and the skip connection (including a 2-D convolutional layer) to avoid
the loss of important information. Finally, feed the features into a
LeakyReLU layer.

4.1.3. The 1-D residual blocks layer


To further learn finer-grained features in each channel, we design
a 1-D residual blocks layer that has three stacked 1-D residual blocks
after 2-D residual blocks layers. Before this, the feature representations
extracted from the 2-D residual block layer are reshaped into 1-D ten-
sors. The structure of the 1-D residual blocks is similar to that of the 2-D
residual blocks in Fig. 3. The difference is that the 2-D convolutional
layers in the 2-D residual block are replaced by the 1-D convolutional
layers, and an additional average pooling layer is added to reduce the
dimensionality of the features and provide better generalization after
the LeakyReLU activation function. The 1-D average pooling operates
Fig. 3. The 2-D residual block structure.
as :
[ ]
𝒙𝑙𝑗 (𝑘) = 𝐴𝑣𝑔 𝒙𝑙−1
𝑗 (𝑘 × 𝑠 + 𝑎) (2)
𝑎∈𝐴
lead is represented by its independent channel to generate features.
Each channel consists of five components that take full account of where the feature representations 𝒙𝑙𝑗 (𝑘) use 𝑘 as the sampling point
the different scales of features at different stages, which is given in index. 𝑠 denotes the strides of the pooling operation, which is all set to
Table 3. Ultimately, the features learned by the 12-channel network 1 in this paper. 𝑅 represents the whole pooling region which is divided
are integrated into the final patient ECG representation and fed into the into 𝑟 parts. Finally, we use 𝐴𝑣𝑔 (∙) to obtain the average values of the
fully connected layer for classification. Details of the channel attention pooling windows.
network are described in the following.
4.1.4. The global average pooling layer
4.1.1. The 2-D convolutional layer Before performing feature fusion of the multi-lead, we use a global
First, the raw 1-D signal of each lead is converted into a 2-D matrix average pooling layer to reduce feature dimension and provide a more
(as shown in Table 3, Part I) and input to a 2-D convolutional layer for specific internal representation for each channel.
feature extraction. The 2-D convolutional layer can be described as :
4.1.5. The feature fusion layer
⎛∑ ⎞ Finally, we concatenate the features learned by each channel and
𝒙𝑙𝑗 = 𝑓 ⎜ 𝒙𝑙−1 ∗ 𝒘𝑙𝑖𝑗 + 𝑏𝑙𝑗 ⎟ (1)
⎜𝑖∈𝑀 𝑗 ⎟ feed them into a dropout layer, then use a flatten layer to get a feature
⎝ 𝑗 ⎠ representation. This feature representation is then fed into different
where 𝒙𝑙−1
𝑗 is the output of the 𝑗-th convolution kernel at the (𝑙 − 1)-th fully connected layers to obtain the corresponding classification results
layer which is used as the input of the 𝑙-th layer. It is operated with of MI detection and location tasks.
the bias coefficients 𝒘𝑙𝑖𝑗 and the corresponding bias 𝑏𝑙𝑗 of this layer
under the receptive field of 𝑀𝑗 . Where ∗ refers to the 2-D convolutional 4.2. Multi-task framework
operation. And then the rectified linear unit (ReLU) is applied as the
activation function 𝑓 (∙) to output 𝒙𝑙𝑗 . From the basic process of clinical diagnosis, cardiologists usually
Then, a batch normalization layer and a LeakyReLU activation func- detected MI first and then located MI after confirming the diagnosis.
tion are used to alleviate the vanishing gradient problem and accelerate Meanwhile, the MI detection result can also be verified from the
the convergence. results of location. However, existing approaches are usually designed
to implement only one specific task or multiple independent tasks
4.1.2. The 2-D residual block layer which did not take into account the relationship between these two
After the 2-D convolutional layer, we input the learned features tasks and the additional information they may bring. These approaches
into the 2-D residual block layer. As illustrated in Fig. 3 and Table 4, limited the classification performance and improvement space of the
this 2-D residual block layer mainly consists of three 2-D convolutional existing methods. Therefore, for MI detection and location tasks, a
layers with different sizes of convolutional kernels. After these three model that can perform both of them at the same time is more reliable
stacked 2-D convolutional layers, we use the SE module to process than several independent models. Moreover, this kind of model may
and fuse the learned feature maps to obtain a more efficient feature be more effective in terms of generalizability and performance, since

5
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

MI detection and location tasks may have overlapping key features and • ML-ResNet: ML-ResNet also employed a multi-lead framework
additional information that can be complementary. for MI detection and location, but it used a residual-based net-
Multi-task frameworks have been shown to play a gainful role work structure in designing multi-feature branches. Each multi-
in similar tasks by learning shared and complementary information feature branch mainly consisted of three stacked residual blocks,
between multiple tasks to optimize the performance of all tasks [42– each of which included three convolutional layers and a skip
44]. And we use the most classical multi-task framework, a shared- connection. Thus, ML-ResNet can be layered to learn feature
bottom model [42]. This model achieves improved classification per- representations within the leads and perform feature fusion.
formance by sharing the bottom representation across multiple tasks • ML-Net: ML-Net also had a novel channel branch structure based
and performing loss optimization within multiple tasks. Thus, MCA-net on a multi-lead framework. The branch structure is designed as
based on this multi-task framework can more effectively combine MI three parallel convolutional networks which used setting different
detection and location tasks. The framework can be expressed by : convolutional kernel sizes to integrate multi-scale features within
the leads. However, unlike the above two approaches, ML-Net
𝑦𝑛 = 𝐹 𝑛 (𝑪(𝑥)) (3) only used four leads ECG data to accommodate the lightweight
needs of portable devices.
In this paper, we design three related tasks: (1) a two-category MI
detection task (Detection), which aims to identify whether people are Unlike these methods, our approach combines 2-D and 1-D residual
MI or HC; (2) a five-category MI location task (Location-5), whose blocks for extracting early significant features and finer-grained fea-
goal is to distinguish AMI, ASMI, ALMI, other and HC populations; (3) tures. And we also used the SE module and skip connections in the
a seven-category MI location task (Location-7), which objective is to MCA-net for feature weighting and to prevent loss of valid information.
distinguish AMI, ASMI, ALMI, IMI, ILMI, other and HC populations. For Finally, by introducing a multi-task framework, the inter-task correla-
the Location-5 task, we select it because a large proportion of patients tion and complementarity are fully exploited to extend the MCA-net to
with MI have Generalized Anterior MI (GAMI) [36], which is consisting individual tasks and improve the overall classification performance.
of AMI, ASMI, and ALMI, so we first focus our task on differentiating Because MFB-CBRNN, ML-ResNet, and ML-Net are all originally
GAMI and classifying the remaining MI labels into other categories. We designed for different MI classification tasks, using different amounts
then consider the task for most of the existing studies and the relevance of data and different means of data processing. We extend them to
of that task to the Location-5 task, as well as the distribution of labels our task by unifying the data inputs and modifying their corresponding
on the dataset. Ultimately, the IMI and ILMI are additionally selected classification layers for a fair comparison.
for classification based on the Location-5 task to form the Location-7
task. 5.2. Implementation details
Specifically, the multi-task framework shares parameters and fea-
ture representations among the three tasks in the channel attention All methods involved in this paper were developed based on Keras
network, which is represented as a function 𝑪. While 𝑛 = 1, 2, 3 for and the TensorFlow framework [45] which are trained and tested using
each task, respectively. Then we get each task output 𝑦𝑛 after passing python 3.6. The relevant experiments were performed on a server
through the corresponding fully connected layer 𝐹 𝑛 . with an Intel(R) Xeon(R) Gold 6230 CPU @ 2.10 GHz and 251 GB
The losses of these tasks in the multi-task framework are combined memory. We set the initial learning rate of the model to 0.001 and
in the form of a weighted sum for joint optimization, and the total loss optimized it during the training process using the callback function
ReduceLROnPlateau. The number of epochs and batch sizes were set to
can be expressed as :
100 and 64, respectively. Moreover, we also deploy the Adam and early
𝑳𝑡𝑜𝑡𝑎𝑙 = 𝑎𝑳𝑑 + 𝑏𝑳𝑙−5 + 𝑐𝑳𝑙−7 (4) stop strategies to speed up the learning process and prevent overfitting.

where 𝑳𝑑 , 𝑳𝑙−5 , 𝑳𝑙−7 correspond to the Detection, Location-5 and 5.3. Evaluation metrics
Location-7 tasks, respectively, and 𝑎, 𝑏, 𝑐 are the weights for the three
related tasks. In this work, the model performances on the PTB dataset are eval-
uated by three metrics, namely, sensitivity (Sen), specificity (Spe), and
5. Experiments and results accuracy (Acc). The Sen, also called recall, represents the percentage
of positive samples correctly predicted by the model over the overall
In this section, we fully evaluate MCA-net on the PTB and PTBXL positive samples. Conversely, Spe represents the percentage of negative
datasets and compare it with state-of-the-art methods for automatic MI samples correctly predicted by the model over the overall negative
detection and location. In addition, we perform a significance test of the samples. While, the accuracy rate (Acc) is the percentage of all cor-
results of our experiments to show the improvements in our approach. rectly identified samples over the total samples. These three evaluation
metrics can be calculated by the following equation:
TP
5.1. Comparison methods Sen = . (5)
TP + FN
TN
Spe = . (6)
For a comprehensive assessment, we conduct comparative experi- TN + FP
ments on our method and other three state-of-the-art MI classification TP + TN
Acc = . (7)
methods, i.e., MFB-CBRNN [23], ML-ResNet [22], and ML-Net [24] on TP + TN + FP + FN
both the PTB and PTBXL datasets. Where TP (True Positive) and TN (True Negative) refer to the
number of samples where both the true categories and the predicted
• MFB-CBRNN: MFB-CBRNN used a multi-lead framework for MI categories are positive or negative. While FP (False Positive) and FN
detection based on the 12-lead ECGs. To fully utilize the ECG (False Negative) refer to the number of samples where the true cat-
waveform information contained in all 12 leads. 12 multiple- egories and the predicted categories are positive and negative, or
feature-branch was employed to extract features within each lead negative and positive, respectively.
which was mainly composed of convolutional blocks. For effective For the multi-label dataset PTBXL, we use the same metrics to
feature fusion, MFB-CBRNN also used a bidirectional LSTM to evaluate the model performances on the MI detection task. But for the
integrate all the features output from each lead branch. MI location task, we select the accuracy and the area under the roc

6
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Table 5
Performance comparison of MCA-net with three state-of-the-art methods on the PTB dataset under the inter-patient scheme.
Model Detection Location-5 Location-7
Sen(%) Spe(%) Acc(%) Sen(%) Spe(%) Acc(%) Sen(%) Spe(%) Acc(%)
MFB-CBRNN 93.34 ± 0.11 73.94 ± 0.73 89.53 ± 0.16 54.41 ± 0.37 63.04 ± 0.18 63.79 ± 0.19 36.96 ± 0.33 41.59 ± 0.39 41.71 ± 0.39
ML-ResNet 96.07 ± 0.06 71.60 ± 1.45 91.28 ± 0.10 53.38 ± 0.48 64.04 ± 0.23 65.00 ± 0.21 36.61 ± 0.34 42.37 ± 0.51 42.51 ± 0.52
ML-Net 95.24 ± 0.03 77.69 ± 0.53 91.81 ± 0.04 54.23 ± 0.16 64.80 ± 0.12 65.62 ± 0.12 36.57 ± 0.30 41.90 ± 0.32 42.03 ± 0.32
MCA-net 98.01 ± 0.05𝑏𝑐 86.01 ± 0.18𝑐 95.57 ± 0.06𝑐 55.35 ± 0.10 67.96 ± 0.10 68.90 ± 0.10 41.80 ± 0.10𝑎𝑏𝑐 49.01 ± 0.09 49.18 ± 0.09𝑎

Location-5: The five-category MI location task. Location-7: The seven-category MI location task.
𝑎,𝑏,𝑐
indicates that the MCA-net showed a significant improvement in this column compared to the MFB-CBRNN, ML-ResNet, and ML-net in the T-test experiment at the 0.05
significance level, respectively.

curve (AUROC) suggested by the dataset publishers [41] to evaluate 5.5. Experimental results on the PTBXL dataset
the model performances. It refers to the expected value of a uniformly
drawn random positive sample ranked ahead of a negative sample. In Different from the PTB dataset, the PTBXL dataset involves a larger
our work, we set each label in turn as a positive sample and set the number and wider range of patients, and each patient can provide
remaining labels as negative samples to calculate their AUROCs and one to more ECG records, allowing for richer sample information and
average them. longer ECG sequences. In addition, the PTBXL dataset is closer to the
All evaluation metrics are shown as mean–variance in the cross val- actual clinical situation than the PTB dataset because its ECG records
idation experimental results to measure the classification performance usually contain multiple labels. We extract HC and MI records from the
and stability of involved methods. PTBXL dataset for MI detection and location experiments to validate
the performance of the MCA-net. Longer duration ECG records are used
as input units to validate the effectiveness of the MCA-net in learning
5.4. Experimental results on the PTB dataset
long time series features. To more closely resemble the real-world data
distribution, the PTBXL dataset is divided by its publisher into ten folds
Unlike the intra-patient schemes which are often used in previous of the inter-patient paradigm [41] which represents that there is no
studies, the inter-patient schemes are of greater clinical significance to overlap of patient origin for ECG records in all folds. Therefore, we
prove the model generalization. perform 10-fold cross validation based on these sets to obtain more
Therefore, we perform 5-fold cross validation experiments on the reliable results. As shown in Table 6, our method still outperforms other
PTB dataset under an inter-patient scheme to verify the effectiveness comparative methods, achieving a sensitivity of 88.59%, specificity of
and generalizability of our approach, the mean sensitivity, specificity, 96.55%, and accuracy of 93.65% on the detection task. Meanwhile,
and accuracy of our model are shown in Table 5 where the opti- MCA-net also performs better than other methods on the Location-5
mal solution is highlighted. We can see that MCA-net achieves the and Location-7 tasks, achieving 84.03% and 81.47% accuracy as well
best performance for all tasks. In the detection task, the sensitivity, as 78.76% and 74.05% AUROC, respectively. This demonstrates the
specificity, and accuracy of MCA-net are improved compared with effectiveness and generalizability of our method across the different
other comparative methods, reaching 98.01%, 89.01%, and 95.57%, datasets.
respectively, while the improvement in specificity is extremely obvious. We also can see that in contrast to the results on the PTB dataset,
The classification performance of MCA-net is also improved in the ML-Net performs the worst on the PTBXL dataset. The possible reason
Location-5 task, achieving 55.35%, 67.96%, and 68.90%, respectively. is that the length of data provided by the PTBXL dataset is much
For the Location-7 task, the MCA-net is achieving 41.80%, 49.01%, and longer (typically more than several thousand sampling points), while
49.18%, with an improvement of more than 6% on accuracy. the lightweight framework adopted by ML-Net cannot effectively learn
We can also observe that MFB-CBRNN performs slightly worse in all long-term dependencies from such long ECG sequences. And MFB-
tasks compared to the other methods. And ML-ResNet achieves good CBRNN performs relatively well on both datasets, which demonstrates
classification accuracy on both detection and location tasks. And ML- the stability of the convolutional layer. In addition, the average per-
Net performs a little better than ML-ResNet on both the Detection formance of ML-ResNet performs better than the other two comparison
and Location-5 tasks, but slightly worse on the Location-7 task. This methods, which proves the effectiveness of the residual module. Our
indicates that they lack adaptability to different tasks. In contrast, our approach not only combines the stability of the convolutional layer
with the effectiveness of the residual module but is also better able to
method has not only superior performance but also high generalizabil-
learn the long-term dependence of ECG sequences thus achieving better
ity. This is mainly because we use multiple convolutional layers with
performance.
different convolutional kernel sizes and SE module to learn important
Again, the reliable stability of the MCA-net is further verified by the
features at different granularities. On the other hand, it also benefits
range of variance fluctuations of the experimental results. In addition
from the multi-task framework which can effectively share underlying
to this, we still use the T-test to verify the significance of MCA-net
parameters and complementary information between different tasks.
on the PTBXL dataset. As can be seen in Table 6, MCA-net achieves
In addition to the mean result of the evaluation metrics, we also
significant improvements in most tasks relative to the comparison
calculate their ranges of variance to compare the stability between method, especially the Location-5 and Location-7 tasks.
methods. As can be seen in Table 5, the range of variance fluctuation
of our method is smaller than that of the comparison method for most 6. Discussion
of the evaluation indicators, which proves that our method ensures
reliable stability while maintaining superior performance. Moreover, a In this paper, we design a multi-task channel attention network
T-test at a 0.05 significance level was used to analyze whether the MCA- named MCA-net for MI detection and location tasks. Our approach
net was significantly better than other comparative methods for each shows an improvement in its overall performance compared to other
evaluation metric. As can be seen, our method has some performance methods. To further evaluate the effectiveness of the various compo-
advantages on the Detection and Location-7 tasks, especially in terms of nents within the MCA-net, we first discuss the gains brought by the
sensitivity. However, in the Location-5 task, which may be influenced multi-task framework to the MCA-net. Then, ablation experiments are
by the label imbalance and distribution, a higher demand is placed on performed for the main modules of the channel attention network
the model’s ability to differentiate. inside the MCA-net. Finally, a further analysis was carried out.

7
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Table 6
Performance comparison of MCA-net with three state-of-the-art methods on the PTBXL dataset under the inter-patient scheme.
Model Detection Location-5 Location-7
Sen(%) Spe(%) Acc(%) Acc(%) AUROC(%) Acc(%) AUROC(%)
MFB-CBRNN 83.12 ± 0.71 96.41 ± 0.02 91.55 ± 0.06 81.96 ± 0.02 74.51 ± 0.09 79.17 ± 0.024 69.21 ± 0.09
ML-ResNet 86.87 ± 0.08 96.93 ± 0.01 93.24 ± 0.01 83.25 ± 0.00 76.15 ± 0.01 80.23 ± 0.01 72.07 ± 0.03
ML-Net 73.32 ± 0.08 91.56 ± 0.02 84.89 ± 0.00 68.30 ± 0.01 68.11 ± 0.00 64.64 ± 0.00 62.98 ± 0.00
MCA-net 88.59 ± 0.06𝑎𝑏𝑐 96.55 ± 0.01𝑐 93.65 ± 0.01𝑎𝑐 84.03 ± 0.005𝑎𝑏𝑐 78.76 ± 0.02𝑎𝑏𝑐 81.47 ± 0.01𝑎𝑏𝑐 74.05 ± 0.03𝑎𝑏𝑐

Location-5: The five-category MI location task. Location-7: The seven-category MI location task.
𝑎,𝑏,𝑐
indicates that the MCA-net showed a significant improvement in this column compared to the MFB-CBRNN, ML-ResNet, and ML-net in the T-test experiment at the 0.05
significance level, respectively.

Table 7
Ablation experiments in Multi-Task Framework under inter-patient scheme.
Model The PTB dataset The PTBXL dataset
Acc-2(%) Acc-5(%) Acc-7(%) Acc-2(%) Acc-5(%) AUROC-5(%) Acc-7(%) AUROC-7(%)
Cnet 94.52 ± 0.06 66.45 ± 0.5131 46.41 ± 0.47 93.57 ± 0.00 83.95 ± 0.00 76.99 ± 0.02 81.34 ± 0.01 73.43 ± 0.02
MCA-net25 95.04 ± 0.06 67.06 ± 0.36 – 93.60 ± 0.01 83.74 ± 0.01 77.77 ± 0.03 – –
MCA-net27 95.21 ± 0.07 – 48.12 ± 0.03 93.65 ± 0.01 – – 81.14 ± 0.01 73.48 ± 0.02
MCA-net 95.57 ± 0.06 68.90 ± 0.10 49.18 ± 0.09 93.65 ± 0.01 84.03 ± 0.00 78.76 ± 0.02 81.47 ± 0.01 74.05 ± 0.03

Acc-2: Accuracy of the MI detection task. Acc-5: Accuracy of the five-category MI location task. Acc-7: Accuracy of the seven-category MI location task. AUROC-5: AUROC of the
five-category MI location task. AUROC-7: AUROC of the seven-category MI location task.

6.1. The benefits of the multi-task framework Table 8


Training time comparison of MCA-net with three state-of-the-art methods.

To validate the impact of the multi-task framework in MCA-net, Model The PTB dataset The PTBXL dataset

we design a single-task model and two dual-task models to perform Epoch(s) Fold(s) Epoch(s) Fold(s)
the same experiments as our model on the PTB and PTBXL datasets. MFB-CBRNN 93.78 37205.48 156.74 2743.85
The results of the ablation experiments can be seen in Table 7. Here, ML-ResNet 115.11 4595.61 84.64 1847.15
Cnet refers to the single-task model, MCA-net25 refers to the dual-task ML-Net 98.55 3145.82 85.54 3413.86
MCA-net 77.23 2276.68 135.44 2682.60
model for the Detection and Location-5 tasks, MCA-net27 refers to the
dual-task model for Detection and Location-7 tasks.
We can find that the single-task model Cnet performs well on dif-
ferent fine-grained tasks based on Table 7, especially the MI detection • No 2-D Residual Block: We can see from Fig. 4 that the model
task. Meanwhile, we can also find that the results of the experiments are without the 2-D residual block exhibits a decrease in accuracy
continuously optimized as the number of tasks increases and the gran- compared to MCA-net. This demonstrates that the 2-D residual
ularity of the tasks refines. In particular, our model achieves the best block can help the model learn the morphological features of the
results, not only on different datasets but also on different tasks. On the ECG waveforms inside the leads in the early feature extraction
PTB dataset, the improvement rates for MI detection and location tasks stage of the model. In addition, we can see that the 2-D residual
are 1.05%, 2.45%, and 2.77%, respectively. On the PTBXL dataset, we block has a better impact on model performance in the PTB
improve the AUROC by 0.08%, 1.77%, and 0.62% respectively, while dataset, which suggests that the 2-D residual block may be more
maintaining the high accuracy. This proves that the multi-task learning effective in extracting short ECG sequences.
framework we adopt can effectively capture common information and
• No 1-D Residual Blocks: And we can also see that the model
task-specific features related to different tasks, and enhance the final
without the stacked 1-D residual blocks also shows a degradation
feature representation through information complementation between
in performance. However, the impact of the 1-D residual block
tasks. We can also observe that increasing the number of tasks enables
on the model shows noticeable differences across the datasets.
more aggregation of features learned from different tasks and enhances
As shown in Fig. 4a, the 1-D residual block brings less than
the final learned representations. And the increased number of clas-
2% improvement in accuracy per task on the PTB dataset, while
sification categories leads the model to learn more accurate features,
on the PTBXL dataset (shown in Fig. 4b), It is more effective
which affects the final decision.
than the other modules, with gains of 3.77%, 5.36%, and 7.18%,
By the way, combine with Tables 5 and 6, we can also observe
that the single-task model Cnet still has higher classification accuracy respectively. This proves that the stacked 1-D residual blocks
than all the comparison methods, which further demonstrates that the can more effectively enhance the model’s ability to capture fine-
channel attention network structure can capture the relevant features grained features from long ECG sequences, and are more suitable
in the 12-lead ECGs more effectively. for processing relatively complex learning tasks.
• No SE: To explore the benefits brought by the SE modules, we
6.2. The benefits of the channel attention network remove the SE modules from the 2-D residual block and the 1-D
residual blocks and then conduct experiments. We can see from
In this subsection, to further understand the impacts of each mod- Fig. 4 that the classification performance of MCA-net decreases
ule on the final classification performance, we remove the different after removing the SE modules. This is particularly evident on the
modules of the channel attention network and conduct experiments PTB dataset, where the accuracy drops on the three classification
in the same configuration, respectively. We compare the classifica- tasks reach 2.83%, 4.62%, and 6.63%, respectively. This shows
tion accuracy between our model MCA-net and MCA-net variants, that the SE modules can strengthen the important features and
i.e., MCA-net without the 2-D residual block, MCA-net without the 1- weaken the non-important features on the ECG classification
D residual blocks, and MCA-net without the SE modules. The ablation task, thus achieving the purpose of improving the classification
experimental results on the PTB and PTBXL datasets are shown in performance.
Fig. 4.

8
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Fig. 4. Ablation experiments in channel attention network.

Fig. 5. An example from the learning curve (fold 10 on the PTBXL dataset).

Fig. 6. 5-fold cross validation results on the PTB dataset.

9
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Fig. 7. 10-fold cross validation results on the PTBXL dataset.

Table 9 dynamics of accuracy and loss on both training and validation sets
Training resources comparison of MCA-net with three under the three tasks. we can easily observe that the performance
state-of-the-art methods.
of MCA-net gradually improves on each task and quickly reaches
Model Trainable params
stability with increasing epochs. Full results of the learning curve
MFB-CBRNN 14,462 on the PTBXL and PTB datasets are shown in the supplementary
ML-ResNet 83,006
material (Fig. S1–15).
ML-Net 3,849,314
MCA-net 112,250 • Qualitative results: We use 5-fold and 10-fold cross validation
on the PTB and PTBXL datasets to obtain more reliable results,
respectively. In this case, each fold is used in turn as the test
set and the remaining data as the training set. Therefore, for
6.3. Further analysis further qualitative analysis of the experimental results, we detail
the performance results and the corresponding number of records
• Time and Resource: To demonstrate that our method is rela- and patients for each fold, as shown in Figs. 6 and 7. We can
tively economical in terms of time and resources compared to see that MCA-net achieves good overall performance on the MI
other methods. We first count the average time overhead for detection task, with an accuracy of over 95% in most folds. On
each method to complete one epoch and one fold for each task. the PTB dataset, there is some variation in the results across folds,
For the MCA-net multi-task model, each task would complete at which may be influenced by patient specificity, as the number of
the same time. For the comparison methods, we execute three patients involved in each fold is limited. In contrast, the results
tasks in parallel and choose the training time of the last finished are relatively stable across folds on the PTBXL dataset. This
one as the benchmark for comparison. Then, we further compare demonstrates the good adaptability and stability of the MCA-net.
the resource overheads of the models by considering the sum of • Limitations: Although our method demonstrates some advan-
trainable params required for each task. From Tables 8 and 9, tages for each task, it still has some limitations. Firstly, the model
we can see that MFB-CBRNN requires the least trainable params, suffers from under-learning on some of the labels due to data
but has a relatively long training time, especially the time over- imbalance, leading to a performance bottleneck in the model.
head of one fold on the PTB dataset. In contrast, ML-Net has a For example, as shown in Fig. 8 (Since the PTBXL dataset is a
relatively low time overhead on both datasets but requires far multi-label dataset, we only give the confusion matrix for each
more trainable params than the other methods. Both ML-ResNet task on the PTB dataset), the classification accuracy of ALMI was
and our method, in relative terms, balance the time and resource significantly lower than that of the other labels in MI location
overheads better, but our method achieves better classification task. Secondly, on the MI location task, we can find that ALMI
performance on each task than ML-ResNet. In the future, we will and AMI as well as IMI and ILMI are easily confused. This may be
further optimize our model to reduce the overhead of the model because the proximity of infarct locations usually leads to similar
while ensuring high classification accuracy. ECG waveform abnormalities, thus making it more difficult for
• Convergence: We perform a convergence analysis of the MCA-net our model to distinguish them which indicates that our method
based on the learning curve on the PTB and PTBXL datasets. Due needs further improvement in its ability to identify the location
to space constraints, we only randomly give part of the learning of adjacent infarcts.
curve of the PTBXL dataset, as shown in Fig. 5. It shows the

10
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

Fig. 8. Confusion matrixs of MI detection and location tasks on the PTB dataset.

7. Conclusion References

[1] Javad Hassannataj Joloudari, Sanaz Mojrian, Issa Nodehi, Amir Mashmool,
In clinical practice, cardiologists can make a preliminary diagnosis
Zeynab Kiani Zadegan, Sahar Khanjani Shirkharkolaie, Tahereh Tamadon,
of MI by analyzing abnormal ECG signals, but it is very difficult to Samiyeh Khosravi, Mitra Akbari, Edris Hassannataj, et al., A survey of applica-
give a rapid and accurate diagnosis. Therefore, providing an automatic tions of artificial intelligence for myocardial infarction disease diagnosis, 2021,
computer-aided method can effectively reduce the burden on health- arXiv preprint arXiv:2107.06179.
care professionals while providing reliable support for the prognostic [2] Emelia J. Benjamin, Paul Muntner, Alvaro Alonso, Marcio S. Bittencourt,
Clifton W. Callaway, April P. Carson, Alanna M. Chamberlain, Alexander R.
assessment of patients. Chang, Susan Cheng, Sandeep R. Das, et al., Heart disease and stroke statistics—
In this paper, we design a multi-task channel attention network 2019 update: a report from the American Heart Association, Circulation 139 (10)
named MCA-net. It uses multiple independent channels to more com- (2019) e56–e528.
[3] R.M. Savage, G.S. Wagner, R.E. Ideker, S.A. Podolsky, D.B. Hackel, Correlation
prehensively extract valid features from each lead. Each channel in-
of postmortem anatomic findings with electrocardiographic changes in patients
cludes multiple stacked convolutional layers which contain SE modules with myocardial infarction: retrospective study of patients with typical anterior
to better assign feature weights. We also add a multi-task framework to and posterior infarcts, Circulation 55 (2) (1977) 279–285.
additionally take into account the shared and complementary informa- [4] Ammar Awad Mutlag, Mohd Khanapi Abd Ghani, Mazin Abed Mohammed,
tion between the MI detection and location tasks to further improve the Abdullah Lakhan, Othman Mohd, Karrar Hameed Abdulkareem, Begonya Garcia-
Zapirain, Multi-agent systems in fog–cloud computing for critical healthcare task
MCA-net performance. The experimental results show that our model management model (CHTM) used for ECG monitoring, Sensors 21 (20) (2021)
has state-of-the-art performance. 6923.
Although our method demonstrates some advantages for each task, [5] Ammar Awad Mutlag, Mohd Khanapi Abd Ghani, Mazin Abed Mohammed, A
it still has some limitations. Firstly, the model suffers from under- healthcare resource management optimization framework for ECG biomedical
sensors, in: Efficient Data Handling for Massive Internet of Medical Things,
learning on some of the labels due to data imbalance. Secondly, MCA- Springer, 2021, pp. 229–244.
net is prone to confusion when differentiating the relevant labels. To [6] Xinwen Liu, Huan Wang, Zongjin Li, Lang Qin, Deep learning in ECG diagnosis:
address the above issues, in the future, we will consider collecting more A review, Knowl.-Based Syst. 227 (2021) 107187.
clinical data while further improving our MCA-net with the help of [7] Patrick T. O’gara, Frederick G. Kushner, Deborah D. Ascheim, Donald E. Casey,
Mina K. Chung, James A. De Lemos, Steven M. Ettinger, James C. Fang,
data augmentation such as contrastive learning. Moreover, considering
Francis M. Fesmire, Barry A. Franklin, et al., 2013 ACCF/AHA guideline for
the multi-view nature of the 12-lead ECGs, more effective lead fusion the management of ST-elevation myocardial infarction: a report of the American
methods can be considered from a multi-view fusion perspective to college of cardiology foundation/American heart association task force on
obtain a richer feature representation. practice guidelines, J. Am. Coll. Cardiol. 61 (4) (2013) e78–e140.
[8] Ph. Gabriel Steg, Stefan K. James, Dan Atar, Luigi P. Badano, Carina Blomstrom
Lundqvist, Michael A. Borger, Carlo Di Mario, Kenneth Dickstein, Gregory
Declaration of competing interest Ducrocq, et al., ESC guidelines for the management of acute myocardial
infarction in patients presenting with ST-segment elevation, Eur. Heart J. 33
(20) (2012) 2569–2619.
The authors declare that they have no known competing finan- [9] L.D. Sharma, R.K. Sunkaria, Myocardial infarction detection and localization
cial interests or personal relationships that could have appeared to using optimal features based lead specific approach, Innov. Res. Biomed. En
influence the work reported in this paper. 41 (1) (2020) 58–70.
[10] Gong Zhang, Yujuan Si, Di Wang, Weiyi Yang, Yongjian Sun, Automated
detection of myocardial infarction using a gramian angular field and principal
Acknowledgments component analysis network, IEEE Access 7 (2019) 171570–171583.
[11] Nahian Ibn Hasan, Arnab Bhattacharjee, Deep learning approach to cardiovas-
cular disease classification employing modified ECG signal from empirical mode
This work was supported in part by the National Key Research decomposition, Biomed. Signal Process. Control 52 (2019) 128–140.
and Development Program of China (No. 2021YFF1201200), the NSFC- [12] Chaitra Sridhar, Oh Shu Lih, V. Jahmunah, Joel E.W. Koh, Edward J. Ciaccio,
Zhejiang Joint Fund for the Integration of Industrialization and In- Tan Ru San, N. Arunkumar, Seifedine Kadry, U. Rajendra Acharya, Accurate
detection of myocardial infarction using non linear features with ECG signals, J.
formatization (No. U1909208), 111 Project (No. B18059) and the
Ambient Intell. Hum. Comput. 12 (3) (2021) 3227–3244.
Science and Technology Major Project of Changsha (No. kh2202004). [13] Binish Fatimah, Pushpendra Singh, Amit Singhal, Dipro Pramanick, Pranav S.,
This work was carried out in part using computing resources at the High Ram Bilas Pachori, Efficient detection of myocardial infarction from single lead
Performance Computing Center of Central South University. ECG signal, Biomed. Signal Process. Control 68 (2021) 102678.
[14] Dionisije Sopic, Amin Aminifar, Amir Aminifar, David Atienza, Real-time event-
driven classification technique for early detection and prevention of myocardial
Appendix A. Supplementary data infarction on wearable systems, IEEE Trans. Biomed. Circuits Syst. 12 (5) (2018)
982–992.
[15] Ashok Kumar Dohare, Vinod Kumar, Ritesh Kumar, Detection of myocardial
Supplementary material related to this article can be found online infarction in 12 lead ECG using support vector machine, Appl. Soft Comput.
at https://doi.org/10.1016/j.compbiomed.2022.106199. 64 (2018) 138–147.

11
W. Pan et al. Computers in Biology and Medicine 150 (2022) 106199

[16] Chuang Han, Li Shi, Automated interpretable detection of myocardial infarction [31] Narjes Benameur, M. Abed Mohammed, Ramzi Mahmoudi, Younes Arous, Be-
fusing energy entropy and morphological features, Comput. Methods Programs gonya Garcia-Zapirain, Karrar Hameed Abdulkareem, Mohamed Hedi Bedoui,
Biomed. 175 (2019) 9–23. Parametric methods for the regional assessment of cardiac wall motion
[17] Awni Y. Hannun, Pranav Rajpurkar, Masoumeh Haghpanahi, Geoffrey H. Tison, abnormalities: comparison study, Comput. Mater. Continua 69 (1) (2021)
Codie Bourn, Mintu P. Turakhia, Andrew Y. Ng, Cardiologist-level arrhythmia 1233–1252.
detection and classification in ambulatory electrocardiograms using a deep neural [32] Hope Pereira, Nivedita Daimiwal, Analysis of features for myocardial infarction
network, Nat. Med. 25 (1) (2019) 65–69. and healthy patients based on wavelet, in: 2016 Conference on Advances in
[18] Zahra Ebrahimi, Mohammad Loni, Masoud Daneshtalab, Arash Gharehbaghi, A Signal Processing, CASP, 2016, pp. 164–169.
review on deep learning methods for ECG arrhythmia classification, Expert Syst. [33] Jia Liu, Chi Zhang, Tapani Ristaniemi, Fengyu Cong, Detection of myocardial
Appl. X 7 (2020) 100033. infarction from multi-lead ECG using dual-q tunable Q-factor wavelet transform,
[19] Nils Strodthoff, Claas Strodthoff, Detecting and interpreting myocardial infarction in: 2019 41st Annual International Conference of the IEEE Engineering in
using fully convolutional neural networks, Physiol. Meas. 40 (1) (2019) 015001. Medicine and Biology Society, EMBC, 2019, pp. 1496–1499.
[20] U. Rajendra Acharya, Hamido Fujita, Shu Lih Oh, Yuki Hagiwara, Jen Hong [34] Teh Ying Wah, Mazin Abed Mohammed, Uzair Iqbal, Seifedine Kadry, Arnab
Tan, Muhammad Adam, Application of deep convolutional neural network for Majumdar, Orawit Thinnukool, et al., Novel DERMA fusion technique for ECG
automated detection of myocardial infarction using ECG signals, Inform. Sci. heartbeat classification, Life 12 (6) (2022) 842.
415–416 (2017) 190–198. [35] Garima Sahu, Kailash Chandra Ray, An efficient method for detection and
[21] Ziyang He, Shuaiying Yuan, Jianhui Zhao, Bo Du, Zhiyong Yuan, Adi Alhudhaif, localization of myocardial infarction, IEEE Trans. Instrum. Meas. 71 (2021) 1–12.
Fayadh Alenezi, Sara A. Althubiti, A novel myocardial infarction localiza- [36] Wenhan Liu, Mengxin Zhang, Yidan Zhang, Yuan Liao, Qijun Huang, Sheng
tion method using multi-branch DenseNet and spatial matching-based active Chang, Hao Wang, Jin He, Real-time multilead convolutional neural network
semi-supervised learning, Inform. Sci. 606 (2022) 649–668. for myocardial infarction detection, IEEE J. Biomed. Health. Inf. 22 (5) (2017)
[22] Chuang Han, Li Shi, ML-ResNet: A novel network to detect and locate myocardial 1434–1444.
infarction using 12 leads ECG, Comput. Methods Programs Biomed. 185 (2020) [37] Wenqiang Li, Yuk Ming Tang, Kai Ming Yu, Suet To, SLC-GAN: An automated
105138. myocardial infarction detection model based on generative adversarial networks
[23] Wenhan Liu, Fei Wang, Qijun Huang, Sheng Chang, Hao Wang, Jin He, MFB- and convolutional neural networks with single-lead electrocardiogram synthesis,
CBRNN: A hybrid network for MI detection using 12-lead ECGs, IEEE J. Biomed. Inform. Sci. 589 (2022) 738–750.
Health. Inf. 24 (2) (2020) 503–514. [38] Wenhan Liu, Qijun Huang, Sheng Chang, Hao Wang, Jin He, Multiple-feature-
[24] Yangjie Cao, Tingting Wei, Bo Zhang, Nan Lin, Joel J.P.C. Rodrigues, Jie Li, branch convolutional neural network for myocardial infarction diagnosis using
Di Zhang, ML-net: Multi-channel lightweight network for detecting myocardial electrocardiogram, Biomed. Signal Process. Control 45 (2018) 22–32.
infarction, IEEE J. Biomed. Health. Inf. 25 (10) (2021) 3721–3731. [39] Yang Cao, Wenyan Liu, Shuang Zhang, Lisheng Xu, Baofeng Zhu, Huiying Cui,
[25] Lidan Fu, Binchun Lu, Bo Nie, Zhiyun Peng, Hongying Liu, Xitian Pi, Hybrid Ning Geng, Hongguang Han, Stephen E. Greenwald, Detection and localization
network with attention mechanism for detection and location of myocardial of myocardial infarction based on multi-scale ResNet and attention mechanism,
infarction based on 12-lead electrocardiogram signals, Sensors 20 (4) (2020) Front. Physiol. 13 (2022) 24.
1020. [40] R. Bousseljot, D. Kreiseler, A. Schnabel, Nutzung der EKG-signaldatenbank
[26] Jie Hu, Li Shen, Gang Sun, Squeeze-and-excitation networks, in: Proceedings of CARDIODAT der PTB über das Internet, Biomed. Tech. 40 (1995) 317–318.
the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2018, [41] Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I.
pp. 7132–7141. Lunze, Wojciech Samek, Tobias Schaeffter, PTB-XL, a large publicly available
[27] K. Minami, H. Nakajima, T. Toyoshima, Real-time discrimination of ventricular electrocardiography dataset, Sci. Data 7 (1) (2020) 1–15.
tachyarrhythmia with Fourier-transform neural network, IEEE Trans. Biomed. [42] Rich Caruana, Multi-task learning, Mach. Learn. 28 (1) (1997) 41–75.
Eng. 46 (2) (1999) 179–185. [43] Jonathan Baxter, A model of inductive bias learning, J. Artificial Intelligence
[28] D.A. Coast, R.M. Stern, G.G. Cano, S.A. Briller, An approach to cardiac arrhyth- Res. 12 (2000) 149–198.
mia analysis using hidden Markov models, IEEE Trans. Biomed. Eng. 37 (9) [44] Sebastian Ruder, An overview of multi-task learning in deep neural networks,
(1990) 826–836. 2017, arXiv preprint arXiv:1706.05098.
[29] Li Sun, Yanping Lu, Kaitao Yang, Shaozi Li, ECG analysis using multiple instance [45] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey
learning for myocardial infarction detection, IEEE Trans. Biomed. Eng. 59 (12) Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et
(2012) 3348–3356. al., Tensorflow: A system for large-scale machine learning, in: 12th USENIX
[30] S. Selva Nidhyananthan, S. Saranya, R. Shantha Selva Kumari, Myocardial Symposium on Operating Systems Design and Implementation (OSDI 16), 2016,
infarction detection and heart patient identity verification, in: 2016 International pp. 265–283.
Conference on Wireless Communications, Signal Processing and Networking
(WiSPNET), 2016, pp. 1107–1111.

12

You might also like