Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Feature selection may improve deep neural networks

for the bioinformatics problems

Zheng Chen*, Meng Pang*, Zixin Zhao, Shuainan Li, Rui Miao, Yifan Zhang, Xiaoyue Feng,
Xin Feng, Yexian Zhang, Meiyu Duan, Lan Huang, Fengfeng Zhou#.

BioKnow Health Informatics Lab, College of Computer Science and Technology, and Key
Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,
Jilin University, Changchun, Jilin, China, 130012.

* These authors contributed equally to this work.

# Correspondence may be addressed to Fengfeng Zhou: FengfengZhou@gmail.com or


ffzhou@jlu.edu.cn . Lab web site: http://www.healthinformaticslab.org/ . Phone: +86-431-
8516-6024. Fax: +86-431-8516-6024.
Supplementary Figure S1

Performance difference between FS+DNN and DNN. The performance measurements were
F1-score, Precision and Recall. DNN may be one of the three deep neural network algorithms
CNN/DBN/RNN. “FS+DNN” is the model using the features selected by a feature selection
algorithm, while “DNN” is the model using the initial list of 500 features with the maximal
variances. The number of features selected by the t-test based incremental feature selection
algorithm was labeled over the column, and the classification performance of each feature
subset was evaluated by one of the three DNNs, i.e., CNN/DBN/RNN.

0.5
CNN DBN RNN
0.4

0.3
F1-score Difference

0.2

0.1

0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1

-0.2

-0.3
Dataset

0.8
0.7
0.6
Precision Difference

0.5
0.4
0.3
0.2
0.1
0
-0.1 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2
-0.3
Dataset

CNN DBN RNN


0.6
0.5
0.4
0.3
Recall Difference

0.2
0.1
0
-0.1 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970

-0.2
-0.3
-0.4
Dataset
CNN DBN RNN

Supplementary Figure S2

How Trank improved CNN models with different parameter values. Performance measurement
differences between FS+CNN and CNN on the five datasets. FS+CNN is the CNN model
trained over the features selected by Trank, and CNN is the CNN model trained over the initial
list of 500 features. Three performance measurements F1-score/Precision/Recall were
evaluated using CNN with different (a) Epoch values, (b) Layer numbers, and (c) Learning
rates over the five datasets were labeled over the columns.

0.6
0.5
0.4
F1-score Difference

0.3
0.2
0.1
0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1
-0.2
Epoch=100 Epoch=200 Epoch=300
-0.3
Dataset
0.6
0.5
0.4
Precision Difference

0.3
0.2
0.1
0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1
-0.2
Epoch=100 Epoch=200 Epoch=300
-0.3
Dataset

0.7
0.6
0.5
0.4
Recall Difference

0.3
0.2
0.1
0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1
-0.2 Epoch=100 Epoch=200 Epoch=300
-0.3
Dataset

(a)

0.5

0.4

0.3
F1-score Difference

0.2

0.1

0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1

-0.2
LayerNum=3 LayerNum=4 LayerNum=5
-0.3
Dataset
0.5

0.4

0.3
Precision Difference

0.2

0.1

0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1

-0.2
LayerNum=3 LayerNum=4 LayerNum=5
-0.3
Dataset

0.6

0.5

0.4
Recall Difference

0.3

0.2

0.1

0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1
LayerNum=3 LayerNum=4 LayerNum=5
-0.2
Dataset

(b)

1
LearningRate=0.001
0.8 LearningRate=0.01
LearningRate=0.1
0.6
F1-score Difference

0.4

0.2

0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2

-0.4
Dataset
1.2

0.8
Precision Difference

0.6

0.4

0.2

0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
-0.4
Dataset

0.8

0.6

0.4
Sp Difference

0.2

0
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
-0.4
Dataset

(c)

Supplementary Figure S3

How Trank improved DBN models with different parameter values. Performance measurement
differences between FS+DBN and DBN on the five datasets. FS+DBN is the DBN model
trained over the features selected by Trank, and DBN is the DBN model trained over the initial
list of 500 features. Three performance measurements F1-score/Precision/Recall were
evaluated using DBN with different (a) Epoch values, (b) Layer numbers, and (c) Learning
rates over the five datasets were labeled over the columns.
0.4000

0.3000

0.2000
F1-score Difference

0.1000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1000

-0.2000
Epoch=100 Epoch=200 Epoch=300
-0.3000
Dataset

0.4000
Epoch=100 Epoch=200 Epoch=300
0.3500
0.3000
Precision Difference

0.2500
0.2000
0.1500
0.1000
0.0500
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.0500
Dataset

0.5000
0.4000
0.3000
Recall Difference

0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1000
-0.2000
-0.3000
Epoch=100 Epoch=200 Epoch=300
-0.4000
Dataset

(a)
1.0000

0.8000

0.6000
F1-score Difference

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
LayerNum=3 LayerNum=4 LayerNum=5
-0.4000
Dataset

1.0000
LayerNum=3 LayerNum=4 LayerNum=5
0.9000
0.8000
0.7000
Precision Difference

0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
-0.1000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Dataset

1.0000

0.8000

0.6000
Recall Difference

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000

-0.4000
LayerNum=3 LayerNum=4 LayerNum=5
-0.6000
Dataset

(b)
1.0000
F1-score Difference 0.8000

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
-0.4000
Dataset

0.9000
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
0.8000
0.7000
Precision Difference

0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
-0.1000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Dataset

0.8000

0.6000

0.4000
Recall Difference

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000

-0.4000
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
-0.6000
Dataset

(c)

Supplementary Figure S4

How Trank improved RNN models with different parameter values. Performance measurement
differences between FS+RNN and RNN on the five datasets. FS+RNN is the RNN model
trained over the features selected by Trank, and RNN is the RNN model trained over the initial
list of 500 features. Three performance measurements F1-score/Precision/Recall were
evaluated using RNN with different (a) Epoch values, (b) Layer numbers, and (c) Learning
rates over the five datasets were labeled over the columns.

0.6000

0.5000

0.4000
F1-score Difference

0.3000

0.2000

0.1000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1000
Epoch=100 Epoch=200 Epoch=300
-0.2000
Dataset

0.8000
Epoch=100 Epoch=200 Epoch=300
0.7000
0.6000
Precision Difference

0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1000
Dataset

0.6000
0.5000
0.4000
0.3000
Recall Difference

0.2000
0.1000
0.0000
-0.1000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970

-0.2000
-0.3000
Epoch=100 Epoch=200 Epoch=300
-0.4000
Dataset

(a)
0.6000

0.5000

0.4000
F1-score Difference

0.3000

0.2000

0.1000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1000
LayerNum=3 LayerNum=4 LayerNum=5
-0.2000
Dataset

0.8000
LayerNum=3 LayerNum=4 LayerNum=5
0.7000
0.6000
Precision Difference

0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Dataset

0.5000
0.4000
0.3000
Recall Difference

0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1000
-0.2000
-0.3000
LayerNum=3 LayerNum=4 LayerNum=5
-0.4000
Dataset

(b)
0.7000
0.6000
0.5000
F1-score Difference

0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.1000
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
-0.2000
Dataset

0.8000
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
0.7000
0.6000
Precision Difference

0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Dataset

0.6000

0.4000
Recall Difference

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000

-0.4000
LearningRate=0.001 LearningRate=0.01 LearningRate=0.1
-0.6000
Dataset

(c)

Supplementary Figure S5

How Trank improved the three DNN models with different numbers of top-ranked features.
Performance differences between FS+DNN and DNN on the five datasets. FS+DNN is the
DNN model trained over the features selected by Trank, and DNN is the DNN model trained
over the initial list of the top-ranked features. Data labels on the top of each column were the
numbers of features selected by Trank using DNN with different numbers of top-ranked features
over the five datasets. DNN could be (a) CNN, (b) DBN and (c) RNN. The performance
improvement of DNN by FS was also illustrated by F1-score, Precision and Recall in this figure.

CNN models
0.8000
Top1000 Top1500 Top2000
0.6000
F1-score Difference

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000

-0.4000
dataset

CNN models
0.6000
Top1000 Top1500 Top2000
0.5000
0.4000
Precision Difference

0.3000
0.2000
0.1000
0.0000
-0.1000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
-0.3000
dataset

CNN models
0.8000
Top1000 Top1500 Top2000
0.6000
Recall Difference

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
dataset

(a)
DBN models
1.0000
Top1000 Top1500 Top2000
0.8000
F1-score Difference

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
dataset

DBN models
1.2000
Top1000 Top1500 Top2000
1.0000
Precision Difference

0.8000

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
dataset

DBN models
1.0000
Top1000 Top1500 Top2000
0.8000
0.6000
Recall Difference

0.4000
0.2000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
-0.4000
-0.6000
dataset

(b)
RNN models
1.0000

Top1000 Top1500 Top2000


0.8000
F1-score Difference

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
dataset

RNN models
0.8000

0.6000
Precision Difference

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
dataset

Top1000 Top1500 Top2000

RNN models
1.0000
0.8000
Recall Difference

0.6000
0.4000
0.2000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
dataset

Top1000 Top1500 Top2000

(c)

Supplementary Figure S6

How 11 feature selection algorithms improved the three DNN models with on the five datasets.
Prediction performance differences were between FS+DNN and DNN on the five datasets. The
DNN model was trained over the features selected by each feature selection algorithm denoted
on the horizontal axis, and DNN is the DNN model trained over the initial list of 500 features.
DNN could be (a) CNN, (b) DBN and (c) RNN. The three prediction performance
measurements were F1-score, Precision and Recall.

CNN models improved by 11 feature selection algorithms


0.5000
F1-score Differences

0.4000
0.3000
0.2000
0.1000
0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.1000
-0.2000
-0.3000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.4000

0.6000 CNN models improved by 11 feature selection algorithms

0.4000
Precision Differences

0.2000

0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.2000

-0.4000

-0.6000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.8000

0.7000 CNN models improved by 11 feature selection algorithms

0.6000
0.5000
Recall Differences

0.4000
0.3000
0.2000
0.1000
0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.1000
-0.2000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.3000

(a)
0.5000 DBN models improved by 11 feature selection algorithms
0.4000
F1-score Difference

0.3000
0.2000
0.1000
0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.1000
-0.2000
-0.3000
-0.4000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.5000

DBN models improved by 11 feature selection algorithms


0.5000
0.4000
Precision Differences

0.3000
0.2000
0.1000
0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.1000
-0.2000
-0.3000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970

0.6000 DBN models improved by 11 feature selection algorithms

0.4000
Recall Differences

0.2000

0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE

-0.2000

-0.4000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.6000

(b)
0.8000 RNN models improved by 11 feature selection algorithms
0.7000
F1-score Difference

0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.1000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000

0.8000 RNN models improved by 11 feature selection algorithms

0.6000
Precision Differences

0.4000

0.2000

0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE

-0.2000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.4000

0.8000 RNN models improved by 11 feature selection algorithms

0.6000

0.4000
Recall Differences

0.2000

0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE

-0.2000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.4000

(c)

Supplementary Figure S7

Averaged performance improvements of the 11 feature selection algorithms on the three DNNs.
Prediction performance differences between FS+DNN and DNN on the five datasets. Here
presented the differences of (a) F1-score, (b) precision and (c) recall between FS+DNN and
DNN. The DNN model was trained over the features selected by each feature selection
algorithm denoted on the horizontal axis, and DNN is the DNN model trained over the initial
list of 500 features. The averaged accuracy improvement was averaged over the five
methylomic datasets. The data series Avg3 gives the values averaged over the other three curves
in this figure.

Averaged F1-score improvements of 11 feature selection algorithms


0.3
0.25
0.2
0.15
0.1
0.05
0
Chi2 Dtree FCBF Lasso lSVM McOne MI MetaTree RF SVM-RFE Trank
-0.05
-0.1
CNN DBN RNN Avg3

(a)

Averaged precision improvements of 11 feature selection algorithms


2.0000

1.5000

1.0000

0.5000

0.0000
Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.5000

-1.0000

-1.5000
CNN DBN RNN Avg3

(b)

Averaged recall improvements of 11 feature selection algorithms


1.4000
1.2000
1.0000
0.8000
0.6000
0.4000
0.2000
0.0000
-0.2000 Dtree FCBF Lasso McOne RF Trank lSVM chi2 MetaTree MI SVM-RFE
-0.4000
-0.6000
CNN DBN RNN Avg3
(c)

Supplementary Figure S8

Prediction performances of SVM-RFE working with the three DNNs on the five methylomic
datasets. Here demonstrated the three prediction performance measurements, i.e., (a) F1-score,
(b) precision and (c) recall of SVM-RFE+DNN. DNN could be CNN, DBN or RNN. The
horizontal axis was the dataset, and the vertical axis was the prediction performance of the deep
neural network using the features selected by SVM-RFE.

SVM-RFE+DNN
1.2000

1.0000

0.8000
F1-score

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
CNN DBN RNN

(a)

SVM-RFE+DNN
1.2000

1.0000

0.8000
Precision

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
CNN DBN RNN

(b)
SVM-RFE+DNN
1.2000

1.0000

0.8000
Recall

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
CNN DBN RNN

(c)

Supplementary Figure S9

How SVM-RFE improved the three DNN models with different activation functions.
Performance measurement differences between FS+DNN and DNN on the five datasets are
shown for F1-scores, precisions and recalls of FS+DNN between DNN. FS+DNN is the DNN
model trained over the features selected by the SVM-RFE based incremental feature selection
algorithm, and DNN is the DNN model trained over the initial list of 500 features. Prediction
performance measurements were evaluated for the DNN models with different activation
functions using features selected by SVM-RFE. The evaluated performance measurements
were (a) F1-scores, (b) Precision and (c) Recall.

CNN models
1.2000
1.0000
0.8000
F1-score Difference

0.6000
0.4000
0.2000
0.0000
-0.2000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.4000
-0.6000
-0.8000
relu sigmoid tanh
DBN models
1.2000

1.0000
F1-score Difference

0.8000

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
relu sigmoid tanh

RNN models
1.2000

1.0000
F1-score Difference

0.8000

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
relu sigmoid tanh

(a)

CNN
1.2000
1.0000
Difference

0.8000
0.6000
0.4000
0.2000
Precision

0.0000
-0.2000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.4000
-0.6000
relu sigmoid tanh
DBN
1.2000

1.0000
Precision Difference

0.8000

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
relu sigmoid tanh

RNN
1.2000
Difference

1.0000

0.8000

0.6000
Precision

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
relu sigmoid tanh

(b)

CNN
1.5000

1.0000
Recall Difference

0.5000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.5000

-1.0000

-1.5000
relu sigmoid tanh
DBN
1.0000

0.8000
Precision Difference

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
relu sigmoid tanh

RNN
1.2000
1.0000
Recall Difference

0.8000
0.6000
0.4000
0.2000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
relu sigmoid tanh

(c)

Supplementary Figure S11

Three recent DNN models were improved by SVM-RFE on the five datasets. Overall
performance measurement differences between FS+DNN and DNN on the two datasets are
shown below, as (a) F1-scores, (b) precisions and (c) recalls of FS+DNN and DNN. FS+DNN
is the DNN model trained over the features selected by the SVM-RFE based incremental feature
selection algorithm, and DNN is the DNN model trained over the initial list of 500 features.
Data labels on the top of each column were the numbers of features selected by SVM-RFE
using DNN. DNN could be MobilenetV2, ShufflenetV2 and Squeezenet.
F1-score improvements for recent deep neural networks
0.9000
0.8000
0.7000
0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
MobilenetV2 ShufflenetV2 Squeezenet

(a)

Precision improvements for recent deep neural networks


0.7000
0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
mobilenetV2 shufflenetV2 squeezenet

(b)

Recall improvements for recent deep neural networks


0.7000
0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
mobilenetV2 shufflenetV2 squeezenet

(c)
Supplementary Figure S12

Prediction performance improvements of CNN/DBN/RNN on the 17 transcriptomic


datasets by feature selection. This figure demonstrated the improvements of (a) F1-score, (b)
precision and (c) recall of FS+DNN and DNN. FS+DNN is the DNN model trained over the
features selected by the SVM-RFE based incremental feature selection algorithm, and DNN is
the DNN model trained over the initial list of 500 features. Data labels on the top of each column
were the numbers of features selected by SVM-RFE using DNN. DNN could be CNN, DBN or
RNN.

F1-score improvements on transcriptomic datasets


1.2
1
0.8
0.6
0.4
0.2
0
ALL1 ALL2 ALL3 ALL4 Adeno CNS Colon DLBCL Gas1 Gas2 Gas3 Leuk Lym Mye Pros Stroke T1D
CNN DBN RNN

(a)

Precision improvements on transcriptomic datasets


1.2
1
0.8
0.6
0.4
0.2
0
ALL1 ALL2 ALL3 ALL4 Adeno CNS Colon DLBCL Gas1 Gas2 Gas3 Leuk Lym Mye Pros Stroke T1D
CNN DBN RNN

(b)

recall improvements on transcriptomic datasets


1.2
1
0.8
0.6
0.4
0.2
0
ALL1 ALL2 ALL3 ALL4 Adeno CNS Colon DLBCL Gas1 Gas2 Gas3 Leuk Lym Mye Pros Stroke T1D
CNN DBN RNN

(c)
Supplementary Figure S13

The CNN/DBN/RNN models improved by the SVM-RFE based IFS strategy using the top 500
Trank features. This figure demonstrated the improvements in (a) F1-scores, (b) precisions and
(c) recalls of FS over each DNN. The DNN model was trained over the top 500 Trank features.
Data labels on the top of each column were the numbers of features selected by SVM-RFE
using DNN. DNN could be CNN, DBN or RNN.

F1-score improvements
0.7000
0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
CNN DBN RNN

(a)

Precision improvements
1.2000
1.0000
0.8000
0.6000
0.4000
0.2000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
-0.2000
CNN DBN RNN

(b)
Recall improvements
0.9000
0.8000
0.7000
0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
-0.1000 GSE103186 GSE53045 GSE66695 GSE74845 GSE80970

CNN DBN RNN

(c)

Supplementary Figure S14

Performance comparison between Variance and Trank over the five datasets. SVM-RFE was
used to select features from the top features ranked by Variance (Var+FS) and Trank (Trank+FS).
The prediction performance measurements were averaged over the three algorithms
CNN/DBN/RNN was compared for different numbers of features, i.e., 500, 1000, 1500 and
2000. The evaluated prediction performance measurements were (a) F1-score, (b) Precision, (c)
Recall.

Averaged F1-score (top=500)


1.0000

0.8000

0.6000

0.4000

0.2000

0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS
Averaged F1-score (top=1000)
1.0000

0.9500

0.9000

0.8500

0.8000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

Averaged F1-score (top=1500)


1.0000
0.9800
0.9600
0.9400
0.9200
0.9000
0.8800
0.8600
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

Averaged F1-score (top=2000)


1.0000

0.9500

0.9000

0.8500

0.8000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

(a)
Averaged precision of CNN, DBN and RNN(top=500)
1.2000
1.0000
0.8000
0.6000
0.4000
0.2000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

Averaged precision of CNN, DBN and RNN(top=1000)


1.2000
1.0000
0.8000
0.6000
0.4000
0.2000
0.0000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

Averaged precision of CNN, DBN and RNN(top=1500)


1.0500

1.0000

0.9500

0.9000

0.8500

0.8000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS
Averaged precision of CNN, DBN and RNN(top=2000)
1.0500

1.0000

0.9500

0.9000

0.8500

0.8000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

(b)

Averaged recall of CNN, DBN and RNN(top=500)


1.0500

1.0000

0.9500

0.9000

0.8500

0.8000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

Averaged recall of CNN, DBN and RNN(top=1000)


1.0200
1.0000
0.9800
0.9600
0.9400
0.9200
0.9000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS
Averaged recall of CNN, DBN and RNN(top=1500)
1.0200
1.0000
0.9800
0.9600
0.9400
0.9200
0.9000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

Averaged recall of CNN, DBN and RNN(top=2000)


1.0200
1.0000
0.9800
0.9600
0.9400
0.9200
0.9000
GSE103186 GSE53045 GSE66695 GSE74845 GSE80970
Var+FS Trank+FS

(c)

You might also like