Professional Documents
Culture Documents
Brain Tumor Classification Via Genetic Algorithms 3 - Use
Brain Tumor Classification Via Genetic Algorithms 3 - Use
ScienceDirect
Article history: Gliomas are the most common type of primary brain tumors in adults and their early
Received 2 February 2018 detection is of great importance. In this paper, a method based on convolutional neural
Received in revised form networks (CNNs) and genetic algorithm (GA) is proposed in order to noninvasively classify
16 June 2018 different grades of Glioma using magnetic resonance imaging (MRI). In the proposed
Accepted 10 October 2018 method, the architecture (structure) of the CNN is evolved using GA, unlike existing methods
Available online 18 October 2018 of selecting a deep neural network architecture which are usually based on trial and error or
by adopting predefined common structures. Furthermore, to decrease the variance of
Keywords: prediction error, bagging as an ensemble algorithm is utilized on the best model evolved
Brain tumor by the GA. To briefly mention the results, in one case study, 90.9 percent accuracy for
Magnetic resonance imaging classifying three Glioma grades was obtained. In another case study, Glioma, Meningioma,
Medical image classification and Pituitary tumor types were classified with 94.2 percent accuracy. The results reveal the
Convolutional neural networks effectiveness of the proposed method in classifying brain tumor via MRI images. Due to the
Genetic algorithms flexible nature of the method, it can be readily used in practice for assisting the doctor to
Bagging ensemble algorithm diagnose brain tumors in an early stage.
© 2018 Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish
Academy of Sciences. Published by Elsevier B.V. All rights reserved.
* Corresponding author at: School of Mechanical Engineering, College of Engineering, University of Tehran, P.O.B. 11155-4563, Tehran, Iran.
E-mail addresses: amin.kabir@ut.ac.ir (A. Kabir Anaraki), m.ayati@ut.ac.ir (M. Ayati), foadkazemi@gmail.com (F. Kazemi).
https://doi.org/10.1016/j.bbe.2018.10.004
0208-5216/© 2018 Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences. Published by Elsevier
B.V. All rights reserved.
64 biocybernetics and biomedical engineering 39 (2019) 63–74
presence of brain tumors in the center of the nervous system support vector machine, neural network, Hybrid intelligent
of human body, even benign tumors may incapacitate the techniques and probabilistic neural network [10–14].
brain and cause irrecoverable effects. In [15] classification schemes and their performance in
Gliomas are considered as the most common type of order to classify several types of brain tumors and grades of
primary brain tumor in adults [1]. According to the World Gliomas are investigated. In their proposed method, the region
Health Organization grading system [2], gliomas are diagnosed of interest is first defined and then some features such as the
in grades of severity from I to IV. Grade I tumors have cells that shape of the tumor are extracted from the MR images. In order
are benign and are approximately normal in appearance. to select the appropriate features, support vector machines
Grade II tumors have cells that appear to be slightly abnormal. (SVM) with recursive feature elimination has been used. By
Grade III tumors have cells that are malignant and clearly observing their results, it is seen that their proposed method in
abnormal. The most severe type of brain tumors that contain binary classifications has obtained high accuracies, but the
fast-spreading and abnormal cells are considered as Grade IV. accuracy of the multi-class classification according to the
Glioblastoma multiforme (GBM) are quintessential tumors of confusion matrix provided in this paper, is low. In a recent
this type. Meningioma tumors, arise from a layer of tissue article, [16], the classification performance of three tumor
called the meninges. Meninges cover the brain and spinal cord types using fully connected and convolutional neural net-
and act as protector. They are mostly considered as benign works (CNNs) is compared. As described in this article, the
tumors, because they grow at a slow pace and are also less performance of various structures of the convolutional net-
likely to spread. Pituitary tumors develop in the pituitary gland works has been tested and, eventually, a relatively shallow
and account for 14% of all primary intracerebral tumors, with network with two convolutional layers, two max-pooling
most of them are due to spontaneous mutation and some are layers, and two fully connected layers is used for classification.
due to inherited genetic defects [3]. These tumors are also It has also been mentioned that the use of Vanilla preproces-
benign and they are much less likely to spread. Although these sing has been effective in classification accuracy. In another
tumors are considered benign, they can cause serious recent study [17], CNN is used to classify healthy and
health problems due to their presence in sensitive areas of unhealthy brain images as well as high-grade and low-grade
the brain [4]. Glioma tumors. A modified version of the famous AlexNet was
Early detection plays a major role in treatment and recovery used as their network architecture. Despite the valuable works
of the patient [5]. Diagnosing a brain tumor and its grade being done in this area, developing a robust and practical
usually undergoes a complicated and time-consuming pro- method to classify brain MR images still requires more effort.
cess. Usually, the patient refers to MRI when the brain tumor Convolutional neural networks have had many remarkable
has grown sufficiently and several harassing symptoms have successes in solving complex problems of machine learning
appeared. After examining the brain images, if tumor exis- and currently are considered as the most successful method
tence is suspected, the patient's brain biopsy comes to action. for image processing [18]. Instead of matrix multiplication,
Unlike MR, biopsy has an invasive procedure and in some convolution operators are used in most layers of these
cases, it may even take up to a month for a definite answer. networks. This contributes to the superiority of convolutional
MRI specialists perform techniques such as perfusion to grade networks in solving problems with high computational costs.
tumor and biopsy to confirm. It should be noted that in recent This is very important since the MRI datasets in MRI-based
years some novel methods have been introduced in order to diagnosis include thousands of images with different qualities
grade brain tumors other than biopsy. In particular, distin- and types. Another advantage of this method compared to
guishing high-grade and low-grade glioma using perfusion MR shallow machine learning methods, is automatic feature
imaging has been able to resolve some biopsy drawbacks. For extraction. In conventional methods, a method was usually
these reasons, utilization of a computer-aided system for proposed for extracting features, and to further reduce the
detection is helpful. An automatic efficient system for brain dimensions, a method was used to select the dominant
tumor classification assists doctors in interpretation of features. Recently, CNN has also been widely used in the
medical images and supports decision of specialists in an processing of medical images using deep neural networks
early stages of tumor growth. In this study, brain tumor such as grade classification [19], segmentation [20,21] and skull
grading is done by spending much less time and as is stripping of brain tumor images [22].
confirmed in this paper with high accuracy. Furthermore, In this article, a method based on CNN is proposed to
the whole process of classification is non-invasive. classify three grades of Gliomas with MR images. Selecting an
Considerable attention has been paid to medical image appropriate deep neural network architecture for a specific
analysis for diagnosis purposes. Recently, the emergence of purpose, consists of a challenging procedure which is usually
modern machine learning algorithms and their proven done by trial and error or employing a common architecture.
efficiency in solving various problems in the field of artificial Unlike conventional schemes, in the proposed method the
intelligence, have also doubled the interest to the field of architecture of the convolutional neural network is evolved
health-related topics and algorithms [6]. Many researches on using GA. Networks with different number of layers and
classifying various tumors using MRI, especially MR brain parameters are investigated by GA and the network with the
images, artificial neural networks and evolutionary algorithms best performance on the dataset was selected for further
have been done [7–9] and various methods have been processing. Afterwards, a model averaging method called
implemented as well. Previous studies indicate that normal bagging is utilized on the best model evolved by the GA.
and abnormal classes in brain MR images are easily distin- Bagging is an ensemble method and is employed in order to
guished using shallow machine learning algorithms such as decrease the variance of final diagnosis. The proposed method
biocybernetics and biomedical engineering 39 (2019) 63–74 65
is used in two case studies. In first case, three Glioma grades In the case of normal MRIs, in average six middle MR slices
are classified with 90.9% accuracy. In second case, for are selected at identical intervals for each subject. These
demonstrating the strength of the proposed method, three images are used to distinguish between healthy and tumorous
different types of tumors from another MRI database were brains. Images of brain tumors that have been identified with
used as the input to CNN, and the final performance of infusion of contrast material are also used. Due to differences
diagnosis was 94.2%. Results confirm that proposed method is in the size and location of tumors, the number of slices varies
applicable on different brain MRI datasets in order to assist the from case to case.
specialist in early detection. After training the classifier network's parameters, healthy
The remainder of the paper is organized as follows. In and tumor slices can be classified. In other words, by
Section 2, a brief explanation about the datasets utilized as examining all brain-MR slices, the classifier recognizes
input of the networks is given. CNNs are discussed in details in between normal and abnormal slices. This not only helps to
Section 3. In Section 4, the proposed method for selecting an identify the tumor existence but also determines the approxi-
appropriate architecture based on GA is presented. Experi- mate location of the tumor in the brain. Also, the number of
mental results are presented and discussed in Sections 5 and 6, slides that contain a trace of tumor informs about the
respectively. Finally, Section 7 is dedicated to conclusions. approximate size and grade of the tumor.
Fig. 1 – An example of the six selected sections from the axial MR of a normal person.
66 biocybernetics and biomedical engineering 39 (2019) 63–74
Fig. 2 – Examples of three different grades of Gliomas axial brain images: (a) Glioma Grade IV; (b) Glioma Grade III; (c) Glioma
Grade II.
Fig. 3 – Examples of axial brain images from the public dataset provided by Cheng et al.: (a) Glioma; (b) Meningioma;
(c) Pituitary tumors.
In the proposed method, some manipulated images were recognize complex concepts. In other words, it enables the
added to the training set by applying random changes to the multilayer models to learn representations of data with
original data. Rotation of 108, 208 or 308 clockwise or multiple levels of abstraction [18].
counterclockwise, translating 15 pixel to right or left, scaling Convolutional neural networks (CNNs) are one of the most
to 0.75 of the original size, mirroring tumor MR images and a efficient supervised methods of deep learning which have
combination of these changes were performed and resulted made remarkable improvements in image processing field.
images were added to the original datasets. The train and test Generally, convolutional, pooling, and fully connected are
datasets are randomly selected from this new dataset. three main layers of a convolutional network. In convolutional
After selecting images and increasing the number of data, layers, the network uses different kernels to convolve the
8000 normal MR images and 8000 glioma MR images were input image to create various feature maps. Applying this layer
provided for train and test. More specifically, there are 4000 will significantly reduce the number of parameters (weight
GBM (grade IV) images, 2000 grade III and 2000 grade II tumor sharing) of the network and the network learns the correlation
images in the Glioma class. Note that 500 images are randomly between the neighbor pixels (local connectivity) [30].
excluded from the dataset of each class and they are used for There are two stages of training in every convolutional
test purposes. neural network. In feedforward step input images are fed to
Cheng et al. dataset consists of 989 axial images and same the network. In other words, dot product of the input vector
data augmentation process is applied on it. 1521 images are and parameters vector of each neuron is performed and look at
used for train and 115 images from each class are employed for convolution operator in each layer is applied. Afterwards, the video
test. output is computed. By using a loss function, the network (3brow
output is compared with the desired output (correct answers) n1blue)
and the error rate is computed then, based on the error, back
3. Convolutional neural networks propagation stage begins. Calculation of the gradient of each
parameter is done in this step using the chain rule and finally
Structure, layers, and parameters of convolutional neural all the parameters are updated. This is repeated for an
networks are described in this section. Deep learning adequate number of iterations. More detailed explanations are
algorithms are subsets of machine learning algorithms in given below.
the world of artificial intelligence. Using simple concepts, deep It should be noted that in order to keep the size of the
learning enables the computer to create, characterize, and output unchanged, the input volume is padded with zeros
biocybernetics and biomedical engineering 39 (2019) 63–74 67
x if x 0
seluðxÞ ¼ l (4)
aex a if x < 0
Fig. 4 – An example of applying a convolution layer with
zero-padding method.
3.3. Pooling
x if x 0
eluðxÞ ¼ (3) Fig. 5 – An example of a max-pooling layer.
aðex 1Þ if x < 0
68 biocybernetics and biomedical engineering 39 (2019) 63–74
In order to train a deep network, the loss function must be The mechanism of natural selection is simulated in GAs. After
minimized by a gradient-based optimization algorithm. each generation the fittest individual of the population will
Stochastic gradient descent (SGD) is widely used as an survive and produce more offspring in the next generation.
optimizer in deep learning [18]. Recently a method for Sometimes a mutation occurs and an offspring with a new
stochastic optimization called Adaptive moment estimation characteristic is created. After several generations superior
(Adam) is presented in [34]. It is demonstrated that Adam individuals appear to be more likely to survive [38].
works better than customary optimization algorithms. In this study, GA is implemented to evolve the best
Furthermore, its computational efficiency in the presence structure of the CNN by choosing proper parameters for the
of large dataset is a privilege of this method. Learning rate for network. These parameters are number of convolutional and
updating the weights will remain constant in SGD algorithm, max-pooling layers, number of filters and size of them,
however Adam algorithm computes adaptive learning rates number of fully connected layers, activation function, dropout
by estimating the first moment (the mean) and the second probability, optimization method and learning rate. The
moment (the uncentered variance) of the gradients. Other values associated with these parameters are specified in
optimizers like Adagrad, Adadelta, Adamax and Nadam were Table 1.
also examined in this study. In Adagrad optimizer, the Considering the parameters of Table 1, over one million
learning rate adapts to the parameters. This happens by different architectures are possible for the CNN. Directly
doing larger updates for infrequent parameters compared to searching these possible architecture to find the best is not
frequent parameters [35]. Unlike Adagrad which accumulates possible however, GA will ease this search. The flowchart of the
all past squared gradients, Adadelta limits the size of the genetic algorithm used in this research is illustrated in Fig. 6.
window of the accumulated previous gradients [36]. Adamax Initially, 50 networks with random parameters are created as
is a variant of Adam optimizer based on the infinity norm. initial populations. Each network will be trained by 80 percent
Nadam combines Adam and Nesterov Accelerated Gradient of data and 20 percent of data are used as validation dataset.
optimizers [37]. In Nadam, before computing the gradient, Validation accuracy is considered as a criterion for retaining or
parameters with the momentum step are updated and this rejecting the network in the next generation. In the proposed
makes it possible to take more precise steps in the gradient method, GA is stopped if early stopping criterion is satisfied or
direction. the number of generations exceeds the pre-specified maximum
The values of all these optimizers' parameters, other than number of generations. Early stopping criterion is when no
the learning rate, are selected based on the author's improvement happens in the validation accuracy (loss func-
recommendation. tion) of three sequent epochs. The early stopping criterion is
implemented in order to reduce computational costs. In this
paper, maximum number of generations is 15.
4. Designing the network architecture According to the validation accuracy, 40 percent of best
networks or elites will retain and move to the next generation.
Typically, a desirable network architecture is found by With 10 percent probability rejected networks also have a
testing various common network structures. This process chance to be transferred to the next generation. In other
requires a lot of trial and error and, of course high words, top 20 networks are directly entering the next
computational cost. In this study, various CNN architectures generation, and 30 other rejected networks might retain and
for the task of MRI image classification are evolved using enter next generation by 0.1 probability. The other members of
genetic algorithm (GA) [38]. Instead of training and compar- the next generation are created through applying selection
ing more than one million different architectures, by and crossover operators. In addition, there is 20 percent
employing GA and comparing less than 500 architectures a chance that a network structure is randomly mutated. This
suitable architecture was discovered. Thus, the computa- process is repeated until stopped based on the flowchart, and
tional costs are decreased. finally the network architecture that has the best performance
More details on selecting the network architecture is selected as the main network architecture for the
are given below. classification.
Table 1 – The parameters employed to evolve the best CNN structure and their associated values.
Number of convolutional + max pooling layers 2, 3, 4, 5, 6
Number of fully connected + dropout layers 1, 2, 3
Number of filters 16, 24, 32, 48, 64, 96, 128
Kernel sizes 2, 3, 4, 5, 6, 7
Number of fully connected neurons 128, 192, 256, 384, 512
Activation functions ReLU, Leaky ReLU, ELU, SELU
Feedforward optimizers SGD, ADAM, ADAMAX, NADAM, ADAGRAD, ADADELTA
Learning rate 1e4, 1e3, 1e2
Dropout rate 0.1, 0.2, 0.3, 0.4, 0.5
biocybernetics and biomedical engineering 39 (2019) 63–74 69
4.2. Bagging as training set and the remaining images are placed in the
validation set. Using the new training set all the variables are
Bagging (or Bootstrap Aggregating) is an ensemble method optimized. Here, the optimization is performed with 10,000
that reduce the generalization error by combining multiple iterations and this process is repeated 5 times. It was observed
models or classifiers [39]. Bagging is a subset of a general that implementing this method improve the results.
method for machine learning called the model averaging.
Different models do not usually create similar errors on the
test set, for this reason the model averaging works. The idea is 5. Results
to train separate classifiers and then evaluate the output for
test samples by each classifier. A compound classifier is then In this section, evaluation criteria are described and the results of
created as the aggregation of each particular classifiers. the proposed method for two case studies are presented, refer to
In order to decrease the variance of classification error, bagging Section 2.1. As previously mentioned, convolutional networks
is used on the best model evolved by the GA. First, all training and were created with random architectures, and in every iteration
validation images are concatenated to form a combined set. This of 15 GA generations better networks were evolved. Classifica-
combined set then permuted and new training set and validation tion accuracy of validation dataset is considered as genetic
set are randomly selected. 75 percent of the combined set is used algorithm criteria for improving networks' architecture. Accord-
70 biocybernetics and biomedical engineering 39 (2019) 63–74
Fig. 8 – Best CNN architectures provided by GA: (a) Case Study I; (b) Case Study II.
biocybernetics and biomedical engineering 39 (2019) 63–74 71
Fig. 9 – Loss and accuracy variations during 100 epochs for: (a, b) Case Study I; (c, d) Case Study II. Solid lines and dash dot
lines are related to the training dataset and validation dataset, respectively.
evaluate more precisely, various criteria have been investigated. about 95% of grade II and grade III tumors were correctly
In Case Study I, Normal images were classified exceptional. classified. By dividing correct predictions by all test data, multi-
Glioblastoma multiform tumors that are the most common and class classification accuracy is calculated. Thus, in 4 classes, the
most malignant brain tumors were classified with an excellent classification accuracy is 93.1% and if only classification of
sensitivity of 97.4% and a total accuracy of 96.1%. In addition, Glioma grades is considered, 90.9% accuracy is obtained.
72 biocybernetics and biomedical engineering 39 (2019) 63–74
Fig. 10 – Confusion matrices for (a) Case Study I and (b) Case Study II.
TP FP TN FN
Case Study I
Normal 499 6 1494 1
Grade II 442 32 1468 58
Grade III 434 34 1466 66
Grade IV 487 66 1434 13
Case Study II
Glioma 113 10 220 2
Meningioma 101 5 225 14
Pituitary 111 5 225 4
Case Study II
Glioma 0.983 0.957 0.919 0.991 0.043 0.017 0.081 0.965
Meningioma 0.878 0.978 0.953 0.941 0.022 0.122 0.047 0.945
Pituitary 0.965 0.978 0.957 0.983 0.022 0.035 0.043 0.974
According to the results obtained for Case Study II, overall it has also had a great performance here. An important aspect
accuracies for classifying Glioma, Meningioma and Pituitary for providing an efficient neural network for specific problems,
tumors were 96.5, 94.5 and 97.4 percent respectively. In this is to determine the proper architecture. In [40] it is demon-
case, 94.2% of the network predictions were correct. By strated that using convolutional neural networks with
reviewing the results of the classifications, the great ability complex structures do not guarantee a better result compared
of the proposed classification method to classify different MR to simpler structures. In this article, a fully automated
brain images is confirmed. procedure for selecting the network structure and its param-
The ability of CNNs and on a larger scale deep learning eters has been used. As previously stated, the genetic
algorithms to process and classify images has been proved in algorithm has had an acceptable performance in finding a
many researches. As it is clear from the results of this study, suitable network for our particular application and the usage
biocybernetics and biomedical engineering 39 (2019) 63–74 73
Table 3 – Comparison of the proposed method data. Therefore, it is necessary to compare different network
with related works. architectures to reach the desired objective. Because it is
Approach Classification impossible to evaluate all possible cases, in this research GA is
accuracya performed to specify an appropriate network architecture with
far less computations. Applying Bagging algorithm on the best
Case Study I – Glioma Grade II/Grade III/Grade IV
SVM + RFE 62.5% network suggested by GA was also effective and has improved
(Zacharaki et al. the accuracy of the classification according to Table 2. This
[14]) table reflects the success of the proposed method to classify
Proposed method 90.9% brain tumors type via MR images.
Case Study II – Glioma/Meningioma/Pituitary Although it has been shown in this paper that the proposed
Vanilla 91.43% method yields better performance compared to similar
preprocessing literature, larger datasets with several tumor types and other
+ shallow CNN CNN structures and deep learning algorithms in future works
(Paul et al. [15])
may lead to better performances.
Proposed method 94.2%
a
Classification accuracy = (correct predictions/number of all data),
using test dataset.
references
of bagging method was also beneficial. Table 3 compares the [1] Wen PY, Kesari S. Malignant gliomas in adults. N Engl J Med
performance of the proposed method with similar tasks. As is 2008;359(5):492–507.
[2] Louis DN, Perry A, Reifenberger G, Von Deimling A,
seen, the proposed method is superior to existing methods for
Figarella-Branger D, Cavenee WK, et al. The 2016 World
classifying similar MR brain tumor classes.
Health Organization classification of tumors of the central
The binary classification of the tumor grades proposed by nervous system: a summary. Acta Neuropathol 2016;131
Zacharaki et al. [15] had achieved 62.5% accuracy for (6):803–20.
classifying three glioma grades. As specified in Table 3, the [3] Laws ER, Ezzat S, Asa SL, Rio LM, Michel L, Knutzen R.
present method of this paper is about 30 percent more Pituitary disorders: diagnosis and management. John Wiley
accurate. Paul et al. [16] have proposed a CNN classifier using a & Sons; 2013.
[4] Black PM. Brain tumors. N Engl J Med 1991;324(22):1555–64.
dataset provided by Cheng et al. [28]. Using the same dataset,
[5] Kelly PJ. Gliomas: survival, origin and early detection. Surg
as is shown in Table 3, the proposed method also has
Neurol Int 2010;1.
performed better accuracy. [6] Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F,
Ghafoorian M, et al. A survey on deep learning in medical
image analysis. Med Image Anal 2017.
7. Conclusions [7] Lo C-S, Wang C-M. Support vector machine for breast MR
image classification. Comput Math Appl 2012;64(5):1153–62.
[8] Trigui R, Mitéran J, Walker PM, Sellami L, Hamida AB.
In summary, in this study a CNN-based method for classifying
Automatic classification and localization of prostate cancer
Glioma brain tumor MR images is proposed. Genetic algorithm using multi-parametric MRI/MRS. Biomed Signal Process
was utilized to search for a CNN structure that produces better Control 2017;31:189–98.
results. The proposed method not only has been grading [9] Rasti R, Teshnehlab M, Phung SL. Breast cancer diagnosis in
Glioma tumors with high precision, but also has been very DCE-MRI using mixture ensemble of convolutional neural
successful in classifying images of various types of brain networks. Pattern Recogn 2017;72:381–90.
[10] Chaplot S, Patnaik L, Jagannathan N. Classification of
tumors.
magnetic resonance brain images using wavelets as input
In the proposed algorithm, classification of various grades
to support vector machine and neural network. Biomed
of Glioma and two other widespread tumor types are carried Signal Process Control 2006;1(1):86–92.
out with high precision. There is no requirement to perform [11] El-Dahshan E-SA, Hosny T, Salem A-BM. Hybrid intelligent
time-consuming processes such as skull stripping or segmen- techniques for MRI brain images classification. Digit Signal
tation and the decision is just made by the raw data of MR Process 2010;20(2):433–41.
images. Proposed method can be manipulated as a secondary [12] Zhang Y, Dong Z, Wu L, Wang S. A hybrid method for MRI
brain image classification. Expert Syst Appl 2011;38
option for early detection in a non-invasive procedure. In
(8):10049–53.
addition, the time required for the classification is very short in
[13] Saritha M, Joseph KP, Mathew AT. Classification of MRI
comparison to the time required for analyzing biopsy. brain images using combined wavelet entropy based spider
Consequently, the proper action can be taken on proper time web plots and probabilistic neural network. Pattern Recogn
with respect to the severity of the tumor. Lett 2013;34(16):2151–6.
Most of the proposed schemes in this area comprise of the [14] Kalbkhani H, Shayesteh MG, Zali-Vargahan B. Robust
region-of-interest definition, manual feature extraction, fea- algorithm for brain magnetic resonance image (MRI)
classification based on GARCH variances series. Biomed
ture selection and finally classification. In contrast, the
Signal Process Control 2013;8(6):909–19.
proposed deep learning method extracts useful features [15] Zacharaki EI, Wang S, Chawla S, Soo Yoo D, Wolf R, Melhem
automatically, and thus providing a feature extracting method ER, et al. Classification of brain tumor type and grade using
is not required. Using a network architecture that performs MRI texture and shape in a machine learning scheme. Magn
well on some data may lead to poor results for another set of Reson Med 2009;62(6):1609–18.
74 biocybernetics and biomedical engineering 39 (2019) 63–74
[16] Paul JS, Plassard AJ, Landman BA, Fabbri D. Deep learning for tumor region augmentation and partition. PLoS ONE
brain tumor classification. Proc of SPIE. 2016. pp. 1013710–1. 2015;10(10). e0140381.
[17] Khawaldeh S, Pervaiz U, Rafiq A, Alkhawaldeh RS. [29] Goodfellow I, Bengio Y, Courville A. Deep learning. MIT
Noninvasive grading of glioma tumor using magnetic Press; 2016.
resonance imaging with convolutional neural networks. [30] LeCun Y, Bengio Y. Convolutional networks for images,
Appl Sci 2017;8(1):27. speech, and time series. Handb Brain Theory Neural Netw
[18] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 1995;3361(10):1995.
2015;521(7553):436–44. [31] He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers:
[19] Mohan G, Subashini MM. MRI based medical image surpassing human-level performance on imagenet
analysis: survey on brain tumor grade classification. classification. Proceedings of the IEEE International
Biomed Signal Process Control 2018;39:139–61. Conference on Computer Vision; 2015. p. 1026–34.
[20] Pereira S, Pinto A, Alves V, Silva CA. Brain tumor [32] Clevert D-A, Unterthiner T, Hochreiter S. Fast and accurate
segmentation using convolutional neural networks in MRI deep network learning by exponential linear units (elus);
images. IEEE Trans Med Imaging 2016;35(5):1240–51. 2015, arXiv preprint arXiv:1511.07289.
[21] Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, [33] Klambauer G, Unterthiner T, Mayr A, Hochreiter S. Self-
Bengio Y, et al. Brain tumor segmentation with deep neural normalizing neural networks; 2017, arXiv preprint
networks. Med Image Anal 2017;35:18–31. arXiv:1706.02515.
[22] Kleesiek J, Urban G, Hubert A, Schwarz D, Maier-Hein K, [34] Kingma D, Ba J. Adam: a method for stochastic
Bendszus M, et al. Deep MRI brain extraction: a 3D optimization; 2014, arXiv preprint arXiv:1412.6980.
convolutional neural network for skull stripping. [35] Duchi J, Hazan E, Singer Y. Adaptive subgradient methods
NeuroImage 2016;129:460–9. for online learning and stochastic optimization. J Mach
[23] IXI Dataset. Available from: Learn Res 2011;12(July):2121–59.
http://brain-development.org/ixi-dataset/. [36] Zeiler MD. ADADELTA: an adaptive learning rate method;
[24] Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, 2012, arXiv preprint arXiv:1212.5701.
et al. The Cancer Imaging Archive (TCIA): maintaining and [37] Sutskever I, Martens J, Dahl G, Hinton G. On the importance
operating a public information repository. J Digit Imaging of initialization and momentum in deep learning.
2013;26(6):1045–57. International Conference on Machine Learning. 2013. pp.
[25] Scarpace L, Flanders AE, Jain R, Mikkelsen T, Andrews DW. 1139–47.
Data from REMBRANDT. Cancer Imaging Archive 2015. [38] Deepa SN. Introduction to genetic algorithms. Berlin
[26] Scarpace L, Mikkelsen T, Cha S, Rao S, Tekchandani S, Heidelberg: Springer-Verlag; 2008.
Gutman D, et al. Radiology data from the cancer genome [39] Dietterich TG. Ensemble methods in machine learning.
atlas glioblastoma multiforme [TCGA-GBM] collection. Multiple Classifier Syst 2000;1857:1–15.
Cancer Imaging Archive 2016. [40] Pan Y, Huang W, Lin Z, Zhu W, Zhou J, Wong J, et al. Brain
[27] Pedano N, Flanders AE, Scarpace L, Mikkelsen T, Eschbacher tumor grading based on neural networks and convolutional
JM, Hermes B, et al. Radiology data from the cancer genome neural networks. Engineering in Medicine and Biology
atlas low grade glioma [TCGA-LGG] collection. Cancer Society (EMBC), 37th Annual International Conference of
Imaging Archive 2016. the IEEE. 2015. pp. 699–702.
[28] Cheng J, Huang W, Cao S, Yang R, Yang W, Yun Z, et al.
Enhanced performance of brain tumor classification via