Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 1

Population-Based Metaheuristic Approaches for Feature Selection on

Mammograms
INSERT
LOGO HERE
Jinn-Yi Yeh1 and Siwa Chan2 INSERT
LOGO HERE
1
Department of Management Information Systems, National Chiayi University, Taiwan, jyeh@mail.ncyu.edu.tw
2
Department of Medical Imaging, Taichung Tzu Chi Hospital, Taiwan, chan.siwa@gmail.com

Introduction Population-based Metaheuristic Methods Performance of metaheuristic methods


 Breast cancer is the most common cause of death by • Genetic algorithm (GA) The optimal costs of the GA, SA, ACO, and PSO were
cancer in women. Mammography is an effective imaging – GAs are population-based metaheuristic algorithms 0.084, 0.161, 0.100, and 0.130, respectively. The
tool for detecting breast cancer at an early stage. performance of the four population-based metaheuristic
inspired by Darwin’s theory of evolution.
 Mammogram clues are subtle and vary • Simulated annealing (SA) approaches was evaluated by testing the 100 different sets
in appearance, thereby making diagnosis and the results are presented in Figure 5. The means of
– This algorithm was originally proposed by Metropolis
a challenging. Numerous computer-aided both the optimal Az and running time are listed in Table 3.
in 1953 to simulate the annealing process in the
diagnosis (CAD) systems have been
physics of solids.
developed for analyzing digitalized
• Ant colony optimization (ACO)
mammograms.
– The ACO algorithm models the behavior of foraging
 Studies indicated that the feature space of ROIs is very
ants and is useful for problems that aim to
large and complex. The use excessively many features
determine the shortest path.
may degrade the performance of the classification
methods and increase the complexity of the classifier. • Particle swarm optimization (PSO)
(a) GA (b) SA
Many approximate algorithms have been developed to – PSO optimizes a problem by iteratively improving a
address optimization. candidate solution in terms of a given measure of
 The work proposed an automated quality.
breast-mass classification system
for assisting the radiologist in
ROI Extraction
analyzing mammograms. All
A radiologist used prior information retrieved from the data
possible features reported in the (c) ACO (d) PSO
sets to obtain the ROIs, which were considered as the gold Fig. 5: Cost versus iteration of feature selection process
literature were computed.
standard. Figures 2, 3, and 4 present some ROI extraction
Population-based metaheuristic Table 3: Means of both the optimal A z and running time
results of mammograms for normal, benign, and malignant
approaches were used to select an
cases, respectively   MIAS dataset DDSM
optimal subset of features for mass classification. A SVM
was used to classify benign, malignant, and normal Methods Mean Az Mean time Mean Az Mean time
cases.
GA 85.69% 79.14 99.91% 36.05
Research method SA 81.27% 32.16 99.52% 14.21

The study was approved by the Institutional Review Board ACO 84.30% 65.734 99.82% 30.13
of Taichung Veterans General Hospital (IRB#: SE15264A). PSO 86.70% 67.764 99.89% 29.92
(a) mdb010 (b) mdb097 (c) mdb104 (d) mdb167
The block diagram of the proposed system is presented in Fig. 3: Benign mammograms and their binary images
Figure 1. First, a radiologist extracted the ROIs for each
mammogram by using prior information. Second, the ROI Experimental results
was used to compute 277 features comprising 6 intensity
features, 7 moment invariant features, 192 GLCM features, • Figure 6 presents the Z-scores of the features selected
8 local binary pattern (LBP) features, 44 run length statistics by combining results from all methods. The top 20
(RLS) features, 6 shape features, 6 normalized radial length features are F8, F126, F49, F229, F172, F156, F261, F23, F116, F7,
(NRL) features, 4 normalized chord length (NCL) features, F122, F178, F256, F4, F12, F28, F71, F5, F259, and F277.
3 relative gradient orientation (RGO) features, and 1 Fourier • Figure 7 presents Az versus the numbers of features
descriptor feature for classification. Third, the samples were
selected using different feature selection methods. The
randomly split into 80% for training and 20% for testing. (a) mdb111 (b) mdb117 (c) mdb155 (d) mdb271
Fig. 4: Malignant mammograms and their binary images means (standard deviations) of Az for our proposed
Finally, four population-based metaheuristic algorithms
method, ReliefF, and Info-Gain were 0.891 (0.037),
were compared for feature selection along with the SVM. Feature Extraction 0.857 (0.036), and 0.827 (0.027), respectively.

After ROI extraction, total 277 features are calculated for


Mammogram Feature extrection
Datasets Feature selection each case. Table 1 lists some features of benign, malignant,
Intensity (6) Shape (6) Genetic Algorithm and normal types of mammograms
MIAS (69 benign, 54 Moment (7) NRL (6) Simulated Annealing
malignant, 68 normal) Table 1: Some features of benign, malignant, and normal types of mammograms
GLCM (192) Ant Colony
NCL (4)
Optimization Case F1 F2 F3 F4 F5 F6 … F277 Class
DDSM (150 benign,
150 malignant)
mdb031 180.831 17.451 0.005 -0.005 0.017 6.012 … 0.799 normal
LBP (8) RGO (3) Particle Swarm
mdb050 176.403 8.317 0.001 0.005 0.036 5.031 … 0.819 normal Fig. 6: Z-scores for the different numbers of features selected using all methods
Optimization
RLS (44)
Fourier mdb085 178.167 16.099 0.004 0.020 0.020 5.907 … 0.818 normal
ROI extraction Descriptor (1) mdb196 138.244 5.766 0.001 0.000 0.048 4.566 … 0.819 normal
Support Vector Machine … … … … … … … … … …
Radiologist
Split dataset mdb010 185.648 9.887 0.002 -0.004 0.029 5.241 … 0.801 benign
Threshold method mdb097 178.177 8.187 0.001 0.002 0.034 4.993 … 0.829 benign
Training set Testing set
80% 20%
Performance evaluation mdb104 200.821 6.999 0.001 -0.005 0.046 4.702 … 0.864 benign
mdb167 147.871 6.548 0.001 0.003 0.046 4.678 … 0.846 benign
… … … … … … … … … …
Fig. 1: Block diagram of the proposed system mdb111 202.115 11.747 0.002 -0.005 0.024 5.507 … 0.939 malignant
mdb117 180.206 12.854 0.003 0.007 0.022 5.652 … 0.894 malignant
Data sets mdb155 174.242 10.495 0.002 0.020 0.034 5.219 … 0.900
… 0.878
malignant
mdb271 200.808 11.798 0.002 -0.003 0.024 5.507 malignant Fig. 7: Az versus the numbers of features selected using different feature selection
… … … … … … … … … … methods
• Mammography Image Analysis Society (MIAS)
– MIAS consists of 161 cases, 322 digitized medio-lateral oblique
Parameter setting for feature selection Conclusion
(MLO) images
– Of these files, 207 do not reveal tumor lesions and are • This study proposed an automated breast-mass
presented as normal mammograms, and the remaining 115 Table 2 lists the parameter setting of four population-based classification system for assisting radiologists in
present abnormal mammograms with space-occupying or meta-heuristic methods for feature selection on analyzing digital mammograms.
calcification lesions. mammograms. The objective function is measured by the • All possible enormous features available in the literature
– The work randomly selected 69 benign, 54 malignant, and 68 area under the receiver operating characteristic (ROC) were computed. Four population-based metaheuristic
normal cases for classification. curve (Az). The cost function is then defined as 1-Az. approaches, a GA, SA, ACO, and PSO, were used for
• Digital Database for Screening Mammography (DDSM) Table 2: Parameter setting of four population-based meta-heuristic methods selecting an optimal subset of features. For a mass
– DDSM contains 2620 cases. GA SA ACO ACO classification SVM was used.
– The work randomly selected 150 benign and 150 malignant Parameters Parameters Parameters Parameters • Experimental results revealed that GA and PSO
cases for classification. outperformed SA and ACO in feature selection. A
MaxIt=20 Nf=100 Nf=100 Nf=100 combination of heuristic approaches outperformed other
nPop=10 MaxIt=20 MaxIt=20 MaxIt=20 methods in selecting an optimal feature set.
Pc=0.7 MaxSubIt=5 nAnt=10 nPop=20
Nc=8 T0=10 =1 w=1 Main contributions
Pm=0.3 =0.99 =1 Wdamp=0.9
Nm=3 =0.05 9 • Almost all the features in the literature were evaluated.
Mu=0.1   c1=2 • Almost all population-based heuristic approaches for
c2=2 feature selection were evaluated.
=8
=2.05 • Two practical data sets, mini-MIAS and DDSM, were
(a) mdb031 (b) mdb050 (c) mdb085 (d) mdb196
Fig. 2: Normal mammograms and their binary images
used to develop the proposed approach.

Poster template by ResearchPosters.co.za

You might also like