Download as odt, pdf, or txt
Download as odt, pdf, or txt
You are on page 1of 8

Breast cancer prediction with evaluating performance of elastography image using deep learning

The implemented classification approach consists of three main consecutive stages. Firstly, the
dataset is extracted by using image processing algorithms then data preprocessing procedures are
applied to the dataset. Finally, machine learning techniques are used for classification.
A. Dataset
The data used in this paper are adopted with an informed consent obtained from all of the
included patients. The dataset is composed of combined ultrasound and ultrasound elastography
DICOM images labeled by experienced radiologists and collected over the period from Feb. 2017 to
Jun. 2018. A total of 82 images were taken for 34 different patients where some patients have
multiple lesions while other patients have only one. Also, some ultrasound elastography are obtained
using two different matching materials (oil and gel) between the transducer of an ultrasound imaging
system and the breast tissue of the patient. The lesions are labelled as 56 malignant lesions and 26
benign lesions. The images have different sizes where the B-mode image . Whereas the elastography
image is superimposed with the B-mode image as shown on the right. Different levels of measured
strains are represented in the elastography images using a color map; where blue represents the
highest strain (softest), green represents average strain (intermediate) and red the lowest strain
(hardest).
B. Image segmentation
There are two region of interests (ROIs) that can be extracted from each image which are the
tumor in the B-mode image and the tumor in the elastography image. Therefore, segmentation of the
ROI from the images is performed to be able to extract and measure the features of different tumors
for diagnosis. The images are cropped to separate B-mode images from elastography images. Tumors
in B-mode images are segmented using proper threshold for gray levels which separate the tumor
from the background. Meanwhile, to extract the tumor in the elastography images, the B-mode image
is subtracted from the elastography image followed by masking the elastography image with the ROI
in the B-mode. Thereafter, a proper color threshold is applied to extract the region with the lowest
strain (hardest). The extracted ROIs and image segmentation .

C.Feature Extraction
Features are to be extracted from the ROIs to help decide whether the lesion is malignant or
benign. A total of 33 features are calculated for both B-mode and elastography images. The extracted
features are composed of the geometrical and texture characteristics of either ROIs, their differences
or relative values. Some features are related to the texture of the ROI such as mean and standard
deviation or geometry of the tumor such as area, perimeter and width-to height ratio or the quality of
images such as contrast to noise ratio and signal to noise ratio.
D. Data Preprocessing
After obtaining the resulting dataset from feature
extraction. The data is preprocessed by using PCA for
dimensionality reduction. This resulted in selecting 18
dimension which are considered the most informative
dimensions. The dataset is shuffled and then divided into two
main parts where 80% is used as a training set (67 samples)
and 20% (17 samples) is used as test set. The training set is
used for the model fitting and hyperparameters tuning while
the test set is used as a held-out (unseen test set) to evaluate
the performance of the model on unseen data.
E. Machine Learning
Applying Machine learning algorithms is considered a
crucial step to classify between malignant cases and benign
cases with higher accuracy and reliability. For that purpose,
different machine learning algorithms are applied. Whereas
the classifier which provides the highest possible separability
is selected.
Support vector machine (SVM) [9] is selected as the
classification algorithm. SVM is firstly applied on the training
set for fitting and hyperparameters tuning. Afterwards the
resulting model with the selected hyperparameters from the
training stage is then tested against the held-out test set for
performance evaluation. SVM with radial basis function
showed highest accuracy of 94.12%.

Algorithm and its work:


1.CNN:
Data analytics in deep learning
FORMS OF DATA ANALYSIS
Organization with huge data can begin analytics. Before beginning data scientists should
make sure that predictive analytics fulfills their business goals and is appropriate for the big data
environment.
Let’s take a quick look at the three types of analytics –
Descriptive analytics – It is the basic form of analytics which aggregates big data and provides
useful insights into the past.
Predictive analytics – Next step in data reduction; It uses various statistical modelling and
machine learning techniques to analyze past data and predict the future outcomes
Prescriptive analytics – New form of analytics which uses a combination of business rules,
machine learning and computational modelling to recommend the best course of action for any pre-
specified outcome.
A neural network is an interconnected group of nodes. Each processing node has its own small
sphere of knowledge, including what it has seen and any rules it was originally programmed with or
developed for itself.

Algorithm and its work:


1.CNN
2.Back Propagation
3.SVM
Algorithm and its work:
1.CNN
2.BackPropagation
3.SVM

1.CNN:
Convolutional Neural Networks CNN models stand for one of the oldest deep neural networks
hierarchy that have hidden layers among the general layers to help the weights of the system learn more
about the features found in the input image. A general CNN architecture consist of distinct types of
layers. The convolutional layer applies an array of weights to all of the input sections from the image
and creates the output feature map. The pooling layers simplify the information that is found in the
output from the convolutional layer. The last layer is the fully connected layer that
Convolutional neural networks are widely adopted for solving problems in image classification. In this
project, we aim to gain a better understanding of deep learning through exploring the miss-classified
cases in facial and emotion recognitions. Particularly, we propose the backtracking algorithm in order
to track down the activated pixels among the last layer of feature maps. We then are able to visualize
the facial features that lead to the miss-classifications, by applying the feature tracking algorithm. A
comparative analysis of the activated pixels reveals that for the facial recognition, the activations of the
common pixels are decisive for the result of classification; for the emotion recognition, the activations
of the unique pixels indeed determine the result of classification.

2.Back Propagation:

The backpropagation algorithm looks for the minimum of the error function in weight
space using the method of gradient descent. The combination of weights which minimizes the error
function is considered to be a solution of the learning problem. Since this method requires computation
of the gradient of the error function at each iteration step, we must guarantee the continuity and
differentiability of the error function. Obviously we have to use a kind of activation function other than
the step function used in perceptrons,because the composite function produced by interconnected
perceptrons is discontinuous, and therefore the error function too. One of the more popular activation
functions for backpropagation networks is the sigmoid, a real function.

Many other kinds of activation functions have been proposed and the backpropagation
algorithm is applicable to all of them. A differentiable activation function makes the function computed
by a neural network differentiable (assuming that the integration function at each node is just the sum
of the inputs), since the network itself computes only function compositions. The error function also
becomes differentiable.

The smoothing produced by a sigmoid in a step of the error function. Since we want to
follow the gradient direction to find the minimum of this function, it is important that no regions exist
in which the error function is completely flat. As the sigmoid always has a positive derivative, the slope
of the error function provides a greater or lesser descent direction which can be followed. We can think
of our search algorithm as a physical process in which a small sphere is allowed to roll on the surface
of the error function until it reaches the bottom.

3.SVM:

The objective of the support vector machine algorithm is to find a hyperplane in an N-


dimensional space(N — the number of features) that distinctly classifies the data points.
To separate the two classes of data points, there are many possible hyperplanes that could be chosen.
Our objective is to find a plane that has the maximum margin, i.e the maximum distance between data
points of both classes. Maximizing the margin distance provides some reinforcement so that future data
points can be classified with more confidence.

Hyperplanes are decision boundaries that help classify the data points. Data points falling on either side
of the hyperplane can be attributed to different classes. Also, the dimension of the hyperplane depends
upon the number of features. If the number of input features is 2, then the hyperplane is just a line. If
the number of input features is 3, then the hyperplane becomes a two-dimensional plane. It becomes
difficult to imagine when the number of features exceeds 3.
Support vectors are data points that are closer to the hyperplane and influence the position and
orientation of the hyperplane. Using these support vectors, we maximize the margin of the classifier.
Deleting the support vectors will change the position of the hyperplane. These are the points that help
us build our SVM.

Conclusion:
In this paper, it is clear that SVM is very helpful for making a decision in breast cancer
diagnosis by building a generic and robust model to differentiate between benign and malignant cases.
The model has accuracy of 94.1% which is high compared with the literature and logistic loss of 2.5;
which means the average error produced by model decisions is 2.5 every 17 samples.

You might also like