Professional Documents
Culture Documents
Musfequa Final Proposal
Musfequa Final Proposal
by
Musfequa Rahman
ID: 1704050
Chattogram-4349, Bangladesh.
September, 2022
Chittagong University of Engineering & Technology (CUET)
Department of Computer Science & Engineering
Chattogram-4349, Bangladesh.
Thesis Proposal
Application for the Approval of B.Sc. Engineering Thesis/Project
List of Figures ii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Background and Present State . . . . . . . . . . . . . . . . . . . . 2
3 Specific Objectives and Possible Outcomes . . . . . . . . . . . . . 4
4 Outline of Methodology . . . . . . . . . . . . . . . . . . . . . . . 4
4.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . 5
4.2 Split Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.3 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . 5
4.4 Deep CNN Model . . . . . . . . . . . . . . . . . . . . . . . 6
4.5 Classification . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Required Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.1 Required Tools/ Components . . . . . . . . . . . . . . . . 7
6 Cost Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
7 Time Management . . . . . . . . . . . . . . . . . . . . . . . . . . 8
i
List of Figures
ii
1 Introduction
Currently, one of the leading causes of human death is cancer which is character-
ized by abnormal cell growth that has the potential to invade or spread to other
parts of the body. In [1] according to the report of World Cancer Organization
(WHO), 10 million deaths caused by cancer in 2020 and 19 million of new cases
of cancer are occur until 2020. Any part of the human body can be affected by
cancer cells including liver, lungs, breasts, brain, colon, rectum, liver, stomach,
skin and blood. Among other types of cancer, breast cancer is one of the most
common for women and the mortality of breast cancer is also very high, account-
ing for 1 in 4 new cases and 1 in 6 cancer-related deaths worldwide in [1]. In 2020,
it has been reported that the incidence rate of breast cancer was 2.26 million with
0.685 million death cases and 27 million of new cases of cancer are expected to
occur until 2030 in [2].
Breast cancer develops when cells within the breast proliferate out of control and
invade nearby tissues via blood and lymphatic arteries. Breast fatty tissue or
fibrous connective tissue can also develop breast cancer. Lumps or tissue thick-
ening that feels different from surrounding tissue, breast ache skin that is pitted
and red or discolored on the breast, breast swells completely or partially are com-
mon signs of breast cancer. Breast tissue that has a lump in it can be used to
diagnose breast cancer. In spite of recent developments in our understanding of
the molecular biology of breast cancer progression and the identification of new
related molecular markers, a biopsy is still the only diagnostic procedure that
can definitively determine whether the suspicious area is cancerous. This is be-
cause the histopathological analysis is still the most common technique for breast
cancer diagnosis. The pathologists make their diagnoses by visually examining
histological slides under a microscope, which is regarded as the gold standard for
confirmation of a diagnosis. The typical manual diagnosis, however, necessitates
a heavy workload from knowledgeable professionals. Pathologists who lack suf-
ficient diagnostic experience are more likely to make errors in diagnosis. It has
been demonstrated that using computer-aided diagnosis (CAD) to identify his-
topathology pictures automatically can increase the diagnostic effectiveness and
1
provide doctors more objective and precise diagnosis results.
Despite more than 40 years of research in [3], the difficulty of analyzing the
complex images makes autonomous imaging processing for cancer diagnosis a
difficult problem to solve. ML algorithms have been used to categorize and fore-
cast a variety of biomedical signal types. Many academics are paying attention
to Deep Learning, a developing technology in the field of machine learning that
has made it possible for machines to interpret high-dimensional data including
images, multidimensional anatomical images, and videos. In light of this, this
work will examine the deep learning method for the categorization of histolo-
gical images of breast cancer. In addition to comparing several convolutional
neural network architectures, this study also looks at various transfer learning
pre-trained models with slight layer modifications. Transfer learning offers good
feature combinations even for extremely complicated tasks in a short amount of
time.
2
identify 96.5 percent of the type of breast cancer in a suspected patient using
the Wisconsin Breast Cancer Dataset (WBCD) of the FNA biopsy system in the
work in [5].
On image classification and object detection tasks, deep learning algorithms have
produced results on par with those of human experts in [6]. The most popular
deep learning framework for learning complicated discriminative characteristics
between image classes is the convolutional neural network. There are numerous
CNN architectures, including VGG16, VGG19 and Inception-ResNet-v2. On the
enormous ImageNet dataset throughout the years, inspection has delivered out-
standing results.
In [7] made an attempt at achieving this feat by creating a straightforward 3-
layer CNN, which produced a Balanced Accuracy (BAC) of 84 percent. Cruz’s
method used a fully learnfrom-data approach and did not involve the use of manu-
ally created features, albeit because of computational limitations, their network
structure was far less complex than the intricate architecture of modern neural
networks.
In order to diagnose breast and prostate cancer automatically, in [8] developed a
5-layered multi-input CNN that took into account both RGB pictures and phase
shearlet coefficients. This CNN achieved an amazing accuracy rate of 88 percent
for a different dataset. In [9], a thorough analysis of the architecture and func-
tioning of each network is performed and the performance of each network is then
evaluated based on how accurately it diagnoses and classifies breast cancer. CNN
is considered to provide a little bit higher precision than MLP for the detection
and identification of breast cancer.
The study in [10] also found that, when using the provided dataset, the augment-
ation strategy was successful in the automatic identification of this cancer. In [11]
employed a DL technique to automatically identify and examine invasive ductal
carcinoma (IDC) tissue zones.
In [12], the automated detection of IDC, for which authors have created success-
ful approaches, is still a difficult diagnostic problem for breast cancer. Without
including the base CNN, four CNN architectures were tested. A sizable dataset
of roughly 275,000 50x50 RGB image patches was used to train all architectures.
3
Quantitative results were measured with an accuracy of 89 percent using a global
average of 10-Fold Cross Validation tests.
In [13], propose a method for classifying breast cancer histopathology images by
combining several compact Convolutional Neural Networks (CNNs). The first
step is to create a hybrid CNN architecture with a local model branch and a
global model branch. The hybrid approach gains a stronger capacity for repres-
entation through local voting and two-branch information merging. Second, the
channel importance can be learned and the redundant channels are thus elimin-
ated by incorporating the proposed Squeeze-Excitation-Pruning (SEP) block into
our hybrid model. The proposed channel pruning strategy can yield 89 percent
accuracy while lowering the danger of overfitting.
In this research, we will try to classify eight types of Histopathology images (four
cancerous, four non-cancerous) found in Breast cancer.
4 Outline of Methodology
The main goal of this work is to create a model that can classify the images of
breast cancer. The entire dataset is divided into three phases: training, valida-
tion, and testing. Data are represented in the preprocessing stage so that features
can be mapped from them. The deep CNN model’s retrieved features are con-
verted into features vectors, which are used as the input for sequence models.
Classify the histopathology pictures using the model’s output next.
4
Figure 4.1: Proposed methodology for breast cancer classification.
The following subsections provide information about the system’s detailed pro-
cessing:
Before implementing any machine learning or deep learning strategy, data col-
lection is the most crucial step. The goal of this study is to classify the breast
cancer. To ensure this, we must train our model using a dataset made up of
numerous histopathology images. Because training a CNN model takes a lot of
data, the dataset needs to be large in size.
The dataset is splited into the training, validation and test phases with proper
partitioning strategy as it has significant impact on the model’s outcome.
It is not possible to execute operations on raw input data. So that the features
from an input data set may be mapped, they need to be preprocessed. Histogram
In this study, equalization is utilized to enhance image contrast. By figuring out
the data’s mean and standard deviation, normalization can be used to maintain
5
the data within specific limits. All of the input images will be first enhanced, and
then reshaped to the same scale, to make training with a large amount of data
easier.
The presented model will make use of deep learning techniques. All neural net-
work types have some common building blocks, including neurons, weights, bi-
ases, and functions. Each input image will be sent through a series of convolution
layers containing filters (kernels), pooling, and fully connected layers in order to
train and evaluate deep learning CNN models (FC). An activation function will
then be used to categorize an object after that. The pooling layer summarizes
these features after receiving the feature maps from the convolution layer. The
summarized characteristic is subjected to the subsequent operation.
4.5 Classification
A class label is predicted for a specific sample of input data using the predictive
modeling technique known as classification in machine learning. A feature vector
is created further from the feature maps, flattened, and represented in the fully
connected layer. The images are now 6 categorized using an activation function.
As it performs better for multi-class classification problems, softmax activation
is used in this work.
5 Required Resources
A well-built desktop or laptop will be required for this research with the avail
ability of GPU. To perform this research large dataset will be required. So, a
computer with minimum core i5 processor and 8 GB RAM is required.
6
5.1 Required Tools/ Components
• Some Libraries:
– Tensorflow
– Keras
– OpenCV
– Matplotlib
– Panda
6 Cost Estimation
The costs that will occur to implement our proposed method are given below:
a. Cost of Materials:
• A powerful PC Tk 110000
• Software Tk 5500
Total Tk. 115500
7
7 Time Management
Gantt Chart for the entire timeline of the proposal of the thesis is given below:
8
CSE Undergraduate Studies (CUGS) Committee
Reference :
[4] M. Malathi, P. Sinthia, F. Farzana and G. Aloy Anuja Mary, ‘Breast cancer
detection using active contour and classification by deep belief network,’
Materials Today: Proceedings, vol. 45, pp. 2721–2724, 2021, International
Conference on Advances in Materials Research - 2019, issn: 2214-7853. doi:
https://doi.org/10.1016/j.matpr.2020.11.551 (cit. on p. 2).
[5] S. Ara, A. Das and A. Dey, ‘Malignant and benign breast cancer classi-
fication using machine learning algorithms,’ in 2021 International Confer-
ence on Artificial Intelligence (ICAI), 2021, pp. 97–101. doi: 10 . 1109 /
ICAI52203.2021.9445249 (cit. on p. 3).
10
[9] M. Desai and M. Shah, ‘An anatomization on breast cancer detection and
diagnosis employing multi-layer perceptron neural network (mlp) and con-
volutional neural network (cnn),’ Clinical eHealth, vol. 4, pp. 1–11, 2021.
doi: https://doi.org/10.1016/j.ceh.2020.11.002 (cit. on p. 3).
11