Development of Neural Network For

DEVELOPMENT OF NEURAL NETWORK FOR
BREAST CANCER DIAGNOSIS
1
ABSTRACT
This project focused on the development of self learning expert system for breast
cancer diagnosis using artificial neural network. The goal of this project is to develop an ES
that can learn autonomously from external medical data using “Neural Networks.”
Mammogram datasets were acquired from web-based machine learning dataset website
(https://archive.ics.uci.edu/ml/datasets/Breast+Cancer) and oncologists. The collected
datasets were cleaned and preprocessed before being used to train the NN and tested with
inference datasets. In training set there are 226 mammograms, 193 are normal and 33 are
malignant; network predict all benign as benign, out of 193 normal cases 2 samples are miss
classified. Validation set comprises of 48 samples, 42 are normal and 6 are malignant,
network predict all normal and malignant correctly. Test set consists of 48 samples, 37
normal and 11 malignant; prediction is 100% for this dataset. This product is useful for the
diagnosis of breast cancerous and non-malignant tumors. Future research works can be
focused on prediction of malignant tumors and computer-aided treatment prescription.
2
TABLE OF CONTENTS
Title Page i
Declaration ii
Certification iii
Dedication iv
Acknowledgement v
Abstract vi
Table of contents vii
Chapter One 1
Background of Study 8
Statement of the Problem 12
Purpose of the Study 13
Research Question 13
Scope of Study 14
Chapter Two 15
Introduction 15
Symptoms of Breast Cancer 19
Breast Cancer Diagnosis 22
Expert Systems 26
Neural Networks 28
Chapter Three 34
Datasets 34
Data cleaning and pre-processing 35
Training neural network 35
Trained neural network 40
Output 40
Chapter Four 42
Features Extraction Using GCLM 45
Features Selection 49
Classification 50
Regression and Analysis 52
Neural Network Performance Evaluation 53
Chapter Five 55
Summary 55
Conclusion 58
Recommendations 59
References 60
3
CHAPTER ONE
1.1 BACKGROUND INFORMATION
The integration of information technology into many areas of human life has
led to the development of different innovative solutions to human problems both at an
individual level such as; personal finance, healthy lifestyle, education, communication,
Etc. and at global scale which includes; climate change, insecurity, disaster management
and disease control.
The advent of information technology aim to free human from mental
drudgery as the industrial revolution frees human from physical drudgery.
ES are computer applications or programs developed to solve complex
problems at the level of extra-ordinary human intelligence and expertise. ES reasoned
through bodies of knowledge represented mainly as IF-THEN-RULES rather than through
conventional procedural code. They are among the first truly successful forms of artificial
intelligence (AI) softwares. However, some experts points out that ES were not part of
true AI since they lack the ability to learn autonomously from external data.
Knowledge engineering is the building of ES and its practitioners are called
knowledge engineers. The computer must have all the required knowledge needed to solve
a problem and the required knowledge must be represented as symbol patterns in the
memory of the computer. The computer must also use the knowledge efficiently by
4
selecting from a handful of reasoning methods.
Figure 1: Components of an Expert System
ES consists of three main parts; the knowledge base, the reasoning methods
or inference engine and the user interface.
Knowledge is required to exhibit intelligence or sound judgment. The
success of any ES majorly rests upon the collection of highly accurate and precise
knowledge. Data is a collection of facts; information is organized as data and facts about
the task domain. Data, information and past experience combined together are termed as
knowledge. Knowledge base contains factual and heuristic knowledge.
Factual knowledge is that knowledge of the task domain that is widely
shared, typically found in textbooks or journals and commonly agreed upon by those
knowledgeable in the particularly field.
Heuristic knowledge is less rigorous, more experiential, more judgmental
knowledge of performance. It is rarely discussed and largely individualistic. It is the
knowledge of good practice, good judgment and plausible reasoning in the field. It is the
knowledge that underlies the “art of good guessing.
A knowledge representation is the method used to organize and formalizes
the knowledge in the knowledge base in the form of IF-THEN-ELSE rules.
5
Knowledge base is formed by reading from various scholars, experts and the
knowledge engineers. Knowledge engineer is a person with the qualities of empathy,
quick learning and strong analytical skills. He acquires information from subject experts
by recording, interviewing and observing him at work. He then categorizes and organizes
the information in a meaningful way in the form of IF-THEN-ELSE rules to be used by
inference engine.
Inference engine acquires and manipulates the knowledge from the
knowledge base to arrive at a particular solution. For a rule based Expert System, it;
applies rules repeatedly to the facts, adds new knowledge into the knowledge base if
required, resolve rules conflict when multiples rules are applicable to a particular case.
To recommend a solution, the inference engine uses forward chaining, a strategy
used by an ES to answer the question, “WHAT CAN HAPPEN NEXT”. The inference
engine follows the chain of conditions and derivations and finally deduces the outcome. It
considers all the facts and rules, and sorts them before concluding to a solution. This
strategy is followed for working on conclusion, result or effect. For example, predictions
of share market status as an effect of changes in interest rates.
Figure 2: Forward Chaining
The inference engine also use backward chaining is used to answer the
question, “WHY THIS HAPPENED”. The inference engine tries to find out which
conditions could have happened in the past for this result. This strategy is followed for
finding out cause or reason. For example, diagnosis of blood or breast cancer in humans.
6
Figure 3: Backward Chaining
User interface provides interactions between user of the ES and the ES itself.
It explains how the ES arrived at a particular recommendation. The explanation may
appear in the form of; natural language displayed on screen, verbal narrations in natural
language and listing rule numbers displayed on screen. The user interface makes it easy
to trace the credibility of ES deductions.
Cancer is a group of diseases involving abnormal cell growth with potential to
spread to other body parts. Breast cancer is a cancer that develops from breast tissues.
Common breast cancer signs and symptoms include; a lump or swelling in the breast, upper
chest or armpit; changes in skin texture; changes in breast color; rash, crusting or changes to
the nipple; unusual discharge from either nipple.
Risk factors for developing breast cancer includes; being female; obesity; lack of
physical exercise; drinking alcohol; hormone replacement therapy during menopause, Etc.
Breast cancer commonly develops in cells from the lining of milk ducts and the lobules that
supply the ducts with milk. Cancer developing from the ducts are known as “ductal
carcinomas”, while those developing from lobules are known as “lobular carcinomas”.
The earlier breast cancer is diagnosed, the better the chance of successful treatment.
Breast cancer is diagnosed by biopsy of the affected area of the breast confirmed by X-ray
mammography. The likelihood of a lump being cancerous can also be detected by physical
examination of the breast by an healthcare provider.
7
Nowadays with the advent of technology, medical fields are becoming more
effective. There are many applications of the ES that has been used in medical field. An ES
has been implemented for disease diagnosing such as diabetes, skin disease, Etc.
This breast cancer diagnosis ES consist of both structured questions and structured
responses within medical domains.
1.2 STATEMENT OF THE PROBLEM
Expert Systems are among the first truly successful forms of artificial
intelligence software. However, some experts point out that ES were not part of true
artificial intelligence since they lack the ability to learn autonomously from external data
which limit their diagnostic capacity.
The focus of this project is to develop an ES that can learn autonomously
from external medical data using “Neural Networks.”
1.3 PURPOSE OF STUDY
The aim of this project is to develop an Expert System that can learn
autonomously from external medical data using “Neural networks” for breast cancer
diagnosis.
The specific objectives of this project are to:
 Early detection of cancerous cells in breast tissues
 Accept inputs from end users and provide accurate and
precise expert diagnose report
 Learn autonomously and improve its accuracy and precision
continuously
 Aid human experts in the diagnosis of breast cancer
8
 Aid healthcare providers to what to do in the case of
emergency if the human expert is not present at that point in
time
1.4 RESEACRCH QUESTION
1.5 METHODOLOGY
In implementing this project, detailed and extensive review of relevant literature was
carried out. Elaborate study of the principles and techniques governing the development of
an Expert System for solving complex diagnosis problem is carried out by learning its
properties and concepts. Accurate and precise training datasets were acquired from web-
based machine learning dataset websites and cancer experts or oncologists. The collected
datasets were used to train the ES learning algorithms and tested with inference datasets.
1.6 SCOPE OF THE STUDY
The scope of this project is limited to the development of an automated
reasoning system (ES) for early breast cancer diagnosis only. Treatment or prevention of
breast cancer is not considered as other types of cancer were also not considered.
9
CHAPTER TWO
LITERATURE REVIEW
2.1 INTRODUCTION
The integration of information technology into many areas of human life has led to
the development of different innovative solutions to human problems both at an individual
level such as; personal finance, healthy lifestyle, education, communication, Etc. and at
global scale which includes; climate change, insecurity, disaster management and disease
control.
The advent of information technology aim to free human from mental drudgery as
the industrial revolution frees human from physical drudgery.
Cancer is a disease of the cells, which are the body’s basic building blocks. The
body constantly makes new cells to help us grow, replace worn-out tissue and heal injuries.
Normally, cells multiply and die in an orderly way. Sometimes cells don’t grow, divide and
die in the usual way. This may cause blood or lymph fluid in the body to become abnormal,
or form a lump called a tumour. A tumour can be benign or malignant.
Benign tumour – Cells are confined to one area and are not able to spread to other parts of
the body. This is not cancer.
Malignant tumour – This is made up of cancerous cells, which have the ability to spread by
travelling through the bloodstream or lymphatic system (lymph fluid).
Figure 4: How Cancer Starts
10
The cancer that first develops in a tissue or organ is called the primary cancer. A
malignant tumour is usually named after the organ or type of cell affected. A malignant
tumour that has not spread to other parts of the body is called localised cancer. A tumour
may invade deeper into surrounding tissue and can grow its own blood vessels in a process
called angiogenesis. If cancerous cells grow and form another tumour at a new site, it is
called a secondary cancer or metastasis. A metastasis keeps the name of the original cancer.
For example, breast cancer that has spread to the bones is called metastatic breast cancer,
even though the person may be experiencing symptoms caused by problems in the bones.
Figure 5: How Cancer Spreads
Women and men both have breast tissue. In women, breasts are made up of milk
glands. A milk gland consists of:
• Lobules – where milk is produced
• Ducts – tubes that carry milk to the nipples.
In men, the development of the lobules is suppressed at puberty by testosterone, the
primary male sex hormone. Both female and male breasts also contain supportive fibrous
and fatty tissue. Some breast tissue extends into the armpit (axilla). This is known as the
‘axillary tail’ of the breast.
Breast cancer and the lymphatic system
11
The lymphatic system is a key part of the immune system. It protects the body
against disease and infection. It is made up of a network of thin tubes called lymph vessels
that are found throughout the body. Lymph vessels connect to groups of small, bean-shaped
structures called lymph nodes or glands. Lymph nodes are found throughout the body,
including in the armpits, breastbone (sternum), neck, abdomen and groin. The lymph nodes
in the armpit are often the first place cancer cells spread to outside the breast. During
surgery for breast cancer (or, sometimes, in a separate operation), some or all of the lymph
nodes will be removed and examined for cancer cells.
Figure 6: The Breast
Breast cancer is the abnormal growth of the cells lining the breast lobules or ducts.
These cells grow uncontrollably and have the potential to spread to other parts of the body.
Both women and men can develop breast cancer, although breast cancer is rare in men.
Ductal carcinoma in situ (DCIS) – Abnormal cells is contained within the ducts of
the breast. Having DCIS can increase the risk of developing invasive breast cancer. Invasive
breast cancer early breast cancer – The cancer has spread from the breast ducts or lobules
into surrounding breast tissue. It may also have spread to lymph nodes in the armpit. Most
breast cancers are found when they are invasive. The most common types of early breast
cancer are invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC). IDC
12
accounts for about 80% of breast cancers, and ILC makes up about 10% of breast cancer
cases.
Other types of invasive breast cancer include locally advanced breast cancer,
secondary breast cancer, inflammatory breast cancer and Paget’s disease of the nipple.
Lobular carcinoma in situ
Some women have abnormal cells that are contained within the lobules of the breast.
This is called lobular carcinoma in situ (LCIS). This is not cancer. LCIS is very rare in men.
While LCIS increases the risk of developing breast cancer, most women with this condition
will not develop breast cancer. LCIS is usually detected during tests for other breast
disorders. If you are diagnosed with LCIS, you will be monitored with regular screening
mammograms or other types of breast imaging.
2.2 SYMPTOMS OF BREAST CANCER
Some people have no symptoms and the cancer is found during a screening
mammogram (a low-dose x-ray of the breast) or a physical examination by a doctor. If you
do have symptoms, they could include:
• a lump, lumpiness or thickening, especially if it is in only one breast
• changes in the size or shape of the breast
• changes to the nipple, such as a change in shape, crusting, sores or ulcers, redness, a clear
or bloody discharge, or a nipple that turns in (inverted) when it used to stick out
• changes in the skin of the breast, such as dimpling or indentation, a rash, a scaly
appearance, unusual redness or other colour changes
• swelling or discomfort in the armpit
13
• persistent, unusual pain that is not related to your normal monthly menstrual cycle,
remains after your period and occurs in one breast only. Most breast changes aren’t caused
by cancer. However, if you have symptoms, see your doctor without delay.
In most people, the exact cause of breast cancer is unknown, but some factors can
increase the risk. Most people diagnosed with breast cancer have no known risk factors,
aside from getting older, which increases the risk in women and men. Having risk factors
does not necessarily mean that you will develop breast cancer. In women, risk factors
include:
 Having several first-degree relatives, such as a mother, father, sister or daughter,
diagnosed with breast cancer and/or a particular type of ovarian cancer. However,
most women diagnosed with breast cancer do not have a family history
 having a family member who has had genetic testing and has been found to carry a
mutation in the BRCA1 or BRCA2 genes
 a previous diagnosis of breast cancer or ductal carcinoma in situ (DCIS)
 a past history of particular non-cancerous breast conditions, such as lobular
carcinoma in situ (LCIS) or atypical ductal hyperplasia (abnormal cells in the lining
of the milk ducts)
 Long-term hormone replacement therapy (HRT) use.
 In men, the risk is increased in those who have:
 several first-degree relatives (male or female) who have had breast cancer
 a relative diagnosed with breast cancer under the age of 40
 several relatives with ovarian or colon cancer Key questions 11
 a family member who has had genetic testing and has been found to carry a mutation
in the BRCA1 or BRCA2 genes
14
 a rare genetic syndrome called Klinefelter syndrome. Men with this syndrome have
three sex chromosomes (XXY) instead of the usual two (XY). Some lifestyle
factors, such as being overweight, smoking, drinking alcohol and a lack of physical
activity, also slightly increase the risk of breast cancer in both women and men.
Inherited breast cancer gene
Most people diagnosed with breast cancer do not have a family history of the
disease. However, a small number of people have inherited a gene fault that increases their
breast cancer risk. Everyone inherits a set of genes from each parent, so they have two
copies of each gene. Sometimes there is a fault in one copy of a gene. This fault is called a
mutation. The two most common gene mutations that are linked to breast cancer are on the
BRCA1 and BRCA2 genes. Women in families with an inherited BRCA1 or BRCA2
change are at an increased risk of breast and ovarian cancers. Men in these families may be
at an increased risk of breast and prostate cancers. People with a strong family history of
breast cancer can attend a family cancer clinic for tests to see if they have inherited a gene
mutation.
2.3 BREAST CANCER DIAGNOSIS
If patient have symptoms of breast cancer, his/her doctor will take a full medical
history, which will include patient family history. The doctor will also perform a physical
examination, checking patient breasts and the lymph nodes under patient arms. Patient
doctor may refer the patient to a specialist for further tests to find out if his/her breast
change is due to cancer.
A mammogram is a low-dose x-ray of the breast tissue. This x-ray can find changes
that are too small to be felt during a physical examination. Both breasts will be checked
during a mammogram. During the mammogram, your breast is pressed between two x-ray
15
plates, which spread the breast tissue out so clear pictures can be taken. This can be
uncomfortable, but it takes only about 20 seconds. If the lump that patient l does not show
up on a mammogram, other tests will need to be done.
An ultrasound is a painless scan that uses sound waves to create a picture of patient
breast. A gel is spread on patient breast, and a small device called a transducer is moved
over the area. This sends out sound waves that echo when they meet something dense, like
an organ or a tumour. A computer creates a picture from these echoes. The scan is painless
and takes about 15–20 minutes.
A magnetic resonance imaging (MRI) scan uses a large magnet and radio waves to
create pictures of the breast tissue on a computer. Breast MRI is commonly used to screen
people who are at high risk of breast cancer, but it can also be used in people with very
dense breast tissue. Before the scan, patient will have an injection of a contrast dye to make
any cancerous breast tissue easier to see. Patient will lie face down on a table with
cushioned openings for patient breasts with his/her arms above patient r head. The table
slides into the machine, which is large and shaped like a cylinder. The scan is painless and
takes 30–60 minutes.
During a biopsy, a small sample of cells or tissue is removed from patient breast. A
pathologist examines the sample and checks it for cancer cells under a microscope. The
results of the biopsy and further tests will be outlined in a pathology report, which will
include the size and location of the tumour, the grade of the cancer, whether there are cancer
cells near the edge (margin) of the removed breast tissue, and whether there are cancer cells
in your lymph nodes. The report will help the patient doctor decide what treatment is best
for him/her. There are a few ways of taking a biopsy, and patient may need more than one.
The biopsy may be done in a specialist’s rooms, at a radiology practice, in hospital or at a
breast clinic.
16
Fine needle aspiration (FNA) – A thin needle is used to take cells from the breast
lump or abnormal area. Sometimes an ultrasound is used to help guide the needle. The test
can feel similar to having blood taken and may be a bit uncomfortable. A local anesthetic
may be used to numb the area where the needle will be inserted.
Core biopsy – A wider needle is used to remove a piece of tissue (a core) from the
lump or abnormal area. It is usually done under local anesthetic, so patient breast is numb,
although patient may feel some pain or discomfort when the anesthetic is given. During a
core biopsy, a mammogram, ultrasound or MRI is used to guide the needle. Patient may
have some bruising to your breast afterwards.
Vacuum-assisted stereotactic core biopsy – In this core biopsy, a number of small
tissue samples are removed through one small cut (incision) in the skin using a needle and a
suction-type instrument. It is done under a local anesthetic. A mammogram, ultrasound or
MRI may be used to guide the needle into place. You may feel some discomfort during the
procedure.
Surgical biopsy – If the abnormal area is too small to be biopsied using other
methods or the biopsy result isn’t clear, a surgical biopsy is done. Before the biopsy, a guide
wire may be put into the breast to help the surgeon find the abnormal tissue. Patient will be
given a local anesthetic, and the doctor may use a mammogram, ultrasound or MRI to guide
the wire into place. The biopsy is then done under a general anesthetic. The lump and a
small area of nearby breast tissue are removed, along with the wire. This is usually done as
day surgery, but some people stay in hospital overnight.
If the tests described above show that a patient have breast cancer, one or more tests
may be done to see if the cancer has spread to other parts of your body. Blood samples may
be taken to check patient general health and to look at patient bone and liver function for
17
signs of cancer. Doctor may take an x-ray of patient chest to check patient lungs for signs of
cancer.
A bone scan may be done to see if the breast cancer has spread to the patient bones.
A small amount of radioactive material is injected into a vein, usually in patient arm. This
material is attracted to areas of bone where there is cancer. After a few hours, the bones are
viewed with a scanning machine, which sends pictures to a computer. This scan is painless
and the radioactive material is not harmful. Plenty of fluids should be drink on the day of
the test and the day after.
A CT (computerized tomography) scan uses x-rays and a computer to create
detailed, cross-sectional pictures of the inside of the body. Patient may have to fast (not eat
or drink) for a period of time beforehand to make the scan pictures clearer and easier to
read. Before the scan, patient will either drink a liquid dye or be given an injection of dye
into a vein in patient arm. This dye is known as the contrast and it makes the pictures
clearer. If patient have the injection, he/she may feel hot all over for a few minutes. Patient
will lie flat on a table while the CT scanner, which is large and round like a doughnut, takes
pictures. This painless test takes 30–40 minutes.
A PET (positron emission tomography) scan is a specialized test, which is rarely
done for breast cancer. It is currently not funded by Medicare as a routine test for breast
cancer. A PET scan uses low-dose radioactive glucose to measure cell activity in different
parts of the body. If patient do have a PET scan, a small amount of the glucose will be
injected into a vein, usually in patient arm. Patient will need to wait for about an hour for
the fluid to move around his/her body, and then patient will lie on a table that moves
through a scanning machine. The scan will show ‘hot spots’ where the fluid has
accumulated – this happens where there are active cells, like cancer cells.
18
The tests described above show whether the cancer has spread to other parts of the
body. Working out how far the cancer has spread is called staging. Stages are numbered
from I to IV. The grade describes how active the cancer cells are and how fast the cancer is
likely to be growing.
Stage I the tumour is less than 2 cm in diameter and has not spread to the lymph
nodes in the armpit. Stage IIA The tumour is less than 2 cm in diameter and has spread to
the lymph nodes in the armpit. The tumour is 2–5 cm in diameter and has not spread to the
lymph nodes in the armpit.
Stage IIB The tumour is 2–5 cm in diameter and has spread to the lymph nodes in
the armpit.
Stage III is referred to as locally advanced breast cancer, and
Stage IV refers to advanced breast cancer.
Grade 1 (low grade) Cancer cells look a little different from normal cells. They are
usually slow growing.
Grade 2 (intermediate grade) Cancer cells do not look like normal cells. They are
growing faster than grade 1 breast cancer, but not as fast as grade 3.
Grade 3 (high grade) Cancer cells look very different from normal cells. They are
fast growing.
Prognosis means the expected outcome of a disease. Patient may wish to discuss
his/her prognosis with his/her doctor, but it is not possible for any doctor to predict the
exact course of the disease. Survival rates for people with breast cancer have increased
significantly over time due to better diagnostic tests and scans, earlier detection, and
improvements in treatment methods. Most people with early breast cancer can be treated
successfully.
2.4 EXPERT SYSTEMS
19
ES are computer applications or programs developed to solve complex problems at
the level of extra-ordinary human intelligence and expertise. ES reasoned through bodies of
knowledge represented mainly as IF-THEN-RULES rather than through conventional
procedural code. They are among the first truly successful forms of artificial intelligence
(AI) softwares. However, some experts points out that ES were not part of true AI since
they lack the ability to learn autonomously from external data. ES consists of three main
parts; the knowledge base, the reasoning methods or inference engine and the user interface.
Knowledge is required to exhibit intelligence or sound judgment. The success of any
ES majorly rests upon the collection of highly accurate and precise knowledge. Data is a
collection of facts; information is organized as data and facts about the task domain. Data,
information and past experience combined together are termed as knowledge. Knowledge
base contains factual and heuristic knowledge; factual knowledge is that knowledge of the
task domain that is widely shared, typically found in textbooks or journals and commonly
agreed upon by those knowledgeable in the particularly field. Heuristic knowledge is less
rigorous, more experiential, more judgmental knowledge of performance. It is rarely
discussed and largely individualistic. It is the knowledge of good practice, good judgment
and plausible reasoning in the field. It is the knowledge that underlies the “art of good
guessing. “
A knowledge representation is the method used to organize and formalizes the
knowledge in the knowledge base in the form of IF-THEN-ELSE rules. Knowledge base
is formed by reading from various scholars, experts and the knowledge engineers.
Knowledge engineer is a person with the qualities of empathy, quick learning and strong
analytical skills. He acquires information from subject experts by recording, interviewing
and observing him at work. He then categorizes and organizes the information in a
meaningful way in the form of IF-THEN-ELSE rules to be used by inference engine.
20
Inference engine acquires and manipulates the knowledge from the knowledge base
to arrive at a particular solution. For a rule based Expert System, it; applies rules
repeatedly to the facts, adds new knowledge into the knowledge base if required and
resolve rules conflict when multiples rules are applicable to a particular case. To
recommend a solution, the inference engine uses the following strategies; forward
chaining, a strategy used by an ES to answer the question, “WHAT CAN HAPPEN
NEXT”. The inference engine follows the chain of conditions and derivations and finally
deduces the outcome. It considers all the facts and rules, and sorts them before concluding
to a solution. This strategy is followed for working on conclusion, result or effect. For
example, predictions of share market status as an effect of changes in interest rates.
Backward chaining is used to answer the question, “WHY THIS HAPPENED”. The
inference engine tries to find out which conditions could have happened in the past for this
result. This strategy is followed for finding out cause or reason. For example, diagnosis of
blood or breast cancer in humans.
User interface provides interactions between user of the ES and the ES itself. It
explains how the ES arrived at a particular recommendation. The explanation may appear
in the form of; natural language displayed on screen, verbal narrations in natural language
and listing rule numbers displayed n screen. The user interface makes it easy to trace the
credibility of ES deductions.
2.5 NEURAL NETWORKS
Artificial neural networks (NNs) are biologically inspired and mimic the human
brain. They are occurring neurons. These neurons are connected each other with connection
links. These links have weights. They multiplied with transmitted signal in network. The
21
output of each neuron is determined by using an activation function such as sigmoid and step.
Usually nonlinear activation functions are used. NN’s are trained by experience, when
applied an unknown input to the network it can generalize from past experiences and produce
a new result (Bishop, 1996; Hanbay, Turkoglu, & Demir, 2007; Haykin,1994). NNs models
have been used for pattern matching, nonlinear system modeling, communications, electrical
and electronics industry, energy production, chemical industry, medical applications, data
mining and control because of their parallel processing capabilities. When designing a NN
model a number of considerations must be taken into account. First of all the suitable
structure of the NN model must be chosen, after this the activation function, the number of
layers and the number of units in each layer must be chosen. Generally desired model consist
of a number of layers. The most general model assumes complete interconnections
Figure 7: General Model of Artificial Neural Network
Perceptron is the simplest and oldest model of Neuron, as we know it. Takes some
inputs, sums them up, applies activation function and passes them to output layer.
22
Figure 8: Perceptron
Feed forward neural networks are also quite old — the approach originates from 50s,
generally it follows the following rules; all nodes are fully connected, activation flows from
input layer to output without back loops and there is one layer between input and output
(hidden layer).In most cases this type of networks is trained using Back-propagation method.
Figure 9: Feed Forward Neural Networks
RBF neural networks are actually FF (feed forward) NNs that use radial basis
function as activation function instead of logistic function. What makes the difference?
Logistic function map some arbitrary value to a 0…1 range, answering a “yes or no” question.
It is good for classification and decision making systems, but works bad for continuous values.
Contrary, radial basis functions answer the question “how far are we from the target”? This is
perfect for function approximation, and machine control (as a replacement of PID controllers,
for example).
To be short, these are just FF networks with different activation function and appliance.
23
Figure 10: Feed Forward Neural Networks
DFF neural networks opened Pandora box of deep learning in early 90s. These are just
FF NNs, but with more than one hidden layer. So, what makes them so different?
When training a traditional FF, we pass only a small amount of error to previous layer.
Because of that stacking more layers led to exponential growth of training times making DFFs
quite impractical. Only in early 00s a bunch of approaches are developed that allowed to train
DFFs effectively; now they form a core of modern Machine Learning systems, covering the
same purposes as FFs, but with much better results.
Figure 11: DFF Neural Networks
Recurrent Neural Networks introduce different type of cells — Recurrent cells. The
first network of this type was so called Jordan network, when each of hidden cell received its
own output with fixed delay — one or more iterations. Apart from that, it was like common
FNN.
24
Figure 12: Recurrent Neural Networks
25
CHAPTER THREE
SYSTEM DESIGN
3.1 DATASETS
Raw Datasets; these are breast cancer mammography (image data) that needs to be
cleaned and preprocessed. Training Datasets; these are breast cancer mammography (image
data) used to train the neural network. Input or Inference Datasets; these are breast cancer
mammography (image data) used to provide required predictions once the neural network
has been trained.
3.2 DATA CLEANING AND PREPROCESSING
Data needs to be cleaned and processed so that it’s in a usable format for modeling.
Exploration is required to identify important elements within the data and to identify any data
quality issues. The importance of data pre-processing can only be emphasized by the fact that
your neural network is only as good as the input data used to train it. If important data inputs
are missing, neural network may not be able to achieve desired level of accuracy. On the
other side, if data is not processed beforehand, it could affect the accuracy as well as
performance of the network down the lane. Some data pre-processing techniques includes:
Mean subtraction (zero centering); it’s the process of subtracting mean from each of the data
point to make it zero-centered. Consider a case where inputs to neuron (unit) are all positive
or all negative. In that case the gradient calculated during back propagation will either be
positive or negative (same as sign of inputs). And hence parameter updates are only restricted
to specific directions which in turn will make it inefficient to converge. Data Normalization;
26
normalization refers to normalizing the data to make it of same scale across all dimensions.
Common way to do that is to divide the data across each dimension by its standard deviation.
However, it only makes sense if you have a reason to believe that different input features
have different scales but they have equal importance to the learning algorithm.
Regularization; one of the most common problems in training deep neural network is over-
fitting. You’ll realize over-fitting in play when your network performed exceptionally well on
the training data but poorly on test data. This happens as our learning algorithm tries to fit
every data point in the input even if they represent some randomly sampled noise.
Regularization helps avoid over-fitting by penalizing the weights of the network.
3.3 TRAINING NEURAL NETWOK
Once the network has been cleaned and pre-processed for breast cancer diagnosis, that
network is ready to be trained. To start this process the initial weights are chosen randomly.
Then, the training, or learning, begins.
There are two approaches to training - supervised and unsupervised. Supervised
training involves a mechanism of providing the network with the desired output either by
manually "grading" the network's performance or by providing the desired outputs with the
inputs. Unsupervised training is where the network has to make sense of the inputs without
outside help. The vast bulk of networks utilize supervised training. Unsupervised training is
used to perform some initial characterization on inputs. In supervised training, both the inputs
and the outputs are provided. The network then processes the inputs and compares its
resulting outputs against the desired outputs. Errors are then propagated back through the
system, causing the system to adjust the weights which control the network. This process
occurs over and over as the weights are continually tweaked.
The set of data which enables the training is called the "training set." During the
training of a network the same set of data is processed many times as the connection weights
27
are ever refined. The current commercial network development packages provide tools to
monitor how well an artificial neural network is converging on the ability to predict the right
answer. These tools allow the training process to go on for days, stopping only when the
system reaches some statistically desired point, or accuracy. If a network simply can't solve
the problem, the designer then has to review the input and outputs, the number of layers, the
number of elements per layer, the connections between the layers, the summation, transfer,
and training functions, and even the initial weights themselves. Those changes required to
create a successful network constitute a process wherein the "art" of neural networking
occurs. Another part of the designer's creativity governs the rules of training. There are many
laws (algorithms) used to implement the adaptive feedback required to adjust the weights
during training. The most common technique is backward-error propagation, more commonly
known as back-propagation. These various learning techniques are explored in greater depth
later in this report. When finally the system has been correctly trained, and no further
learning is needed, the weights can, if desired, be "frozen." In some systems this finalized
network is then turned into hardware so that it can be fast.
Other systems don't lock themselves in but continue to learn while in production use.
The other type of training is called unsupervised training. In unsupervised training, the
network is provided with inputs but not with desired outputs. The system itself must then
decide what features it will use to group the input data. This is often referred to as self-
organization or adaption. At the present time, unsupervised learning is not well understood.
This adaption to the environment is the promise which would enable science fiction types of
robots to continually learn on their own as they encounter new situations and new
environments. Life is filled with situations where exact training sets do not exist. Some of
these situations involve military action where new combat techniques and new weapons
might be encountered. Because of this unexpected aspect to life and the human desire to be
28
prepared, there continues to be research into, and hope for, this field. Yet, at the present time,
the vast bulk of neural network work is in systems with supervised learning. Supervised
learning is achieving results.
One of the leading researchers into unsupervised learning is Tuevo Kohonen, an
electrical engineer at the Helsinki University of Technology. He has developed a self-
organizing network, sometimes called an auto-associator that learns without the benefit of
knowing the right answer. It is an unusual looking network in that it contains one single layer
with many connections. The weights for those connections have to be initialized and the
inputs have to be normalized. The neurons are set up to compete in a winner-take-all fashion.
Kohonen continues his research into networks that are structured differently than standard,
feedforward, back-propagation approaches. Kohonen's work deals with the grouping of
neurons into fields. Neurons within a field are "topologically ordered." Topology is a branch
of mathematics that studies how to map from one space to another without changing the
geometric configuration. The three-dimensional groupings often found in mammalian brains
are an example of topological ordering.
TRAINED
(PROCESSED NEURAL NETWORK
MAMMOGRAPHIC OUTPUT
IMAGE) (PREDICTIONS
29
MAMMOGRAPHIC
IMAGE PREPROCESSING
30
Figure 13: System Design
Kohonen has pointed out that the lack of topology in neural network models make
today's neural networks just simple abstractions of the real neural networks within the brain.
As this research continues, more powerful self learning networks may become possible. But
currently, this field remains one that is still in the laboratory. This project’s neural network
was trained with a supervised approach
3.4 TRAINED NEURAL NETWORK
When finally the neural network has been correctly trained, and no further learning
is needed, the weights can, if desired, be "frozen." In some systems this finalized network is
then turned into hardware so that it can be fast. Other systems don't lock themselves in but
continue to learn while in production use as the network employed for this project. The
network can accept breast cancer patient’s diagnostic reports after being trained as inputs,
runs the inputs through network nodes and layers then makes predictions on the presence of
cancerous tumors or not while establishing its prediction accuracy
3.5 OUTPUT
These are the predictions or inferences made by the neural net on the presence of
cancerous tumors or not from the input mammographic image while establishing its
prediction accuracy.
31
3.6 METHODOLOGY
In implementing this project, detailed and extensive review of relevant literature was
carried out. Elaborate study of the principles and techniques governing the development of
an Expert System for solving complex diagnosis problem is carried out by learning its
properties and concepts. Accurate and precise training datasets were acquired from web-
based machine learning dataset websites and cancer experts or oncologists. The collected
datasets were used to train the ES learning algorithms and tested with inference datasets.
32
CHAPTER FOUR
IMPLEMENTATION
The methodology (Figure 1) adopted in this research project comprises of four
stages after data cleaning and preprocessing done by mini-MIAS database (that is, all
dataset used for this project have been preprocessed and cleaned which are located in mini-
MIAS database), first one is acquisition of image from mini-MIAS database, second
extracting features from the mammograms, selecting more optimal features, classifier to
identify appropriate class of mammogram. The suspicious parts were extracted from the
mammogram by using texture features. Database for this experiment is taken from mini-
MIAS this data set contains 322 mammograms, 270 images are normal (non-cancerous) and
52 images are malignant (cancerous). Every image in this database is 1024 × 1024 pixels.
This database can be access easily. Figure 2 and Figure 3 are sample images for normal
and malignant class respectively.
33
Texture features are extracted using GLCM along 0° for each mammogram.
Features represent image in a specific format that focus especially on relevant information.
In the next stage features are selected for training and testing; this stage is very important
because classification accuracy mainly depend on careful selection of features. In the other
step mammograms are classified, for this research work neural network is used as a
classifier to distinguish mammogram and classify it into normal and malignant class.
34
4.1 FEATURES EXTRACTION USING GLCM
Feature extraction plays a vital role for pattern classification. Gray Level Co-
occurrence Matrix (GLCM) features are determined along 0° for all mammograms. In the
proposed system, 10 texture features define by Haralick et al. Shown in Table 1 are extracted
from the texture feature sub-space based on GLCM. Numbers of gray level in an image
determine the size of GLCM. For each formula given in the equations, n determine the
number of grey level used. The matrix element Q (i,j) is the relative frequency with two
pixels, separated by pixel distance, occur within given neighborhood with intensity i and j.
Texture features that are derived from the GLCM are given below
Variables Image 1 Image 2 Image 3 Image 4 Image 5
Contrast 0.024 0.037 0.042 0.038 0.038
Correlation 0.996 0.995 0.995 0.995 0.995
Entropy 1.195 1.233 1.277 1.262 1.115
Sum of square variance 6.986 8.385 9.065 8.723 7.535
Sum average 3.980 4.356 4.585 4.399 3.993
Sum variance 20.170 24.691 26.565 25.733 22.786
Sum entropy 1.180 1.211 1.256 1.244 1.098
Difference variance 0.024 0.037 0.042 0.038 0.038
Difference entropy 0.092 0.124 0.123 0.097 0.089
Info measure of correlation 1 -0.925 -0.905 -0.906 -0.926 -0.925
Table 1: Statistical value for sample images.
Contrast
35
It measures grey level values between reference and its neighbour pixel, variance
present in the mammograms is measured through it. Its value is high in case of Q (i,j) has
huge variation in the matrix. It can be measure through equation shown below
Correlation
Correlation shows the linear dependency of grey value. The value of correlation will be high
in case of mammogram contain linear structure up to considerable amount.
are mean and variance of marginal distribution Qx(i) and Qy(j)
Entropy
Entropy is a measure of randomness; it also describes the distribution variance in a
region. It can be calculated by using equation given below.
Sum of square
36
It tells about variation between two dependent variables. Variance puts relatively high
weights on the elements that differ from the average value of Q (i,j).
Sum average
This is the relation between clear and dense area in a mammogram.
Sum variance
It reveals spatial heterogeneity of an image.
Sum entropy
It is a measure of the sum of micro (local) differences in an image.
Difference entropy
This is a measure of the variability of micro differences.
37
Information measure of correlation
In this feature two derived arrays are used, first array represents the summation of
rows, while the second one represents the summation of columns in the GLCM.
Difference variance
Local variability can be measure through it.
Above ten features are calculated for all mammograms, values of features for five
mammograms are shown in the Table 1.
4.2 FEATURES SELECTION
Features subset selection is used to reduce feature space that helps to reduce the
computation time. This is achieved by removing noisy, redundant and irrelevant features i.e.,
it selects the effective features to get desire output.
38
For this research work, rank feature method is using to select optimal features that contribute
more toward target output. This function rearranges the features from top to bottom according
to their contribution. In this work top six ranked features are selected for training the
network. List of selected features is shown in Table 2.
Optical Features
F1 Sum variance
F2 Sum of square variance
F3 Correlation
F4 Sum entropy
F5 Entropy
F6 Difference variance
Table 2: Optimal features selected by using rank method.
4.3 CLASSIFICATION
Artificial Neural Network (ANN) classifier is used in this work as it is a commonly
used classifier for breast cancer classification. Neural Network composed of simple elements
that are inspired by biological neuron operates in parallel. We train neural network to perform
specific function by adjusting weights between elements. Neural network is trained to get
desired output. Such situation is shown in Figure 4. The network is adjusted based on the
comparison with the output and the corresponding target until the network output matches the
target. ANN classifier is based on two steps, i.e. training and testing. Classification accuracy
depends on training.
39
From the selected data base 70% data is used for training, 15% data for testing and
remaining 15% data is used for validation. Neural network contains three layers namely input
layer, hidden layer and output layer. Parameters used for artificial neural network are shown
in the training window (Figure 5).
Training function Levenberg-Marquardt is used for training the network, it shows
good results in training and classification. Other training function resilient back propagation,
40
Conjugate Gradient with Powell etc. are used; from all these training function
Levenberg-Marquardt is selected by comparing classification accuracy, training time to
converge and mean square error. Optimize network architecture used in this study has 20
neuron. Optimize network is selected by observing mean square error (mse) for different
values of hidden neurons.
4.4 REGRESSION ANALYSIS
Regression analysis is a statistical process to estimate association among all variables.
In the regression plot output from network are plotted versus the target set shown in the
Figure 6. In the regression plot perfect fit is indicated by dotted line while the solid line
shows the output from the network. Solid line perfectly equal to dashed line is achieved if the
classifier predicts 100% accurately. The difference between two line shows there are some
sample which are not correctly predicted by network. Data is represented by circle. In the plot
shown below value of R is 0.718, this value also shows the result accuracy. The value of R
equals to 1 shows 100% prediction (Figures 7 and 8).
41
4.5 NEURAL NETWORK PERFORMANCE EVALUATION
The problem under evaluation is binary classification; the parameters used for
weighing are accuracy, specificity, and sensitivity.
These parameters are defined as:
Sensitivity=TP/TP+FN × 100
Specificity=TN/TN+FP × 100
Accuracy=TP+TN/TP+TN+FP+FN × 100
Where TP is true positive, TN is true negative, and FP and FN are false positive and
false negative respectively. Sensitivity measures the percentage of truly predicted cancer
class, specificity measures the percentage of truly predicted benign/normal class and
42
accuracy is percentage of rightly predicted cancer and normal cases. Data is rotated five time
and the best result out of fivefold is shown in the confusion matrix below. In training set there
are 226 mammograms, 193 are normal and 33 are malignant; network predict all benign as
benign, out of 193 normal cases 2 samples are miss classified. Validation set comprises of 48
samples, 42 are normal and 6 are malignant, network predict all normal and malignant
correctly. Test set consists of 48 samples, 37 normal and 11 malignant; prediction is 100%
for this dataset.
Overall results by using mini MIAS Data Specificity Sensitivity Accuracy
Database division
Training 99.0% 100% 99.1%
Validation 100% 100% 100%
Test 100% 100% 100%
Overall 99.3% 100% 99.4%
Table 3: Overall summary of the results.
43
CHAPTER FIVE
5.1 SUMMARY
Neural networks offer a different way to analyze data, and to recognize patterns
within that data, than traditional computing methods. However, they are not a solution for
all computing problems. Traditional computing methods work well for problems that can be
well characterized. Balancing checkbooks, keeping ledgers, and keeping tabs of inventory
are well defined and do not require the special characteristics of neural networks.
Traditional computers are ideal for many applications. They can process data, track
inventories, network results, and protect equipment. These applications do not need the
special characteristics of neural networks.
Expert systems are an extension of traditional computing and are sometimes called the
fifth generation of computing. (First generation computing used switches and wires. The
second generation occurred because of the development of the transistor. The third generation
involved solid-state technology, the use of integrated circuits, and higher level languages like
COBOL, Fortran, and "C". End user tools, "code generators," are known as the fourth
generation.) The fifth generation involves artificial intelligence.
Typically, an expert system consists of two parts, an inference engine and a
knowledge base. The inference engine is generic. It handles the user interface, external files,
program access, and scheduling. The knowledge base contains the information that is specific
to a particular problem. This knowledge base allows an expert to define the rules which
govern a process. This expert does not have to understand traditional programming. That
person simply has to understand both what he wants a computer to do and how the
mechanism of the expert system shell works. It is this shell, part of the inference engine that
44
actually tells the computer how to implement the expert's desires. This implementation occurs
by the expert system generating the computer's programming itself; it does that through
"programming" of its own. This programming is needed to establish the rules for a particular
application. This method of establishing rules is also complex and does require a detail
oriented person.
Efforts to make expert systems general have run into a number of problems. As the
complexity of the system increases, the system simply demands too much computing
resources and becomes too slow. Expert systems have been found to be feasible only when
narrowly confined.
Artificial neural networks offer a completely different approach to problem solving
and they are sometimes called the sixth generation of computing. They try to provide a tool
that both programs itself and learns on its own. Neural networks are structured to provide the
capability to solve problems without the benefits of an expert and without the need of
programming. They can seek patterns in data that no one knows are there.
Expert systems have enjoyed significant successes. However, artificial intelligence
has encountered problems in areas such as vision, continuous speech recognition and
synthesis, and machine learning. Artificial intelligence also is hostage to the speed of the
processor that it runs on. Ultimately, it is restricted to the theoretical limit of a single
processor. Artificial intelligence is also burdened by the fact that experts don't always speak
in rules.
Yet, despite the advantages of neural networks over both expert systems and more
traditional computing in these specific areas, neural nets are not complete solutions. They
offer a capability that is not ironclad, such as a debugged accounting system. They learn, and
as such, they do continue to make "mistakes." Furthermore, even when a network has been
developed, there is no way to ensure that the network is the optimal network.
45
Neural systems do exact their own demands. They do require their implementor to
meet a number of conditions. These conditions include:
 a data set which includes the information which can characterize the problem.
 an adequately sized data set to both train and test the network.
 an understanding of the basic nature of the problem to be solved so that basic first-cut
decision on creating the network can be made. These decisions include the activation
and transfer functions, and the learning methods.
 an understanding of the development tools.
 adequate processing power (some applications demand real-time processing that
exceeds what is available in the standard, sequential processing hardware. The
development of hardware is the key to the future of neural networks).
Once these conditions are met, neural networks offer the opportunity of solving
problems in an arena where traditional processors lack both the processing power and a step-
by-step methodology. A number of very complicated problems cannot be solved in the
traditional computing environments. For example, speech is something that all people can
easily parse and understand. A person can understand a southern drawl, a Bronx accent, and
the slurred words of a baby. Without the massively paralleled processing power of a neural
network, this process is virtually impossible for a computer. Image recognition is another task
that a human can easily do but which stymies even the biggest of computers. A person can
recognize a plane as it turns, flies overhead, and disappears into a dot. A traditional computer
might try to compare the changing images to a number of very different stored patterns.
This new way of computing requires skills beyond traditional computing. It is a
natural evolution. Initially, computing was only hardware and engineers made it work. Then,
there were software specialists - programmers, systems engineers, data base specialists, and
designers. Now, there are also neural architects. This new professional need to be skilled
46
more than his predecessors. For instance, he will need to know statistics in order to choose
and evaluate training and testing situations. This skill of making neural networks work is one
that will stress the logical thinking of current software engineers.
5.2 CONCLUSION
Neural networks offer a unique way to solve some problems while making their own
demands. The biggest demand is that the process is not simply logic. It involves an empirical
skill, an intuitive feel as to how a network might be created. Current state-of-the-art artificial
neural networks for general image analysis are able to detect cancer in mammographies with
similar accuracy to radiologists, even in a screening-like cohort with low breast cancer
prevalence.
5.3 RECOMMENDATION
To reduce the death rate due to breast cancer it is very essential that cancer must be
identified at initial stage. Early stage detection of breast cancer can be enhanced with well
trained unsupervised artificial neural network with precise and accurate prediction and high
performance metrics. This should be explored by future researchers. Future research works
should be focused on prediction of malignant tumors and computer-aided treatment
prescription. In this research project 10 texture features from GLCM are calculated along 0°
are under consideration. Further sample space is reduced to 6 features. In future more,
features can be considered, and other dataset can be used to increase robustness of system.
47
REFERENCES
1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6354665/
2. A Beginner's Guide to Neural Networks and Deep Learning
(https://skymind.ai/wiki/neural-network)
3. A Beginner's Guide to Neural Networks and Deep Learning
(https://skymind.ai/wiki/neural-network)
4. Abdel-Zaher, Ahmed M, Eldeib, Ayman M (2016) Expert systems with applications,
Elsevier, Netherlands 46: 139-144.
5. Agrawal S, Agrawal J (2015) Neural network techniques for cancer prediction: A
survey. Procedia Computer Science 60: 769-774.
6. Albregtsen F (2008) Statistical texture measures computed from gray level
coocurrence matrices. Image Processing Laboratory, Department of Informatics,
University of Oslo 5: 5.
7. Álvarez Menéndez L, De Cos Juez FJ, Sánchez Lasheras F, ÁlvarezRiesgo JA (2010)
Artificial neural networks applied to cancer detection in a breast screening programme.
Math Comput Model 52: 983-891.
8. Anderson, T. W. (1984). An introduction to multivariate statistical analysis. New
York: Wiley.
9. Aragones, M. J., Ruiz, A. G., Jimenez, R., Perez, M., & Conejo, E. A. (2003). A
combined neural network and decision trees model for prognosis of breast cancer
relapse.
10. Artificial Intelligence in Medicine, 27, 45–63.
11. Ash, T. (1989). Dynamic node creation in backpropagation networks. Connection
Science, 1, 365-375.
48
12. Baker, C.L. (1979). Syntactic theory and the projection problem. Linguistic Inquiry,
10:533-581.
13. Bennett, K. P., & Mangasarian, O. L. (1992). Robust linear programming
discrimination of two linearly inseparable sets. Optimization Methods and Software, 1,
23–34.
14. Berbar MA (2017) Hybrid methods for feature extraction for breast masses
classification. Egyptian Informatics Journal.
15. Bhardwaj A, Tiwari A (2015) Breast cancer diagnosis using genetically optimized
neural network model. Expert Syst Appl 42: 4611-4620.
16. Bishop, C. M. (1996). Neural networks for pattern recognition. Oxford: Clarendon
17. Bowerman, M. (1987). The ‘no negative evidence’ problem: How do children avoid
constructing an overly general grammar? In J.A. Hawkins (Ed.), Explaining language
universals. Oxford: Basil Blackwell.
18. Braine, M.D.S. (1971). On two types of models of the internalization of grammars. In
D.I. Slobin (Ed.), The ontogenesis of grammar: A theoretical perspective. New York:
Academic Press.
19. Brown, R., & Hanlon, C. (1970). Derivational complexity and order of acquisition in
child speech.
20. Chandana P, Rao PS, Satyanarayana CH, Srinivas Y, Latha AG (2017) An efficient
content-based image retrieval (CBIR) using GLCM for feature extraction. Advances in
Intelligent Systems and Computing 21-30.
21. Chechkina EG, Toner, Marin Z, Audit B, Roux SG (2016) Combining multifractal
analyses of digital mammograms and infrared thermograms to assist in early breast
cancer diagnosis. AIP Conference Proceedings 1760.
22. Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.
49
23. Coleman C (2017) Early detection and screening for breast cancer. In Seminars in
Oncology Nursing.
24. D. Klahr and K. Kotovsky (Eds.), The 21st Carnegie-Mellon symposium on cognition:
Complex information processing: The Impact of Herbert A. Simon. Hillsdale, NJ:
Lawrence Erlbaum.
25. Dhahbi S, Barhoumi W, Zagrouba E (2015) Breast cancer diagnosis in digitized
mammograms using curvelet moments. Comput Biol Med 64: 79-90.
26. Dheeba J, Albert Singh N, Tamil Selvi S (2014) Computer-aided detection of breast
cancer on mammograms: A swarm intelligence optimized wavelet neural network
approach. J Biomed Inform 49: 45-52.
27. Dheeba J, Singh NA, Selvi ST (2014) Computer-aided detection of breast cancer on
mammograms: A swarm intelligence optimized wavelet neural network approach. J
Biomed Inform 49: 45-52.
28. Dunbar, K. & Klahr, D. (1989). Developmental differences in scientific discovery
processes.
29. Early detection of breast cancer fact sheet (https://www.cancer.org.au/about-
cancer/early-detection/early-detection-factsheets/breast-cancer.html)3
30. Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench.
Online Appendix for "Data Mining: Practical Machine Learning Tools and
Techniques", Morgan Kaufmann, Fourth Edition, 2016.
31. Elman, J.L. (1990). Finding structure in time. Cognitive Science, 14:179-211.
32. Elman, J.L. (1991). Distributed representations, simple recurrent networks, and
grammatical structure. Machine Learning, 7:195-225.
33. Estes, W.K. (1986). Array models for category learning. Cognitive Psychology,
18:500-549.
50
34. Explorations in the microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.
35. Fahlman, S.E., & Lebiere, C. (1990). The Cascade-Correlation learning architecture.
In D.S.
36. Furundzic D, Djordjevic M, JovicevicBekic A (1998) Neural networks approach to
early breast cancer detection. J Sys Arch 44: 617-633.
37. Gardezi SJS, Faye I, Adjed F, Kamel N, Eltoukhy MM (2016) Mammogram
classification using curvelet GLCM texture features and GIST features. Proceedings of
the International Conference on Advanced Intelligent Systems and Informatics 705-713.
38. Gold, E.M. (1967). Language identification in the limit. Information and Control,
16:447-474.
39. Gonzalez, R.C., & Wintz, P. (1977). Digital image processing. Reading, MA:
Addison-Wesley.
40. Gouldinga NR, Marquezb JD, Prewettc EM, Claytord TN, Nadler BR (2008)
Goulding ultrasonic imaging techniques for breast cancer detection. AIP Conference
Proceedings 975.
41. Haralick RM, Shanmugam K, Dinstein IK (1973) Textural features for image
classification.IEEE Transactions on systems, man, and cybernetics 6: 610-621.
42. Harris, C. (1991). Parallel distributed processing models and metaphors for language
and development. Ph.D. dissertation, University of California, San Diego.
43. http://biogps.org/dataset/tag/cancer/
44. http://htv.com.pk/health/breast-cancer-growing-at-alarming-rate-in-pakistan
45. http://www.eng.usf.edu/cvprg/
46. http://www.iccr-cancer.org/datasets
47. http://www.mammoimage.org/databases/
48. http://www.onlinemedicalimages.com/
51
49. http://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/
50. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer
51. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
52. https://archive.ics.uci.edu/ml/index.php
53. https://canceraustralia.gov.au/
54. https://data.world/datasets/cancer
55. https://elitedatascience.com/datasets
56. https://ethw.org/Teuvo_Kohonen
57. https://gallery.azure.ai/Experiment/Breast-cancer-dataset
58. https://github.com/datasets/breast-cancer
59. https://github.com/sfikas/medical-imaging-datasets
60. https://scikit-
learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
61. https://shiring.github.io/machine_learning/2017/01/15/rfe_ga_post
62. https://sites.google.com/site/aacruzr/image-datasets
63. https://ugc.futurelearn.com/uploads/files/6f/fe/6ffe8e0c-c7ef-40a3-8769-
9a4bb7164ffa/Transcript1-1.pdf
64. https://wiki.cancerimagingarchive.net/display/Public/QIN+Breast+DCE-MRI
65. https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-
learning-datasets/
66. https://www.breastcancer.org/
67. https://www.cancer.org.au/content/about_cancer/ebooks/cancertypes/Understanding_
Breast_Cancer_booklet_July_2016.pdf
68. https://www.cancer.org/cancer/breast-cancer.html
52
69. https://www.datasciencelearner.com/datasets-for-machine-learning-projects-data-
scientist/
70. https://www.dawn.com/news/1344915
71. https://www.expertsystem.com/machine-learning-definition/
72. https://www.google.com/imgres?imgurl=https%3A%2F%2Fi.udemycdn.com%2Fco
urse%2F750x422%2F1795952_e23e_2.jpg&imgrefurl=https%3A%2F%2Fwww.ude
my.com%2Fcourse%2Fthe-complete-neural-networks-bootcamp-theory-
applications%2F&docid=AYWM3WTt2PYwQM&tbnid=3kub-
tGj8Pdl5M%3A&vet=10ahUKEwj3revep6HkAhUBu3EKHfJlAQ0QMwiHASgQMB
A..i&w=750&h=422&bih=671&biw=1366&q=neural%20network&ved=0ahUKEwj3
revep6HkAhUBu3EKHfJlAQ0QMwiHASgQMBA&iact=mrc&uact=8
73. https://www.google.com/imgres?imgurl=https%3A%2F%2Fleonardoaraujosantos.gi
tbooks.io%2Fartificial-
inteligence%2Fcontent%2Fimage_folder_6%2Frecurrent.jpg&imgrefurl=https%3A%
2F%2Fleonardoaraujosantos.gitbooks.io%2Fartificial-
inteligence%2Fcontent%2Frecurrent_neural_networks.html&docid=lsr8X1595djQM
M&tbnid=2DkILitsq_NRiM%3A&vet=10ahUKEwiv2O_irKHkAhXfRBUIHUFXB
MMQMwhaKAEwAQ..i&w=947&h=410&bih=622&biw=1366&q=Recurrent%20Ne
ural%20Networks%20&ved=0ahUKEwiv2O_irKHkAhXfRBUIHUFXBMMQMwha
KAEwAQ&iact=mrc&uact=8
74. https://www.google.com/imgres?imgurl=https%3A%2F%2Fupload.wikimedia.org%
2Fwikipedia%2Fcommons%2Fthumb%2F7%2F7d%2FRadial_funktion_network.svg
%2F250px-
Radial_funktion_network.svg.png&imgrefurl=https%3A%2F%2Fen.wikipedia.org%2
Fwiki%2FRadial_basis_function_network&docid=h7t_1doMkyywTM&tbnid=BlbOR
53
TO9lptsNM%3A&vet=10ahUKEwjbw9yFq6HkAhX3XRUIHc0hCcIQMwhfKAYwB
g..i&w=250&h=210&bih=622&biw=1366&q=RBF%20neural%20networks%20&ved
=0ahUKEwjbw9yFq6HkAhX3XRUIHc0hCcIQMwhfKAYwBg&iact=mrc&uact=8
75. https://www.google.com/imgres?imgurl=https%3A%2F%2Fwww.octoparse.com%2
Fmedia%2F5154%2Fdeep-feed-
forward.png&imgrefurl=https%3A%2F%2Fwww.octoparse.com%2Fblog%2F27-
neutral-network-explained-in-graphics&docid=vkk508_bMzN6cM&tbnid=-
4gPh5vn_VaTvM%3A&vet=10ahUKEwj657vdq6HkAhVcShUIHVHqB_QQMwhN
KAEwAQ..i&w=608&h=538&bih=622&biw=1366&q=DFF%20neural%20networks
%20&ved=0ahUKEwj657vdq6HkAhVcShUIHVHqB_QQMwhNKAEwAQ&iact=mr
c&uact=8
76. https://www.kaggle.com/uciml/breast-cancer-wisconsin-data
77. https://www.mayoclinic.org/diseases-conditions/breast-cancer/diagnosis-
treatment/drc-20352475
78. https://www.medicinenet.com/breast_cancer_facts_stages/article.htm
79. https://www.ncbi.nlm.nih.gov/gds/?term=breast+cancer
80. https://www.programcreek.com/python/example/104690/sklearn.datasets.load_breast
_cancer
81. https://www.pyimagesearch.com/2019/02/18/breast-cancer-classification-with-keras-
and-deep-learning/
82. https://www.researchgate.net/post/How_do_I_solve_my_problem_with_the_mini-
MIAS_data_set
83. https://www.researchgate.net/post/How_to_download_MIAS_Dataset
84. https://www.researchgate.net/publication/311950799_Analysis_of_the_Wisconsin_Br
east_Cancer_Dataset_and_Machine_Learning_for_Breast_Cancer_Detection
54
85. https://www.sciencedirect.com/science/article/pii/S0957417414005594
86. https://www.webmd.com/breast-cancer/default.htm
87. In J.R. Hayes (Ed.), Cognition and the development of language. New York: Wiley.
88. Jalalian A, Mashohor S, Mahmud R, Karasfi B, Saripan MIB, et al. (2017)
Foundation and methodologies in computer-aided diagnosis systems for breast cancer
detection. Excli 16: 113.
89. Jordan, M. I. (1986). Serial order: A parallel distributed processing approach. Institute
for Cognitive Science Report 8604. University of California, San Diego.
90. K. Estes (Vol. 1). Hillsdale, NJ: Lawrence Erlbaum.
91. Kail, R. (1984). The development of memory. New York: W.H. Freeman.
92. Kumar S, Chandra M (2017) Detection of microcalcification using the wavelet based
adaptive sigmoid function and neural network. J Infor Proc Sys 13: 703-715.
93. Lea, G. & Simon, H.A. (1979). Problem solving and rule induction. In H.A. Simon
(Ed.), Models of Thought. New Haven, CT: Yale University Press.
94. MaktabdarOghaz M, Maarof MA, Rohani MF, Zainal A, Shaid SZM. An optimized
skin texture model using gray-level co-occurrence matrix. Neural Comput Appl 1-9.
95. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann,
and Ian H. Witten (2009). The WEKA Data Mining Software: An Update. SIGKDD
Explorations, Volume 11, Issue 1.
96. McClelland, J.L. (in press). Parallel distributed processing: Implications for cognition
and development. In R. Morris (Ed.), Parallel distributed processing: Implications for
psychology and neurobiology. Oxford: Oxford University Press.
97. McKenzie, B.E., Tootell, H.E., & Day, R.H. (1980). Development of visual size
constancy during the first year of human infancy. Developmental Psychology, 16:163-
174.
55
98. Medin, D.L., & Schaffer, M.M. (1978). Context theory of classification learning.
Psychological Review, 85:207-238.
99. Miller, G.A., & Chomsky, N. (1963). Finitary models of language users. In R.D.
Luce, R.R. Bush, & E. Galanter (Eds.), Handbook of Mathematical Psychology, Volume
II. New York: John Wiley.
100. Muramatsu C, Hara T, Endo T, Fujita H (2016) Breast mass classification on
mammograms using radial local ternary patterns. Comput Biol Med 72: 43-53.
101. Newport, E.L. (1988). Constraints on learning and their role in language acquisition:
Studies of the acquisition of American Sign Language. Language Sciences, 10:147-172.
Newport, E.L. (1990). Maturational constraints on language learning. Cognitive
Science, 14:11-28.
102. Nithya R, Santhi B (2011) Classification of normal and abnormal patterns in digital
mammograms for diagnosis of breast cancer. Int J Comp App 28: 21-25.
103. Nosofsky, R.M. (in press). Exemplars, prototypes, and similarity rules. In A. Healy, S.
Kosslyn, & R. Shiffrin (Eds.), From learning theory to connectionist theory: Essays in
honor of William
104. Nurhasanah, Sampurno J, Faryuni ID, OktoIvansyah (2016) Automated analysis of
image mammogram for breast cancer diagnosis. AIP Conference Proceeding 1719.
105. Osherson, D.N., Stob, M., & Weinstein, S. (1986). Systems that learn: An
introduction to learning theory for cognitive and computer scientists. Cambridge, MA:
MIT Press.
106. Pereira DC, Ramos RP, Do Nascimento MZ (2014) Segmentation and detection of
breast cancer in mammograms combining wavelet analysis and genetic algorithm.
Comput Methods Programs Biomed 114: 88-101.
107. Pinker, S. (1989). Learnability and cognition. Cambridge, MA: MIT Press.
56
108. Plunkett, K., & Marchman, V. (1990). From rote learning to system building. Center
for Research in Language, TR 9020. University of California, San Diego.
109. Pollack, J.B. (1990). Language acquisition via strange automata. Proceedings of the
Twelfth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Erlbaum.
110. Preetha K (2016) Breast cancer detection and classification using artificial neural
network with partical swarm optimization. IJARBEST 2: 19.
111. Rampun A, Morrow PJ, Scotney BW, Winder J (2017) Fully automated breast
boundary and pectoral muscle segmentation in mammograms. Arti Intell Med.
112. Rasti R, Teshnehlab M, Phung SL (2017) Breast cancer diagnosis in DCE-MRI using
mixture ensemble of convolutional neural networks. Pattern Recogn 72: 381-930.
113. Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann
Publishers, San Mateo, CA.
114. Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal
representations by error propagation. In D.E. Rumelhart & J.L. McClelland (Eds.),
Parallel distributed processing:
115. Servan-Schreiber, D., Cleeremans, A., & McClelland, J.L. (1986). Encoding
sequential structure in simple recurrent networks. CMU Technical Report CMU-CS-88-
183. Computer Science Department, Carnegie-Mellon University.
116. Shultz, T.R., & Schmidt, W.C. (1991). A Cascade-Correlation model of balance scale
phenomenon. In Proceedings of the Thirteenth Annual Conference of the Cognitive
Science Society. Hillsdale, NJ: Erlbaum.
117. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA: A Cancer Journal for
Clinicians. Am Cancer Society 67: 7-30.
118. Singh AK, Gupta B (2015) A novel approach for breast cancer detection and
segmentation in a mammogram. Procedia Computer Science 54: 676-682.
57
119. Suckling J, Parker J, Dance D, Astley S, Hutt I, et al. (1994) The mammographic
image analysis society digital mammogram database. InExerptaMedica. International
Congress Series 1069: 375-378.
120. Sun W, Tseng TL (Bill), Zhang J, Qian W (2017) Enhancing deep convolutional
neural network scheme for breast cancer diagnosis with unlabeled data. Comput Med
Imaging Graph 57: 4-9.
121. Touretzky (Ed.), Advances in Neural Information Processing Systems 2, 524-532.
122. Turkewitz, G., & Kenny, P.A. (1982). Limitations on input as a basis for neural
organization and perceptual development: A preliminary theoretical statement.
Developmental Psychobiology, 15(4):257-368.
123. Vijayasarveswari V, Khatun S, Fakir MM, Jusoh M, Ali S (2017) UWB based low-
cost and non-invasive practical breast cancer early detection. AIP Conference
Proceeding 1808.
124. Wahab N, Khan A, Lee YS (2017) Two-phase deep convolutional neural network for
reducing class skewness in histopathological images based breast cancer detection.
Comp Biol Med 85: 86-97.
125. Wexler, K., & Cullicover, P. (1980) Formal principles of language acquisition.
Cambridge, MA: MIT Press.
126. Xie W, Li Y, Ma Y (2016) Breast mass classification in digital mammography based
on extreme learning machine. Neurocomputing 173: 930-941.
58

Development of Neural Network For

Uploaded by

Copyright:

Available Formats

You might also like

Development of Neural Network For

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Development of Neural Network For

Uploaded by

Copyright:

Available Formats

DEVELOPMENT OF NEURAL NETWORK FOR

BREAST CANCER DIAGNOSIS

(https://archive.ics.uci.edu/ml/datasets/Breast+Cancer) and oncologists. The collected

focused on prediction of malignant tumors and computer-aided treatment prescription.

1.1 BACKGROUND INFORMATION

led to the development of different innovative solutions to human problems both at an

and disease control.

The advent of information technology aim to free human from mental

drudgery as the industrial revolution frees human from physical drudgery.

ES are computer applications or programs developed to solve complex

problems at the level of extra-ordinary human intelligence and expertise. ES reasoned

through bodies of knowledge represented mainly as IF-THEN-RULES rather than through

Knowledge engineering is the building of ES and its practitioners are called

Figure 1: Components of an Expert System

or inference engine and the user interface.

Knowledge is required to exhibit intelligence or sound judgment. The

knowledge. Knowledge base contains factual and heuristic knowledge.

Factual knowledge is that knowledge of the task domain that is widely

knowledgeable in the particularly field.

Heuristic knowledge is less rigorous, more experiential, more judgmental

knowledge of performance. It is rarely discussed and largely individualistic. It is the

knowledge that underlies the “art of good guessing.

A knowledge representation is the method used to organize and formalizes

the knowledge in the knowledge base in the form of IF-THEN-ELSE rules.

knowledge engineers. Knowledge engineer is a person with the qualities of empathy,

the information in a meaningful way in the form of IF-THEN-ELSE rules to be used by

Inference engine acquires and manipulates the knowledge from the

To recommend a solution, the inference engine uses forward chaining, a strategy

of share market status as an effect of changes in interest rates.

Figure 2: Forward Chaining

It explains how the ES arrived at a particular recommendation. The explanation may

to trace the credibility of ES deductions.

Cancer is a group of diseases involving abnormal cell growth with potential to

the nipple; unusual discharge from either nipple.

examination of the breast by an healthcare provider.

responses within medical domains.

1.2 STATEMENT OF THE PROBLEM

which limit their diagnostic capacity.

The focus of this project is to develop an ES that can learn autonomously

from external medical data using “Neural Networks.”

1.3 PURPOSE OF STUDY

The specific objectives of this project are to:

 Early detection of cancerous cells in breast tissues

 Accept inputs from end users and provide accurate and

precise expert diagnose report

 Learn autonomously and improve its accuracy and precision

 Aid human experts in the diagnosis of breast cancer

emergency if the human expert is not present at that point in

1.4 RESEACRCH QUESTION

1.6 SCOPE OF THE STUDY

The scope of this project is limited to the development of an automated

the development of different innovative solutions to human problems both at an individual

the industrial revolution frees human from physical drudgery.

or form a lump called a tumour. A tumour can be benign or malignant.

the body. This is not cancer.

travelling through the bloodstream or lymphatic system (lymph fluid).

Figure 4: How Cancer Starts