Professional Documents
Culture Documents
Development of Neural Network For
Development of Neural Network For
Development of Neural Network For
1
ABSTRACT
This project focused on the development of self learning expert system for breast
cancer diagnosis using artificial neural network. The goal of this project is to develop an ES
that can learn autonomously from external medical data using “Neural Networks.”
Mammogram datasets were acquired from web-based machine learning dataset website
datasets were cleaned and preprocessed before being used to train the NN and tested with
inference datasets. In training set there are 226 mammograms, 193 are normal and 33 are
malignant; network predict all benign as benign, out of 193 normal cases 2 samples are miss
classified. Validation set comprises of 48 samples, 42 are normal and 6 are malignant,
network predict all normal and malignant correctly. Test set consists of 48 samples, 37
normal and 11 malignant; prediction is 100% for this dataset. This product is useful for the
diagnosis of breast cancerous and non-malignant tumors. Future research works can be
2
TABLE OF CONTENTS
Title Page i
Declaration ii
Certification iii
Dedication iv
Acknowledgement v
Abstract vi
Table of contents vii
Chapter One 1
Background of Study 8
Statement of the Problem 12
Purpose of the Study 13
Research Question 13
Scope of Study 14
Chapter Two 15
Introduction 15
Symptoms of Breast Cancer 19
Breast Cancer Diagnosis 22
Expert Systems 26
Neural Networks 28
Chapter Three 34
Datasets 34
Data cleaning and pre-processing 35
Training neural network 35
Trained neural network 40
Output 40
Chapter Four 42
Features Extraction Using GCLM 45
Features Selection 49
Classification 50
Regression and Analysis 52
Neural Network Performance Evaluation 53
Chapter Five 55
Summary 55
Conclusion 58
Recommendations 59
References 60
3
CHAPTER ONE
The integration of information technology into many areas of human life has
individual level such as; personal finance, healthy lifestyle, education, communication,
Etc. and at global scale which includes; climate change, insecurity, disaster management
conventional procedural code. They are among the first truly successful forms of artificial
intelligence (AI) softwares. However, some experts points out that ES were not part of
true AI since they lack the ability to learn autonomously from external data.
knowledge engineers. The computer must have all the required knowledge needed to solve
a problem and the required knowledge must be represented as symbol patterns in the
memory of the computer. The computer must also use the knowledge efficiently by
4
selecting from a handful of reasoning methods.
ES consists of three main parts; the knowledge base, the reasoning methods
success of any ES majorly rests upon the collection of highly accurate and precise
knowledge. Data is a collection of facts; information is organized as data and facts about
the task domain. Data, information and past experience combined together are termed as
shared, typically found in textbooks or journals and commonly agreed upon by those
knowledge of good practice, good judgment and plausible reasoning in the field. It is the
5
Knowledge base is formed by reading from various scholars, experts and the
quick learning and strong analytical skills. He acquires information from subject experts
by recording, interviewing and observing him at work. He then categorizes and organizes
inference engine.
knowledge base to arrive at a particular solution. For a rule based Expert System, it;
applies rules repeatedly to the facts, adds new knowledge into the knowledge base if
required, resolve rules conflict when multiples rules are applicable to a particular case.
used by an ES to answer the question, “WHAT CAN HAPPEN NEXT”. The inference
engine follows the chain of conditions and derivations and finally deduces the outcome. It
considers all the facts and rules, and sorts them before concluding to a solution. This
strategy is followed for working on conclusion, result or effect. For example, predictions
The inference engine also use backward chaining is used to answer the
question, “WHY THIS HAPPENED”. The inference engine tries to find out which
conditions could have happened in the past for this result. This strategy is followed for
finding out cause or reason. For example, diagnosis of blood or breast cancer in humans.
6
Figure 3: Backward Chaining
User interface provides interactions between user of the ES and the ES itself.
appear in the form of; natural language displayed on screen, verbal narrations in natural
language and listing rule numbers displayed on screen. The user interface makes it easy
spread to other body parts. Breast cancer is a cancer that develops from breast tissues.
Common breast cancer signs and symptoms include; a lump or swelling in the breast, upper
chest or armpit; changes in skin texture; changes in breast color; rash, crusting or changes to
Risk factors for developing breast cancer includes; being female; obesity; lack of
physical exercise; drinking alcohol; hormone replacement therapy during menopause, Etc.
Breast cancer commonly develops in cells from the lining of milk ducts and the lobules that
supply the ducts with milk. Cancer developing from the ducts are known as “ductal
carcinomas”, while those developing from lobules are known as “lobular carcinomas”.
The earlier breast cancer is diagnosed, the better the chance of successful treatment.
Breast cancer is diagnosed by biopsy of the affected area of the breast confirmed by X-ray
mammography. The likelihood of a lump being cancerous can also be detected by physical
7
Nowadays with the advent of technology, medical fields are becoming more
effective. There are many applications of the ES that has been used in medical field. An ES
has been implemented for disease diagnosing such as diabetes, skin disease, Etc.
This breast cancer diagnosis ES consist of both structured questions and structured
Expert Systems are among the first truly successful forms of artificial
intelligence software. However, some experts point out that ES were not part of true
artificial intelligence since they lack the ability to learn autonomously from external data
The aim of this project is to develop an Expert System that can learn
autonomously from external medical data using “Neural networks” for breast cancer
diagnosis.
continuously
8
Aid healthcare providers to what to do in the case of
time
1.5 METHODOLOGY
In implementing this project, detailed and extensive review of relevant literature was
carried out. Elaborate study of the principles and techniques governing the development of
an Expert System for solving complex diagnosis problem is carried out by learning its
properties and concepts. Accurate and precise training datasets were acquired from web-
based machine learning dataset websites and cancer experts or oncologists. The collected
datasets were used to train the ES learning algorithms and tested with inference datasets.
reasoning system (ES) for early breast cancer diagnosis only. Treatment or prevention of
breast cancer is not considered as other types of cancer were also not considered.
9
CHAPTER TWO
LITERATURE REVIEW
2.1 INTRODUCTION
The integration of information technology into many areas of human life has led to
level such as; personal finance, healthy lifestyle, education, communication, Etc. and at
global scale which includes; climate change, insecurity, disaster management and disease
control.
The advent of information technology aim to free human from mental drudgery as
Cancer is a disease of the cells, which are the body’s basic building blocks. The
body constantly makes new cells to help us grow, replace worn-out tissue and heal injuries.
Normally, cells multiply and die in an orderly way. Sometimes cells don’t grow, divide and
die in the usual way. This may cause blood or lymph fluid in the body to become abnormal,
Benign tumour – Cells are confined to one area and are not able to spread to other parts of
Malignant tumour – This is made up of cancerous cells, which have the ability to spread by
10
The cancer that first develops in a tissue or organ is called the primary cancer. A
malignant tumour is usually named after the organ or type of cell affected. A malignant
tumour that has not spread to other parts of the body is called localised cancer. A tumour
may invade deeper into surrounding tissue and can grow its own blood vessels in a process
called angiogenesis. If cancerous cells grow and form another tumour at a new site, it is
called a secondary cancer or metastasis. A metastasis keeps the name of the original cancer.
For example, breast cancer that has spread to the bones is called metastatic breast cancer,
even though the person may be experiencing symptoms caused by problems in the bones.
Women and men both have breast tissue. In women, breasts are made up of milk
primary male sex hormone. Both female and male breasts also contain supportive fibrous
and fatty tissue. Some breast tissue extends into the armpit (axilla). This is known as the
11
The lymphatic system is a key part of the immune system. It protects the body
against disease and infection. It is made up of a network of thin tubes called lymph vessels
that are found throughout the body. Lymph vessels connect to groups of small, bean-shaped
structures called lymph nodes or glands. Lymph nodes are found throughout the body,
including in the armpits, breastbone (sternum), neck, abdomen and groin. The lymph nodes
in the armpit are often the first place cancer cells spread to outside the breast. During
surgery for breast cancer (or, sometimes, in a separate operation), some or all of the lymph
Breast cancer is the abnormal growth of the cells lining the breast lobules or ducts.
These cells grow uncontrollably and have the potential to spread to other parts of the body.
Both women and men can develop breast cancer, although breast cancer is rare in men.
Ductal carcinoma in situ (DCIS) – Abnormal cells is contained within the ducts of
the breast. Having DCIS can increase the risk of developing invasive breast cancer. Invasive
breast cancer early breast cancer – The cancer has spread from the breast ducts or lobules
into surrounding breast tissue. It may also have spread to lymph nodes in the armpit. Most
breast cancers are found when they are invasive. The most common types of early breast
cancer are invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC). IDC
12
accounts for about 80% of breast cancers, and ILC makes up about 10% of breast cancer
cases.
Other types of invasive breast cancer include locally advanced breast cancer,
secondary breast cancer, inflammatory breast cancer and Paget’s disease of the nipple.
Some women have abnormal cells that are contained within the lobules of the breast.
This is called lobular carcinoma in situ (LCIS). This is not cancer. LCIS is very rare in men.
While LCIS increases the risk of developing breast cancer, most women with this condition
will not develop breast cancer. LCIS is usually detected during tests for other breast
disorders. If you are diagnosed with LCIS, you will be monitored with regular screening
Some people have no symptoms and the cancer is found during a screening
• changes to the nipple, such as a change in shape, crusting, sores or ulcers, redness, a clear
or bloody discharge, or a nipple that turns in (inverted) when it used to stick out
• changes in the skin of the breast, such as dimpling or indentation, a rash, a scaly
13
• persistent, unusual pain that is not related to your normal monthly menstrual cycle,
remains after your period and occurs in one breast only. Most breast changes aren’t caused
by cancer. However, if you have symptoms, see your doctor without delay.
In most people, the exact cause of breast cancer is unknown, but some factors can
increase the risk. Most people diagnosed with breast cancer have no known risk factors,
aside from getting older, which increases the risk in women and men. Having risk factors
does not necessarily mean that you will develop breast cancer. In women, risk factors
include:
diagnosed with breast cancer and/or a particular type of ovarian cancer. However,
most women diagnosed with breast cancer do not have a family history
having a family member who has had genetic testing and has been found to carry a
carcinoma in situ (LCIS) or atypical ductal hyperplasia (abnormal cells in the lining
several first-degree relatives (male or female) who have had breast cancer
a family member who has had genetic testing and has been found to carry a mutation
14
a rare genetic syndrome called Klinefelter syndrome. Men with this syndrome have
three sex chromosomes (XXY) instead of the usual two (XY). Some lifestyle
factors, such as being overweight, smoking, drinking alcohol and a lack of physical
activity, also slightly increase the risk of breast cancer in both women and men.
Most people diagnosed with breast cancer do not have a family history of the
disease. However, a small number of people have inherited a gene fault that increases their
breast cancer risk. Everyone inherits a set of genes from each parent, so they have two
copies of each gene. Sometimes there is a fault in one copy of a gene. This fault is called a
mutation. The two most common gene mutations that are linked to breast cancer are on the
BRCA1 and BRCA2 genes. Women in families with an inherited BRCA1 or BRCA2
change are at an increased risk of breast and ovarian cancers. Men in these families may be
at an increased risk of breast and prostate cancers. People with a strong family history of
breast cancer can attend a family cancer clinic for tests to see if they have inherited a gene
mutation.
If patient have symptoms of breast cancer, his/her doctor will take a full medical
history, which will include patient family history. The doctor will also perform a physical
examination, checking patient breasts and the lymph nodes under patient arms. Patient
doctor may refer the patient to a specialist for further tests to find out if his/her breast
A mammogram is a low-dose x-ray of the breast tissue. This x-ray can find changes
that are too small to be felt during a physical examination. Both breasts will be checked
during a mammogram. During the mammogram, your breast is pressed between two x-ray
15
plates, which spread the breast tissue out so clear pictures can be taken. This can be
uncomfortable, but it takes only about 20 seconds. If the lump that patient l does not show
An ultrasound is a painless scan that uses sound waves to create a picture of patient
breast. A gel is spread on patient breast, and a small device called a transducer is moved
over the area. This sends out sound waves that echo when they meet something dense, like
an organ or a tumour. A computer creates a picture from these echoes. The scan is painless
A magnetic resonance imaging (MRI) scan uses a large magnet and radio waves to
create pictures of the breast tissue on a computer. Breast MRI is commonly used to screen
people who are at high risk of breast cancer, but it can also be used in people with very
dense breast tissue. Before the scan, patient will have an injection of a contrast dye to make
any cancerous breast tissue easier to see. Patient will lie face down on a table with
cushioned openings for patient breasts with his/her arms above patient r head. The table
slides into the machine, which is large and shaped like a cylinder. The scan is painless and
During a biopsy, a small sample of cells or tissue is removed from patient breast. A
pathologist examines the sample and checks it for cancer cells under a microscope. The
results of the biopsy and further tests will be outlined in a pathology report, which will
include the size and location of the tumour, the grade of the cancer, whether there are cancer
cells near the edge (margin) of the removed breast tissue, and whether there are cancer cells
in your lymph nodes. The report will help the patient doctor decide what treatment is best
for him/her. There are a few ways of taking a biopsy, and patient may need more than one.
breast clinic.
16
Fine needle aspiration (FNA) – A thin needle is used to take cells from the breast
lump or abnormal area. Sometimes an ultrasound is used to help guide the needle. The test
can feel similar to having blood taken and may be a bit uncomfortable. A local anesthetic
may be used to numb the area where the needle will be inserted.
Core biopsy – A wider needle is used to remove a piece of tissue (a core) from the
lump or abnormal area. It is usually done under local anesthetic, so patient breast is numb,
although patient may feel some pain or discomfort when the anesthetic is given. During a
core biopsy, a mammogram, ultrasound or MRI is used to guide the needle. Patient may
tissue samples are removed through one small cut (incision) in the skin using a needle and a
MRI may be used to guide the needle into place. You may feel some discomfort during the
procedure.
Surgical biopsy – If the abnormal area is too small to be biopsied using other
methods or the biopsy result isn’t clear, a surgical biopsy is done. Before the biopsy, a guide
wire may be put into the breast to help the surgeon find the abnormal tissue. Patient will be
given a local anesthetic, and the doctor may use a mammogram, ultrasound or MRI to guide
the wire into place. The biopsy is then done under a general anesthetic. The lump and a
small area of nearby breast tissue are removed, along with the wire. This is usually done as
If the tests described above show that a patient have breast cancer, one or more tests
may be done to see if the cancer has spread to other parts of your body. Blood samples may
be taken to check patient general health and to look at patient bone and liver function for
17
signs of cancer. Doctor may take an x-ray of patient chest to check patient lungs for signs of
cancer.
A bone scan may be done to see if the breast cancer has spread to the patient bones.
A small amount of radioactive material is injected into a vein, usually in patient arm. This
material is attracted to areas of bone where there is cancer. After a few hours, the bones are
viewed with a scanning machine, which sends pictures to a computer. This scan is painless
and the radioactive material is not harmful. Plenty of fluids should be drink on the day of
detailed, cross-sectional pictures of the inside of the body. Patient may have to fast (not eat
or drink) for a period of time beforehand to make the scan pictures clearer and easier to
read. Before the scan, patient will either drink a liquid dye or be given an injection of dye
into a vein in patient arm. This dye is known as the contrast and it makes the pictures
clearer. If patient have the injection, he/she may feel hot all over for a few minutes. Patient
will lie flat on a table while the CT scanner, which is large and round like a doughnut, takes
done for breast cancer. It is currently not funded by Medicare as a routine test for breast
cancer. A PET scan uses low-dose radioactive glucose to measure cell activity in different
parts of the body. If patient do have a PET scan, a small amount of the glucose will be
injected into a vein, usually in patient arm. Patient will need to wait for about an hour for
the fluid to move around his/her body, and then patient will lie on a table that moves
through a scanning machine. The scan will show ‘hot spots’ where the fluid has
accumulated – this happens where there are active cells, like cancer cells.
18
The tests described above show whether the cancer has spread to other parts of the
body. Working out how far the cancer has spread is called staging. Stages are numbered
from I to IV. The grade describes how active the cancer cells are and how fast the cancer is
likely to be growing.
Stage I the tumour is less than 2 cm in diameter and has not spread to the lymph
nodes in the armpit. Stage IIA The tumour is less than 2 cm in diameter and has spread to
the lymph nodes in the armpit. The tumour is 2–5 cm in diameter and has not spread to the
Stage IIB The tumour is 2–5 cm in diameter and has spread to the lymph nodes in
the armpit.
Grade 1 (low grade) Cancer cells look a little different from normal cells. They are
Grade 2 (intermediate grade) Cancer cells do not look like normal cells. They are
growing faster than grade 1 breast cancer, but not as fast as grade 3.
Grade 3 (high grade) Cancer cells look very different from normal cells. They are
fast growing.
Prognosis means the expected outcome of a disease. Patient may wish to discuss
his/her prognosis with his/her doctor, but it is not possible for any doctor to predict the
exact course of the disease. Survival rates for people with breast cancer have increased
significantly over time due to better diagnostic tests and scans, earlier detection, and
improvements in treatment methods. Most people with early breast cancer can be treated
successfully.
19
ES are computer applications or programs developed to solve complex problems at
the level of extra-ordinary human intelligence and expertise. ES reasoned through bodies of
procedural code. They are among the first truly successful forms of artificial intelligence
(AI) softwares. However, some experts points out that ES were not part of true AI since
they lack the ability to learn autonomously from external data. ES consists of three main
parts; the knowledge base, the reasoning methods or inference engine and the user interface.
ES majorly rests upon the collection of highly accurate and precise knowledge. Data is a
collection of facts; information is organized as data and facts about the task domain. Data,
information and past experience combined together are termed as knowledge. Knowledge
base contains factual and heuristic knowledge; factual knowledge is that knowledge of the
task domain that is widely shared, typically found in textbooks or journals and commonly
agreed upon by those knowledgeable in the particularly field. Heuristic knowledge is less
discussed and largely individualistic. It is the knowledge of good practice, good judgment
and plausible reasoning in the field. It is the knowledge that underlies the “art of good
guessing. “
knowledge in the knowledge base in the form of IF-THEN-ELSE rules. Knowledge base
is formed by reading from various scholars, experts and the knowledge engineers.
Knowledge engineer is a person with the qualities of empathy, quick learning and strong
and observing him at work. He then categorizes and organizes the information in a
20
Inference engine acquires and manipulates the knowledge from the knowledge base
to arrive at a particular solution. For a rule based Expert System, it; applies rules
repeatedly to the facts, adds new knowledge into the knowledge base if required and
resolve rules conflict when multiples rules are applicable to a particular case. To
recommend a solution, the inference engine uses the following strategies; forward
NEXT”. The inference engine follows the chain of conditions and derivations and finally
deduces the outcome. It considers all the facts and rules, and sorts them before concluding
to a solution. This strategy is followed for working on conclusion, result or effect. For
Backward chaining is used to answer the question, “WHY THIS HAPPENED”. The
inference engine tries to find out which conditions could have happened in the past for this
result. This strategy is followed for finding out cause or reason. For example, diagnosis of
User interface provides interactions between user of the ES and the ES itself. It
explains how the ES arrived at a particular recommendation. The explanation may appear
in the form of; natural language displayed on screen, verbal narrations in natural language
and listing rule numbers displayed n screen. The user interface makes it easy to trace the
credibility of ES deductions.
Artificial neural networks (NNs) are biologically inspired and mimic the human
brain. They are occurring neurons. These neurons are connected each other with connection
links. These links have weights. They multiplied with transmitted signal in network. The
21
output of each neuron is determined by using an activation function such as sigmoid and step.
Usually nonlinear activation functions are used. NN’s are trained by experience, when
applied an unknown input to the network it can generalize from past experiences and produce
a new result (Bishop, 1996; Hanbay, Turkoglu, & Demir, 2007; Haykin,1994). NNs models
have been used for pattern matching, nonlinear system modeling, communications, electrical
and electronics industry, energy production, chemical industry, medical applications, data
mining and control because of their parallel processing capabilities. When designing a NN
model a number of considerations must be taken into account. First of all the suitable
structure of the NN model must be chosen, after this the activation function, the number of
layers and the number of units in each layer must be chosen. Generally desired model consist
Perceptron is the simplest and oldest model of Neuron, as we know it. Takes some
inputs, sums them up, applies activation function and passes them to output layer.
22
Figure 8: Perceptron
Feed forward neural networks are also quite old — the approach originates from 50s,
generally it follows the following rules; all nodes are fully connected, activation flows from
input layer to output without back loops and there is one layer between input and output
(hidden layer).In most cases this type of networks is trained using Back-propagation method.
RBF neural networks are actually FF (feed forward) NNs that use radial basis
function as activation function instead of logistic function. What makes the difference?
Logistic function map some arbitrary value to a 0…1 range, answering a “yes or no” question.
It is good for classification and decision making systems, but works bad for continuous values.
Contrary, radial basis functions answer the question “how far are we from the target”? This is
perfect for function approximation, and machine control (as a replacement of PID controllers,
for example).
To be short, these are just FF networks with different activation function and appliance.
23
Figure 10: Feed Forward Neural Networks
DFF neural networks opened Pandora box of deep learning in early 90s. These are just
FF NNs, but with more than one hidden layer. So, what makes them so different?
When training a traditional FF, we pass only a small amount of error to previous layer.
Because of that stacking more layers led to exponential growth of training times making DFFs
quite impractical. Only in early 00s a bunch of approaches are developed that allowed to train
DFFs effectively; now they form a core of modern Machine Learning systems, covering the
Recurrent Neural Networks introduce different type of cells — Recurrent cells. The
first network of this type was so called Jordan network, when each of hidden cell received its
own output with fixed delay — one or more iterations. Apart from that, it was like common
FNN.
24
Figure 12: Recurrent Neural Networks
25
CHAPTER THREE
SYSTEM DESIGN
3.1 DATASETS
Raw Datasets; these are breast cancer mammography (image data) that needs to be
cleaned and preprocessed. Training Datasets; these are breast cancer mammography (image
data) used to train the neural network. Input or Inference Datasets; these are breast cancer
mammography (image data) used to provide required predictions once the neural network
Data needs to be cleaned and processed so that it’s in a usable format for modeling.
Exploration is required to identify important elements within the data and to identify any data
quality issues. The importance of data pre-processing can only be emphasized by the fact that
your neural network is only as good as the input data used to train it. If important data inputs
are missing, neural network may not be able to achieve desired level of accuracy. On the
other side, if data is not processed beforehand, it could affect the accuracy as well as
performance of the network down the lane. Some data pre-processing techniques includes:
Mean subtraction (zero centering); it’s the process of subtracting mean from each of the data
point to make it zero-centered. Consider a case where inputs to neuron (unit) are all positive
or all negative. In that case the gradient calculated during back propagation will either be
positive or negative (same as sign of inputs). And hence parameter updates are only restricted
to specific directions which in turn will make it inefficient to converge. Data Normalization;
26
normalization refers to normalizing the data to make it of same scale across all dimensions.
Common way to do that is to divide the data across each dimension by its standard deviation.
However, it only makes sense if you have a reason to believe that different input features
have different scales but they have equal importance to the learning algorithm.
Regularization; one of the most common problems in training deep neural network is over-
fitting. You’ll realize over-fitting in play when your network performed exceptionally well on
the training data but poorly on test data. This happens as our learning algorithm tries to fit
every data point in the input even if they represent some randomly sampled noise.
Once the network has been cleaned and pre-processed for breast cancer diagnosis, that
network is ready to be trained. To start this process the initial weights are chosen randomly.
training involves a mechanism of providing the network with the desired output either by
manually "grading" the network's performance or by providing the desired outputs with the
inputs. Unsupervised training is where the network has to make sense of the inputs without
outside help. The vast bulk of networks utilize supervised training. Unsupervised training is
used to perform some initial characterization on inputs. In supervised training, both the inputs
and the outputs are provided. The network then processes the inputs and compares its
resulting outputs against the desired outputs. Errors are then propagated back through the
system, causing the system to adjust the weights which control the network. This process
The set of data which enables the training is called the "training set." During the
training of a network the same set of data is processed many times as the connection weights
27
are ever refined. The current commercial network development packages provide tools to
monitor how well an artificial neural network is converging on the ability to predict the right
answer. These tools allow the training process to go on for days, stopping only when the
system reaches some statistically desired point, or accuracy. If a network simply can't solve
the problem, the designer then has to review the input and outputs, the number of layers, the
number of elements per layer, the connections between the layers, the summation, transfer,
and training functions, and even the initial weights themselves. Those changes required to
create a successful network constitute a process wherein the "art" of neural networking
occurs. Another part of the designer's creativity governs the rules of training. There are many
laws (algorithms) used to implement the adaptive feedback required to adjust the weights
during training. The most common technique is backward-error propagation, more commonly
known as back-propagation. These various learning techniques are explored in greater depth
later in this report. When finally the system has been correctly trained, and no further
learning is needed, the weights can, if desired, be "frozen." In some systems this finalized
Other systems don't lock themselves in but continue to learn while in production use.
The other type of training is called unsupervised training. In unsupervised training, the
network is provided with inputs but not with desired outputs. The system itself must then
decide what features it will use to group the input data. This is often referred to as self-
organization or adaption. At the present time, unsupervised learning is not well understood.
This adaption to the environment is the promise which would enable science fiction types of
robots to continually learn on their own as they encounter new situations and new
environments. Life is filled with situations where exact training sets do not exist. Some of
these situations involve military action where new combat techniques and new weapons
might be encountered. Because of this unexpected aspect to life and the human desire to be
28
prepared, there continues to be research into, and hope for, this field. Yet, at the present time,
the vast bulk of neural network work is in systems with supervised learning. Supervised
organizing network, sometimes called an auto-associator that learns without the benefit of
knowing the right answer. It is an unusual looking network in that it contains one single layer
with many connections. The weights for those connections have to be initialized and the
inputs have to be normalized. The neurons are set up to compete in a winner-take-all fashion.
Kohonen continues his research into networks that are structured differently than standard,
neurons into fields. Neurons within a field are "topologically ordered." Topology is a branch
of mathematics that studies how to map from one space to another without changing the
TRAINED
(PROCESSED NEURAL NETWORK
MAMMOGRAPHIC OUTPUT
IMAGE) (PREDICTIONS
29
MAMMOGRAPHIC
IMAGE PREPROCESSING
30
Figure 13: System Design
Kohonen has pointed out that the lack of topology in neural network models make
today's neural networks just simple abstractions of the real neural networks within the brain.
As this research continues, more powerful self learning networks may become possible. But
currently, this field remains one that is still in the laboratory. This project’s neural network
When finally the neural network has been correctly trained, and no further learning
is needed, the weights can, if desired, be "frozen." In some systems this finalized network is
then turned into hardware so that it can be fast. Other systems don't lock themselves in but
continue to learn while in production use as the network employed for this project. The
network can accept breast cancer patient’s diagnostic reports after being trained as inputs,
runs the inputs through network nodes and layers then makes predictions on the presence of
3.5 OUTPUT
These are the predictions or inferences made by the neural net on the presence of
cancerous tumors or not from the input mammographic image while establishing its
prediction accuracy.
31
3.6 METHODOLOGY
In implementing this project, detailed and extensive review of relevant literature was
carried out. Elaborate study of the principles and techniques governing the development of
an Expert System for solving complex diagnosis problem is carried out by learning its
properties and concepts. Accurate and precise training datasets were acquired from web-
based machine learning dataset websites and cancer experts or oncologists. The collected
datasets were used to train the ES learning algorithms and tested with inference datasets.
32
CHAPTER FOUR
IMPLEMENTATION
stages after data cleaning and preprocessing done by mini-MIAS database (that is, all
dataset used for this project have been preprocessed and cleaned which are located in mini-
MIAS database), first one is acquisition of image from mini-MIAS database, second
extracting features from the mammograms, selecting more optimal features, classifier to
identify appropriate class of mammogram. The suspicious parts were extracted from the
mammogram by using texture features. Database for this experiment is taken from mini-
MIAS this data set contains 322 mammograms, 270 images are normal (non-cancerous) and
52 images are malignant (cancerous). Every image in this database is 1024 × 1024 pixels.
This database can be access easily. Figure 2 and Figure 3 are sample images for normal
33
Texture features are extracted using GLCM along 0° for each mammogram.
Features represent image in a specific format that focus especially on relevant information.
In the next stage features are selected for training and testing; this stage is very important
because classification accuracy mainly depend on careful selection of features. In the other
step mammograms are classified, for this research work neural network is used as a
classifier to distinguish mammogram and classify it into normal and malignant class.
34
4.1 FEATURES EXTRACTION USING GLCM
Feature extraction plays a vital role for pattern classification. Gray Level Co-
occurrence Matrix (GLCM) features are determined along 0° for all mammograms. In the
proposed system, 10 texture features define by Haralick et al. Shown in Table 1 are extracted
from the texture feature sub-space based on GLCM. Numbers of gray level in an image
determine the size of GLCM. For each formula given in the equations, n determine the
number of grey level used. The matrix element Q (i,j) is the relative frequency with two
pixels, separated by pixel distance, occur within given neighborhood with intensity i and j.
Texture features that are derived from the GLCM are given below
Contrast
35
It measures grey level values between reference and its neighbour pixel, variance
present in the mammograms is measured through it. Its value is high in case of Q (i,j) has
huge variation in the matrix. It can be measure through equation shown below
Correlation
Correlation shows the linear dependency of grey value. The value of correlation will be high
Entropy
Sum of square
36
It tells about variation between two dependent variables. Variance puts relatively high
weights on the elements that differ from the average value of Q (i,j).
Sum average
Sum variance
Sum entropy
Difference entropy
37
Information measure of correlation
In this feature two derived arrays are used, first array represents the summation of
rows, while the second one represents the summation of columns in the GLCM.
Difference variance
Above ten features are calculated for all mammograms, values of features for five
Features subset selection is used to reduce feature space that helps to reduce the
computation time. This is achieved by removing noisy, redundant and irrelevant features i.e.,
38
For this research work, rank feature method is using to select optimal features that contribute
more toward target output. This function rearranges the features from top to bottom according
to their contribution. In this work top six ranked features are selected for training the
Optical Features
F1 Sum variance
F3 Correlation
F4 Sum entropy
F5 Entropy
F6 Difference variance
4.3 CLASSIFICATION
used classifier for breast cancer classification. Neural Network composed of simple elements
that are inspired by biological neuron operates in parallel. We train neural network to perform
specific function by adjusting weights between elements. Neural network is trained to get
desired output. Such situation is shown in Figure 4. The network is adjusted based on the
comparison with the output and the corresponding target until the network output matches the
target. ANN classifier is based on two steps, i.e. training and testing. Classification accuracy
depends on training.
39
From the selected data base 70% data is used for training, 15% data for testing and
remaining 15% data is used for validation. Neural network contains three layers namely input
layer, hidden layer and output layer. Parameters used for artificial neural network are shown
good results in training and classification. Other training function resilient back propagation,
40
Conjugate Gradient with Powell etc. are used; from all these training function
converge and mean square error. Optimize network architecture used in this study has 20
neuron. Optimize network is selected by observing mean square error (mse) for different
In the regression plot output from network are plotted versus the target set shown in the
Figure 6. In the regression plot perfect fit is indicated by dotted line while the solid line
shows the output from the network. Solid line perfectly equal to dashed line is achieved if the
classifier predicts 100% accurately. The difference between two line shows there are some
sample which are not correctly predicted by network. Data is represented by circle. In the plot
shown below value of R is 0.718, this value also shows the result accuracy. The value of R
41
4.5 NEURAL NETWORK PERFORMANCE EVALUATION
The problem under evaluation is binary classification; the parameters used for
Sensitivity=TP/TP+FN × 100
Specificity=TN/TN+FP × 100
Accuracy=TP+TN/TP+TN+FP+FN × 100
Where TP is true positive, TN is true negative, and FP and FN are false positive and
false negative respectively. Sensitivity measures the percentage of truly predicted cancer
class, specificity measures the percentage of truly predicted benign/normal class and
42
accuracy is percentage of rightly predicted cancer and normal cases. Data is rotated five time
and the best result out of fivefold is shown in the confusion matrix below. In training set there
are 226 mammograms, 193 are normal and 33 are malignant; network predict all benign as
benign, out of 193 normal cases 2 samples are miss classified. Validation set comprises of 48
samples, 42 are normal and 6 are malignant, network predict all normal and malignant
correctly. Test set consists of 48 samples, 37 normal and 11 malignant; prediction is 100%
Database division
43
CHAPTER FIVE
5.1 SUMMARY
Neural networks offer a different way to analyze data, and to recognize patterns
within that data, than traditional computing methods. However, they are not a solution for
all computing problems. Traditional computing methods work well for problems that can be
well characterized. Balancing checkbooks, keeping ledgers, and keeping tabs of inventory
are well defined and do not require the special characteristics of neural networks.
Traditional computers are ideal for many applications. They can process data, track
inventories, network results, and protect equipment. These applications do not need the
Expert systems are an extension of traditional computing and are sometimes called the
fifth generation of computing. (First generation computing used switches and wires. The
second generation occurred because of the development of the transistor. The third generation
involved solid-state technology, the use of integrated circuits, and higher level languages like
COBOL, Fortran, and "C". End user tools, "code generators," are known as the fourth
knowledge base. The inference engine is generic. It handles the user interface, external files,
program access, and scheduling. The knowledge base contains the information that is specific
to a particular problem. This knowledge base allows an expert to define the rules which
govern a process. This expert does not have to understand traditional programming. That
person simply has to understand both what he wants a computer to do and how the
mechanism of the expert system shell works. It is this shell, part of the inference engine that
44
actually tells the computer how to implement the expert's desires. This implementation occurs
by the expert system generating the computer's programming itself; it does that through
"programming" of its own. This programming is needed to establish the rules for a particular
application. This method of establishing rules is also complex and does require a detail
oriented person.
Efforts to make expert systems general have run into a number of problems. As the
complexity of the system increases, the system simply demands too much computing
resources and becomes too slow. Expert systems have been found to be feasible only when
narrowly confined.
and they are sometimes called the sixth generation of computing. They try to provide a tool
that both programs itself and learns on its own. Neural networks are structured to provide the
capability to solve problems without the benefits of an expert and without the need of
programming. They can seek patterns in data that no one knows are there.
has encountered problems in areas such as vision, continuous speech recognition and
synthesis, and machine learning. Artificial intelligence also is hostage to the speed of the
processor that it runs on. Ultimately, it is restricted to the theoretical limit of a single
processor. Artificial intelligence is also burdened by the fact that experts don't always speak
in rules.
Yet, despite the advantages of neural networks over both expert systems and more
traditional computing in these specific areas, neural nets are not complete solutions. They
offer a capability that is not ironclad, such as a debugged accounting system. They learn, and
as such, they do continue to make "mistakes." Furthermore, even when a network has been
developed, there is no way to ensure that the network is the optimal network.
45
Neural systems do exact their own demands. They do require their implementor to
a data set which includes the information which can characterize the problem.
an adequately sized data set to both train and test the network.
an understanding of the basic nature of the problem to be solved so that basic first-cut
decision on creating the network can be made. These decisions include the activation
Once these conditions are met, neural networks offer the opportunity of solving
problems in an arena where traditional processors lack both the processing power and a step-
traditional computing environments. For example, speech is something that all people can
easily parse and understand. A person can understand a southern drawl, a Bronx accent, and
the slurred words of a baby. Without the massively paralleled processing power of a neural
network, this process is virtually impossible for a computer. Image recognition is another task
that a human can easily do but which stymies even the biggest of computers. A person can
recognize a plane as it turns, flies overhead, and disappears into a dot. A traditional computer
might try to compare the changing images to a number of very different stored patterns.
natural evolution. Initially, computing was only hardware and engineers made it work. Then,
there were software specialists - programmers, systems engineers, data base specialists, and
designers. Now, there are also neural architects. This new professional need to be skilled
46
more than his predecessors. For instance, he will need to know statistics in order to choose
and evaluate training and testing situations. This skill of making neural networks work is one
5.2 CONCLUSION
Neural networks offer a unique way to solve some problems while making their own
demands. The biggest demand is that the process is not simply logic. It involves an empirical
skill, an intuitive feel as to how a network might be created. Current state-of-the-art artificial
neural networks for general image analysis are able to detect cancer in mammographies with
similar accuracy to radiologists, even in a screening-like cohort with low breast cancer
prevalence.
5.3 RECOMMENDATION
To reduce the death rate due to breast cancer it is very essential that cancer must be
identified at initial stage. Early stage detection of breast cancer can be enhanced with well
trained unsupervised artificial neural network with precise and accurate prediction and high
performance metrics. This should be explored by future researchers. Future research works
prescription. In this research project 10 texture features from GLCM are calculated along 0°
are under consideration. Further sample space is reduced to 6 features. In future more,
features can be considered, and other dataset can be used to increase robustness of system.
47
REFERENCES
1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6354665/
(https://skymind.ai/wiki/neural-network)
(https://skymind.ai/wiki/neural-network)
University of Oslo 5: 5.
York: Wiley.
9. Aragones, M. J., Ruiz, A. G., Jimenez, R., Perez, M., & Conejo, E. A. (2003). A
combined neural network and decision trees model for prognosis of breast cancer
relapse.
Science, 1, 365-375.
48
12. Baker, C.L. (1979). Syntactic theory and the projection problem. Linguistic Inquiry,
10:533-581.
23–34.
14. Berbar MA (2017) Hybrid methods for feature extraction for breast masses
15. Bhardwaj A, Tiwari A (2015) Breast cancer diagnosis using genetically optimized
16. Bishop, C. M. (1996). Neural networks for pattern recognition. Oxford: Clarendon
17. Bowerman, M. (1987). The ‘no negative evidence’ problem: How do children avoid
18. Braine, M.D.S. (1971). On two types of models of the internalization of grammars. In
D.I. Slobin (Ed.), The ontogenesis of grammar: A theoretical perspective. New York:
Academic Press.
19. Brown, R., & Hanlon, C. (1970). Derivational complexity and order of acquisition in
child speech.
20. Chandana P, Rao PS, Satyanarayana CH, Srinivas Y, Latha AG (2017) An efficient
content-based image retrieval (CBIR) using GLCM for feature extraction. Advances in
21. Chechkina EG, Toner, Marin Z, Audit B, Roux SG (2016) Combining multifractal
49
23. Coleman C (2017) Early detection and screening for breast cancer. In Seminars in
Oncology Nursing.
24. D. Klahr and K. Kotovsky (Eds.), The 21st Carnegie-Mellon symposium on cognition:
Lawrence Erlbaum.
26. Dheeba J, Albert Singh N, Tamil Selvi S (2014) Computer-aided detection of breast
27. Dheeba J, Singh NA, Selvi ST (2014) Computer-aided detection of breast cancer on
processes.
cancer/early-detection/early-detection-factsheets/breast-cancer.html)3
30. Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench.
Online Appendix for "Data Mining: Practical Machine Learning Tools and
31. Elman, J.L. (1990). Finding structure in time. Cognitive Science, 14:179-211.
32. Elman, J.L. (1991). Distributed representations, simple recurrent networks, and
33. Estes, W.K. (1986). Array models for category learning. Cognitive Psychology,
18:500-549.
50
34. Explorations in the microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.
35. Fahlman, S.E., & Lebiere, C. (1990). The Cascade-Correlation learning architecture.
In D.S.
classification using curvelet GLCM texture features and GIST features. Proceedings of
38. Gold, E.M. (1967). Language identification in the limit. Information and Control,
16:447-474.
39. Gonzalez, R.C., & Wintz, P. (1977). Digital image processing. Reading, MA:
Addison-Wesley.
40. Gouldinga NR, Marquezb JD, Prewettc EM, Claytord TN, Nadler BR (2008)
Goulding ultrasonic imaging techniques for breast cancer detection. AIP Conference
Proceedings 975.
41. Haralick RM, Shanmugam K, Dinstein IK (1973) Textural features for image
42. Harris, C. (1991). Parallel distributed processing models and metaphors for language
43. http://biogps.org/dataset/tag/cancer/
44. http://htv.com.pk/health/breast-cancer-growing-at-alarming-rate-in-pakistan
45. http://www.eng.usf.edu/cvprg/
46. http://www.iccr-cancer.org/datasets
47. http://www.mammoimage.org/databases/
48. http://www.onlinemedicalimages.com/
51
49. http://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/
50. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer
51. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
52. https://archive.ics.uci.edu/ml/index.php
53. https://canceraustralia.gov.au/
54. https://data.world/datasets/cancer
55. https://elitedatascience.com/datasets
56. https://ethw.org/Teuvo_Kohonen
57. https://gallery.azure.ai/Experiment/Breast-cancer-dataset
58. https://github.com/datasets/breast-cancer
59. https://github.com/sfikas/medical-imaging-datasets
60. https://scikit-
learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
61. https://shiring.github.io/machine_learning/2017/01/15/rfe_ga_post
62. https://sites.google.com/site/aacruzr/image-datasets
63. https://ugc.futurelearn.com/uploads/files/6f/fe/6ffe8e0c-c7ef-40a3-8769-
9a4bb7164ffa/Transcript1-1.pdf
64. https://wiki.cancerimagingarchive.net/display/Public/QIN+Breast+DCE-MRI
65. https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-
learning-datasets/
66. https://www.breastcancer.org/
67. https://www.cancer.org.au/content/about_cancer/ebooks/cancertypes/Understanding_
Breast_Cancer_booklet_July_2016.pdf
68. https://www.cancer.org/cancer/breast-cancer.html
52
69. https://www.datasciencelearner.com/datasets-for-machine-learning-projects-data-
scientist/
70. https://www.dawn.com/news/1344915
71. https://www.expertsystem.com/machine-learning-definition/
72. https://www.google.com/imgres?imgurl=https%3A%2F%2Fi.udemycdn.com%2Fco
urse%2F750x422%2F1795952_e23e_2.jpg&imgrefurl=https%3A%2F%2Fwww.ude
my.com%2Fcourse%2Fthe-complete-neural-networks-bootcamp-theory-
applications%2F&docid=AYWM3WTt2PYwQM&tbnid=3kub-
tGj8Pdl5M%3A&vet=10ahUKEwj3revep6HkAhUBu3EKHfJlAQ0QMwiHASgQMB
A..i&w=750&h=422&bih=671&biw=1366&q=neural%20network&ved=0ahUKEwj3
revep6HkAhUBu3EKHfJlAQ0QMwiHASgQMBA&iact=mrc&uact=8
73. https://www.google.com/imgres?imgurl=https%3A%2F%2Fleonardoaraujosantos.gi
tbooks.io%2Fartificial-
inteligence%2Fcontent%2Fimage_folder_6%2Frecurrent.jpg&imgrefurl=https%3A%
2F%2Fleonardoaraujosantos.gitbooks.io%2Fartificial-
inteligence%2Fcontent%2Frecurrent_neural_networks.html&docid=lsr8X1595djQM
M&tbnid=2DkILitsq_NRiM%3A&vet=10ahUKEwiv2O_irKHkAhXfRBUIHUFXB
MMQMwhaKAEwAQ..i&w=947&h=410&bih=622&biw=1366&q=Recurrent%20Ne
ural%20Networks%20&ved=0ahUKEwiv2O_irKHkAhXfRBUIHUFXBMMQMwha
KAEwAQ&iact=mrc&uact=8
74. https://www.google.com/imgres?imgurl=https%3A%2F%2Fupload.wikimedia.org%
2Fwikipedia%2Fcommons%2Fthumb%2F7%2F7d%2FRadial_funktion_network.svg
%2F250px-
Radial_funktion_network.svg.png&imgrefurl=https%3A%2F%2Fen.wikipedia.org%2
Fwiki%2FRadial_basis_function_network&docid=h7t_1doMkyywTM&tbnid=BlbOR
53
TO9lptsNM%3A&vet=10ahUKEwjbw9yFq6HkAhX3XRUIHc0hCcIQMwhfKAYwB
g..i&w=250&h=210&bih=622&biw=1366&q=RBF%20neural%20networks%20&ved
=0ahUKEwjbw9yFq6HkAhX3XRUIHc0hCcIQMwhfKAYwBg&iact=mrc&uact=8
75. https://www.google.com/imgres?imgurl=https%3A%2F%2Fwww.octoparse.com%2
Fmedia%2F5154%2Fdeep-feed-
forward.png&imgrefurl=https%3A%2F%2Fwww.octoparse.com%2Fblog%2F27-
neutral-network-explained-in-graphics&docid=vkk508_bMzN6cM&tbnid=-
4gPh5vn_VaTvM%3A&vet=10ahUKEwj657vdq6HkAhVcShUIHVHqB_QQMwhN
KAEwAQ..i&w=608&h=538&bih=622&biw=1366&q=DFF%20neural%20networks
%20&ved=0ahUKEwj657vdq6HkAhVcShUIHVHqB_QQMwhNKAEwAQ&iact=mr
c&uact=8
76. https://www.kaggle.com/uciml/breast-cancer-wisconsin-data
77. https://www.mayoclinic.org/diseases-conditions/breast-cancer/diagnosis-
treatment/drc-20352475
78. https://www.medicinenet.com/breast_cancer_facts_stages/article.htm
79. https://www.ncbi.nlm.nih.gov/gds/?term=breast+cancer
80. https://www.programcreek.com/python/example/104690/sklearn.datasets.load_breast
_cancer
81. https://www.pyimagesearch.com/2019/02/18/breast-cancer-classification-with-keras-
and-deep-learning/
82. https://www.researchgate.net/post/How_do_I_solve_my_problem_with_the_mini-
MIAS_data_set
83. https://www.researchgate.net/post/How_to_download_MIAS_Dataset
84. https://www.researchgate.net/publication/311950799_Analysis_of_the_Wisconsin_Br
east_Cancer_Dataset_and_Machine_Learning_for_Breast_Cancer_Detection
54
85. https://www.sciencedirect.com/science/article/pii/S0957417414005594
86. https://www.webmd.com/breast-cancer/default.htm
87. In J.R. Hayes (Ed.), Cognition and the development of language. New York: Wiley.
89. Jordan, M. I. (1986). Serial order: A parallel distributed processing approach. Institute
91. Kail, R. (1984). The development of memory. New York: W.H. Freeman.
92. Kumar S, Chandra M (2017) Detection of microcalcification using the wavelet based
adaptive sigmoid function and neural network. J Infor Proc Sys 13: 703-715.
93. Lea, G. & Simon, H.A. (1979). Problem solving and rule induction. In H.A. Simon
94. MaktabdarOghaz M, Maarof MA, Rohani MF, Zainal A, Shaid SZM. An optimized
skin texture model using gray-level co-occurrence matrix. Neural Comput Appl 1-9.
95. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann,
and Ian H. Witten (2009). The WEKA Data Mining Software: An Update. SIGKDD
96. McClelland, J.L. (in press). Parallel distributed processing: Implications for cognition
97. McKenzie, B.E., Tootell, H.E., & Day, R.H. (1980). Development of visual size
constancy during the first year of human infancy. Developmental Psychology, 16:163-
174.
55
98. Medin, D.L., & Schaffer, M.M. (1978). Context theory of classification learning.
99. Miller, G.A., & Chomsky, N. (1963). Finitary models of language users. In R.D.
Luce, R.R. Bush, & E. Galanter (Eds.), Handbook of Mathematical Psychology, Volume
mammograms using radial local ternary patterns. Comput Biol Med 72: 43-53.
101. Newport, E.L. (1988). Constraints on learning and their role in language acquisition:
Science, 14:11-28.
102. Nithya R, Santhi B (2011) Classification of normal and abnormal patterns in digital
mammograms for diagnosis of breast cancer. Int J Comp App 28: 21-25.
103. Nosofsky, R.M. (in press). Exemplars, prototypes, and similarity rules. In A. Healy, S.
Kosslyn, & R. Shiffrin (Eds.), From learning theory to connectionist theory: Essays in
honor of William
image mammogram for breast cancer diagnosis. AIP Conference Proceeding 1719.
105. Osherson, D.N., Stob, M., & Weinstein, S. (1986). Systems that learn: An
introduction to learning theory for cognitive and computer scientists. Cambridge, MA:
MIT Press.
106. Pereira DC, Ramos RP, Do Nascimento MZ (2014) Segmentation and detection of
107. Pinker, S. (1989). Learnability and cognition. Cambridge, MA: MIT Press.
56
108. Plunkett, K., & Marchman, V. (1990). From rote learning to system building. Center
109. Pollack, J.B. (1990). Language acquisition via strange automata. Proceedings of the
Twelfth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Erlbaum.
110. Preetha K (2016) Breast cancer detection and classification using artificial neural
111. Rampun A, Morrow PJ, Scotney BW, Winder J (2017) Fully automated breast
112. Rasti R, Teshnehlab M, Phung SL (2017) Breast cancer diagnosis in DCE-MRI using
113. Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann
114. Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal
115. Servan-Schreiber, D., Cleeremans, A., & McClelland, J.L. (1986). Encoding
116. Shultz, T.R., & Schmidt, W.C. (1991). A Cascade-Correlation model of balance scale
117. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA: A Cancer Journal for
118. Singh AK, Gupta B (2015) A novel approach for breast cancer detection and
57
119. Suckling J, Parker J, Dance D, Astley S, Hutt I, et al. (1994) The mammographic
120. Sun W, Tseng TL (Bill), Zhang J, Qian W (2017) Enhancing deep convolutional
neural network scheme for breast cancer diagnosis with unlabeled data. Comput Med
122. Turkewitz, G., & Kenny, P.A. (1982). Limitations on input as a basis for neural
123. Vijayasarveswari V, Khatun S, Fakir MM, Jusoh M, Ali S (2017) UWB based low-
cost and non-invasive practical breast cancer early detection. AIP Conference
Proceeding 1808.
124. Wahab N, Khan A, Lee YS (2017) Two-phase deep convolutional neural network for
125. Wexler, K., & Cullicover, P. (1980) Formal principles of language acquisition.
58