Development of Neural Network For

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

DEVELOPMENT OF NEURAL NETWORK FOR

BREAST CANCER DIAGNOSIS

1
ABSTRACT

This project focused on the development of self learning expert system for breast

cancer diagnosis using artificial neural network. The goal of this project is to develop an ES

that can learn autonomously from external medical data using “Neural Networks.”

Mammogram datasets were acquired from web-based machine learning dataset website

(https://archive.ics.uci.edu/ml/datasets/Breast+Cancer) and oncologists. The collected

datasets were cleaned and preprocessed before being used to train the NN and tested with

inference datasets. In training set there are 226 mammograms, 193 are normal and 33 are

malignant; network predict all benign as benign, out of 193 normal cases 2 samples are miss

classified. Validation set comprises of 48 samples, 42 are normal and 6 are malignant,

network predict all normal and malignant correctly. Test set consists of 48 samples, 37

normal and 11 malignant; prediction is 100% for this dataset. This product is useful for the

diagnosis of breast cancerous and non-malignant tumors. Future research works can be

focused on prediction of malignant tumors and computer-aided treatment prescription.

2
TABLE OF CONTENTS

Title Page i
Declaration ii
Certification iii
Dedication iv
Acknowledgement v
Abstract vi
Table of contents vii
Chapter One 1
Background of Study 8
Statement of the Problem 12
Purpose of the Study 13
Research Question 13
Scope of Study 14
Chapter Two 15
Introduction 15
Symptoms of Breast Cancer 19
Breast Cancer Diagnosis 22
Expert Systems 26
Neural Networks 28
Chapter Three 34
Datasets 34
Data cleaning and pre-processing 35
Training neural network 35
Trained neural network 40
Output 40
Chapter Four 42
Features Extraction Using GCLM 45
Features Selection 49
Classification 50
Regression and Analysis 52
Neural Network Performance Evaluation 53
Chapter Five 55
Summary 55
Conclusion 58
Recommendations 59
References 60

3
CHAPTER ONE

1.1 BACKGROUND INFORMATION

The integration of information technology into many areas of human life has

led to the development of different innovative solutions to human problems both at an

individual level such as; personal finance, healthy lifestyle, education, communication,

Etc. and at global scale which includes; climate change, insecurity, disaster management

and disease control.

The advent of information technology aim to free human from mental

drudgery as the industrial revolution frees human from physical drudgery.

ES are computer applications or programs developed to solve complex

problems at the level of extra-ordinary human intelligence and expertise. ES reasoned

through bodies of knowledge represented mainly as IF-THEN-RULES rather than through

conventional procedural code. They are among the first truly successful forms of artificial

intelligence (AI) softwares. However, some experts points out that ES were not part of

true AI since they lack the ability to learn autonomously from external data.

Knowledge engineering is the building of ES and its practitioners are called

knowledge engineers. The computer must have all the required knowledge needed to solve

a problem and the required knowledge must be represented as symbol patterns in the

memory of the computer. The computer must also use the knowledge efficiently by

4
selecting from a handful of reasoning methods.

Figure 1: Components of an Expert System

ES consists of three main parts; the knowledge base, the reasoning methods

or inference engine and the user interface.

Knowledge is required to exhibit intelligence or sound judgment. The

success of any ES majorly rests upon the collection of highly accurate and precise

knowledge. Data is a collection of facts; information is organized as data and facts about

the task domain. Data, information and past experience combined together are termed as

knowledge. Knowledge base contains factual and heuristic knowledge.

Factual knowledge is that knowledge of the task domain that is widely

shared, typically found in textbooks or journals and commonly agreed upon by those

knowledgeable in the particularly field.

Heuristic knowledge is less rigorous, more experiential, more judgmental

knowledge of performance. It is rarely discussed and largely individualistic. It is the

knowledge of good practice, good judgment and plausible reasoning in the field. It is the

knowledge that underlies the “art of good guessing.

A knowledge representation is the method used to organize and formalizes

the knowledge in the knowledge base in the form of IF-THEN-ELSE rules.

5
Knowledge base is formed by reading from various scholars, experts and the

knowledge engineers. Knowledge engineer is a person with the qualities of empathy,

quick learning and strong analytical skills. He acquires information from subject experts

by recording, interviewing and observing him at work. He then categorizes and organizes

the information in a meaningful way in the form of IF-THEN-ELSE rules to be used by

inference engine.

Inference engine acquires and manipulates the knowledge from the

knowledge base to arrive at a particular solution. For a rule based Expert System, it;

applies rules repeatedly to the facts, adds new knowledge into the knowledge base if

required, resolve rules conflict when multiples rules are applicable to a particular case.

To recommend a solution, the inference engine uses forward chaining, a strategy

used by an ES to answer the question, “WHAT CAN HAPPEN NEXT”. The inference

engine follows the chain of conditions and derivations and finally deduces the outcome. It

considers all the facts and rules, and sorts them before concluding to a solution. This

strategy is followed for working on conclusion, result or effect. For example, predictions

of share market status as an effect of changes in interest rates.

Figure 2: Forward Chaining

The inference engine also use backward chaining is used to answer the

question, “WHY THIS HAPPENED”. The inference engine tries to find out which

conditions could have happened in the past for this result. This strategy is followed for

finding out cause or reason. For example, diagnosis of blood or breast cancer in humans.

6
Figure 3: Backward Chaining

User interface provides interactions between user of the ES and the ES itself.

It explains how the ES arrived at a particular recommendation. The explanation may

appear in the form of; natural language displayed on screen, verbal narrations in natural

language and listing rule numbers displayed on screen. The user interface makes it easy

to trace the credibility of ES deductions.

Cancer is a group of diseases involving abnormal cell growth with potential to

spread to other body parts. Breast cancer is a cancer that develops from breast tissues.

Common breast cancer signs and symptoms include; a lump or swelling in the breast, upper

chest or armpit; changes in skin texture; changes in breast color; rash, crusting or changes to

the nipple; unusual discharge from either nipple.

Risk factors for developing breast cancer includes; being female; obesity; lack of

physical exercise; drinking alcohol; hormone replacement therapy during menopause, Etc.

Breast cancer commonly develops in cells from the lining of milk ducts and the lobules that

supply the ducts with milk. Cancer developing from the ducts are known as “ductal

carcinomas”, while those developing from lobules are known as “lobular carcinomas”.

The earlier breast cancer is diagnosed, the better the chance of successful treatment.

Breast cancer is diagnosed by biopsy of the affected area of the breast confirmed by X-ray

mammography. The likelihood of a lump being cancerous can also be detected by physical

examination of the breast by an healthcare provider.

7
Nowadays with the advent of technology, medical fields are becoming more

effective. There are many applications of the ES that has been used in medical field. An ES

has been implemented for disease diagnosing such as diabetes, skin disease, Etc.

This breast cancer diagnosis ES consist of both structured questions and structured

responses within medical domains.

1.2 STATEMENT OF THE PROBLEM

Expert Systems are among the first truly successful forms of artificial

intelligence software. However, some experts point out that ES were not part of true

artificial intelligence since they lack the ability to learn autonomously from external data

which limit their diagnostic capacity.

The focus of this project is to develop an ES that can learn autonomously

from external medical data using “Neural Networks.”

1.3 PURPOSE OF STUDY

The aim of this project is to develop an Expert System that can learn

autonomously from external medical data using “Neural networks” for breast cancer

diagnosis.

The specific objectives of this project are to:

 Early detection of cancerous cells in breast tissues

 Accept inputs from end users and provide accurate and

precise expert diagnose report

 Learn autonomously and improve its accuracy and precision

continuously

 Aid human experts in the diagnosis of breast cancer

8
 Aid healthcare providers to what to do in the case of

emergency if the human expert is not present at that point in

time

1.4 RESEACRCH QUESTION

1.5 METHODOLOGY

In implementing this project, detailed and extensive review of relevant literature was

carried out. Elaborate study of the principles and techniques governing the development of

an Expert System for solving complex diagnosis problem is carried out by learning its

properties and concepts. Accurate and precise training datasets were acquired from web-

based machine learning dataset websites and cancer experts or oncologists. The collected

datasets were used to train the ES learning algorithms and tested with inference datasets.

1.6 SCOPE OF THE STUDY

The scope of this project is limited to the development of an automated

reasoning system (ES) for early breast cancer diagnosis only. Treatment or prevention of

breast cancer is not considered as other types of cancer were also not considered.

9
CHAPTER TWO

LITERATURE REVIEW

2.1 INTRODUCTION

The integration of information technology into many areas of human life has led to

the development of different innovative solutions to human problems both at an individual

level such as; personal finance, healthy lifestyle, education, communication, Etc. and at

global scale which includes; climate change, insecurity, disaster management and disease

control.

The advent of information technology aim to free human from mental drudgery as

the industrial revolution frees human from physical drudgery.

Cancer is a disease of the cells, which are the body’s basic building blocks. The

body constantly makes new cells to help us grow, replace worn-out tissue and heal injuries.

Normally, cells multiply and die in an orderly way. Sometimes cells don’t grow, divide and

die in the usual way. This may cause blood or lymph fluid in the body to become abnormal,

or form a lump called a tumour. A tumour can be benign or malignant.

Benign tumour – Cells are confined to one area and are not able to spread to other parts of

the body. This is not cancer.

Malignant tumour – This is made up of cancerous cells, which have the ability to spread by

travelling through the bloodstream or lymphatic system (lymph fluid).

Figure 4: How Cancer Starts

10
The cancer that first develops in a tissue or organ is called the primary cancer. A

malignant tumour is usually named after the organ or type of cell affected. A malignant

tumour that has not spread to other parts of the body is called localised cancer. A tumour

may invade deeper into surrounding tissue and can grow its own blood vessels in a process

called angiogenesis. If cancerous cells grow and form another tumour at a new site, it is

called a secondary cancer or metastasis. A metastasis keeps the name of the original cancer.

For example, breast cancer that has spread to the bones is called metastatic breast cancer,

even though the person may be experiencing symptoms caused by problems in the bones.

Figure 5: How Cancer Spreads

Women and men both have breast tissue. In women, breasts are made up of milk

glands. A milk gland consists of:

• Lobules – where milk is produced

• Ducts – tubes that carry milk to the nipples.

In men, the development of the lobules is suppressed at puberty by testosterone, the

primary male sex hormone. Both female and male breasts also contain supportive fibrous

and fatty tissue. Some breast tissue extends into the armpit (axilla). This is known as the

‘axillary tail’ of the breast.

Breast cancer and the lymphatic system

11
The lymphatic system is a key part of the immune system. It protects the body

against disease and infection. It is made up of a network of thin tubes called lymph vessels

that are found throughout the body. Lymph vessels connect to groups of small, bean-shaped

structures called lymph nodes or glands. Lymph nodes are found throughout the body,

including in the armpits, breastbone (sternum), neck, abdomen and groin. The lymph nodes

in the armpit are often the first place cancer cells spread to outside the breast. During

surgery for breast cancer (or, sometimes, in a separate operation), some or all of the lymph

nodes will be removed and examined for cancer cells.

Figure 6: The Breast

Breast cancer is the abnormal growth of the cells lining the breast lobules or ducts.

These cells grow uncontrollably and have the potential to spread to other parts of the body.

Both women and men can develop breast cancer, although breast cancer is rare in men.

Ductal carcinoma in situ (DCIS) – Abnormal cells is contained within the ducts of

the breast. Having DCIS can increase the risk of developing invasive breast cancer. Invasive

breast cancer early breast cancer – The cancer has spread from the breast ducts or lobules

into surrounding breast tissue. It may also have spread to lymph nodes in the armpit. Most

breast cancers are found when they are invasive. The most common types of early breast

cancer are invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC). IDC

12
accounts for about 80% of breast cancers, and ILC makes up about 10% of breast cancer

cases.

Other types of invasive breast cancer include locally advanced breast cancer,

secondary breast cancer, inflammatory breast cancer and Paget’s disease of the nipple.

Lobular carcinoma in situ

Some women have abnormal cells that are contained within the lobules of the breast.

This is called lobular carcinoma in situ (LCIS). This is not cancer. LCIS is very rare in men.

While LCIS increases the risk of developing breast cancer, most women with this condition

will not develop breast cancer. LCIS is usually detected during tests for other breast

disorders. If you are diagnosed with LCIS, you will be monitored with regular screening

mammograms or other types of breast imaging.

2.2 SYMPTOMS OF BREAST CANCER

Some people have no symptoms and the cancer is found during a screening

mammogram (a low-dose x-ray of the breast) or a physical examination by a doctor. If you

do have symptoms, they could include:

• a lump, lumpiness or thickening, especially if it is in only one breast

• changes in the size or shape of the breast

• changes to the nipple, such as a change in shape, crusting, sores or ulcers, redness, a clear

or bloody discharge, or a nipple that turns in (inverted) when it used to stick out

• changes in the skin of the breast, such as dimpling or indentation, a rash, a scaly

appearance, unusual redness or other colour changes

• swelling or discomfort in the armpit

13
• persistent, unusual pain that is not related to your normal monthly menstrual cycle,

remains after your period and occurs in one breast only. Most breast changes aren’t caused

by cancer. However, if you have symptoms, see your doctor without delay.

In most people, the exact cause of breast cancer is unknown, but some factors can

increase the risk. Most people diagnosed with breast cancer have no known risk factors,

aside from getting older, which increases the risk in women and men. Having risk factors

does not necessarily mean that you will develop breast cancer. In women, risk factors

include:

 Having several first-degree relatives, such as a mother, father, sister or daughter,

diagnosed with breast cancer and/or a particular type of ovarian cancer. However,

most women diagnosed with breast cancer do not have a family history

 having a family member who has had genetic testing and has been found to carry a

mutation in the BRCA1 or BRCA2 genes

 a previous diagnosis of breast cancer or ductal carcinoma in situ (DCIS)

 a past history of particular non-cancerous breast conditions, such as lobular

carcinoma in situ (LCIS) or atypical ductal hyperplasia (abnormal cells in the lining

of the milk ducts)

 Long-term hormone replacement therapy (HRT) use.

 In men, the risk is increased in those who have:

 several first-degree relatives (male or female) who have had breast cancer

 a relative diagnosed with breast cancer under the age of 40

 several relatives with ovarian or colon cancer Key questions 11

 a family member who has had genetic testing and has been found to carry a mutation

in the BRCA1 or BRCA2 genes

14
 a rare genetic syndrome called Klinefelter syndrome. Men with this syndrome have

three sex chromosomes (XXY) instead of the usual two (XY). Some lifestyle

factors, such as being overweight, smoking, drinking alcohol and a lack of physical

activity, also slightly increase the risk of breast cancer in both women and men.

Inherited breast cancer gene

Most people diagnosed with breast cancer do not have a family history of the

disease. However, a small number of people have inherited a gene fault that increases their

breast cancer risk. Everyone inherits a set of genes from each parent, so they have two

copies of each gene. Sometimes there is a fault in one copy of a gene. This fault is called a

mutation. The two most common gene mutations that are linked to breast cancer are on the

BRCA1 and BRCA2 genes. Women in families with an inherited BRCA1 or BRCA2

change are at an increased risk of breast and ovarian cancers. Men in these families may be

at an increased risk of breast and prostate cancers. People with a strong family history of

breast cancer can attend a family cancer clinic for tests to see if they have inherited a gene

mutation.

2.3 BREAST CANCER DIAGNOSIS

If patient have symptoms of breast cancer, his/her doctor will take a full medical

history, which will include patient family history. The doctor will also perform a physical

examination, checking patient breasts and the lymph nodes under patient arms. Patient

doctor may refer the patient to a specialist for further tests to find out if his/her breast

change is due to cancer.

A mammogram is a low-dose x-ray of the breast tissue. This x-ray can find changes

that are too small to be felt during a physical examination. Both breasts will be checked

during a mammogram. During the mammogram, your breast is pressed between two x-ray

15
plates, which spread the breast tissue out so clear pictures can be taken. This can be

uncomfortable, but it takes only about 20 seconds. If the lump that patient l does not show

up on a mammogram, other tests will need to be done.

An ultrasound is a painless scan that uses sound waves to create a picture of patient

breast. A gel is spread on patient breast, and a small device called a transducer is moved

over the area. This sends out sound waves that echo when they meet something dense, like

an organ or a tumour. A computer creates a picture from these echoes. The scan is painless

and takes about 15–20 minutes.

A magnetic resonance imaging (MRI) scan uses a large magnet and radio waves to

create pictures of the breast tissue on a computer. Breast MRI is commonly used to screen

people who are at high risk of breast cancer, but it can also be used in people with very

dense breast tissue. Before the scan, patient will have an injection of a contrast dye to make

any cancerous breast tissue easier to see. Patient will lie face down on a table with

cushioned openings for patient breasts with his/her arms above patient r head. The table

slides into the machine, which is large and shaped like a cylinder. The scan is painless and

takes 30–60 minutes.

During a biopsy, a small sample of cells or tissue is removed from patient breast. A

pathologist examines the sample and checks it for cancer cells under a microscope. The

results of the biopsy and further tests will be outlined in a pathology report, which will

include the size and location of the tumour, the grade of the cancer, whether there are cancer

cells near the edge (margin) of the removed breast tissue, and whether there are cancer cells

in your lymph nodes. The report will help the patient doctor decide what treatment is best

for him/her. There are a few ways of taking a biopsy, and patient may need more than one.

The biopsy may be done in a specialist’s rooms, at a radiology practice, in hospital or at a

breast clinic.

16
Fine needle aspiration (FNA) – A thin needle is used to take cells from the breast

lump or abnormal area. Sometimes an ultrasound is used to help guide the needle. The test

can feel similar to having blood taken and may be a bit uncomfortable. A local anesthetic

may be used to numb the area where the needle will be inserted.

Core biopsy – A wider needle is used to remove a piece of tissue (a core) from the

lump or abnormal area. It is usually done under local anesthetic, so patient breast is numb,

although patient may feel some pain or discomfort when the anesthetic is given. During a

core biopsy, a mammogram, ultrasound or MRI is used to guide the needle. Patient may

have some bruising to your breast afterwards.

Vacuum-assisted stereotactic core biopsy – In this core biopsy, a number of small

tissue samples are removed through one small cut (incision) in the skin using a needle and a

suction-type instrument. It is done under a local anesthetic. A mammogram, ultrasound or

MRI may be used to guide the needle into place. You may feel some discomfort during the

procedure.

Surgical biopsy – If the abnormal area is too small to be biopsied using other

methods or the biopsy result isn’t clear, a surgical biopsy is done. Before the biopsy, a guide

wire may be put into the breast to help the surgeon find the abnormal tissue. Patient will be

given a local anesthetic, and the doctor may use a mammogram, ultrasound or MRI to guide

the wire into place. The biopsy is then done under a general anesthetic. The lump and a

small area of nearby breast tissue are removed, along with the wire. This is usually done as

day surgery, but some people stay in hospital overnight.

If the tests described above show that a patient have breast cancer, one or more tests

may be done to see if the cancer has spread to other parts of your body. Blood samples may

be taken to check patient general health and to look at patient bone and liver function for

17
signs of cancer. Doctor may take an x-ray of patient chest to check patient lungs for signs of

cancer.

A bone scan may be done to see if the breast cancer has spread to the patient bones.

A small amount of radioactive material is injected into a vein, usually in patient arm. This

material is attracted to areas of bone where there is cancer. After a few hours, the bones are

viewed with a scanning machine, which sends pictures to a computer. This scan is painless

and the radioactive material is not harmful. Plenty of fluids should be drink on the day of

the test and the day after.

A CT (computerized tomography) scan uses x-rays and a computer to create

detailed, cross-sectional pictures of the inside of the body. Patient may have to fast (not eat

or drink) for a period of time beforehand to make the scan pictures clearer and easier to

read. Before the scan, patient will either drink a liquid dye or be given an injection of dye

into a vein in patient arm. This dye is known as the contrast and it makes the pictures

clearer. If patient have the injection, he/she may feel hot all over for a few minutes. Patient

will lie flat on a table while the CT scanner, which is large and round like a doughnut, takes

pictures. This painless test takes 30–40 minutes.

A PET (positron emission tomography) scan is a specialized test, which is rarely

done for breast cancer. It is currently not funded by Medicare as a routine test for breast

cancer. A PET scan uses low-dose radioactive glucose to measure cell activity in different

parts of the body. If patient do have a PET scan, a small amount of the glucose will be

injected into a vein, usually in patient arm. Patient will need to wait for about an hour for

the fluid to move around his/her body, and then patient will lie on a table that moves

through a scanning machine. The scan will show ‘hot spots’ where the fluid has

accumulated – this happens where there are active cells, like cancer cells.

18
The tests described above show whether the cancer has spread to other parts of the

body. Working out how far the cancer has spread is called staging. Stages are numbered

from I to IV. The grade describes how active the cancer cells are and how fast the cancer is

likely to be growing.

Stage I the tumour is less than 2 cm in diameter and has not spread to the lymph

nodes in the armpit. Stage IIA The tumour is less than 2 cm in diameter and has spread to

the lymph nodes in the armpit. The tumour is 2–5 cm in diameter and has not spread to the

lymph nodes in the armpit.

Stage IIB The tumour is 2–5 cm in diameter and has spread to the lymph nodes in

the armpit.

Stage III is referred to as locally advanced breast cancer, and

Stage IV refers to advanced breast cancer.

Grade 1 (low grade) Cancer cells look a little different from normal cells. They are

usually slow growing.

Grade 2 (intermediate grade) Cancer cells do not look like normal cells. They are

growing faster than grade 1 breast cancer, but not as fast as grade 3.

Grade 3 (high grade) Cancer cells look very different from normal cells. They are

fast growing.

Prognosis means the expected outcome of a disease. Patient may wish to discuss

his/her prognosis with his/her doctor, but it is not possible for any doctor to predict the

exact course of the disease. Survival rates for people with breast cancer have increased

significantly over time due to better diagnostic tests and scans, earlier detection, and

improvements in treatment methods. Most people with early breast cancer can be treated

successfully.

2.4 EXPERT SYSTEMS

19
ES are computer applications or programs developed to solve complex problems at

the level of extra-ordinary human intelligence and expertise. ES reasoned through bodies of

knowledge represented mainly as IF-THEN-RULES rather than through conventional

procedural code. They are among the first truly successful forms of artificial intelligence

(AI) softwares. However, some experts points out that ES were not part of true AI since

they lack the ability to learn autonomously from external data. ES consists of three main

parts; the knowledge base, the reasoning methods or inference engine and the user interface.

Knowledge is required to exhibit intelligence or sound judgment. The success of any

ES majorly rests upon the collection of highly accurate and precise knowledge. Data is a

collection of facts; information is organized as data and facts about the task domain. Data,

information and past experience combined together are termed as knowledge. Knowledge

base contains factual and heuristic knowledge; factual knowledge is that knowledge of the

task domain that is widely shared, typically found in textbooks or journals and commonly

agreed upon by those knowledgeable in the particularly field. Heuristic knowledge is less

rigorous, more experiential, more judgmental knowledge of performance. It is rarely

discussed and largely individualistic. It is the knowledge of good practice, good judgment

and plausible reasoning in the field. It is the knowledge that underlies the “art of good

guessing. “

A knowledge representation is the method used to organize and formalizes the

knowledge in the knowledge base in the form of IF-THEN-ELSE rules. Knowledge base

is formed by reading from various scholars, experts and the knowledge engineers.

Knowledge engineer is a person with the qualities of empathy, quick learning and strong

analytical skills. He acquires information from subject experts by recording, interviewing

and observing him at work. He then categorizes and organizes the information in a

meaningful way in the form of IF-THEN-ELSE rules to be used by inference engine.

20
Inference engine acquires and manipulates the knowledge from the knowledge base

to arrive at a particular solution. For a rule based Expert System, it; applies rules

repeatedly to the facts, adds new knowledge into the knowledge base if required and

resolve rules conflict when multiples rules are applicable to a particular case. To

recommend a solution, the inference engine uses the following strategies; forward

chaining, a strategy used by an ES to answer the question, “WHAT CAN HAPPEN

NEXT”. The inference engine follows the chain of conditions and derivations and finally

deduces the outcome. It considers all the facts and rules, and sorts them before concluding

to a solution. This strategy is followed for working on conclusion, result or effect. For

example, predictions of share market status as an effect of changes in interest rates.

Backward chaining is used to answer the question, “WHY THIS HAPPENED”. The

inference engine tries to find out which conditions could have happened in the past for this

result. This strategy is followed for finding out cause or reason. For example, diagnosis of

blood or breast cancer in humans.

User interface provides interactions between user of the ES and the ES itself. It

explains how the ES arrived at a particular recommendation. The explanation may appear

in the form of; natural language displayed on screen, verbal narrations in natural language

and listing rule numbers displayed n screen. The user interface makes it easy to trace the

credibility of ES deductions.

2.5 NEURAL NETWORKS

Artificial neural networks (NNs) are biologically inspired and mimic the human

brain. They are occurring neurons. These neurons are connected each other with connection

links. These links have weights. They multiplied with transmitted signal in network. The

21
output of each neuron is determined by using an activation function such as sigmoid and step.

Usually nonlinear activation functions are used. NN’s are trained by experience, when

applied an unknown input to the network it can generalize from past experiences and produce

a new result (Bishop, 1996; Hanbay, Turkoglu, & Demir, 2007; Haykin,1994). NNs models

have been used for pattern matching, nonlinear system modeling, communications, electrical

and electronics industry, energy production, chemical industry, medical applications, data

mining and control because of their parallel processing capabilities. When designing a NN

model a number of considerations must be taken into account. First of all the suitable

structure of the NN model must be chosen, after this the activation function, the number of

layers and the number of units in each layer must be chosen. Generally desired model consist

of a number of layers. The most general model assumes complete interconnections

Figure 7: General Model of Artificial Neural Network

Perceptron is the simplest and oldest model of Neuron, as we know it. Takes some

inputs, sums them up, applies activation function and passes them to output layer.

22
Figure 8: Perceptron

Feed forward neural networks are also quite old — the approach originates from 50s,

generally it follows the following rules; all nodes are fully connected, activation flows from

input layer to output without back loops and there is one layer between input and output

(hidden layer).In most cases this type of networks is trained using Back-propagation method.

Figure 9: Feed Forward Neural Networks

RBF neural networks are actually FF (feed forward) NNs that use radial basis

function as activation function instead of logistic function. What makes the difference?

Logistic function map some arbitrary value to a 0…1 range, answering a “yes or no” question.

It is good for classification and decision making systems, but works bad for continuous values.

Contrary, radial basis functions answer the question “how far are we from the target”? This is

perfect for function approximation, and machine control (as a replacement of PID controllers,

for example).

To be short, these are just FF networks with different activation function and appliance.

23
Figure 10: Feed Forward Neural Networks

DFF neural networks opened Pandora box of deep learning in early 90s. These are just

FF NNs, but with more than one hidden layer. So, what makes them so different?

When training a traditional FF, we pass only a small amount of error to previous layer.

Because of that stacking more layers led to exponential growth of training times making DFFs

quite impractical. Only in early 00s a bunch of approaches are developed that allowed to train

DFFs effectively; now they form a core of modern Machine Learning systems, covering the

same purposes as FFs, but with much better results.

Figure 11: DFF Neural Networks

Recurrent Neural Networks introduce different type of cells — Recurrent cells. The

first network of this type was so called Jordan network, when each of hidden cell received its

own output with fixed delay — one or more iterations. Apart from that, it was like common

FNN.

24
Figure 12: Recurrent Neural Networks

25
CHAPTER THREE

SYSTEM DESIGN

3.1 DATASETS

Raw Datasets; these are breast cancer mammography (image data) that needs to be

cleaned and preprocessed. Training Datasets; these are breast cancer mammography (image

data) used to train the neural network. Input or Inference Datasets; these are breast cancer

mammography (image data) used to provide required predictions once the neural network

has been trained.

3.2 DATA CLEANING AND PREPROCESSING

Data needs to be cleaned and processed so that it’s in a usable format for modeling.

Exploration is required to identify important elements within the data and to identify any data

quality issues. The importance of data pre-processing can only be emphasized by the fact that

your neural network is only as good as the input data used to train it. If important data inputs

are missing, neural network may not be able to achieve desired level of accuracy. On the

other side, if data is not processed beforehand, it could affect the accuracy as well as

performance of the network down the lane. Some data pre-processing techniques includes:

Mean subtraction (zero centering); it’s the process of subtracting mean from each of the data

point to make it zero-centered. Consider a case where inputs to neuron (unit) are all positive

or all negative. In that case the gradient calculated during back propagation will either be

positive or negative (same as sign of inputs). And hence parameter updates are only restricted

to specific directions which in turn will make it inefficient to converge. Data Normalization;

26
normalization refers to normalizing the data to make it of same scale across all dimensions.

Common way to do that is to divide the data across each dimension by its standard deviation.

However, it only makes sense if you have a reason to believe that different input features

have different scales but they have equal importance to the learning algorithm.

Regularization; one of the most common problems in training deep neural network is over-

fitting. You’ll realize over-fitting in play when your network performed exceptionally well on

the training data but poorly on test data. This happens as our learning algorithm tries to fit

every data point in the input even if they represent some randomly sampled noise.

Regularization helps avoid over-fitting by penalizing the weights of the network.

3.3 TRAINING NEURAL NETWOK

Once the network has been cleaned and pre-processed for breast cancer diagnosis, that

network is ready to be trained. To start this process the initial weights are chosen randomly.

Then, the training, or learning, begins.

There are two approaches to training - supervised and unsupervised. Supervised

training involves a mechanism of providing the network with the desired output either by

manually "grading" the network's performance or by providing the desired outputs with the

inputs. Unsupervised training is where the network has to make sense of the inputs without

outside help. The vast bulk of networks utilize supervised training. Unsupervised training is

used to perform some initial characterization on inputs. In supervised training, both the inputs

and the outputs are provided. The network then processes the inputs and compares its

resulting outputs against the desired outputs. Errors are then propagated back through the

system, causing the system to adjust the weights which control the network. This process

occurs over and over as the weights are continually tweaked.

The set of data which enables the training is called the "training set." During the

training of a network the same set of data is processed many times as the connection weights

27
are ever refined. The current commercial network development packages provide tools to

monitor how well an artificial neural network is converging on the ability to predict the right

answer. These tools allow the training process to go on for days, stopping only when the

system reaches some statistically desired point, or accuracy. If a network simply can't solve

the problem, the designer then has to review the input and outputs, the number of layers, the

number of elements per layer, the connections between the layers, the summation, transfer,

and training functions, and even the initial weights themselves. Those changes required to

create a successful network constitute a process wherein the "art" of neural networking

occurs. Another part of the designer's creativity governs the rules of training. There are many

laws (algorithms) used to implement the adaptive feedback required to adjust the weights

during training. The most common technique is backward-error propagation, more commonly

known as back-propagation. These various learning techniques are explored in greater depth

later in this report. When finally the system has been correctly trained, and no further

learning is needed, the weights can, if desired, be "frozen." In some systems this finalized

network is then turned into hardware so that it can be fast.

Other systems don't lock themselves in but continue to learn while in production use.

The other type of training is called unsupervised training. In unsupervised training, the

network is provided with inputs but not with desired outputs. The system itself must then

decide what features it will use to group the input data. This is often referred to as self-

organization or adaption. At the present time, unsupervised learning is not well understood.

This adaption to the environment is the promise which would enable science fiction types of

robots to continually learn on their own as they encounter new situations and new

environments. Life is filled with situations where exact training sets do not exist. Some of

these situations involve military action where new combat techniques and new weapons

might be encountered. Because of this unexpected aspect to life and the human desire to be

28
prepared, there continues to be research into, and hope for, this field. Yet, at the present time,

the vast bulk of neural network work is in systems with supervised learning. Supervised

learning is achieving results.

One of the leading researchers into unsupervised learning is Tuevo Kohonen, an

electrical engineer at the Helsinki University of Technology. He has developed a self-

organizing network, sometimes called an auto-associator that learns without the benefit of

knowing the right answer. It is an unusual looking network in that it contains one single layer

with many connections. The weights for those connections have to be initialized and the

inputs have to be normalized. The neurons are set up to compete in a winner-take-all fashion.

Kohonen continues his research into networks that are structured differently than standard,

feedforward, back-propagation approaches. Kohonen's work deals with the grouping of

neurons into fields. Neurons within a field are "topologically ordered." Topology is a branch

of mathematics that studies how to map from one space to another without changing the

geometric configuration. The three-dimensional groupings often found in mammalian brains

are an example of topological ordering.

TRAINED
(PROCESSED NEURAL NETWORK
MAMMOGRAPHIC OUTPUT
IMAGE) (PREDICTIONS

29

MAMMOGRAPHIC
IMAGE PREPROCESSING
30
Figure 13: System Design

Kohonen has pointed out that the lack of topology in neural network models make

today's neural networks just simple abstractions of the real neural networks within the brain.

As this research continues, more powerful self learning networks may become possible. But

currently, this field remains one that is still in the laboratory. This project’s neural network

was trained with a supervised approach

3.4 TRAINED NEURAL NETWORK

When finally the neural network has been correctly trained, and no further learning

is needed, the weights can, if desired, be "frozen." In some systems this finalized network is

then turned into hardware so that it can be fast. Other systems don't lock themselves in but

continue to learn while in production use as the network employed for this project. The

network can accept breast cancer patient’s diagnostic reports after being trained as inputs,

runs the inputs through network nodes and layers then makes predictions on the presence of

cancerous tumors or not while establishing its prediction accuracy

3.5 OUTPUT

These are the predictions or inferences made by the neural net on the presence of

cancerous tumors or not from the input mammographic image while establishing its

prediction accuracy.

31
3.6 METHODOLOGY

In implementing this project, detailed and extensive review of relevant literature was

carried out. Elaborate study of the principles and techniques governing the development of

an Expert System for solving complex diagnosis problem is carried out by learning its

properties and concepts. Accurate and precise training datasets were acquired from web-

based machine learning dataset websites and cancer experts or oncologists. The collected

datasets were used to train the ES learning algorithms and tested with inference datasets.

32
CHAPTER FOUR

IMPLEMENTATION

The methodology (Figure 1) adopted in this research project comprises of four

stages after data cleaning and preprocessing done by mini-MIAS database (that is, all

dataset used for this project have been preprocessed and cleaned which are located in mini-

MIAS database), first one is acquisition of image from mini-MIAS database, second

extracting features from the mammograms, selecting more optimal features, classifier to

identify appropriate class of mammogram. The suspicious parts were extracted from the

mammogram by using texture features. Database for this experiment is taken from mini-

MIAS this data set contains 322 mammograms, 270 images are normal (non-cancerous) and

52 images are malignant (cancerous). Every image in this database is 1024 × 1024 pixels.

This database can be access easily. Figure 2 and Figure 3 are sample images for normal

and malignant class respectively.

33
Texture features are extracted using GLCM along 0° for each mammogram.

Features represent image in a specific format that focus especially on relevant information.

In the next stage features are selected for training and testing; this stage is very important

because classification accuracy mainly depend on careful selection of features. In the other

step mammograms are classified, for this research work neural network is used as a

classifier to distinguish mammogram and classify it into normal and malignant class.

34
4.1 FEATURES EXTRACTION USING GLCM

Feature extraction plays a vital role for pattern classification. Gray Level Co-

occurrence Matrix (GLCM) features are determined along 0° for all mammograms. In the

proposed system, 10 texture features define by Haralick et al. Shown in Table 1 are extracted

from the texture feature sub-space based on GLCM. Numbers of gray level in an image

determine the size of GLCM. For each formula given in the equations, n determine the

number of grey level used. The matrix element Q (i,j) is the relative frequency with two

pixels, separated by pixel distance, occur within given neighborhood with intensity i and j.

Texture features that are derived from the GLCM are given below

Variables Image 1 Image 2 Image 3 Image 4 Image 5

Contrast 0.024 0.037 0.042 0.038 0.038

Correlation 0.996 0.995 0.995 0.995 0.995

Entropy 1.195 1.233 1.277 1.262 1.115

Sum of square variance 6.986 8.385 9.065 8.723 7.535

Sum average 3.980 4.356 4.585 4.399 3.993

Sum variance 20.170 24.691 26.565 25.733 22.786

Sum entropy 1.180 1.211 1.256 1.244 1.098

Difference variance 0.024 0.037 0.042 0.038 0.038

Difference entropy 0.092 0.124 0.123 0.097 0.089

Info measure of correlation 1 -0.925 -0.905 -0.906 -0.926 -0.925

Table 1: Statistical value for sample images.

Contrast

35
It measures grey level values between reference and its neighbour pixel, variance

present in the mammograms is measured through it. Its value is high in case of Q (i,j) has

huge variation in the matrix. It can be measure through equation shown below

Correlation

Correlation shows the linear dependency of grey value. The value of correlation will be high

in case of mammogram contain linear structure up to considerable amount.

are mean and variance of marginal distribution Qx(i) and Qy(j)

Entropy

Entropy is a measure of randomness; it also describes the distribution variance in a

region. It can be calculated by using equation given below.

Sum of square

36
It tells about variation between two dependent variables. Variance puts relatively high

weights on the elements that differ from the average value of Q (i,j).

Sum average

This is the relation between clear and dense area in a mammogram.

Sum variance

It reveals spatial heterogeneity of an image.

Sum entropy

It is a measure of the sum of micro (local) differences in an image.

Difference entropy

This is a measure of the variability of micro differences.

37
Information measure of correlation

In this feature two derived arrays are used, first array represents the summation of

rows, while the second one represents the summation of columns in the GLCM.

Difference variance

Local variability can be measure through it.

Above ten features are calculated for all mammograms, values of features for five

mammograms are shown in the Table 1.

4.2 FEATURES SELECTION

Features subset selection is used to reduce feature space that helps to reduce the

computation time. This is achieved by removing noisy, redundant and irrelevant features i.e.,

it selects the effective features to get desire output.

38
For this research work, rank feature method is using to select optimal features that contribute

more toward target output. This function rearranges the features from top to bottom according

to their contribution. In this work top six ranked features are selected for training the

network. List of selected features is shown in Table 2.

Optical Features

F1 Sum variance

F2 Sum of square variance

F3 Correlation

F4 Sum entropy

F5 Entropy

F6 Difference variance

Table 2: Optimal features selected by using rank method.

4.3 CLASSIFICATION

Artificial Neural Network (ANN) classifier is used in this work as it is a commonly

used classifier for breast cancer classification. Neural Network composed of simple elements

that are inspired by biological neuron operates in parallel. We train neural network to perform

specific function by adjusting weights between elements. Neural network is trained to get

desired output. Such situation is shown in Figure 4. The network is adjusted based on the

comparison with the output and the corresponding target until the network output matches the

target. ANN classifier is based on two steps, i.e. training and testing. Classification accuracy

depends on training.

39
From the selected data base 70% data is used for training, 15% data for testing and

remaining 15% data is used for validation. Neural network contains three layers namely input

layer, hidden layer and output layer. Parameters used for artificial neural network are shown

in the training window (Figure 5).

Training function Levenberg-Marquardt is used for training the network, it shows

good results in training and classification. Other training function resilient back propagation,

40
Conjugate Gradient with Powell etc. are used; from all these training function

Levenberg-Marquardt is selected by comparing classification accuracy, training time to

converge and mean square error. Optimize network architecture used in this study has 20

neuron. Optimize network is selected by observing mean square error (mse) for different

values of hidden neurons.

4.4 REGRESSION ANALYSIS

Regression analysis is a statistical process to estimate association among all variables.

In the regression plot output from network are plotted versus the target set shown in the

Figure 6. In the regression plot perfect fit is indicated by dotted line while the solid line

shows the output from the network. Solid line perfectly equal to dashed line is achieved if the

classifier predicts 100% accurately. The difference between two line shows there are some

sample which are not correctly predicted by network. Data is represented by circle. In the plot

shown below value of R is 0.718, this value also shows the result accuracy. The value of R

equals to 1 shows 100% prediction (Figures 7 and 8).

41
4.5 NEURAL NETWORK PERFORMANCE EVALUATION

The problem under evaluation is binary classification; the parameters used for

weighing are accuracy, specificity, and sensitivity.

These parameters are defined as:

Sensitivity=TP/TP+FN × 100

Specificity=TN/TN+FP × 100

Accuracy=TP+TN/TP+TN+FP+FN × 100

Where TP is true positive, TN is true negative, and FP and FN are false positive and

false negative respectively. Sensitivity measures the percentage of truly predicted cancer

class, specificity measures the percentage of truly predicted benign/normal class and

42
accuracy is percentage of rightly predicted cancer and normal cases. Data is rotated five time

and the best result out of fivefold is shown in the confusion matrix below. In training set there

are 226 mammograms, 193 are normal and 33 are malignant; network predict all benign as

benign, out of 193 normal cases 2 samples are miss classified. Validation set comprises of 48

samples, 42 are normal and 6 are malignant, network predict all normal and malignant

correctly. Test set consists of 48 samples, 37 normal and 11 malignant; prediction is 100%

for this dataset.

Overall results by using mini MIAS Data Specificity Sensitivity Accuracy

Database division

Training 99.0% 100% 99.1%

Validation 100% 100% 100%

Test 100% 100% 100%

Overall 99.3% 100% 99.4%

Table 3: Overall summary of the results.

43
CHAPTER FIVE

5.1 SUMMARY

Neural networks offer a different way to analyze data, and to recognize patterns

within that data, than traditional computing methods. However, they are not a solution for

all computing problems. Traditional computing methods work well for problems that can be

well characterized. Balancing checkbooks, keeping ledgers, and keeping tabs of inventory

are well defined and do not require the special characteristics of neural networks.

Traditional computers are ideal for many applications. They can process data, track

inventories, network results, and protect equipment. These applications do not need the

special characteristics of neural networks.

Expert systems are an extension of traditional computing and are sometimes called the

fifth generation of computing. (First generation computing used switches and wires. The

second generation occurred because of the development of the transistor. The third generation

involved solid-state technology, the use of integrated circuits, and higher level languages like

COBOL, Fortran, and "C". End user tools, "code generators," are known as the fourth

generation.) The fifth generation involves artificial intelligence.

Typically, an expert system consists of two parts, an inference engine and a

knowledge base. The inference engine is generic. It handles the user interface, external files,

program access, and scheduling. The knowledge base contains the information that is specific

to a particular problem. This knowledge base allows an expert to define the rules which

govern a process. This expert does not have to understand traditional programming. That

person simply has to understand both what he wants a computer to do and how the

mechanism of the expert system shell works. It is this shell, part of the inference engine that

44
actually tells the computer how to implement the expert's desires. This implementation occurs

by the expert system generating the computer's programming itself; it does that through

"programming" of its own. This programming is needed to establish the rules for a particular

application. This method of establishing rules is also complex and does require a detail

oriented person.

Efforts to make expert systems general have run into a number of problems. As the

complexity of the system increases, the system simply demands too much computing

resources and becomes too slow. Expert systems have been found to be feasible only when

narrowly confined.

Artificial neural networks offer a completely different approach to problem solving

and they are sometimes called the sixth generation of computing. They try to provide a tool

that both programs itself and learns on its own. Neural networks are structured to provide the

capability to solve problems without the benefits of an expert and without the need of

programming. They can seek patterns in data that no one knows are there.

Expert systems have enjoyed significant successes. However, artificial intelligence

has encountered problems in areas such as vision, continuous speech recognition and

synthesis, and machine learning. Artificial intelligence also is hostage to the speed of the

processor that it runs on. Ultimately, it is restricted to the theoretical limit of a single

processor. Artificial intelligence is also burdened by the fact that experts don't always speak

in rules.

Yet, despite the advantages of neural networks over both expert systems and more

traditional computing in these specific areas, neural nets are not complete solutions. They

offer a capability that is not ironclad, such as a debugged accounting system. They learn, and

as such, they do continue to make "mistakes." Furthermore, even when a network has been

developed, there is no way to ensure that the network is the optimal network.

45
Neural systems do exact their own demands. They do require their implementor to

meet a number of conditions. These conditions include:

 a data set which includes the information which can characterize the problem.

 an adequately sized data set to both train and test the network.

 an understanding of the basic nature of the problem to be solved so that basic first-cut

decision on creating the network can be made. These decisions include the activation

and transfer functions, and the learning methods.

 an understanding of the development tools.

 adequate processing power (some applications demand real-time processing that

exceeds what is available in the standard, sequential processing hardware. The

development of hardware is the key to the future of neural networks).

Once these conditions are met, neural networks offer the opportunity of solving

problems in an arena where traditional processors lack both the processing power and a step-

by-step methodology. A number of very complicated problems cannot be solved in the

traditional computing environments. For example, speech is something that all people can

easily parse and understand. A person can understand a southern drawl, a Bronx accent, and

the slurred words of a baby. Without the massively paralleled processing power of a neural

network, this process is virtually impossible for a computer. Image recognition is another task

that a human can easily do but which stymies even the biggest of computers. A person can

recognize a plane as it turns, flies overhead, and disappears into a dot. A traditional computer

might try to compare the changing images to a number of very different stored patterns.

This new way of computing requires skills beyond traditional computing. It is a

natural evolution. Initially, computing was only hardware and engineers made it work. Then,

there were software specialists - programmers, systems engineers, data base specialists, and

designers. Now, there are also neural architects. This new professional need to be skilled

46
more than his predecessors. For instance, he will need to know statistics in order to choose

and evaluate training and testing situations. This skill of making neural networks work is one

that will stress the logical thinking of current software engineers.

5.2 CONCLUSION

Neural networks offer a unique way to solve some problems while making their own

demands. The biggest demand is that the process is not simply logic. It involves an empirical

skill, an intuitive feel as to how a network might be created. Current state-of-the-art artificial

neural networks for general image analysis are able to detect cancer in mammographies with

similar accuracy to radiologists, even in a screening-like cohort with low breast cancer

prevalence.

5.3 RECOMMENDATION

To reduce the death rate due to breast cancer it is very essential that cancer must be

identified at initial stage. Early stage detection of breast cancer can be enhanced with well

trained unsupervised artificial neural network with precise and accurate prediction and high

performance metrics. This should be explored by future researchers. Future research works

should be focused on prediction of malignant tumors and computer-aided treatment

prescription. In this research project 10 texture features from GLCM are calculated along 0°

are under consideration. Further sample space is reduced to 6 features. In future more,

features can be considered, and other dataset can be used to increase robustness of system.

47
REFERENCES

1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6354665/

2. A Beginner's Guide to Neural Networks and Deep Learning

(https://skymind.ai/wiki/neural-network)

3. A Beginner's Guide to Neural Networks and Deep Learning

(https://skymind.ai/wiki/neural-network)

4. Abdel-Zaher, Ahmed M, Eldeib, Ayman M (2016) Expert systems with applications,

Elsevier, Netherlands 46: 139-144.

5. Agrawal S, Agrawal J (2015) Neural network techniques for cancer prediction: A

survey. Procedia Computer Science 60: 769-774.

6. Albregtsen F (2008) Statistical texture measures computed from gray level

coocurrence matrices. Image Processing Laboratory, Department of Informatics,

University of Oslo 5: 5.

7. Álvarez Menéndez L, De Cos Juez FJ, Sánchez Lasheras F, ÁlvarezRiesgo JA (2010)

Artificial neural networks applied to cancer detection in a breast screening programme.

Math Comput Model 52: 983-891.

8. Anderson, T. W. (1984). An introduction to multivariate statistical analysis. New

York: Wiley.

9. Aragones, M. J., Ruiz, A. G., Jimenez, R., Perez, M., & Conejo, E. A. (2003). A

combined neural network and decision trees model for prognosis of breast cancer

relapse.

10. Artificial Intelligence in Medicine, 27, 45–63.

11. Ash, T. (1989). Dynamic node creation in backpropagation networks. Connection

Science, 1, 365-375.

48
12. Baker, C.L. (1979). Syntactic theory and the projection problem. Linguistic Inquiry,

10:533-581.

13. Bennett, K. P., & Mangasarian, O. L. (1992). Robust linear programming

discrimination of two linearly inseparable sets. Optimization Methods and Software, 1,

23–34.

14. Berbar MA (2017) Hybrid methods for feature extraction for breast masses

classification. Egyptian Informatics Journal.

15. Bhardwaj A, Tiwari A (2015) Breast cancer diagnosis using genetically optimized

neural network model. Expert Syst Appl 42: 4611-4620.

16. Bishop, C. M. (1996). Neural networks for pattern recognition. Oxford: Clarendon

17. Bowerman, M. (1987). The ‘no negative evidence’ problem: How do children avoid

constructing an overly general grammar? In J.A. Hawkins (Ed.), Explaining language

universals. Oxford: Basil Blackwell.

18. Braine, M.D.S. (1971). On two types of models of the internalization of grammars. In

D.I. Slobin (Ed.), The ontogenesis of grammar: A theoretical perspective. New York:

Academic Press.

19. Brown, R., & Hanlon, C. (1970). Derivational complexity and order of acquisition in

child speech.

20. Chandana P, Rao PS, Satyanarayana CH, Srinivas Y, Latha AG (2017) An efficient

content-based image retrieval (CBIR) using GLCM for feature extraction. Advances in

Intelligent Systems and Computing 21-30.

21. Chechkina EG, Toner, Marin Z, Audit B, Roux SG (2016) Combining multifractal

analyses of digital mammograms and infrared thermograms to assist in early breast

cancer diagnosis. AIP Conference Proceedings 1760.

22. Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.

49
23. Coleman C (2017) Early detection and screening for breast cancer. In Seminars in

Oncology Nursing.

24. D. Klahr and K. Kotovsky (Eds.), The 21st Carnegie-Mellon symposium on cognition:

Complex information processing: The Impact of Herbert A. Simon. Hillsdale, NJ:

Lawrence Erlbaum.

25. Dhahbi S, Barhoumi W, Zagrouba E (2015) Breast cancer diagnosis in digitized

mammograms using curvelet moments. Comput Biol Med 64: 79-90.

26. Dheeba J, Albert Singh N, Tamil Selvi S (2014) Computer-aided detection of breast

cancer on mammograms: A swarm intelligence optimized wavelet neural network

approach. J Biomed Inform 49: 45-52.

27. Dheeba J, Singh NA, Selvi ST (2014) Computer-aided detection of breast cancer on

mammograms: A swarm intelligence optimized wavelet neural network approach. J

Biomed Inform 49: 45-52.

28. Dunbar, K. & Klahr, D. (1989). Developmental differences in scientific discovery

processes.

29. Early detection of breast cancer fact sheet (https://www.cancer.org.au/about-

cancer/early-detection/early-detection-factsheets/breast-cancer.html)3

30. Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench.

Online Appendix for "Data Mining: Practical Machine Learning Tools and

Techniques", Morgan Kaufmann, Fourth Edition, 2016.

31. Elman, J.L. (1990). Finding structure in time. Cognitive Science, 14:179-211.

32. Elman, J.L. (1991). Distributed representations, simple recurrent networks, and

grammatical structure. Machine Learning, 7:195-225.

33. Estes, W.K. (1986). Array models for category learning. Cognitive Psychology,

18:500-549.

50
34. Explorations in the microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.

35. Fahlman, S.E., & Lebiere, C. (1990). The Cascade-Correlation learning architecture.

In D.S.

36. Furundzic D, Djordjevic M, JovicevicBekic A (1998) Neural networks approach to

early breast cancer detection. J Sys Arch 44: 617-633.

37. Gardezi SJS, Faye I, Adjed F, Kamel N, Eltoukhy MM (2016) Mammogram

classification using curvelet GLCM texture features and GIST features. Proceedings of

the International Conference on Advanced Intelligent Systems and Informatics 705-713.

38. Gold, E.M. (1967). Language identification in the limit. Information and Control,

16:447-474.

39. Gonzalez, R.C., & Wintz, P. (1977). Digital image processing. Reading, MA:

Addison-Wesley.

40. Gouldinga NR, Marquezb JD, Prewettc EM, Claytord TN, Nadler BR (2008)

Goulding ultrasonic imaging techniques for breast cancer detection. AIP Conference

Proceedings 975.

41. Haralick RM, Shanmugam K, Dinstein IK (1973) Textural features for image

classification.IEEE Transactions on systems, man, and cybernetics 6: 610-621.

42. Harris, C. (1991). Parallel distributed processing models and metaphors for language

and development. Ph.D. dissertation, University of California, San Diego.

43. http://biogps.org/dataset/tag/cancer/

44. http://htv.com.pk/health/breast-cancer-growing-at-alarming-rate-in-pakistan

45. http://www.eng.usf.edu/cvprg/

46. http://www.iccr-cancer.org/datasets

47. http://www.mammoimage.org/databases/

48. http://www.onlinemedicalimages.com/

51
49. http://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/

50. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer

51. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)

52. https://archive.ics.uci.edu/ml/index.php

53. https://canceraustralia.gov.au/

54. https://data.world/datasets/cancer

55. https://elitedatascience.com/datasets

56. https://ethw.org/Teuvo_Kohonen

57. https://gallery.azure.ai/Experiment/Breast-cancer-dataset

58. https://github.com/datasets/breast-cancer

59. https://github.com/sfikas/medical-imaging-datasets

60. https://scikit-

learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html

61. https://shiring.github.io/machine_learning/2017/01/15/rfe_ga_post

62. https://sites.google.com/site/aacruzr/image-datasets

63. https://ugc.futurelearn.com/uploads/files/6f/fe/6ffe8e0c-c7ef-40a3-8769-

9a4bb7164ffa/Transcript1-1.pdf

64. https://wiki.cancerimagingarchive.net/display/Public/QIN+Breast+DCE-MRI

65. https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-

learning-datasets/

66. https://www.breastcancer.org/

67. https://www.cancer.org.au/content/about_cancer/ebooks/cancertypes/Understanding_

Breast_Cancer_booklet_July_2016.pdf

68. https://www.cancer.org/cancer/breast-cancer.html

52
69. https://www.datasciencelearner.com/datasets-for-machine-learning-projects-data-

scientist/

70. https://www.dawn.com/news/1344915

71. https://www.expertsystem.com/machine-learning-definition/

72. https://www.google.com/imgres?imgurl=https%3A%2F%2Fi.udemycdn.com%2Fco

urse%2F750x422%2F1795952_e23e_2.jpg&imgrefurl=https%3A%2F%2Fwww.ude

my.com%2Fcourse%2Fthe-complete-neural-networks-bootcamp-theory-

applications%2F&docid=AYWM3WTt2PYwQM&tbnid=3kub-

tGj8Pdl5M%3A&vet=10ahUKEwj3revep6HkAhUBu3EKHfJlAQ0QMwiHASgQMB

A..i&w=750&h=422&bih=671&biw=1366&q=neural%20network&ved=0ahUKEwj3

revep6HkAhUBu3EKHfJlAQ0QMwiHASgQMBA&iact=mrc&uact=8

73. https://www.google.com/imgres?imgurl=https%3A%2F%2Fleonardoaraujosantos.gi

tbooks.io%2Fartificial-

inteligence%2Fcontent%2Fimage_folder_6%2Frecurrent.jpg&imgrefurl=https%3A%

2F%2Fleonardoaraujosantos.gitbooks.io%2Fartificial-

inteligence%2Fcontent%2Frecurrent_neural_networks.html&docid=lsr8X1595djQM

M&tbnid=2DkILitsq_NRiM%3A&vet=10ahUKEwiv2O_irKHkAhXfRBUIHUFXB

MMQMwhaKAEwAQ..i&w=947&h=410&bih=622&biw=1366&q=Recurrent%20Ne

ural%20Networks%20&ved=0ahUKEwiv2O_irKHkAhXfRBUIHUFXBMMQMwha

KAEwAQ&iact=mrc&uact=8

74. https://www.google.com/imgres?imgurl=https%3A%2F%2Fupload.wikimedia.org%

2Fwikipedia%2Fcommons%2Fthumb%2F7%2F7d%2FRadial_funktion_network.svg

%2F250px-

Radial_funktion_network.svg.png&imgrefurl=https%3A%2F%2Fen.wikipedia.org%2

Fwiki%2FRadial_basis_function_network&docid=h7t_1doMkyywTM&tbnid=BlbOR

53
TO9lptsNM%3A&vet=10ahUKEwjbw9yFq6HkAhX3XRUIHc0hCcIQMwhfKAYwB

g..i&w=250&h=210&bih=622&biw=1366&q=RBF%20neural%20networks%20&ved

=0ahUKEwjbw9yFq6HkAhX3XRUIHc0hCcIQMwhfKAYwBg&iact=mrc&uact=8

75. https://www.google.com/imgres?imgurl=https%3A%2F%2Fwww.octoparse.com%2

Fmedia%2F5154%2Fdeep-feed-

forward.png&imgrefurl=https%3A%2F%2Fwww.octoparse.com%2Fblog%2F27-

neutral-network-explained-in-graphics&docid=vkk508_bMzN6cM&tbnid=-

4gPh5vn_VaTvM%3A&vet=10ahUKEwj657vdq6HkAhVcShUIHVHqB_QQMwhN

KAEwAQ..i&w=608&h=538&bih=622&biw=1366&q=DFF%20neural%20networks

%20&ved=0ahUKEwj657vdq6HkAhVcShUIHVHqB_QQMwhNKAEwAQ&iact=mr

c&uact=8

76. https://www.kaggle.com/uciml/breast-cancer-wisconsin-data

77. https://www.mayoclinic.org/diseases-conditions/breast-cancer/diagnosis-

treatment/drc-20352475

78. https://www.medicinenet.com/breast_cancer_facts_stages/article.htm

79. https://www.ncbi.nlm.nih.gov/gds/?term=breast+cancer

80. https://www.programcreek.com/python/example/104690/sklearn.datasets.load_breast

_cancer

81. https://www.pyimagesearch.com/2019/02/18/breast-cancer-classification-with-keras-

and-deep-learning/

82. https://www.researchgate.net/post/How_do_I_solve_my_problem_with_the_mini-

MIAS_data_set

83. https://www.researchgate.net/post/How_to_download_MIAS_Dataset

84. https://www.researchgate.net/publication/311950799_Analysis_of_the_Wisconsin_Br

east_Cancer_Dataset_and_Machine_Learning_for_Breast_Cancer_Detection

54
85. https://www.sciencedirect.com/science/article/pii/S0957417414005594

86. https://www.webmd.com/breast-cancer/default.htm

87. In J.R. Hayes (Ed.), Cognition and the development of language. New York: Wiley.

88. Jalalian A, Mashohor S, Mahmud R, Karasfi B, Saripan MIB, et al. (2017)

Foundation and methodologies in computer-aided diagnosis systems for breast cancer

detection. Excli 16: 113.

89. Jordan, M. I. (1986). Serial order: A parallel distributed processing approach. Institute

for Cognitive Science Report 8604. University of California, San Diego.

90. K. Estes (Vol. 1). Hillsdale, NJ: Lawrence Erlbaum.

91. Kail, R. (1984). The development of memory. New York: W.H. Freeman.

92. Kumar S, Chandra M (2017) Detection of microcalcification using the wavelet based

adaptive sigmoid function and neural network. J Infor Proc Sys 13: 703-715.

93. Lea, G. & Simon, H.A. (1979). Problem solving and rule induction. In H.A. Simon

(Ed.), Models of Thought. New Haven, CT: Yale University Press.

94. MaktabdarOghaz M, Maarof MA, Rohani MF, Zainal A, Shaid SZM. An optimized

skin texture model using gray-level co-occurrence matrix. Neural Comput Appl 1-9.

95. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann,

and Ian H. Witten (2009). The WEKA Data Mining Software: An Update. SIGKDD

Explorations, Volume 11, Issue 1.

96. McClelland, J.L. (in press). Parallel distributed processing: Implications for cognition

and development. In R. Morris (Ed.), Parallel distributed processing: Implications for

psychology and neurobiology. Oxford: Oxford University Press.

97. McKenzie, B.E., Tootell, H.E., & Day, R.H. (1980). Development of visual size

constancy during the first year of human infancy. Developmental Psychology, 16:163-

174.

55
98. Medin, D.L., & Schaffer, M.M. (1978). Context theory of classification learning.

Psychological Review, 85:207-238.

99. Miller, G.A., & Chomsky, N. (1963). Finitary models of language users. In R.D.

Luce, R.R. Bush, & E. Galanter (Eds.), Handbook of Mathematical Psychology, Volume

II. New York: John Wiley.

100. Muramatsu C, Hara T, Endo T, Fujita H (2016) Breast mass classification on

mammograms using radial local ternary patterns. Comput Biol Med 72: 43-53.

101. Newport, E.L. (1988). Constraints on learning and their role in language acquisition:

Studies of the acquisition of American Sign Language. Language Sciences, 10:147-172.

Newport, E.L. (1990). Maturational constraints on language learning. Cognitive

Science, 14:11-28.

102. Nithya R, Santhi B (2011) Classification of normal and abnormal patterns in digital

mammograms for diagnosis of breast cancer. Int J Comp App 28: 21-25.

103. Nosofsky, R.M. (in press). Exemplars, prototypes, and similarity rules. In A. Healy, S.

Kosslyn, & R. Shiffrin (Eds.), From learning theory to connectionist theory: Essays in

honor of William

104. Nurhasanah, Sampurno J, Faryuni ID, OktoIvansyah (2016) Automated analysis of

image mammogram for breast cancer diagnosis. AIP Conference Proceeding 1719.

105. Osherson, D.N., Stob, M., & Weinstein, S. (1986). Systems that learn: An

introduction to learning theory for cognitive and computer scientists. Cambridge, MA:

MIT Press.

106. Pereira DC, Ramos RP, Do Nascimento MZ (2014) Segmentation and detection of

breast cancer in mammograms combining wavelet analysis and genetic algorithm.

Comput Methods Programs Biomed 114: 88-101.

107. Pinker, S. (1989). Learnability and cognition. Cambridge, MA: MIT Press.

56
108. Plunkett, K., & Marchman, V. (1990). From rote learning to system building. Center

for Research in Language, TR 9020. University of California, San Diego.

109. Pollack, J.B. (1990). Language acquisition via strange automata. Proceedings of the

Twelfth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Erlbaum.

110. Preetha K (2016) Breast cancer detection and classification using artificial neural

network with partical swarm optimization. IJARBEST 2: 19.

111. Rampun A, Morrow PJ, Scotney BW, Winder J (2017) Fully automated breast

boundary and pectoral muscle segmentation in mammograms. Arti Intell Med.

112. Rasti R, Teshnehlab M, Phung SL (2017) Breast cancer diagnosis in DCE-MRI using

mixture ensemble of convolutional neural networks. Pattern Recogn 72: 381-930.

113. Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann

Publishers, San Mateo, CA.

114. Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal

representations by error propagation. In D.E. Rumelhart & J.L. McClelland (Eds.),

Parallel distributed processing:

115. Servan-Schreiber, D., Cleeremans, A., & McClelland, J.L. (1986). Encoding

sequential structure in simple recurrent networks. CMU Technical Report CMU-CS-88-

183. Computer Science Department, Carnegie-Mellon University.

116. Shultz, T.R., & Schmidt, W.C. (1991). A Cascade-Correlation model of balance scale

phenomenon. In Proceedings of the Thirteenth Annual Conference of the Cognitive

Science Society. Hillsdale, NJ: Erlbaum.

117. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA: A Cancer Journal for

Clinicians. Am Cancer Society 67: 7-30.

118. Singh AK, Gupta B (2015) A novel approach for breast cancer detection and

segmentation in a mammogram. Procedia Computer Science 54: 676-682.

57
119. Suckling J, Parker J, Dance D, Astley S, Hutt I, et al. (1994) The mammographic

image analysis society digital mammogram database. InExerptaMedica. International

Congress Series 1069: 375-378.

120. Sun W, Tseng TL (Bill), Zhang J, Qian W (2017) Enhancing deep convolutional

neural network scheme for breast cancer diagnosis with unlabeled data. Comput Med

Imaging Graph 57: 4-9.

121. Touretzky (Ed.), Advances in Neural Information Processing Systems 2, 524-532.

122. Turkewitz, G., & Kenny, P.A. (1982). Limitations on input as a basis for neural

organization and perceptual development: A preliminary theoretical statement.

Developmental Psychobiology, 15(4):257-368.

123. Vijayasarveswari V, Khatun S, Fakir MM, Jusoh M, Ali S (2017) UWB based low-

cost and non-invasive practical breast cancer early detection. AIP Conference

Proceeding 1808.

124. Wahab N, Khan A, Lee YS (2017) Two-phase deep convolutional neural network for

reducing class skewness in histopathological images based breast cancer detection.

Comp Biol Med 85: 86-97.

125. Wexler, K., & Cullicover, P. (1980) Formal principles of language acquisition.

Cambridge, MA: MIT Press.

126. Xie W, Li Y, Ma Y (2016) Breast mass classification in digital mammography based

on extreme learning machine. Neurocomputing 173: 930-941.

58

You might also like