Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

CLASSIFICATION OF LUNG CANCER NODULES TO

MONITOR PATIENT’S HEALTH USING NEURAL NETWORK


TOPOLOGY WITH SVM ALGORITHM & COMPARE WITH K-
MEANS ACCURACY

MINI PROJECT REPORT


Submitted by
THIPPARAPU SHARANYA (20RH1A04N4)
VORSU CHANDRIKA (20RH1A04P5)
SURA TEJASHWINI (21RH5A0420)
TOLICHUKKA SARALA DEVI (20RH1A04N7)

Under the Esteemed Guidance of


DR. L. MALLIGA
Professor
in partial fulfillment of the Academic Requirements for the Degree of

BACHELOR OF TECHNOLOGY
Electronics and Communication Engineering

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN


(Autonomous Institution-UGC, Govt. of India)
Accredited by NBA & NAAC with ‘A’ Grade
National Ranking by NIRF Innovation-Rank band (151-300), MHRD, Govt. of India
Maisammaguda, Dhulapally, Secunderabad – 500 100
2023-24
MALLA REDDY ENGINEERING COLLEGE FOR WOMEN
(Autonomous Institution-UGC, Govt. of India)
Accredited by NBA & NAAC with ‘A’ Grade
National Ranking by NIRF Innovation-Rank band (151-300), MHRD, Govt. of India
Maisammaguda, Dhulapally, Secunderabad – 500 100

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

CERTIFICATE

This is to certify that the Mini Project work entitled “Classification of Lung cancer

Nodules to monitor patients health using Neural network Topology with SVM algorithm &

compare with K-means accuracy” is carried out by T.SHARANYA (20RH1A04N4),

V.CHANDRIKA (20RH1A04P5) , S.TEJASHWINI (21RH5A0420) & T.SARALA DEVI

(20RH1A04N7)in partial fulfillment for the award of degree of BACHELOR OF

TECHNOLOGY in ELECTRONICS AND COMMUNICATION ENGINEERING,

Jawaharlal Nehru Technological University, Hyderabad during the academic year 2023-2024.

Supervisor’s Signature Head of the Department


DR. L. MALLIGA Dr. K. SUDHAKAR
Professor Professor

External Examiner
H.no:3rd floor,
Above Sbi Bank, Arunodaya Colony, Hi-
Tech Theatre Line,Opp: Metro Pillar No. C1758,
M a d h a p u r , Hyderabad,

PROJECT COMPLETION CERTIFICATE


This is to certify that the following students of “MALLA REDDY
ENGINEERING COLLEGE FOR WOMEN” have undergone project
training titled “Classification of Lung cancer nodules to monitor patients
health using neural network topology with SVM Algorithm & compare
with K-Means Accuracy” in our company.

1. T. Sharanya (20RH1A04N4)
2. V. Chandrika (20RH1A04P5)
3. S. Tejashwini (21RH5A0420)
4. T. Sarala Devi (20RH1A04N7)

They have successfully completed their project. This is based on bonafide


work carried out at Information Technology division, EDUGENE
TECHNOLOGIES PVT. LTD. under our guidance. Their performance
and conduct during to this project was satisfactory.

Warm regards, Managing


Director, Edugene
Technologies ,
Information Technology and Services.
ACKNOWLEDGEMENT

We feel ourselves honored and privileged to place our warm salutation to our college
Malla Reddy Engineering College for Women and Department of Electronics and
Communication Engineering which gave us the opportunity to have expertise in engineering
and profound technical knowledge.
We would like to deeply thank our Honorable Minister of Telangana State Sri.Ch.
Malla Reddy Garu, founder chairman MRGI, the largest cluster of institutions in the state of
Telangana for providing us with all the resources in the college to make our project success.
We wish to convey gratitude to our Principal Dr. Y. Madhavee Latha, for providing
us with the environment and mean to enrich our skills and motivating us in our endeavor and
helping us to realize our full potential.
We express our sincere gratitude to our Head of the Department Dr. K. Sudhakar, of
for inspiring us to take up a project on this subject and successfully guiding us towards its
completion.
We would also like to thank our Project coordinator Dr.S.Rajkumar, for his kind
encouragement and overall guidance in viewing this program a good asset with profound
gratitude.
We would like to thank our internal guide Dr.L.Malliga, and all the Faculty members
for their valuable guidance and encouragement towards the completion of our project work.

With Regards and Gratitude


T.SHARANYA (20RH1A04N4)
V.CHANDRIKA (20RH1A04P5)
S.TEJASWINI (21RH5A0420)
T.SARALA DEVI (20RH1A04N7)
DECLARATION

We hereby declare that our Mini Project entitled “Classification of Lung cancer

Nodules to monitor patients health using Neural network Topology with SVM algorithm

& compare with K-means accuracy” submitted to Malla Reddy Engineering College for

Women(Autonomous Institution), Permanently affiliated to Jawaharlal Nehru

Technological University, Hyderabad for the award of the Degree of Bachelor of

Technology in Electronics and Communication Engineering is a result of original research

work done by us.

It is declared that the project report or any part thereof has not been previously

submitted to any University or Institute for the award of Degree.

T.SHARANYA (20RH1A04N4)
V.CHANRIKA (20RH1A04P5)
S.TEJASHWINI (21RH5A0420)
T.SARALA DEVI (20RH1A04N7)
CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

ABSTRACT

Lung nodules are vital indicators for the presence of lung cancer. An early detection enhances the survival
rate of the patient by starting treatment at the right time. The detection and classification of malignancy in
Computed Tomography (CT) images is a very time-consuming and difficult task for radiologists which lead
the researchers to develop algorithms for Computer-Aided Diagnosis (CAD) systems to mitigate this burden.
The performance of CAD systems is continuously improving by using various deep learning techniques for
screening of lung cancer. In this paper, we proposed transferable texture Convolutional Neural Networks
(CNN) to improve the classification performance of pulmonary nodules in CT scans. An Energy Layer (EL)
is incorporated in our scheme, which extracts texture features from the convolutional layer. The inclusion of
EL reduces the number of learnable parameters of the network, which further reduces the memory
requirements and computational complexity. The proposed model has only three convolutional layers and
one EL, instead of pooling layer. Overall proposed CNN architecture comprises of nine layers for automatic
feature extraction and classification of pulmonary nodule candidates as malignant or benign. Furthermore,
the pre-trained model of proposed CNN is also used to handle the smaller dataset classification problem by
using transfer learning.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 1


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

INDEX
TITLE PAGENO
CHAPTER 1- INTRODUCTION 2
1.1 Introduction 2

CHAPTER 2- LITERATURE SURVEY 3

CHAPTER 3-SYSTEM ANALYSIS 5-8

3.1 Existing system 5

3.2 Proposed system 5

3.3 System architecture 6

3.4 System requirements 6-8

CHAPTER 4-SOFTWARE DESCRIPTION 9-28


4.1 System Study, Feasibility study 9
4.2 System Design 9-11

4.3 System Environment 11-15

4.4 Software Procedure 15-23

4.5 Project Code 23-28

CHAPTER 5- RESULT ANALYSIS 29-35

CONCLUSION 36

REFERENCES 37

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 2


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

1. INTRODUCTION
Lung cancer growth has turned out to be a standout amongst the most widely recognized reasons
for disease in the two people. Countless bite the dust each year because of lung malignancy. The
illness has diverse stages whereby it begins from the little tissue and spreads all through the
distinctive territories of the lungs by a procedure called metastasis. It is the uncontrolled
development of undesirable cells in the lungs. It is assessed that around 12,203 people had lung
disease in 2016, 7130 guys and 5073 females; passing from lung malignant growth in 2016 were
8839. . Biomedical image handling is the most recent rising apparatus in medicinal research
utilized for the early recognition of malignancies. Biomedical image handling strategies can be
utilized in the restorative field to analysis maladies at the beginning time. It utilizes biomedical
images, for example, X-beams, Computed innovation and MRIs. The principle commitment of
image handling in the restorative field is to analysis the malignant growth at the beginning time,
expanding survival rates. The time factor is basic for tumors of the mind, the lungs, and bosoms.
image handling can identify these malignant growths in the early periods of the maladies
encouraging an early treatment process. The image preparing procedure comprises of four essential
stages, pre-handling, division, including extraction and grouping. This paper presents image
preparing procedures whereby the CT examine image is utilized as information image, is handled
and beginning period lung disease is distinguished utilizing an SVM (bolster vector machine)
calculation as a classifier in the grouping stage to improve exactness, affectability, and
explicitness. First the image is pre-handled and divided. After that Features are removed from the
sectioned image lastly the image is delegated ordinary or destructive. Advanced image handling is
the utilization of PC calculations to perform image preparing on computerized images. As a
subfield of advanced flag preparing, computerized image handling has numerous points of interest
over simple image preparing.[19][2] It permits a lot more extensive scope of calculations to be
connected to the information data — the point of advanced image handling is to improve the image
information (Features) by stifling undesirable mutilations as well as upgrade of some vital image
includes with the goal that our AIComputer Vision models can profit by this improved information
to take a shot at. Feature extraction begins from an underlying arrangement of estimated
information and assembles determined qualities (Features) proposed to be useful and non-excess,
encouraging the resulting learning and speculation steps, and at times prompting better human
elucidations.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 3


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

2. LITERATURE SURVEY
2.1 Image retrieval system based on feature extraction and relevance feedback

AUTHORS: D. Harini, & D. Bhaskari

Abstract: The availability of huge multimedia databases and the development of information
highways have urged many researchers for developing effective methods of retrieval based on
their content. The traditional way of searching the available huge collections of multimedia data
was by keyword indexing or simply by browsing, where by the user's main interest lies in the
maximum retrieval of similar data. Digital image databases however, opened the way to content-
based searching and retrieval. A lot of research has been done in retrieving the content based on
image features like color, texture, and shape.

2.2 Tumour detection in non-stationary backgrounds

AUTHORS: R.N. Strickland

Abstract: The authors introduce two detectors which they use to locate simulated tumors of fixed
size in clinical gamma-ray images. The first method was conceived when it was observed that
small tumours possess an identifiable signature in curvature feature space, where "curvature" is
the local curvature of the image data when viewed as a relief map. Computed curvature values are
mapped to a normalized significance space using a windowed statistic. The resulting test statistic
is thresholder at a chosen level of significance to give a positive detection. Nonuniform anatomic
background activity is effectively suppressed. The second detector is an adaptive prewhitening
matched filter, which uses a form of preprocessing known as statistical scaling to adaptively
prewhiten the background.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 4


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

3. SYSTEM ANALYSIS

3.1 EXISTING SYSTEM


In existing paper, a picture handling procedure has been utilized to recognize beginning time lung
malignant growth in CT examine pictures. The CT filter picture is pre-prepared pursued by
division of the ROI of the lung. Discrete waveform Transform is connected for picture pressure
and highlights are extricated utilizing a GLCM. The outcomes are encouraged into a SVM
classifier to decide whether the lung picture is carcinogenic or not. The SVM classifier is assessed
dependent on a LIDC dataset.

Disadvantages:
1. The CT filter picture is pre-prepared pursued by division of the ROI of the lung.

2. Discrete waveform Transform is connected for picture pressure and highlights are extricated
utilizing a GLCM.

3.2 PROPOSED SYSTEM


The proposed model applies a range of algorithms to the different stages of image processing. In
this proposed model, first the CT scan image is pre-processed and the ROI (region of interest) is
separated in preparation for segmentation.[17] At the segmentation stage, Discrete Wavelet
Transform (DWT) is applied and the feature is extracted by using a GLCM (Gray level co-
occurrence matrix) such as correlation, entropy, variance, contrast, dissimilarity and energy. After
the feature extraction stage, classification is carried out by an SVM (support vector machine) for
classification of cancerous and non-cancerous nodules.

Advantages:
1. The classification is the major portion where the cancerous and non-cancerous is identified with
the pre trained model.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 5


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

3.3 SYSTEM ARCHITECTURE:

3.4 SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

• System : Pentium IV 2.4 GHz.


• Hard Disk : 40 GB.
• Floppy Drive : 1.44 Mb.
• Monitor : 15 VGA Colour.
• Mouse : Logitech.
• Ram : 512 Mb.

SOFTWARE REQUIREMENTS:
• Operating System: Windows

• Coding Language: Python 3.7

DATA FLOW DIAGRAM:


1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used
to represent a system in terms of input data to the system, various processing carried out
on this data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modelling tools. It is used to
model the system components. These components are the system process, the data used by

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 6


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

the process, an external entity that interacts with the system and the information flows in
the system

User

Check Unauthorized
user

Upload Lung Cancer Dataset

Read &split Dataset to Train & Test

Execute SVM Algorithms

Execute K-Means Algorithm

Predict Lung Cancer

Accuracy Graph

End process

UML DIAGRAMS
UML stands for Unified Modeling Language. UML is a standardized general-purpose
modeling language in the field of object-oriented software engineering. The standard is managed,
and was created by, the Object Management Group.
The goal is for UML to become a common language for creating models of object oriented
computer software. In its current form UML is comprised of two major components: a Meta-model

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 7


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

and a notation. In the future, some form of method or process may also be added to; or associated
with, UML.

USE CASE DIAGRAM:


A use case diagram in the Unified Modeling Language (UML) is a type of behavioral
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented as
use cases), and any dependencies between those use cases.

IMPLEMENTATION:
MODULES:
Upload Lung Cancer Dataset
Read &split Dataset To Train & Test
Execute SVM Algorithms
Execute K-Means Algorithm
Predict Lung Cancer
Accuracy Graph

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 8


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

4.SOFTWARE DESCRIPTION:

4.1 SYSTEM STUDY

FEASIBILITY STUDY

The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During system
analysis the feasibility study of the proposed system is to be carried out.

Three key considerations involved in the feasibility analysis are

 ECONOMICAL FEASIBILITY
 TECHNICAL FEASIBILITY
 SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY: This study is carried out to check the economic impact
that the system will have on the organization. The amount of fund that the company can pour into
the research and development of the system is limited.

TECHNICAL FEASIBILITY: This study is carried out to check the technical feasibility,
that is, the technical requirements of the system. Any system developed must not have a high
demand on the available technical resources. This will lead to high demands on the available
technical resources.
SOCIAL FEASIBILITY: The aspect of study is to check the level of acceptance of the
system by the user. This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity.

4.2 SYSTEM DESIGN:

Guido Van Rossum published the first version of Python code (version 0.9.0) at alt. sources in
February 1991. This release included already exception handling, functions, and the core data
types of list, dict, str and others. It was also object oriented and had a module system.
Python version 1.0 was released in January 1994. The major new features included in this release

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 9


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

were the functional programming tools lambda, map, filter and reduce, which Guido Van
Rossum never liked. Six and a half years later in October 2000, Python 2.0 was introduced.

 Print is now a function


 Views and iterators instead of lists

Modules Used in Project:-

TensorFlow

TensorFlow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library, and is also used for machine
learning applications such as neural networks. It is used for both research and production
at Google.

NumPy

NumPy is a general-purpose array-processing package. It provides a high-performance


multidimensional array object, and tools for working with these arrays.

It is the fundamental package for scientific computing with Python. It contains various features
including these important ones:

 A powerful N-dimensional array object


 Sophisticated (broadcasting) functions
 Tools for integrating C/C++ and Fortran code
Pandas

Pandas is an open-source Python Library providing high-performance data manipulation and


analysis tool using its powerful data structures. Python was majorly used for data munging and
preparation. It had very little contribution towards data analysis.

Matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in a


variety of hardcopy formats and interactive environments across platforms. Matplotlib can be

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 10


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

used in Python scripts, the Python and IPython shells, the Jupyter Notebook, web application
servers, and four graphical user interface toolkits.

Scikit – learn

Scikit-learn provides a range of supervised and unsupervised learning algorithms via a


consistent interface in Python. It is licensed under a permissive simplified BSD license and is
distributed under many Linux distributions, encouraging academic and commercial use.

Install Python Step-by-Step in Windows and Mac :

Python a versatile programming language doesn’t come pre-installed on your computer devices.
Python was first released in the year 1991 and until today it is a very popular high-level
programming language. Its style philosophy emphasizes code readability with its notable use of
great whitespace.

How to Install Python on Windows and Mac :


There have been several updates in the Python version over the years. The question is how to
install Python? It might be confusing for the beginner who is willing to start learning Python but
this tutorial will solve your query. The latest or the newest version of Python is version 3.7.4 or
in other words, it is Python 3.
Before you start with the installation process of Python. First, you need to know about
your System Requirements. Based on your system type i.e. operating system and based
processor, you must download the python version. My system type is a Windows 64-bit
operating system.

4.3 SYSTEM ENVIRONMENT:

HARDWARE REQUIREMENTS:
• System : Pentium IV 2.4 GHz.
• Hard Disk : 40 GB.
• Ram : 512 Mb.
SOFTWARE REQUIREMENTS:
• Operating system : - Windows.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 11


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

• Coding Language : python

What is Python :-
Python is currently the most widely used multi-purpose, high-level programming language.

Python allows programming in Object-Oriented and Procedural paradigms. Python programs


generally are smaller than other programming languages like Java.

The biggest strength of Python is huge collection of standard library which can be used
for the following –

 Machine Learning
 GUI Applications (like Kivy, Tkinter, PyQt etc. )
 Web frameworks like Django (used by YouTube, Instagram, Dropbox)

Advantages of Python :-

1. Extensive Libraries

Python downloads with an extensive library and it contain code for various purposes like
regular expressions, documentation-generation, unit-testing, web browsers, threading,
databases, CGI, email, image manipulation, and more. So, we don’t have to write the
complete code for that manually.
2. Object-Oriented
This language supports both the procedural and object-oriented programming paradigms.
While functions help us with code reusability, classes and objects let us model the real world.
A class allows the encapsulation of data and functions into one.

What is Machine Learning : -

Before we take a look at the details of various machine learning methods, let's start by looking
at what machine learning is, and what it isn't. Machine learning is often categorized as a
subfield of artificial intelligence, but I find that categorization can often be misleading at first
brush. The study of machine learning certainly arose from research in this context, but in the
data science application of machine learning methods, it's more helpful to think of machine
learning as a means of building models of data.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 12


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Need for Machine Learning

Human beings, at this moment, are the most intelligent and advanced species on earth because
they can think, evaluate and solve complex problems. On the other side, AI is still in its initial
stage and haven’t surpassed human intelligence in many aspects. Then the question is that
what is the need to make machine learn? The most suitable reason for doing this is, “to make
decisions, based on data, with efficiency and scale”.Challenges in Machines Learning :-

Applications of Machines Learning :-

Machine Learning is the most rapidly growing technology and according to researchers we are
in the golden year of AI and ML. Emotion analysis

 Sentiment analysis

 Error detection and prevention

How to start learning ML?

 This is a rough roadmap you can follow on your way to becoming an insanely talented Machine
Learning Engineer. Of course, you can always modify the steps according to your needs to
reach your desired end-goal!

Step 1 – Understand the Prerequisites

(a) Learn Linear Algebra and Multivariate Calculus

Both Linear Algebra and Multivariate Calculus are important in Machine Learning. However,
the extent to which you need them depends on your role as a data scientist. If you are more
focused on application heavy machine learning, then you will not be that heavily focused on
maths as there are many common libraries available.

(b) Learn Statistics


Data plays a huge role in Machine Learning. In fact, around 80% of your time as an ML expert
will be spent collecting and cleaning data. And statistics is a field that handles the collection,
analysis, and presentation of data.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 13


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

(c) Learn Python

Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn them
as they go along with trial and error. But the one thing that you absolutely cannot skip is Python!
While there are other languages you can use for Machine Learning like R, Scala, etc. Python is
currently the most popular language for ML.

Step 2 – Learn Various ML Concepts

Now that you are done with the prerequisites, you can move on to actually learning ML (Which
is the fun part!!!) It’s best to start with the basics and then move on to the more complicated
stuff. Some of the basic concepts in ML are:

(a) Terminologies of Machine Learning

 Model – A model is a specific representation learned from data by applying some machine
learning algorithm. A model is also called a hypothesis.
 Feature – A feature is an individual measurable property of the data. A set of numeric
features can be conveniently described by a feature vector. Feature vectors are fed as input to
the model. For example, in order to predict a fruit, there may be features like color, smell, taste,
etc.
 Target (Label) – A target variable or label is the value to be predicted by our model. For the
fruit example discussed in the feature section, the label with each set of input would be the name
of the fruit like apple, orange, banana, etc.

(b) Types of Machine Learning

 Supervised Learning – This involves learning from a training dataset with labeled data using
classification and regression models. This learning process continues until the required level of
performance is achieved.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 14


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

 Unsupervised Learning – This involves using unlabelled data and then finding the underlying
structure in the data in order to learn more and more about the data itself using factor and cluster
analysis models.
 Semi-supervised Learning – This involves using unlabelled data like Unsupervised Learning
with a small amount of labeled data. Using labeled data vastly increases the learning accuracy
and is also more cost-effective than Supervised Learning.
 Reinforcement Learning – This involves learning optimal actions through trial and error. So
the next action is decided by learning behaviors that are based on the current state and that will
maximize the reward in the future.
Advantages of Machine learning :-

1. Easily identifies trends and patterns: Machine Learning can review large volumes of
data and discover specific trends and patterns that would not be apparent to humans. For
instance, for an e-commerce website like Amazon, it serves to understand the browsing
behaviours and purchase histories of its users to help cater to the right products, deals, and
reminders relevant to them. It uses the results to reveal relevant advertisements to them.

2. No human intervention needed (automation): With ML, you don’t need to babysit
your project every step of the way. Since it means giving machines the ability to learn, it lets
them make predictions and also improve the algorithms on their own. A common example of
this is anti-virus softwares; they learn to filter new threats as they are recognized. ML is also
good at recognizing spam.

4.4 SYSTEM PROCEDURE:


Download the Correct version into the system
Step 1: Go to the official site to download and install python using Google Chrome or any other
web browser. OR Click on the following link: https://www.python.org

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 15


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Now, check for the latest and the correct version for your operating system.

Step 2: Click on the Download Tab.

Step 3: You can either select the Download Python for windows 3.7.4 button in Yellow Color or
you can scroll further down and click on download with respective to their version. Here, we are
downloading the most recent python version for windows 3.7.4

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 16


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Step 4: Scroll down the page until you find the Files option.
Step 5: Here you see a different version of python along with the operating system.

• To download Windows 32-bit python, you can select any one from the three options:
Windows x86 embeddable zip file, Windows x86 executable installer or Windows x86 web-
based installer.
• To download Windows 64-bit python, you can select any one from the three options: Windows
x86-64 embeddable zip file, Windows x86-64 executable installer or Windows x86-64 web-
based installer.
Here we will install Windows x86-64 web-based installer. Here your first part regarding which
version of python is to be downloaded is completed. Now we move ahead with the second part
in installing python i.e. Installation

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 17


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Installation of Python
Step 1: Go to Download and Open the downloaded python version to carry out the installation
process.

Step 2: Before you click on Install Now, Make sure to put a tick on Add Python 3.7 to PATH.

Step 3: Click on Install NOW After the installation is successful. Click on Close.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 18


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

With these above three steps on python installation, you have successfully and correctly installed
Python. Now is the time to verify the installation.
Note: The installation process might take a couple of minutes.
Verify the Python Installation
Step 1: Click on Start
Step 2: In the Windows Run Command, type “cmd”.

Step 3: Open the Command prompt option.


Step 4: Let us test whether the python is correctly installed. Type python –V and press Enter.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 19


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Step 5: You will get the answer as 3.7.4


Note: If you have any of the earlier versions of Python already installed. You must first uninstall
the earlier version and then install the new one.
Check how the Python IDLE works
Step 1: Click on Start
Step 2: In the Windows Run command, type “python idle”.

Step 3: Click on IDLE (Python 3.7 64-bit) and launch the program
Step 4: To go ahead with working in IDLE you must first save the file. Click on File > Click on
Save

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 20


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Step 5: Name the file and save as type should be Python files. Click on SAVE. Here I have named
the files as Hey World.
Step 6: Now for e.g. enter print

4.5 SOFTWARE PROCEDURE:

SYSTEM TEST

The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, sub assemblies, assemblies and/or a finished product It is the process of exercising
software with the intent of ensuring that the Software system meets its requirements and user
expectations and does not fail in an unacceptable manner. There are various types of test. Each test
type addresses a specific testing requirement.

TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. All decision branches
and internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive.

Integration testing
Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event driven and is more concerned with
the basic outcome of screens or fields.

Functional test
Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system documentation, and
user manuals.
Functional testing is centered on the following items:

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 21


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Valid Input : identified classes of valid input must be accepted.

Invalid Input : identified classes of invalid input must be rejected.

Functions : identified functions must be exercised.

Output : identified classes of application outputs must be exercised.

Systems/Procedures : interfacing systems or procedures must be invoked.

System Test

System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results.

White Box Testing


White Box Testing is a testing in which in which the software tester has knowledge
of the inner workings, structure and language of the software, or at least its purpose. It is purpose.
It is used to test areas that cannot be reached from a black box level.

Black Box Testing


Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as most other kinds
of tests, must be written from a definitive source document, such as specification or requirements
document, such as specification or requirements document.
Unit Testing

Unit testing is usually conducted as part of a combined code and unit test phase of
the software lifecycle, although it is not uncommon for coding and unit testing to be conducted as
two distinct phases.

Integration Testing
Software integration testing is the incremental integration testing of two or more
integrated software components on a single platform to produce failures caused by interface
defects.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 22


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant participation by
the end user. It also ensures that the system meets the functional requirements.

4.6 PROJECT CODE:


from tkinter import messagebox
from tkinter import *
from tkinter import simpledialog
import tkinter
from tkinter import filedialog
import matplotlib.pyplot as plt
import numpy as np
from tkinter.filedialog import askopenfilename
import pandas as pd
import os
import cv2
import numpy as np
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans

main = tkinter.Tk()

main.title("Classification of Lung Cancer Nodules to Monitor Patients Health using Neural


Network topology with SVM algorithm & Compare with K-Means Accuracy")
main.geometry("1300x1200")

global filename

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 23


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

global classifier
global svm_acc, kmeans_acc
global X, Y
global X_train, X_test, y_train, y_test
global pca

def uploadDataset():
global filename
filename = filedialog.askdirectory(initialdir=".")
text.delete('1.0', END)
text.insert(END,filename+" loaded\n");

def splitDataset():
global X, Y
global X_train, X_test, y_train, y_test
global pca
text.delete('1.0', END)

X = np.load('features/X.txt.npy')
Y = np.load('features/Y.txt.npy')
X = np.reshape(X, (X.shape[0],(X.shape[1]*X.shape[2]*X.shape[3])))

pca = PCA(n_components = 100)


X = pca.fit_transform(X)
print(X.shape)

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)


text.insert(END,"Total CT Scan Images Found in dataset : "+str(len(X))+"\n")
text.insert(END,"Train split dataset to 80% : "+str(len(X_train))+"\n")
text.insert(END,"Test split dataset to 20% : "+str(len(X_test))+"\n")

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 24


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

def executeSVM():
global classifier
global svm_acc
text.delete('1.0', END)
cls = svm.SVC()
cls.fit(X_train, y_train)
predict = cls.predict(X_test)
svm_acc = accuracy_score(y_test,predict)*100
classifier = cls
text.insert(END,"SVM Accuracy : "+str(svm_acc)+"\n")

def executeKmeans():
global kmeans_acc
kmeans = KMeans(n_clusters=2, random_state=0)
kmeans.fit(X_train)
predict = kmeans.predict(X_test)
kmeans_acc = accuracy_score(y_test,predict)*100
print("KMeans Predicted Labels : "+str(kmeans.labels_))
centroids = kmeans.cluster_centers_
text.insert(END,"K-Means Accuracy : "+str(kmeans_acc)+"\n")

def predictCancer():
filename = filedialog.askopenfilename(initialdir="testSamples")
img = cv2.imread(filename)
img = cv2.resize(img, (64,64))
im2arr = np.array(img)
im2arr = im2arr.reshape(64,64,3)

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 25


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

im2arr = im2arr.astype('float32')
im2arr = im2arr/255
test = []
test.append(im2arr)
test = np.asarray(test)

test = np.reshape(test, (test.shape[0],(test.shape[1]*test.shape[2]*test.shape[3])))


test = pca.transform(test)
predict = classifier.predict(test)[0]
msg = ''
if predict == 0:
msg = "Uploaded CT Scan is Normal"
if predict == 1:
msg = "Uploaded CT Scan is Abnormal"
img = cv2.imread(filename)
img = cv2.resize(img, (400,400))
cv2.putText(img, msg, (10, 25), cv2.FONT_HERSHEY_SIMPLEX,0.7, (0, 255, 255), 2)

cv2.imshow(msg, img)
cv2.waitKey(0)

def graph():

height = [svm_acc, kmeans_acc]

bars = ('SVM Accuracy','KMeans Accuracy')


y_pos = np.arange(len(bars))
plt.bar(y_pos, height)
plt.xticks(y_pos, bars)
plt.show()

font = ('times', 14, 'bold')

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 26


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

title = Label(main, text='Classification of Lung Cancer Nodules to Monitor Patients Health using
Neural Network topology with SVM algorithm & Compare with K-Means Accuracy')

title.config(bg='deep sky blue', fg='white')


title.config(font=font)
title.config(height=3, width=120)
title.place(x=0,y=5)

font1 = ('times', 12, 'bold')


text=Text(main,height=20,width=150)
scroll=Scrollbar(text)
text.configure(yscrollcommand=scroll.set)
text.place(x=50,y=120)
text.config(font=font1)

font1 = ('times', 13, 'bold')


uploadButton = Button(main, text="Upload Lung Cancer Dataset", command=uploadDataset)
uploadButton.place(x=50,y=550)
uploadButton.config(font=font1)

readButton = Button(main, text="Read & Split Dataset to Train & Test", command=splitDataset)
readButton.place(x=350,y=550)
readButton.config(font=font1)

svmButton = Button(main, text="Execute SVM Algorithms", command=executeSVM)


svmButton.place(x=50,y=600)
svmButton.config(font=font1)

kmeansButton = Button(main, text="Execute K-Means Algorithm", command=executeKmeans)

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 27


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

kmeansButton.place(x=350,y=600)
kmeansButton.config(font=font1)

predictButton = Button(main, text="Predict Lung Cancer", command=predictCancer)


predictButton.place(x=50,y=650)
predictButton.config(font=font1)

graphButton = Button(main, text="Accuracy Graph", command=graph)


graphButton.place(x=350,y=650)
graphButton.config(font=font1)

main.config(bg='LightSteelBlue3')
main.mainloop()

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 28


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

5.RESULT ANALYSIS

Classification of Lung Cancer Nodules to Monitor Patients Health using Neural Network
topology with SVM algorithm & Compare with K-Means Accuracy

In this project we are using CT Scan Lung Cancer Nodules dataset to predict patient health using
SVM and KMeans algorithm and then comparing prediction accuracy between them. To
implement this project we are using lung cancer images dataset and below screen showing
dataset details and this dataset saved inside ‘Dataset’ folder.

In above screen in dataset we have two types of images such as normal and abnormal and then
SVM and KMEANS will get train on above dataset and when we upload new image then SVM
will predict whether new image is normal or abnormal. To implement 4 titles we created 4
folders to separate algorithms for each title and you can run one by one

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 29


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

SCREENS SHOT

To run project double click on run.bat file from ‘Title1_SVM_KMeans’ folder to get below
screen

In above screen click on ‘Upload Lung Cancer Dataset’ button and then upload dataset folder

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 30


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

In above screen selecting and uploading ‘Dataset’ folder and then click on ‘Select Folder’ button
to load dataset and to get below screen

In above screen dataset loaded and now click on ‘Read & Split Dataset to Train & Test’ button to
split dataset into train and test parts and application split 80% dataset for training and 20%
dataset to test trained model

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 31


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

In above screen we can see dataset contains total 138 images and then application using 110
images for training and 28 images for testing and now data is ready and now click on ‘Execute
SVM Algorithm’ button to run SVM on loaded dataset and to get below accuracy

In above screen SVM accuracy is 60% and now click on “Execute K-Means Algorithm” button
to run KMEANS algorithm on loaded dataset and to get below screen

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 32


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

In above screen KMEANS got 50% accuracy and now click on ‘Predict Lung Cancer’ button to
upload new test image and then application will give prediction result

In above screen selecting and uploading ‘1.png’ file and then click on ‘Open’ button to get below
result

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 33


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

In above screen uploaded image predicted as Abnormal and now test with another image

In above screen uploading ‘5.png’ and below is the result

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 34


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

Above image predicted as Normal and similarly you can upload any image and perform
prediction and now click on ‘Accuracy Graph’ button to get below graph

In above screen x-axis represents algorithm name and y-axis represents accuracy of those
algorithms and from above graph we can conclude that SVM is better than KMEANS in
prediction.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 35


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

CONCLUSION

In the principal period of the venture the Region of Interest in a picture is distinguished. The
Identified district is situated in an item. The highlights in the picture are distinguished by utilizing
some picture handling system. In second period of the task the component removed information is
then used to arrange the picture is destructive or not utilizing a portion of the SVM – bolster vector
machine grouping. At that point some boosting calculation is utilized to expand the exactness of
the instrument.

In existing paper, a picture handling procedures has been utilized to recognize beginning time lung
malignant growth in CT examine pictures. The CT filter picture is pre-prepared pursued by
division of the ROI of the lung. Discrete waveform Transform is connected for picture pressure
and highlights are extricated utilizing a GLCM. The outcomes are encouraged into a SVM
classifier to decide whether the lung picture is carcinogenic or not. The SVM classifier is assessed
dependent on a LIDC dataset. In future the advanced level of algorithm is used to increase the
level of prediction while we are in process to include the Extreme gradient boosting Algorithm to
use the data set more effectively.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 36


CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH

REFERENCES

1. M. Debois, “TxNxM1 : the anatomy and clinics of metastatic cancer.Boston,” Kluwer Academic
Publishers, 2006.
2. J.M. Fitzpatrick, & M. Sonka, “Medical imaging 2003: image processing. USA: SPIE Society
of PhotoOptical Instrumentation Engineers,” 2003.
3. C. M. Haskell, & J.S. Berek, “ Cancer treatment. Philadelphia: W.B. Saunders, 2001
4.K.T.Manivannan, “Development of gray level co-occurrence matrix based support vector
machines for particulate matter characterization,”Retrieved
fromhttp://rave.ohiolink.edu/etdc/view?acc_num=toledo1341577486, 2012.
5. Sathish Kumar R, R. Logeswari, N. Anitha Devi, S. DivyaBharathy Efficient Clustering using
ECATCH Algorithm to Extend Network Lifetime in Wireless Sensor Networks. International
Journal of Engineering Trends and Technology. 45. 476-481. 10.14445/22315381/IJETT-
V45P290 – March 2017
6. C. Charalambous, Conjugate gradient algorithm for efficient training of artificial neural
networks, IEEE Proceedings 139 (3) (1992) 301–310.
7. B. H. Boyle, “Support vector machines : data analysis, machine learning, and applications,”
Retrievedhttp://public.eblib.com/choice/publicfullrecord.aspx?p=3021500, 2011.

8. A.A. Abdulla, S.M. Shaharum, Lung cancer cell classification method using artificial neural
network, Information Engineering Letters 2 (March) (2012) 50–58.
9. Sathish Kumar R , Nive tha M, Madhu mita G & Santhoshy P. Image Enhancement using NHSI
Model Employed in Color Retinal Images. International Journal of Engineering Trends and
Technology. 58. 14-19. 10.14445/22315381/IJETT-V58P203- April 2018.

10. D. Harini, & D. Bhaskari, “Image retrieval system based on feature extraction and relevance
feedback,” ACM, 2 Penn Plaza, Suite 701, New York,NY 10121-0701, 2012.

MALLA REDDY ENGINEERING COLLEGE FOR WOMEN(AUTONOMOUS) 37

You might also like