Diagnostic Modelling For Lung Cancer Detection and Classification From Computed Tomography Using Machine Learning

N.E.
D University of Engineering & Technology

Department of Computer and Information Systems Engineering
Diagnostic Modelling for Lung Cancer Detection and

Classification from Computed Tomography using Machine
Learning
Independent Study Project Proposal
Student Name: Humera Yakub

Supervisor Name: Dr. Syed Abbas Ali
Class Roll Number: CS-57/2018
Email ID: yakubhumera@gmail.com
Data Engineering and Info Sys Management
NED University of Engineering & Technology
N.E.D University of Engineering & Technology
SUMMARY – Independent Study Project
Title Diagnostic Modelling for Lung Cancer Detection and Classification from
Computed Tomography using Machine Learning
Supervisor Dr. Syed Abbas Ali
Researcher Humera Yakub- Roll # CS-57/2018
Research Summary One of the deadliest diseases in the world which doctors are trying to tackle is
lung cancer. With a mortality rate of only 19.4%, early diagnosis of this
disease is the only solution to effective treatment. Unfortunately, manual
analysis is a time-consuming process. With the advent of image processing
techniques, many researchers have developed models that can use medical
imaging manipulation to detect lung cancer. However, the accuracy of these
models is questionable and often erroneous. This research aims to acquire a
significant accuracy in the detection of lung cancer with the help of advanced
machine learning algorithms. The objective of the study is to develop a
diagnostic model which will use image pre-processing techniques to obtain the
region of interests present in computed tomography images. The nodules will
then be segmented using filtering techniques. Through feature extraction
techniques, the optimal features will be selected and advanced machine
learning algorithms will be used to develop a classification model which will
classify the tumor as malignant or benign.
Goals  To use efficient image processing techniques on computed tomography

images in order to process and segment the nodules present in the
lungs.
 To study and develop the best feature extraction/selection techniques
for a more accurate model.
 To develop a diagnostic model using advanced machine learning
algorithm that will classify the tumor as malignant or benign.
 To decrease the false positive rate and achieve the highest achievable
accuracy.
 To complete the research in the given time and to submit the result and
implementation in a systematic way.
1
Table of Contents
Table of Contents ...................................................................................................................................... 2

Overview ....................................................................................................................................................... 3
Motivation and Need .................................................................................................................................... 3
Objectives ..................................................................................................................................................... 3
Targeted Issues ............................................................................................................................................. 3
Methodology................................................................................................................................................. 4
Image Acquisition...................................................................................................................................... 5
Preprocessing ............................................................................................................................................ 5
Segmentation ............................................................................................................................................ 5
Feature Extraction..................................................................................................................................... 5
Feature Selection ...................................................................................................................................... 5
Classification ............................................................................................................................................. 5
References .................................................................................................................................................... 6
2
Overview
Lung cancer is one of the deadliest and common cancers worldwide with approximately 225,000
cases and around 150,000 deaths yearly. The survival rate of people diagnosed with lung cancer
is staggeringly low. The general survival rate of people with lung cancer is about 5 years after
diagnosis. Current diagnostic methods include biopsies and imaging, such as Computed
Tomography (CT) scans. The chances of survival can improve if a patient is diagnosed early but
due to fewer symptoms, it is difficult to detect at early stages. The aim of this research is to
detect and classify lung tumor as benign or malignant from CT scan using machine learning.
Motivation and Need

Early detection of lung cancer is extremely vital in its treatment but there are a lot of hinderances
that are faced by doctors in diagnosing this lethal disease. The first and foremost, as in any
medical diagnosis system is the accuracy. Only a few experience radiologists have the expertise
to accurately differentiate between malignant and benign tumors. Moreover, it takes almost 1.5
to 2 hours for a panel of radiologists to view the CT image and conclude results. This poses a
serious threat of time constraint during diagnosis. The availability of resources is also an
important issue in this regard. Hence, in order to address all these issues, an efficient and
accurate lung cancer Computer Aided Design (CAD) system is the need of the hour. With an
efficient system, it will be able to provide fast diagnosis with accurate results that will quicken
the pace of treatment if the need arises.
Objectives
The specific objectives of this research and its implementation are
 To develop an accurate and mature model for analyzing and classifying the tumor as
malignant or benign
 To detect the malignancy of the tumor on the basis of efficient feature extraction
techniques.
 To decrease the false positive rate by increasing the efficiency and accuracy of the
diagnostic procedure.
Targeted Issues
Lung nodules are potential manifestations of lung cancer and their early detection facilitates
early treatment and improves patient’s chances of survival. In order to assist radiologist and
doctors in detecting the cancer, numerous CAD systems have been implemented. All these works
involve mainly four steps to detect the nodules: preprocessing, segmentation, feature extraction
and classification. However, some systems do not have satisfactory accuracy of detection and
they need improvement to achieve accuracy tending to 100%. Image processing techniques and
machine learning algorithms are said to be the deciding factor of accuracy. This research will
3
focus on feature extraction and machine learning algorithms to obtain a significant increase in
the accuracy of the diagnosis.
Methodology
This study is divided into three phases
 First Phase: Literature Review and Image Acquisition

 Second Phase: Image Pre-Processing and Segmentation
 Third Phase: Feature Extraction and Classification
The research will start with the understanding of the vast topic of lung cancer. This phase will
require to review all the literature and state-of-the-art models used for detection of lung cancer.
This phase will also deal with the studying of the computed tomography images in order to
understand the functionalities of the lung. Comprehending the available CT scans and learning
about the different type of lung cancer and their distinct symptoms will help to transform all the
medical features into machine-based parameters.
CT scan is the most widely used tool for screening and treatment after diagnoses. These images
are generally viewed in Digital Imaging and Communications in Medicine (DICOM) format. As
a CT scan consist of a series of images, an image slicer will be used to extract the relevant
material. After the extraction, image preprocessing will be done to enhance the images. Image
preprocessing is a way to improve the quality of image, so that the consequential image is better
than the original one. Mean filter and median filter are the most common ways to remove noise.
This preprocessing image will be used as the input for image segmentation. Image segmentation
is an essential process for most image analysis subsequent tasks. Segmentation divides an image
into its constituent regions or objects. The goal of segmentation is to simplify the representation
of an image into something that is more meaningful and easier to analyze.
The extracted nodules will then be used for feature extraction. Feature extraction is one of the
most important steps in this system. A feature is a significant piece of information extracted from
an image which provides more detailed analysis of the image. The features like area, perimeter,
eccentricity and other statistical based parameters will be calculated. These measurements are
important to classify the tumor and efficient feature selection is the key to higher accuracy.
Feature selection is done with the help of machine learning tools, which will select the derived
features to develop a combination of different classifiers. The most accurate classifier will be
selected on the basis of detection accuracy. The aim of this research is to achieve significant
accuracy of the diagnostic model by optimizing the feature extraction and classification models.
A flow diagram of this study is given below
4
Image Acquisition
Preprocessing
Segmentation
Feature Extraction
Feature Selection
Classification
Image Acquisition
CT scan images will be sliced and transformed from the DICOM format.
Preprocessing
The preprocessing involves noise reduction, transformation, removing inconsistency, filtering
etc.
Segmentation
This process locates objects or boundaries which help in acquiring the region of interest. It will
segment the nodule from the CT scan image.
Feature Extraction
At this stage, features of the nodules such as area, perimeter, diameter etc. will be extracted.
Feature Selection
This stage selects the best features which will contribute to the highest accuracy of the model.
Classification
This stage will classify the nodule as malignant or benign. A number of supervised/ unsupervised
machine algorithms can be used to accomplish this task.
5
References
[1]. http://www.lungcancernews.org/2016/11/01/epidemiology-of-lung-cancer-in-developing-
countries/
[2]. https://www.cancer.org/treatment/understanding-your-diagnosis/tests/ct-scan-for-
cancer.html
[3]. Avinash, S., Manjunath, K., and Senthilkumar, S., “Analysis and Comparison of
Image Enhancement Techniques for the Prediction of Lung Cancer”, 2ndIEEE International
Conference on Recent Trends in Electronics, Information & Communication Technology,
pp. 1535-1539, 2017
[4]. Vijayalaxmi Mekali and H. A. Girijamma, Automated Lung Nodules and Ground Glass
Opacity Nodules Detection and Classification from Computed Tomography Images, Proceedings
of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018
(ISMAC-CVB), 10.1007/978-3-030-00665-5_127, (1347-1358), (2019).
[5]. Chouhan, S.S., Kaul, A., and Singh, U.P., “Image Segmentation Using
Computational Intelligence Techniques”, Archives of Computational Methods in
Engineering, pp. 1-64, February, 2018.
[6] Makaju, Suren ; Prasad, P. W.C. ; Alsadoon, Abeer ; Singh, A. K. ; Elchouemi, A. / Lung
Cancer Detection using CT Scan Images. In: Procedia Computer Science. 2018 ; Vol. 125. pp.
107-114.
[7] Zia ur rehman, Muhammad & Javaid, Muzzamil & Shah, Syed & Gilani, Syed & Jamil,
Mohsin & Ikramullah, Shahid. (2018). An appraisal of nodules detection techniques for lung
cancer in CT images. Biomedical Signal Processing and Control. 41. 140-151.
10.1016/j.bspc.2017.11.017.

Diagnostic Modelling For Lung Cancer Detection and Classification From Computed Tomography Using Machine Learning

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Diagnostic Modelling For Lung Cancer Detection and Classification From Computed Tomography Using Machine Learning

Uploaded by

Copyright:

Available Formats

N.E.

D University of Engineering & Technology

Diagnostic Modelling for Lung Cancer Detection and

Independent Study Project Proposal

Student Name: Humera Yakub

SUMMARY – Independent Study Project

Supervisor Dr. Syed Abbas Ali

Researcher Humera Yakub- Roll # CS-57/2018

Goals  To use efficient image processing techniques on computed tomography

Table of Contents ...................................................................................................................................... 2

Motivation and Need

 First Phase: Literature Review and Image Acquisition

You might also like