Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 67

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

-590018

A
Project Work Phase - II Report
On
Lung Cancer Segmentation and Detection using Machine
Learning
SUBMITTED IN PARTIAL FULFILLMENT FOR THE AWARD OF DEGREE OF

BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING
SUBMITTED BY
Arfin Khan (1JB18CS022)
Arjun Dwivedi (1JB18CS02 )
Ashish Anand (1JB18CS025)
Diksha Priya (1JB18CS042)

Under the Guidance of

Dr. Naveena C
Professor,
Dept. of CSE, SJBIT

Department of Computer Science and


Engineering SJB Institute of
Technology
BGS Health and Education city,
Kengeri, Bengaluru-560060, Karnataka, India.

2021- 2022
ACKNOWLEDGEMENT

We would like to express our profound grateful to His Divine Soul Jagadguru Padmabhushan
Sri Sri Sri Dr. Balagangadharanatha Mahaswamiji Jagadguru Sri
Sri Sri Dr. Nirmalanandanatha Mahaswamiji

Revered Sri Sri Dr. Prakashnath


Swamiji

Dr. K. V. Mahendra Prashanth,

Dr. Krishan A N,

Dr. Naveena C,

Arfin Khan 22
Arjun Dwivedi 2
Ashish Anand 25
Diksha Priya 42
DECLARATION BY THE STUDENTS

We, Arfin Khan [1JB18CS022], Arjun Dwivedi [1JB18CS023], Ashish Anand


[1JB18CS025], Diksha Priya [1JB18CS042] students of 8th semester Computer
Science and Engineering, SJB INSTITUTE OF TECHNOLOGY,Bangalore, hereby
declare that the project entitled “LUNG CANCER DETECTION AND
SEGMENTATION USING MACHINE LEARNING” submitted to the
Visvesvaraya Technological University, Belagavi during the academic year 2021-22,
is a record of an original work done by us under the guidance Dr. Naveena C,
Professor, Department of Computer Science and Engineering, SJB Institute of
Technology, Bangalore. This project dissertation report is submitted in partial
fulfilment for the award of Computer Science and Engineering. The results embodied in
this report have not been submitted to any other University or Institute for the award of
any degree.

Date:
Place : Bangalore Arfin Khan [1JB18CS022]
Arjun Dwivedi [1JB18CS023]
Ashish Anand [1JB18CS025]
Diksha Priya [1JB18CS042]
ABSTRACT

The aim of this project is to detect Lung Cancer using MACHINE LEARNING.
Early detection of lung cancer is important in improving a patient's life. Histopathological
examination of tissue is a common procedure needed to obtain an early diagnosis. Tissue
analysis is usually done by a pathologist review, however, this process is
time consuming and flawed. Early detection of cancerous regions will be greatly accelerated
the whole process and help the pathologist. In this paper, we suggest a completely automatic
way to get lung cancer throughout the slide images of lung tissue samples. Separation is done
on image correction rate using a convolutional neural network (CNN). Two CNN architects
(VGG and ResNet) are trained along with their own performance compared. The results
obtained show that CNN the established method has the potential to assist pathologists in lung
cancer diagnosis.
. 2

6
8

29

39

6
7
8
9

10

5
29

LUNA . 5

49

9 .. 1

.9 . 1

.9 Nodules with Malignancy Rating . 2

9 Nodules with Malignancy Rating . . 2


LIST OF TABLES

Table .1 Test Case-UTCO1 1

Table .2 Test Case-UTCO2 . 1

Table .3 Test Case-UTCO3 . 2

Table .4 Test Case-UTCO4 2

Table .5 Test Case-UTCO5 3

Table .1 Model performance 8

Table .2 Model accuracy and loss 49


Lung Cancer Segmentation and Detection using Machine

Medical diagnostics are a category of medical tests designed to detect infections, conditions,
and diseases. Now a days machine learning algorithms are used to analysis data and to
produce most effective results. Currently medical clinics are very well furnished with fully
automatic machines and those machines are generating huge amount of data, then those
data are collected and shared with information systems or with doctor to take required
steps . Machine learning techniques can be used for the analysis of medical data, and it is
very helpful in medical or any diagnostic system for sensing different specialized diagnostic
problems . Using Machine learning, systems take the patient data like symptoms, laboratory
data and some of the important attributes as an input and generates the accurate diagnosis
results. Based on the accuracy of the result, machine will decide which data will be worked
as training and trained dataset for the future reference. By using machine learning
classification algorithms, for any specific disease, we can improve the accuracy, speed,
reliability and performance of the diagnostic on the current system. Machine learning can
offer automatic learning techniques to extract common patterns from realistic data and
make sophisticate and accurate decisions, based on the different learning behaviours. But
the major problem with medical data is that the most of medical data have huge number of
dimensionality as the data changes frequently. In this paper, we tried to resolve the issue of
current system. Thus, we proposed the new approach which can predict the chance of death,
heart attack and stroke. In addition, we will be able to find the size and place of various
tumors and infections. It is very effortless and on time process for patients to analyze
disease based on clinic and laboratory symptoms and data to give the more accurate results.
Also, it will help to detect the disease in primary stage.
Lung cancer is one of the causes of cancer deaths. It is difficult to detect because it
arises and shows symptoms in final stage. However, mortality rate and probability can be
reduced by early detection and treatment of the disease. However, variance of intensity in
CT scan images and anatomical structure misjudgment by doctors and radiologists might
cause difficulty in marking the cancerous cell. The model should also aid aspiring medical
students especially radiology students to train themselves in identifying the Lung nodules
from normal tissues.
Lung Cancer Segmentation and Detection using Machine

the ability of the algorithm to provide a proper lung segmentation in


cases with severe pathologies that are associated with inhomogeneities in the pathological
lungs.

4) CADe system:
Designing an efficient CADe system for detecting lung nodules is still challenging.

5) Accuracy:
The most important aspect of segmentation and detection is accuracy. Who is liable for
accuracy caused by a caner or tumor? In the case of cancer or tumor cells, the software will
be the main component that will cure the disease and will make all the important decisions.
While the initial designs have a person physically placed behind the newer designs
showcased by Google, do not have a detector and segmentor ! In such designs, where the
cancer does not have any 100% accuracy and sensitivity, how is the person infected with
cancer or tumor will recover? Additionally, due to the nature of cancer or tumor, the
occupants will mostly be in a relaxed state and may not be paying close attention to the
curavtive measures. In situations where their attention is needed, by the time they need to act,
it may be too late to avert the situation and can get accuracy.
Lung Cancer Segmentation and Detection using Machine

With an estimated 160,000 deaths in 2018, lung cancer is the most common cause of
cancer death in the United States. Lung cancer is one of the most prevalent cancers worldwide,
causing 1.76 million deaths per year. Clinical decision support systems have been developed to
enable early diagnosis of lung cancer from CT images.
However, most of these tools are limited to lung or nodule segmentation, leaving
classification of nodules to the radiologist. Early research shows that deep learning models can
support with this task as well. Integrating these research efforts into clinical applications is an
active area of development. For examples of lung cancer detection models, some of which are
currently under review for FDA or CE approval. This project constitutes a design study of how
a deep learning-based lung cancer detection app could look like.
Lung Cancer Segmentation and Detection using Machine

1. Build a multiclass classification model using a custom convolution neural network :The
cardinal objective of this project is to develop state of the art Convolutional Neural Network
(CNN) model to perform the classification of lung nodule images into respective cancer
types. The model is trained and tested on the dataset made available by LUNA. The model
can be used for analyzing the CT image and find out if it’s dangerous at early stage.

2. Create new technology in the field of lung cancer detection:The analysis procedure
involves designing the search strategies to find and extract relevant studies on lung cancer
cells. The fundamental purpose of this is to summarize the current, state of the art techniques
for lung cancer detection in the context of CNN based models.

3. Predicting the lung cancer in early stage:The main idea is to detect the lung cancer in early
stages so as it becomes easy for the doctor to diagnose and start the treatment immediately
without further delay. It is also is cost efficient and time efficient.
Lung Cancer Segmentation and Detection using Machine
8
1
9
2
1
3
4
5
6
7
8
9
2
1
Lung Cancer Segmentation and Detection using Machine Learning

Health diagnostic is very important for people health, but people residing in rural
areas where no doctors are found faces difficulties in traveling and paying huge fees.
With this they may get relief of their difficulties.
In big cities people are busy with their work that they forget about themselves and
started facing health issue, this may help them very precisely and in faster way to
know them about the disease at the early age.
Identifying if a given cell is cancerous or not beginning with the implementation of
the most common type of lung cancer which is the SCLC(Small Cell Lung Cancer)
cancer.

3
Lung Cancer Segmentation and Detection using Machine

CHAPTER
SYSTEM REQUIREMENTS AND SPECIFICATIONS

.1 Hardware Requirements

.2 Software Requirements

.3 Functionality Requirements

1.
2.
3.
4.
5.

6. lung

5
Lung Cancer Segmentation and Detection using Machine Learning
.4 Non Functionality Requirements

.5 Feasibility Study

6
Lung Cancer Segmentation and Detection using Machine

Economical Feasibilty

Technical Feasibility

Social Feasibility

7
Lung Cancer Segmentation and Detection using Machine

2
lung nodule

over-fitting lung
nodule

0
Lung Cancer Segmentation and Detection using Machine

1
Lung Cancer Segmentation and Detection using Machine

3
Lung Cancer Segmentation and Detection using Machine

4
Lung Cancer Segmentation and Detection using Machine

LUNA

The LUNA provide datasets for automatic nodule detection algorithms using the largest
publicly available reference database of chest CT scans, the LIDC-IDRI data set. In LUNA16,
participants develop their algorithm and upload their predictions on 888 CT scans in one of the
two tracks: 1) the complete nodule detection track where a complete CAD system should be
developed, or 2) the false positive reduction track where a provided set of nodule candidates
should be classified.

LUNA

import os

from pathlib import Path

from functools import reduce

5
Lung Cancer Segmentation and Detection using Machine

import streamlit as st
import pandas as pd
import numpy as np
from PIL import Image
from matplotlib import cm

DATA_DIR = Path( file ).absolute().parent / "data"

@st.cache
def load_ann_codes():
codes = {
"Malignancy": {
1: "Highly Unlikely",
2: "Moderately Unlikely",
3: "Indeterminate",
4: "Moderately Suspicious",
5: "Highly Suspicious",
},
}
return codes

@st.cache
def load_meta():
scan_df = pd.read_csv(DATA_DIR / "scan_meta.csv")
nod_df = pd.read_csv(DATA_DIR / "nodule_meta.csv")
return scan_df, nod_df

@st.cache
def load_raw_img(pid):
img = np.load(DATA_DIR/pid/"scan.npy")
return img

@st.cache
def load_mask(pid):
fnames = sorted((DATA_DIR/pid).glob('*_mask.npy'))
masks = [np.load(fname) for fname in fnames]
mask = reduce(np.logical_or, masks)
return mask

6
Lung Cancer Segmentation and Detection using Machine

@st.cache
def load_nodule_img(pid, nid):
img = np.load(DATA_DIR/pid/f"nodule_{nid:02d}_vol.npy")
return img

@st.cache(allow_output_mutation=True)
def get_img_slice(img, z, window=(-600, 1500)):
# clip pixel values to desired window
level, width = window
img = np.clip(img, level-(width/2), level+(width/2))
# normalize pixel values to 0-1 range
img_min = img.min()
img_max = img.max()
img = (img - img_min) / (img_max - img_min)
# convert to Pillow image for display
img_slice = img[:, :, z]
pil_img = Image.fromarray(np.uint8(cm.gray(img_slice)*255))
return pil_img.convert('RGBA')

@st.cache(allow_output_mutation=True)
def get_nod_slice(img, window=(-600, 1500)):
# clip pixel values to desired window
level, width = window
img = np.clip(img, level-(width/2), level+(width/2))
# normalize pixel values to 0-1 range
img_min = img.min()
img_max = img.max()
img = (img - img_min) / (img_max - img_min)
# convert to Pillow image for display
z = int(img.shape[2]/2)
img_slice = img[:, :, z]
pil_img = Image.fromarray(np.uint8(cm.gray(img_slice)*255))
return pil_img.convert('RGBA')

@st.cache
def get_overlay():
arr = np.zeros((512, 512, 4)).astype(np.uint8)
arr[:, :, 1] = 128
arr[:, :, 3] = 128
overlay = Image.fromarray(arr, mode='RGBA')
return overlay

7
Lung Cancer Segmentation and Detection using Machine

@st.cache
def get_mask_slice(mask, z):
mask_slice = (mask[:, :, z]*96).astype(np.uint8)
mask_img = Image.fromarray(mask_slice, mode='L')
return mask_img

scan_df, nod_df = load_meta()


scan = scan_df.iloc[0]
pid = scan.PatientID

img_arr = load_raw_img(pid)
mask_arr = load_mask(pid)

st.header("Selected case for lung cancer detection application")

st.subheader("Patient information")

st.write("**Patient ID:**", scan.PatientID)


st.write("**Diagnosis:**", "Malignant, primary lung cancer")
st.write("**Diagnosis method:**", "Biopsy")

st.subheader(f"CT scan")

img_placeholder = st.empty()

col1, col2 = st.beta_columns(2)


with col1:
st.write("**Pixel spacing**")
st.write(f"x: {scan.PixelSpacing:.2f} mm")
st.write(f"y: {scan.PixelSpacing:.2f} mm")
st.write(f"z: {scan.SliceSpacing:.2f} mm")
st.write("**Device**")
st.write(f"{scan.ManufacturerModelName} (by {scan.Manufacturer})")

with col2:
overlay_nodules = st.checkbox("Show nodule overlay", value=True)
z = st.slider("Slice:", min_value=1,
max_value=img_arr.shape[2], value=int(img_arr.shape[2]/2))
level = st.number_input("Window level:", value=-600)
width = st.number_input("Window width:", value=1500)

img = get_img_slice(img_arr, z-1, window=(level, width))

8
Lung Cancer Segmentation and Detection using Machine

CHAPTER

TESTING

.1 Unit testing

Test strategy and approach

Test objectives

0
Lung Cancer Segmentation and Detection using Machine

Features to be tested

Unit
Testing
Table .1 Test Case-UTCO1
Test Case#

Test Name

Test Description

Input

Expected Output

Actual Output

Test Result

Table .2: Test Case-UTCO2


Test Case#

Test Name

Test Description

Input

Expected Output

Actual Output

Test Result

1
Lung Cancer Segmentation and Detection using Machine

Table .3 Test Case-UTCO3


Test Case#

Test Name

Test Description

Input

Expected Output

Actual Output

Test Result

Table .4 Test Case-UTCO4


Test Case#

Test Name cancer

Test Description

Input

Expected Output

Actual Output

Test Result

2
Lung Cancer Segmentation and Detection using Machine

Table .5 Test Case-UTCO5


Test Case#

Test Name

Test Description

Input

Expected Output

Actual Output

Test Result

.2 Integration testing

Test Results:

3
Lung Cancer Segmentation and Detection using Machine

.3 Functional Testing

.4 System Testing

.5 White Box Testing

4
Lung Cancer Segmentation and Detection using Machine

.6 Black Box Testing

.7 Acceptance Testing

Test Results:

5
Lung Cancer Segmentation and Detection using Machine

CHAPTER

RESULTS

.1 Performance measure on training and validation

7
Lung Cancer Segmentation and Detection using Machine Learning

Table .1 Model performance

.2 Training and validation plot

Fig. .1 Model accuracy

8
Lung Cancer Segmentation and Detection using Machine

Fig. .2. Model loss

Table .2 Model accuracy and loss

Epoch Time taken Loss Accuracy Validation Validation


value loss accuracy

4
9
Lung Cancer Segmentation and Detection using Machine

1
Lung Cancer Segmentation and Detection using Machine Learning

9 Nodules with Malignancy Rating

9 Nodules with Malignancy Rating

2
10
Lung Cancer Segmentation and Detection using Machine

10

In this paper, we have proposed a fully automated deep learning-based method to detect lung
cancer on whole-slide histopathology images. The CNN architectures VGG16 and ResNet50 were
compared and the former showed higher AUC and patch classification accuracy.
The presented results suggest that the complex neural network has the potential to diagnose
lung cancer from the whole slide image, but further efforts are needed to increase the classification
accuracy.
In future work, the next steps will be to increase the size of the training set, add afunction to
increase the image, and normalize the spot. In addition, we will try to train from scratch instead of
using pre-trained weights on ImageNet.

4
1
Lung Cancer Segmentation and Detection using Machine

1
REFERENCES

[1]. S.Somasundaram, R.Gobinath (2019) “Current Trends on Deep Learning Models for
Brain Tumor Segmentation and Detection – A Review” International Conference on Machine
Learning, Big Data, Cloud and Parallel Computing (Com-IT-Con), 217-220.IEEE Reference
Paper

[2]. Mansi Lather, Dr. Parvinder Singh (2019) “Investigating Brain Tumor Segmentation and
Detection Techniques” Internaational Conference on Computational Intelligence and Data
Science (ICCIDS 2019), 121-129.

[3]. Tiejun Yang, Jikun Song (2018) “An Automatic Brain Tumor Image Segmentation
Method Based on the U-net” International Conference IEEE 2018, 1600-1604

[4]. Padma Ganasala, Durga Srinivas Kommana, Bhargav Gurrapu (2020) “Semiautomatic
and Automatic Brain Tumor Segmentation Methods: Performance Comparison” 2020 IEEE
India Council International Subsections Conference (INDISCON), 43-46.

[5]. S. Gobhinath, S. Anandkumar, R. Dhayalan, P. Ezhilbharathi, R. Haridharan (2021)


“Human Brain Tumor Detection and Classification by Medical Image Processing” 2021 7th
International Conference on Advanced Computing & Communication Systems (ICACCS), 561-
564.

[6]. M. Usman Akram, Anam Usman (2011) “Computer Aided System for Brain Tumor
Detection and Segmentation” IEEE Paper Presentation 2011, 299-302.

[7]. T. A. Jemimma, Y. Jacob Vetharaj (2018) “Watershed Algorithm based DAPP features
for Brain Tumor Segmentation and Classification” International Conference on Smart Systems
and Inventive Technology (ICSSIT 2018), 155-158.

6
Lung Cancer Segmentation and Detection using Machine Learning

[8]. Spyridon Bakas, Mauricio Reyes, Andras Jakab (2019) “Identifying the Best Machine
Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall
Survival Prediction in the BRATS Challenge”

[9]. KAI HU, QINGHAI GAN , YUAN ZHANG, SHUHUA DENG, FEN XIAO , WEI
HUANG, CHUNHONG CAO, XIEPING GAO (2019) “Brain Tumor Segmentation Using
Multi- Cascaded Convolutional Neural Networks and Conditional Random Field” IEEE Paper
Presentation.

[10]. MAHNOOR ALI, SYED OMER GILANI, ASIM WARIS, KASHAN ZAFAR, AND
MOHSIN JAMIL (2020) “Brain Tumour Image Segmentation Using Deep Networks” IEEE
Paper Presentation.

[11]. MAHNOOR ALI, SYED OMER GILANI, ASIM WARIS, KASHAN ZAFAR, AND
MOHSIN JAMIL (2020) “Brain Tumour Image Segmentation Using Deep Networks” IEEE
Paper Presentation.

[12]. R. Ramya and K.B. Jayanthi () “Multiregion Image Segmentation by Graph Cuts for
Brain Tumour Segmentation” K.S. Rangasamy College of Technology, Tamilnadu, 330-332.

[13]. Nidhi Singh, Shalini Das and A. Veeramuthu (2017) “An Efficient Combined Approach
for Medical Brain Tumour Segmentation” International Conference on Communication and
Signal Processing, April 6-8, 2017, India, 1325-1329.

[14]. Bjoern H. Menze*, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan
Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest,
Levente Lanczi, Elizabeth Gerstner, Marc-André Weber, Tal Arbel, Brian B. Avants, Nicholas
Ayache, Patricia Buendia, D. Louis Collins, Nicolas Cordier, Jason J. Corso, Antonio
Criminisi, Tilak Das, Christopher R. Durst, Michel Dojat, Senan Doyle, Joana Festa- The
Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)”.Conference on
Communication and Signal Processing, April 6-8, 2017, India, 1325-1329. IEEE Paper
Presentation

7
||Jai Sri Gurudev||
Sri AdichunchanagiriShikshana Trust (R)
SJB INSTITUTE OF TECHNOLOGY
(Affiliated to Visvesvaraya Technological University, Belagavi& Approved by AICTE, New Delhi)
No. 67, BGS Health & Education City, Dr.Vishnuvardhan Road, Kengeri, Bengaluru-560060.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Project Outcome Year 2021-22

Project Title: Lung Cancer Detection and Segmentation using Machine Learning

Project Domain: Machine Learning

Factors addressed
Sl No: Applicable PO’s and PSO’s Justification
through project *
Techniques to detect and segment SCLC.
1. Research PO4 & PSO1

2. Skill PO1 Lung Cancer Segmentation

3. Technology PO5 Jupyter Notebook


Addresses the problems faced by people
4. Social relevance PO6 suffering from cancer and ways to detect
and counter it.
Economical and fast when compared other
5. Economy PO11
manual segmenting methods
Worked effectively as an individual and as a
6. Team work PO9
member

Signature of Students Signature of Guide

You might also like