Professional Documents
Culture Documents
Ilovepdf Merged
Ilovepdf Merged
Ilovepdf Merged
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING
(Accredited by NBA, New Delhi, validity up to 30.06.2026)
Submitted by:
VASIHA FATHIMA R
4JD19CS059
Certificate
This is to Certify that the seminar work entitled “A deep learning Convolutional
Neural Network in health care environment” is a bonafide work carried out by
Vasiha Fathima R (4JD19CS059), in partial fulfillment of the requirements for the
award of the degree of Bachelor of Engineering in Computer Science and
Engineering of Visvesvaraya Technological University, Belagavi, during the year
2022-2023. It is certified that all the corrections/suggestions indicated for internal
assessment have been incorporated in the report. This work has been approved as it
satisfies the academic requirements for Bachelor of Engineering Degree.
It is my proud privilege and duty to acknowledge the kind of help and valuable guidance
received from several people in preparation and presentation of this seminar.
My sincere thanks to Prof. Latharani T R, Seminar Coordinator for her support and
encouragement during the preparations of this seminar.
I would also like to express my gratitude to our management and beloved Principal
Dr. Ganesh D B for providing a pleasant environment to work in library and for
laboratory facilities needed to prepare this report.
I wish to thank our parents for constantly encouraging and motivating us to learn. Their
personal sacrifice in providing this opportunity to pursue engineering is gratefully
acknowledged.
Finally I would like to thank teaching and non-teaching staff of Department of Computer
Science and Engineering, and my friends for their continuous support and cooperation.
I
ABSTRACT
Imaging techniques are used to capture anomalies of the human body. The captured
images must be understood for diagnosis, prognosis and treatment planning of the anomalies.
Medical image understanding is generally performed by skilled medical professionals. However,
the scarce availability of human experts and the fatigue and rough estimate procedures involved
with them limit the effectiveness of image understanding performed by skilled medical
professionals. Convolutional neural networks (CNNs) are effective tools for image
understanding. They have outperformed human experts in many image understanding tasks. This
article aims to provide a comprehensive survey of applications of CNNs in medical image
understanding. The underlying objective is to motivate medical image understanding researchers
to extensively apply CNNs in their research and diagnosis. A brief introduction to CNNs has
been presented. A discussion on CNN and its various award-winning frameworks have been
presented. The major medical image understanding tasks, namely image classification,
segmentation, localization and detection have been introduced. Applications of CNN in medical
image understanding of the ailments of brain, breast, lung and other organs have been surveyed
critically and comprehensively. A critical discussion on some of the challenges is also presented.
TABLE OF CONTENTS
Acknowledgement I
Abstract II
List of Figures IV
List of Tables V
1. Introduction 1-7
2. Related Work 8
3. System Architecture 9 - 14
5. Applications/Advantages/Disadvantages 16 - 18
6. Conclusion 19
References 20
III
LIST OF FIGURES
IV
LIST OF TABLES
Figure No. Title of the Table Page No.
1
2
3
4
5
6
7
8
9
10
11
12
V
A deep learning Convolutional Neural Network in health care environment 2022-2023
CHAPTER 1
INTRODUCTION
1.1 INTRODUCTION
Most modern Deep Learning models are based on artificial neural networks, specifically
CNNs. A convolutional neural network is a class of deep neural networks, most commonly
applied to analyzing visual imagery. CNNs use a mathematical operation called convolutional in
place of general matrix multiplication in at least one of their layers. They are specifically
designed to process pixel data and are used in image recognition and processing.
They have applications in image and video recognition, recommender systems, image
classification, image segmentation, medical image analysis, natural language processing, brain-
computer interfaces, and financial time series.
CNN is very useful as it minimizes human effort by automatically detecting the features. For
example, for apples and mangoes, it would automatically detect the distinct features of each
class on its own.
The term ‘Convolution” in CNN denotes the mathematical function of convolution which is a
special kind of linear operation wherein two functions are multiplied to produce a third function
which expresses how the shape of one function is modified by the other. In simple terms, two
images which can be represented as matrices are multiplied to give an output that is used to
extract features from the image.
A CNN is a type of neural network that is most often applied to image processing problems. The
fact that they are useful for these fast-growing areas is one of the main reasons they are so
important in Deep Learning, Artificial Intelligence and Machine Learning.
For example, first lets take a regular neural network, which has a input layer, hidden layer and an
output layer.
With most algorithms that handle image processing, the filters are typically created by an
engineer based on heuristics. CNNs can learn what characteristics in the filters are the most
important. That saves a lot of time and trial and error work since we don't need as many
parameters.
A big difference between a CNN and a regular neural network is that CNNs use convoluteons to
handle the math behind the scenes. A convolution is used instead of matrix multiplication in at
least one layer of the CNN. Convolutions take to two functions and return a function.
It doesn't seem like a huge savings until you are working with high resolution images that have
thousands of pixels. The convolutional neural network algorithm's main purpose is to get data
into forms that are easier to process without losing the features that are important for figuring out
what the data represents. This also makes them great candidates for handling huge dataset.
CNNs work by applying filters to your input data. What makes them so special is that CNNs are
able to tune the filters as training happens. That way the results are fine-tuned in real time, even
when you have huge data sets, like with images.
Since the filters can be updated to train the CNN better, this removes the need for hand-created
filters. That gives us more flexibility in the number of filters we can apply to a data set and the
relevance of those filters. Using this algorithm, we can work on more sophisticated problems like
face recognition.
One of things that prevents a lot of problems from using CNNs is a lack of data. While networks
can be trained with relatively few data points (~10,000 >), the more data there is available, the
better tuned the CNN will be.
Just keep in mind that these data points have to be clean and labeled in order for the CNN in to
be able to use them. That's what makes them so expensive to work with.
CNN reduces the computation very efficiently. The small “filter/kernel” slides along the image,
working on small blocks at a time. The processing required across the image is quite similar and
hence this works very well.
Convolutional neural networks are based on neuroscience findings. They are made of layers of
artificial neurons called nodes. These nodes are functions that calculate the weighted sum of the
inputs and return an activation map. This is the convolution part of the neural network.
Each node in a layer is defined by its weight values. When you give a layer some data, like an
image, it takes the pixel values and picks out some of the visual features.
When you're working with data in a CNN, each layer returns activation maps. These maps point
out important features in the data set. If you gave the CNN an image, it'll point out features based
on pixel values, like colors, and give you an activation function.
Usually with images, a CNN will initially find the edges of the picture. Then this slight definition
of the image will get passed to the next layer. Then that layer will start detecting things like
corners and color groups. Then that image definition will get passed to the next layer and the
cycle continues until a prediction is made.
As the layers get more defined, this is called max pooling. It only returns the most relevant
features from the layer in the activation map. This is what gets passed to each successive layer
until you get the final layer.
The last layer of a CNN is the classification layer which determines the predicted value based on
the activation map. If you pass a handwriting sample to a CNN, the classification layer will tell
you what letter is in the image. This is what autonomous vehicles use to determine whether an
object is another car, a person, or some other obstacle.
Training a CNN is similar to training many other machine learning algorithms. You'll start with
some training data that is separate from your test data and you'll tune your weights based on the
accuracy of the predicted values. Just be careful that you don't overfit your model.
There are multiple kinds of CNNs you can use depending on your problem.
1D CNN: With these, the CNN kernel moves in one direction. 1D CNNs are usually used on
time-series data.
2D CNN: These kinds of CNN kernels move in two directions. You'll see these used with image
labelling and processing.
3D CNN: This kind of CNN has a kernel that moves in three directions. With this type of CNN,
researchers use them on 3D images like CT scans and MRIs.
In most cases, you'll see 2D CNNs because those are commonly associated with image data.
Here are some of the applications that you might see CNNs used for.
The hidden layers are typically convolutional layers followed by activation layers, some of them
followed by pooling layers.
A simple convolutional neural network that aids understanding of the core design principles is
the early convolutional neural network LeNet-5, published by Yann LeCun in 1998. LeNet is
capable of recognizing handwritten characters.
CHAPTER 2
RELATED WORK
2.1 RELATED WORK
CHAPTER 3
ARCHITECTURE
3.1 ARCHITECTURE
➢ The CNN is made up of three types of layers: convolutional layers, pooling layers, and
fully-connected (FC) layers.
➢ A convolution tool that separates and identifies the various features of the image for
analysis in a process called as Feature Extraction.
➢ It adds non-linearity to the network. There are several commonly used activation
functions such as the ReLU, Softmax, tanH and the Sigmoid functions. Each of these
functions have a specific usage.
➢ For a binary classification CNN model, sigmoid and softmax functions are preferred an
for a multi-class classification, generally softmax us used. In simple terms, activation
functions in a CNN model determine whether a neuron should be activated or not.
➢ It decides whether the input to the work is important or not to predict using mathematical
operations.
➢ A fully connected layer that utilizes the output from the convolution process and predicts
the class of the image based on the features extracted in previous stages.
➢ This CNN model of feature extraction aims to reduce the number of features present in a
dataset. It creates new features which summarizes the existing features contained in an
original set of features. There are many CNN layers as shown in the CNN
architecture diagram.
CONVOLUTIONAL LAYERS
There are three types of layers that make up the CNN which are the convolutional layers, pooling
layers, and fully-connected (FC) layers. When these layers are stacked, a CNN architecture will
be formed. In addition to these three layers, there are two more important parameters which are
the dropout layer and the activation function which are defined below.
1.Convolutional Layer
This layer is the first layer that is used to extract the various features from the input images. In
this layer, the mathematical operation of convolution is performed between the input image and a
filter of a particular size MxM. By sliding the filter over the input image, the dot product is taken
between the filter and the parts of the input image with respect to the size of the filter (MxM).
The output is termed as the Feature map which gives us information about the image such as the
corners and edges. Later, this feature map is fed to other layers to learn several other features of
the input image.
The convolution layer in CNN passes the result to the next layer once applying the convolution
operation in the input. Convolutional layers in CNN benefit a lot as they ensure the spatial
relationship between the pixels is intact.
2.Pooling Layer
In most cases, a Convolutional Layer is followed by a Pooling Layer. The primary aim of this
layer is to decrease the size of the convolved feature map to reduce the computational costs. This
is performed by decreasing the connections between layers and independently operates on each
feature map. Depending upon method used, there are several types of Pooling operations. It
basically summarizes the features generated by a convolution layer.
This CNN model generalizes the features extracted by the convolution layer, and helps the
networks to recognize the features independently. With the help of this, the computations are
also reduced in a network.
The Fully Connected (FC) layer consists of the weights and biases along with the neurons and is
used to connect the neurons between two different layers. These layers are usually placed before
the output layer and form the last few layers of a CNN Architecture.
In this, the input image from the previous layers is flattened and fed to the FC layer. The
flattened vector then undergoes few more FC layers where the mathematical functions operations
usually take place. In this stage, the classification process begins to take place. The reason two
layers are connected is that two fully connected layers will perform better than a single
connected layer. These layers in CNN reduce the human supervision.
The image above shows why we call these kinds of layers “fully connected” or sometimes
“densely connected.” All possible connections layer-to-layer are present, meaning every input of
the input vector influences every output of the output vector. The orange lines represent the first
neuron (or perceptron) of the layer. The weights of this neuron only affect output A, and do not
have an effect on outputs B, C or D.
4.Dropout
To overcome this problem, a dropout layer is utilized wherein a few neurons are dropped from
the neural network during training process resulting in reduced size of the model. On passing a
dropout of 0.3, 30% of the nodes are dropped out randomly from the neural network.
5.Activation Functions
Finally, one of the most important parameters of the CNN model is the activation function. They
are used to learn and approximate any kind of continuous and complex relationship between
variables of the network. In simple words, it decides which information of the model should fire
in the forward direction and which ones should not at the end of the network.
It adds non-linearity to the network. There are several commonly used activation functions such
as the ReLU, SoftMax, tanH and the Sigmoid functions. Each of these functions have a specific
usage. For a binary classification CNN model, sigmoid and SoftMax functions are preferred an
for a multi-class classification, generally SoftMax us used. In simple terms, activation functions
in a CNN model determine whether a neuron should be activated or not. It decides whether the
input to the work is important or not to predict using mathematical operations.
METHODOLOGY
3.2 METHODOLOGY
CHAPTER 4
APPLICATIONS
4.1 APPLICATIONS
Object detection: Convolutional neural networks are used in self-driving cars as well as facial
recognition systems for object detection.
Image captioning: Convolutional neural networks are used to caption and describe images,
making it easier for visually impaired people to understand what the images are trying to convey.
It is even used heavily by YouTube.
Voice synthesis: Google Assistant’s voice synthesizer uses Deepmind’s WaveNet ConvNet
model.
Astrophysics: they are used to make sense of radio telescope data and predict the probable
visual image to represent that data.
Convolutional neural networks have even found applications to some extent in population
genetic inference as well as in disease identification. They are also used for the purpose of
fraud detection.
ADVANTAGES
4.2 Benefits of employing CNNs :
The benefits of using CNNs over other traditional neural networks in the computer vision
environment are listed as follows:
1.The main reason to consider CNN is the weight sharing feature, which reduces the number of
trainable network parameters and in turn helps the network to enhance generalization and to
avoid overfitting.
2.Concurrently learning the feature extraction layers and the classification layer causes the model
output to be both highly organized and highly reliant on the extracted features.
3.Large-scale network implementation is much easier with CNN than with other
neural networks.
4.CNNs do not require human supervision for the task of identifying important features.
6.Convolutional neural networks also minimize computation in comparison with a regular neural
network.
7.CNNs make use of the same knowledge across all image locations.
8.CNNs are also robust to noise, which means that they can still recognize patterns in images
even if they are distorted or corrupted.
9. CNNs also support transfer learning, which means that they can be trained on one task and
then used to perform another task with little or no additional training.
10. CNNs automate the feature extraction process, which means that they can learn to recognize
patterns in images without the need for manual feature engineering.
DISADVANTAGES
4.3 Limitations of CNNs :
The Limitations of using CNNs neural networks in the computer vision
environment are listed as follows:
5.In case the convolutional neural network is made up of multiple layers, the training process
could take a particularly long time if the computer does not have a good GPU.
6.Convolutional neural networks will recognize the image as clusters of pixels which are
arranged in distinct patterns. They don’t understand them as components present in the image.
8. CNNs are also vulnerable to adversarial attacks, which involve intentionally manipulating the
input data to fool the CNN into making incorrect decisions. This can be a serious problem in
applications like autonomous vehicles, where safety is a critical concern.
9. CNNs have a limited ability to generalize to new situations. This means that they may perform
poorly on images that are very different from those in the training dataset.
10. Another disadvantage of CNNs is their lack of interpretability. This means that it is difficult
to understand how the CNN makes its decisions.
CONCLUSION
Convolutional neural networks are multi-layer neural networks that are really good at
getting the features out of data. They work well with images and they don't need a lot of pre-
processing. Using convolutions and pooling to reduce an image to its basic features, you can
identify images correctly.
It's easier to train CNN models with fewer initial parameters than with other kinds of neural
networks. You won't need a huge number of hidden layers because the convolutions will be able
to handle a lot of the hidden layer discovery for you.
Today the CNN consider as power full tool within machine learning for a lot of application such
as face detection and image , video recognitions and voice recognition. CNN is the best artificial
neural network, it is used for modeling image but it is not limited to just modeling of the image
but out of many of its applications.
REFERENCES
1. Akkus Z, Galimzianova A, Hoogi A, Rubin DL, Erickson BJ (2017) Deep learning for brain
MRI segmentation: state of the art and future directions. J Digit Imaging 30(4):449–459.
4. . Hasan MK, Alam MA, Elahi MTE, Roy S, Wahid SR: a deep convolutional neural network
for coronavirus recognition from chest radiography images(2020) CVR-NET.
6. Sun W, Tseng TB, Zhang J, Qian W (2017) Enhancing deep con volutional neural network
scheme for breast cancer diagnosis with unlabeled data. Comput Med Imaging Graph 57:4–9