Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Group Number 29

REPORT ON
IMAGE SEGMENTATION

Image Segmentation
Image segmentation is a method in which a digital image is broken down
into various subgroups called Image Segments which helps in
reducing the complexity of the image to make further processing or
analysis of the image simpler.
Segmentation in simple terms is assigning labels to pixels. All picture
elements or pixels belonging to the same category have a common label
assigned to them.
Different Image Segmentation techniques are as follows:
1. Threshold based segmentation
2. Edge - based segmentation
3. Region - based segmentation
4. Clustering - based segmentation
5. Artificial Neural Network based segmentation

Research Paper Implementation


The paper that we have implemented is: Mask R-CNN
(https://arxiv.org/pdf/1703.06870.pdf)
This paper mainly focuses on the Mask R-CNN model that comes
under Region Based Segmentation.

Link for the project :


https://github.com/Rohithkumargattu979/Instance-Segmentation

1
Group Number 29

Brief summary about CNN, Faster R-CNN, Mask R-CNN :


● CNN (Convolutional Neural Networks) are a class of deep neural
networks, most commonly applied to analyse visual imagery.
● R-CNN (Region – Based CNN) is a type of machine learning model
used for computer vision tasks, specially used for object detection.
● Faster R-CNN
Before discussing Faster R-CNN, let us know about Region of
Interest (ROI). Fast R-CNN doesn’t use this method for
detection and this is the basic difference between them.
Faster R-CNN possesses an extra CNN for gaining the region
proposal, which we call the region proposal network. In the
training region, the proposal network takes the feature map as
input and outputs region proposals. And these proposals go to the
ROI pooling layer for further procedure.
● MASK R-CNN

Mask R-CNN is a Convolutional Neural Network (CNN) and


state-of-the-art in terms of image segmentation. This variant of a
Deep Neural Network detects objects in an image and generates a
high-quality segmentation mask for each instance.

The results of the model that we have developed are discussed below
which includes the snippets of the program, their meanings and outputs
by showing the segmented image and original image.

NOTE: Some of the deep learning and machine learning frameworks and
libraries used are:

numpy, torch, torchvision, pytorch, cv2, matplotlib, TensorFlow etc.

Snippets of our Implementation

2
Group Number 29

for i in range(detection_count):
box = boxes[0, 0, i]
class_id = box[1]
score = box[2]
if score < 0.5:
continue

# Get box Coordinates


x = int(box[3] * width)
y = int(box[4] * height)
x2 = int(box[5] * width)
y2 = int(box[6] * height)

roi = black_image[y: y2, x: x2]


roi_height, roi_width, _ = roi.shape

# Get the mask


mask = masks[i, int(class_id)]
mask = cv2.resize(mask, (roi_width, roi_height))
_, mask = cv2.threshold(mask, 0.5, 255, cv2.THRESH_BINARY)

cv2.rectangle(img, (x, y), (x2, y2), (255, 0, 0), 3)

# Get mask coordinates


contours, _ = cv2.findContours(np.array(mask, np.uint8),
cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
color = colors[int(class_id)]
for cnt in contours:
cv2.fillPoly(roi, [cnt], (int(color[0]), int(color[1]),
int(color[2])))

RESULTS OF THE IMPLEMENTATION OF MASK R-CNN:

3
Group Number 29

Original Image taken to test the model:

The results after running on the MASK R-CNN model:

These are the images that are the results of the Mask R-CNN model.
● The first one is the result for object division, that is basically, the
model detects all the models in the given image by masking other
objects in the image.
● The second one is the image that is basically instance segmented
image, that is, the objects are mapped to different classes based on
their pixel values and the colours for the labels are given randomly.

4
Group Number 29

Proposed Change in Algorithm to the Model:


The model that we have implemented , Mask R-CNN is one of the models
that is a part of Region - Based Segmentation which indeed is just one of
the techniques in Image Segmentation. But as a change or innovation for
the existing method , we have implemented the K-Means Clustering
algorithm.
K-Means Algorithm: This algorithm identifies the different clusters in
the given data based on the similarity present in the data.

In the process of implementing the K-Means model, we have studied


research papers that the model is running along the centroid data
computed, which is of no useful purpose to the end users for some
specific image data that has less number of distinguishable colours ( for
example, MRI Image Data ). The output that we receive from the above
method clusters the given image itself, and assigns colours in the same
domain space, which serves no purpose for the anticipated output
purpose.

Motivation behind the idea of innovation:


In recent days, the use of machine learning and deep learning has
penetrated into a lot of different fields, one such impactful field is in
medical science. Use of MRI Data to detect deadly tumours is a useful
case. For clearly distinguishing the different fields in MRI, the above
discussed K Means algorithm becomes a hindrance in not depicting the
differences clearly. This has been the main motivation behind
introducing contrasting colors in the same K Means clustering method.

5
Group Number 29

From above, image (a) consists of only 3 colors, which are in turn
used as the colors for k clusters. Here, we lose the significance of ‘k’
value if k is greater than the number of colors, as the clusters
cannot be differentiated clearly.
Image (b) is the desired image after k-means clustering, to
understand the clusters, differentiate them and derive the
inferences.

The following code snippet shows our improvement as part of the


proposed technique

for i in range(0,kmeans.labels_.size):
if kmeans.labels_[i] == 0:
img2[i] = np.array([0.0,0.0,255.0])
elif kmeans.labels_[i] == 1:
img2[i] = np.array([255.0,255.0,0.0])
elif kmeans.labels_[i] == 2:
img2[i] = np.array([0.0,0.0,0.0])
elif kmeans.labels_[i] == 3:
img2[i] = np.array([255.0,0.0,255.0])

6
Group Number 29

Experiments that have been conducted / carried out:


Apart from the region-based image segmentation and clustering based
segmentation, we have also carried out some other experiments with
different techniques listed below:

1. Threshold Based Segmentation


In Threshold Based Segmentation, we define a minimum threshold
to the pixel values and classify the pixels based on this value.

Input image Output image

7
Group Number 29

2. Edge Based Segmentation


In Edge Based Segmentation, we rely on the edges in an image to
classify various objects. Here we are using edge detection operators
like the Prewitt edge operator, these operators detect the changes
in the gray level in an image to find out the edges and detect the
objects in the image. When we give the input as shown above, the
Prewitt edge operator detects the objects present in the image
successfully as evident from the output.

Input:

Output:

8
Group Number 29

3. Custom Training of the Dataset.

The model that is used was trained using a really sophisticated dataset
which can identify over 50 classes of objects.
We tried to train our model with our own dataset of images. The images
are annotated manually using makesense.ai website. The images that are
used for training include various screwdrivers. A COCO Json file is
generated which is later used to train the Mask-RCNN model to identify
the screwdrivers in the input image.
The following images depict the defined classes in the annotation of
images.

9
Group Number 29

A random image is taken and ran through the model that is trained. The
above image is the output of the given image.

10
Group Number 29

Conclusions And Findings:


We used a pre-trained model and applied a mask R-CNN to it,
our findings are that this is the best model for instance
segmentation.
K-Means clustering has been employed and we were able to
detect the clusters present in the grayscale image after a
modification which has been shown above. This is the best
algorithm to segment a desired object from its background in
the image.
Since there are a vast number of resources in this growing field
of Deep Learning, we have many options to choose a perfect
model which best suits our requirements. There is a lot of scope
to research and explore.

11
Group Number 29

Contribution Of Team Members:

Name Contribution

Srikar Shashank Mask R-CNN implementation and report


writing

Pavan Shyamendra Edge based implementation and


appending colors for K-MEANS as the
innovation

M S Narain Shriraam Report writing and Edge based


segmentation

Rohit Kumar Gattu MASK R-CNN implementation and report


writing

Kasina Satwik Custom training the data set for Mask


R-CNN as an experiment and report
writing

Nithin Karupakula Report writing and Threshold based


segmentation

12

You might also like