Cell Counting Computer Vision

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Faster RCNN Breakdown

Regressor (Bounding Box)

Input Image Region Proposal Network RoI pooling


Classifier
(Object or Background)

Input Image:
Faster RCNN accepts data in a special form, images, class name and the coordinate with tuple value
of x and y (xmin, xmax) (ymin, ymax). This image first goes through a pretrained CNN (VGG 16) for
preprocessing to provide us with feature map, (Inception or ResNet) can also be used. The output of
the part is sent to the Region Proposal Network.

Region Proposal Network:


Tasked with finding out those areas in the image where there is a possibility of object presence. And
label the area with object (foreground) and the area without object is labelled as background.

This task starts with the use of anchor boxes of different sizes to capture object of all sizes. After
generating the anchor boxes the Intersection over union (IoU) is computed, this is the over lapping
of the bounding box. If the IoU is greater than or equal to 50% it will be labelled as foreground
class, and if it is less than 50% it will be labelled as background class.

The foreground is moved to the next stage. (Output equal feature maps or region proposals with
different sizes because of the CNN)

CNN (3 * 3, 512)  into 2 different CNN (1*1 (filter size), 2 output * 9 anchor) classification task
and (1*1 (filter size), 4 bounding box * 9 anchor) regression task.

It will tell us if there is an object in the area or not. (Binary classification (Image or No image))

RoI pooling:
The RoI pooling receives the foreground anchor boxes and the feature maps from the VGG. This
input is feature maps of different size, so the RoI reduces all the feature maps to the same (fixed)
size using max pooling (7 * 7 window size * 512 Channels). it sends the output through two fully
connected layers, the last layer splits it into Regressor and Classifier.

Regressor:
Draws the bounding box on the object.

Classifier:
To find out if there is an object and returns the name of the class.
Reference
1. Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal
networks." Advances in neural information processing systems . 2015.

2. https://arxiv.org/pdf/1506.01497.pdf

3. https://debuggercafe.com/introduction-to-deep-learning-for-object-detection/

4. https://debuggercafe.com/evaluation-metrics-for-object-detection/

5. https://debuggercafe.com/faster-rcnn-object-detection-with-pytorch/

Appendix

You might also like