Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

The Code:

import cv2

file = 'ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt'
fm = 'frozen_inference_graph.pb'
model = cv2.dnn_DetectionModel(fm, file)
label = []
filename = 'labels.txt'
with open (filename, 'rt') as fpt:
#r is open and t is txt
label = fpt.read().rstrip('\n').rsplit('\n')
print(label)
print(len(label))

model.setInputSize(320, 320)
#This sets the input size of the model to 320 pixels by 320 pixel
##The input image will be resized or cropped to this size before being fed
into the model for processing.
model.setInputScale(1.0 / 127.5)
# 320.2 = 127.5 to ensure that the input data falls within a suitable range
for the model's activation functions.
model.setInputMean(127.5)
#we use this to center the data around zero and remove any bias caused by
lighting conditions or image capture settings.
model.setInputSwapRB(True)
#the Red and Blue channels of the input image will be swapped.
#we do this to account for differences in channel ordering conventions between
different image sources or libraries.

cap = cv2.VideoCapture (0)

font_scale = 3
font = cv2. FONT_HERSHEY_PLAIN

while True:

ret, frame = cap.read()

i , c, b = model.detect (frame, confThreshold=0.55)

print (i)
if (len (i) !=0):
for CI, conf, boxes in zip(i. flatten (), c.flatten(), b):
if (CI<=80):
cv2.rectangle(frame, boxes, (150, 0, 170),2 )
cv2.putText (frame, label[CI-1], (boxes [0]+10, boxes [1]+40), font,
fontScale=font_scale, color=(255,0 ,0), thickness=1)
#plt.imshow(cv2.cvtColor(frame, cv2.COLOR_BAYER_BG2BGR))
cv2.imshow('Human Detection', frame)

if cv2.waitKey(1) & 0xFF == ord('q'):


break
cap.release()
cv2.destroyAllWindows()

How human detection will work:


Step 1: Choosing a library.
Python offers a variety of libraries for computer vision applications. Among these, OpenCV stands out
as a reliable and robust choice for human detection. OpenCV's human detection algorithms can detect
multiple individuals and filter out extraneous objects in the scene. However, to enhance its
capabilities, we plan to explore methods for detecting humans that are partially obscured by mud or
other environmental factors.

Step 2: Have a pre-trained model.


The trained model that we have comes from the OpenCV library. As you can see in the following
figure, we have imported a text file called “labels”. The text file we imported contains a list of 80
objects that our model is capable of detecting, including human beings, vehicles random objects, etc.
However, this is just a small subset of the library's capabilities. The OpenCV library is constantly
updated and improved to increase its capabilities and is widely used in various fields.
After importing the list of detectable objects, we further refined our model's focus to prioritize the
detection of humans in images or videos captured by the camera. We then filtered out everything else
to make it mainly focus on when the camera detects a human.

All of this is pre-trained in by OpenCVand we have modified it to our liking.

Step 3: Access the web camera.


From Python and OpenCV we can access the camera using the command “cv2.VideoCapture()” and
use 0,1,2, etc. to determine which input of the camera we will be using. For now, we are using the
computer’s camera, and when we attach the components, we will access the drone’s camera.

Step 4: Detect a human.


After completing all the steps, we can compile the code and test it. After testing it we found out that it
can detect multiple humans and other objects can be detected as unknowns.
Step 5: Detecting a human in a flood.
Because we want to make our drones detect humans in the flood, the victim might be covered in mud,
and objects that disturb the detection procedure. Due to this challenge, we are planning to improve our
code and use advanced pre-trained machine-learning modules. There are many pre-trained machine
learning modules for us to detect a human YOLO (You Only Look Once) and SSD (Single Shot
Detection). YOLO, or You Only Look Once, is an object detection method that uses an end-to-end
neural network to predict both bounding boxes and class probabilities in a single pass. Unlike
previous algorithms, which repurposed classifiers to perform detection, YOLO achieved state-of-the-
art results by taking a fundamentally different approach. It performs all of its predictions with the help
of a single fully connected layer, rather than using a Region Proposal Network to detect possible
regions of interest and performing recognition on those regions separately. YOLO requires only a
single iteration, whereas other methods require multiple iterations for the same image.[1]

Moving on to SSD, it’s a fast and accurate object detection algorithm that uses a single feed-forward
convolutional neural network to detect objects in a single shot. It divides the input image into grids,
predicts the presence of objects, refines the bounding boxes, and predicts the class probabilities of the
detected objects. SSD has been widely adopted in various applications and has achieved state-of-the-
art results on various object detection benchmarks.[2]

1-YOLO Algorithm for Object Detection Explained [+Examples]. (n.d.). Retrieved May 1,
2023, from https://www.v7labs.com/blog/yolo-object-detection

2-Liu, W. et al. (2016). SSD: Single Shot MultiBox Detector. In: Leibe, B., Matas, J., Sebe, N.,
Welling, M. (eds) Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer
Science(), vol 9905. Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2

You might also like