Week 05

Embedded Systems &
IoT Applications
AI on Embedded Device
Woojin Jeong
Object Detection
2
Object Detection
How about it?
ANN for MNIST CNN for CIFAR-10
Classify handwritten
Classify handwritten How can I classify
image to airplane,
image to 0 ~ 9 this image?
automobile, … truck
3
Object Detection
● Task of detecting instances of objects of a specific class within an image
or video.
○ locates the existence of objects in an image using a bounding box →
Localization
○ assigns the types or classes of the objects found → Classification
Classified to “dog”
Localization using
bounding box
4
Object Detection Methods
Object Detection in 20 Years: A Survey
5
Object Detection Methods
● Neural-network based method
○ Region-Based CNN (R-CNN, Fast R-CNN, etc.)
○ Single Shot Detector (SSD)
○ Retina-Net
○ You Only Look Once (YOLO)
2-stage detector
6
YOLO
7
YOLO
● You Only Look Once
○ A single neural network predicts bounding boxes and class probabilities
directly from full images in one evaluation. Since the whole detection pipeline
is a single network, it can be optimized end-to-end directly on detection
performance.
○ 1-stage detector
● By Joseph Redmon, Santosh Sivvala, Ross Girshick, Ali Farhadi

○ https://arxiv.org/abs/1506.02640
8
YOLO - Unified detection
9
● Assume
○ Divide image to 4x4 grids (S = 4)
○ Predict 2 bounding boxes(BB) for each grid (B = 2)
○ Classify 20 objects (C = 20)
Resize image Predict 2 BB of a grid

Divide to 4x4 grids
10
Predicted BB
11
YOLO - Unified detection x
Pc
(x, y)
h/H 1 : Object exists in the cell
0 : No object
IOU : Intersection Over

Union
w/W
Predicted BB
12
(x, y)
Predicted BB
13
Pc
x
(x, y) h/H
y
h
w/W
Pc
Predicted BB
C1
C2
Conditional class probabilities
C20 of the cell
14
h
Output for the cell
Pc
Pc
C1
C2
C20
15
5*2 + 20 = 30
S=4 (B=2, C= 20)
S=4
Output tensor
16
30
4
4
Resize image
Predict 2 BB of a grid Output tensor
Divide to 4x4 grids
17
YOLO - Network design
18
Output : 7x7x30
S=7, B=2, C=20
Input : 448x448
RGB image
YoloV2 use PASCAL
VOC (20 classes)
19
2 Fully connected layers.

Predict output probabilities & coordinates
24 Convolutional Layers inspired by the GoogLeNet.

Extract features from image
20
Pretrained with 1000-class ImageNet
Fine tuned with PASCAL VOC (20 classes)
21
YOLO - Inference
C2
x y w h Pc x y w h Pc C1 C2
0
C2
Pc x C1 C2
0
C2
Pc x C1 C2
0
y
w
Output tensor
h
Pc
x
y
w
h
Pc
C1 C1
C2 C2
C2 C2
0 0 22
BBox1 BBox2 BBox31 BBox32
YOLO - Inference
x
y
w
h
Pc
x
y
w
h
Pc
Confidence for “Cat” class of each
C1 C1
BBox.
C2 C2
C2 C2
0 0
BBox1 BBox2 BBox31 BBox32
0.9 0.87 0.77 0.0 0.0

BBox12 BBox13 BBox11 BBox1 BBox2
BBox13 and BBox11 have high confidence values.

But those have high IOU with BBox12. So, Select only
BBox12 (Non-Maximum Suppression:NMS).
23
YOLO - Inference

BBox.
0.9 0.87 0.85 0.85 0.0 0.0

BBox11 BBox12 BBox10 BBox21 BBox1 BBox2

24
YOLO - Inference

BBox.
0.9 0.87 0.85 0.85 0.0 0.0


25
YOLO - Inference

BBox.
0.9 0.87 0.85 0.85 0.0 0.0

BBox21 has low IOU with BBox11. So, BBox11 is not

removed.
26
YOLO - Inference
Confidence for “Cat” class of each BBox.
0.9 0.87 0.85 0.85 0.0 0.0

Confidence for “Dog” class of each BBox.
0.92 0.87 0.85 0.85 0.0 0.0

Use Non-Maximum Suppression on each object classes.
27
YOLO - Inference
Confidence for “Cat” class of each BBox.
0.9 0.87 0.85 0.85 0.0 0.0

Confidence for “Dog” class of each BBox.
0.92 0.87 0.85 0.85 0.0 0.0

Use Non-Maximum Suppression on each object classes.
28
YOLO - Demo
https://www.youtube.com/@yoloobjectdetection3460
29
YOLOv8
30
YOLOv1 ~ YOLOv8
● YOLOv1 ~ YOLOv3
○ by Joseph Redmond
● YOLOv4
○ by Alexey Bochkovskiy (Apr. 2020)
● YOLOv5
○ by Glenn Jocher from Ultralytics (Jun. 2020)
● YOLOv6
○ by Meituan Vision AI Department (Sep. 2022)
● YOLOv7
○ by Alexey Bochkovskiy (Jul. 2022)
● YOLOv8
○ by Glenn Jocher from Ultralytics (Jan. 2023)
https://www.youtube.com/watch?v=QOC6vgnWnYo
31
YOLOv8
32
Use YOLOv8
● Documents for YOLOv8
○ https://docs.ultralytics.com/
● Prepare environment
○ Install Anaconda : https://www.anaconda.com/download
○ Use Powershell
○ Create Virtual Environment
○ Install package for YOLOv8 (use pip in virtual environment)
■ Ultralytics
■ Pytorch
33
Use YOLOv8
● Anaconda & powershell
34
Use YOLOv8
● Anaconda & powershell
35
Use YOLOv8
● Create Virtual Environment

○ conda create –n [env_name]
● Activate Virtual Environment

○ conda activate [env_name]
● To Deactivate Virtual Environment

○ conda deactivate
36
Use YOLOv8
● Install YOLOV8 package (use pip in Virtual Environment)

○ pip install ultralytics
● YOLOv8 Pre-trained models

○ Pretrained on the COCO dataset (https://cocodataset.org/#home)
○ Detect 80 classes
○ https://docs.ultralytics.com/tasks/detect/
37
Use YOLOv8
● Use pretrained model with CLI in powershell

○ conda activate [env_name]
○ yolo
● If you setup environment correctly, yolo will display helps.
● Run prediction with pretrained model.
○ yolo predict model=yolov8n.pt source=https://ultralytics.com/images/zidane.jpg
○ yolo predict model=yolov8n.pt \
source='https://www.youtube.com/watch?v=sD9gTAFDq40' imgsz=320
● Add show=True or save=True option to show real-time result or save the

result.
● Result saved in ./runs/detect directory
38

Week 05

Uploaded by

Copyright:

Available Formats

You might also like

Week 05

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 05

Uploaded by

Copyright:

Available Formats

Embedded Systems &

Object Detection in 20 Years: A Survey

● By Joseph Redmon, Santosh Sivvala, Ross Girshick, Ali Farhadi

Resize image Predict 2 BB of a grid

IOU : Intersection Over

2 Fully connected layers.

24 Convolutional Layers inspired by the GoogLeNet.

Pretrained with 1000-class ImageNet

Fine tuned with PASCAL VOC (20 classes)

BBox1 BBox2 BBox31 BBox32

0.9 0.87 0.77 0.0 0.0

BBox13 and BBox11 have high confidence values.

Confidence for “Cat” class of each

0.9 0.87 0.85 0.85 0.0 0.0

BBox12 and BBox10 have high confidence values.

Confidence for “Cat” class of each

0.9 0.87 0.85 0.85 0.0 0.0

BBox12 and BBox10 have high confidence values.

Confidence for “Cat” class of each

0.9 0.87 0.85 0.85 0.0 0.0

BBox21 has low IOU with BBox11. So, BBox11 is not

Confidence for “Cat” class of each BBox.

0.9 0.87 0.85 0.85 0.0 0.0

Confidence for “Dog” class of each BBox.

0.92 0.87 0.85 0.85 0.0 0.0

Use Non-Maximum Suppression on each object classes.

Confidence for “Cat” class of each BBox.

0.9 0.87 0.85 0.85 0.0 0.0

Confidence for “Dog” class of each BBox.

0.92 0.87 0.85 0.85 0.0 0.0

Use Non-Maximum Suppression on each object classes.

● Anaconda & powershell

● Anaconda & powershell

● Create Virtual Environment

● Activate Virtual Environment

● To Deactivate Virtual Environment

● Install YOLOV8 package (use pip in Virtual Environment)

● YOLOv8 Pre-trained models

● Use pretrained model with CLI in powershell

● Add show=True or save=True option to show real-time result or save the

You might also like