Embedded Vision

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

EMBEDDED VISION

2023/2024

Mohamed Ali HAMDI


Overview

Embedded vision is the integration of vision in machines that use


algorithms to decode meaning from observing images or videos.
Embedded vision systems use embedded boards, sensors,
cameras, and algorithms to extract information. Application areas
are many, and include automobiles, medical, industry, domestic,
and security systems .

1- INTRODUCTION TO EMBEDDED VISION


Embedded vision refers to the practical use of computer vision in machines that
understand their environment through visual means. Computer vision is the use of
digital processing and intelligent algorithms to interpret meaning from images or
video. Due to the emergence of very powerful, low-cost, and energy efficient
processors, it has become possible to incorporate practical computer vision
capabilities into embedded systems, mobile devices, PCs, and the cloud.

Embedded vision system blocks

PAGE 1
Embedded vision block diagram

2- DESIGN OF AN EMBEDDED VISION SYSTEM

An embedded vision system consists, for example, of a camera, a so called board


level camera, which is connected to a processing board. Processing boards take
over the tasks of the PC from the classic machine vision setup. As processing
boards are much cheaper than classic industrial PCs, vision systems can become
smaller and also more cost effective. The interfaces for embedded vision systems
are primarily USB or LVDS (Low voltage differential signaling connector).

As like embedded systems, there are popular single board computers (SBC), such
as the Raspberry Pi are available on the market for embedded vision product
development. the Raspberry Pi is a mini computer with established interfaces and
offers a similar range of features as a classic PC or laptop. Embedded vision
solutions can also be implemented with so-called system on modules (SoM) or
computer on modules (CoM). These modules represent a computing unit. For the
adaptation of the desired interfaces to the respective application, a so called
individual carrier board is needed. This is connected to the SoM via specific
connectors and can be designed and manufactured relatively simply. The SoMs or
CoMs (or the entire system) are cost effective on the one hand since they are
available off-the-shelf, while on the other hand they can also be individually
customized through the carrier board. For large manufactured quantities,
individual processing boards are a good idea.

All modules, single board computers, and SoMs, are based on a system on chip
(SoC). This is a component on which the processor(s), controllers, memory

PAGE 2
modules, power management, and other components are integrated on a single
chip. Due to these efficient components, the SoCs, embedded vision systems have
only recently become available in such a small size and at a low cost.

Embedded vision is the technology of choice for many applications. Accordingly,


the design requirements are widely diversified. The two interface technologies
offered for embedded vision systems in the portfolio are USB3 Vision for easy
integration and LVDS for a lean system design. USB 3.0 is the right interface for a
simple plug and play camera connection and ideal for camera connections to
single board computers. It allows the stable data transfer with a bandwidth of up
to 350 MB/s. LVDS-based interface allows a direct camera connection with
processing boards and thus also to on board logic modules such as FPGAs (field
programmable gate arrays) or comparable components. This allows a lean system
design to be achieved and can benefit from a direct board-to-board connection
and data transfer. The interface is therefore ideal for connecting to a SoM on a
carrier / adapter board or with an individually developed processor unit. It allows
stable, reliable data transfer with a bandwidth of up to 252 MB/s.

Design of embedded vision system

Embedded system
boards

3- Characteristics of Embedded Vision System Boards versus Standard


Vision System Boards

PAGE 3
Most of the previously mentioned single board computers and SoMs do not
include the x86 family processors common in standard PCs. Rather, the CPUs are
often based on the ARM architecture. The open source Linux operating system is
widely used as an operating system in the world of ARM processors. For Linux,
there are a large number of open source application programs, as well as numerous
freely available program libraries. Increasingly, however, x86-based single board
computers are also spreading. A consistently important criterion for the computer
is the space available for the embedded system.

For the software developer, the program development for an embedded system is
different than for a standard PC. As a rule, the target system does not provide a
suitable user interface which can also be used for programming. The software
developer must connect to the embedded system via an appropriate interface if
available (e.g., network interface) or develop the software on the standard PC and
then transfer it to the target system. When developing the software, it should be
noted that the hardware concept of the embedded system is oriented to a specific
application and thus differs significantly from the universally usable PC. However,
the boundary between embedded and desktop computer systems is sometimes
difficult to define. Just think of the mobile phone, which on the one hand has
many features of an embedded system (ARM-based, single-board construction),
but on the other hand can cope with very different tasks and is therefore a
universal computer.

4- Processors for Embedded Vision

This technology category includes any device that executes vision algorithms or
vision system control software. The applications represent distinctly different
types of processor architectures for embedded vision, and each has advantages and
trade-offs that depend on the workload. For this reason, many devices combine
multiple processor types into a heterogeneous computing environment, often
integrated into a single semiconductor component. In addition, a processor can be
accelerated by dedicated hardware that improves performance on computer vision
algorithms.

Vision algorithms typically require high compute performance. And, of course,


embedded systems of all kinds are usually required to fit into tight cost and power
consumption envelopes. In other digital signal processing application domains,
such as digital wireless communications, chip designers achieve this challenging
combination of high performance, low cost, and low power by using specialized

PAGE 4
coprocessors and accelerators to implement the most demanding processing tasks
in the application. These coprocessors and accelerators are typically not
programmable by the chip user, however. This trade-off is often acceptable in
wireless applications, where standards mean that there is strong commonality
among algorithms used by different equipment designers.

In vision applications, however, there are no standards constraining the choice of


algorithms. On the contrary, there are often many approaches to choose from to
solve a particular vision problem. Therefore, vision algorithms are very diverse,
and tend to change fairly rapidly over time. As a result, the use of
nonprogrammable accelerators and coprocessors is less attractive for vision
applications compared to applications like digital wireless and compression centric
consumer video equipment. Achieving the combination of high performance, low
cost, low power, and programmability is challenging. Special purpose hardware
typically achieves high performance at low cost, but with little programmability.
General purpose CPUs provide programmability, but with weak performance, poor
cost, or energy efficiency.

Demanding embedded vision applications most often use a combination of


processing elements, which might include, for example:

A general purpose CPU for heuristics, complex decision making, network


access, user interface, storage management, and overall control

A high-performance DSP-oriented processor for real time, moderate rate


processing with moderately complex algorithms

One or more highly parallel engines for pixel rate processing with simple
algorithms

While any processor can in theory be used for embedded vision, the most
promising types today are:

High-performance embedded CPU

Application specific standard product (ASSP) in combination with a CPU

Graphics processing unit (GPU) with a CPU

DSP processor with accelerator(s) and a CPU

Field programmable gate array (FPGA) with a CPU

PAGE 5
Mobile “application processor”

High Performance Embedded CPU

In many cases, embedded CPUs cannot provide enough performance or cannot do


so at an acceptable price or power consumption levels to implement demanding
vision algorithms. Often, memory bandwidth is a key performance bottleneck,
since vision algorithms typically use large amounts of memory bandwidth, and
don’t tend to repeatedly access the same data.

The memory systems of embedded CPUs are not designed for these kinds of data
flows. However, like most types of processors, embedded CPUs become more
powerful over time, and in some cases can provide adequate performance. There
are some compelling reasons to run vision algorithms on a CPU when possible.
First, most embedded systems need a CPU for a

variety of functions. If the required vision functionality can be implemented using


that CPU, then the complexity of the system is reduced relative to a multiprocessor
solution.

In addition, most vision algorithms are initially developed on PCs using general
purpose CPUs and their associated software development tools.

Similarities between PC CPUs and embedded CPUs (and their associated tools)
mean that it is typically easier to create embedded implementations of vision
algorithms on embedded CPUs compared to other kinds of embedded vision
processors. In addition, embedded CPUs typically are the easiest to use compared
to other kinds of embedded vision processors, due to their relatively
straightforward architectures, sophisticated tools, and other application
development infrastructure, such as operating systems.

An example of an embedded CPU is the Intel Atom E660T.

Application Specific Standard Product (ASSP) in Combination with a CPU


Application specific standard products (ASSPs) are specialized, highly integrated chips tailored
for specific applications or application sets.
ASSPs may incorporate a CPU, or use a separate CPU chip. By virtue of specialization, ASSPs
typically deliver superior cost and energy efficiency compared with other types of processing
solutions. Among other techniques, ASSPs deliver this efficiency through the use of specialized
coprocessors and accelerators. ASSPs are by definition focused on a specific application, they are
usually provided with extensive application software. The specialization that enables ASSPs to
achieve strong efficiency, however, also leads to their key limitation lack of flexibility. An ASSP
designed for one application is typically not suitable for another application, even one that is

PAGE 6
related to the target application. ASSPs use unique architectures, and this can make programming
them more difficult than with other kinds of processors. Indeed, some ASSPs are not user
programmable. Another consideration is risk. ASSPs often are delivered by small suppliers, and
this may increase the risk that there will be difficulty in supplying the chip, or in delivering
successor products that enable system designers to upgrade their designs without having to start
from scratch. An example of a vision-oriented ASSP is the PrimeSense PS1080-A2, used in
the Microsoft Kinect.

General Purpose CPUs


While computer vision algorithms can run on most general purpose CPUs, desktop processors
may not meet the design constraints of some systems. However, x86 processors and system
boards can leverage the PC infrastructure for low-cost hardware and broadly supported software
development tools. Several Alliance Member companies also offer devices that integrate a RISC
CPU core. A general purpose CPU is best suited for heuristics, complex decision making,
network access, user interface, storage management, and overall control. A general purpose CPU
may be paired with a vision specialized device for better performance on pixel level processing.

Graphics Processing Units with CPU


High-performance GPUs deliver massive amounts of parallel computing potential, and graphics
processors can be used to accelerate the portions of the computer vision pipeline that perform
parallel processing on pixel data.
While General Purpose GPUs (GPGPUs) have primarily been used for high-performance
computing (HPC), even mobile graphics processors and integrated graphics cores are gaining
GPGPU capability meeting the power constraints for a wider range of vision applications. In
designs that require 3D processing in addition to embedded vision, a GPU will already be part
of the system and can be used to assist a general purpose CPU with many computer vision
algorithms. Many examples exist of x86-based embedded systems with discrete GPGPUs.
Graphics processing units (GPUs), intended mainly for 3D graphics, are increasingly capable of
being used for other functions, including vision applications. The GPUs used in personal
computers today are explicitly intended to be programmable to perform functions other than 3D
graphics.
Such GPUs are termed “general purpose GPUs” or “GPGPUs.” GPUs have massive parallel
processing horsepower. They are ubiquitous in personal computers. GPU software development
tools are readily and freely available, and getting started with GPGPU programming is not
terribly complex. For these reasons, GPUs are often the parallel processing engines of first resort
of computer vision algorithm developers who develop their algorithms on PCs, and then may
need to accelerate execution of their algorithms for simulation or prototyping purposes.
GPUs are tightly integrated with general purpose CPUs, sometimes on the same chip. However,
one of the limitations of GPU chips is the limited variety of CPUs with which they are currently
integrated. The limited number of CPU operating systems support the integration. Today there are
low-cost, low-power GPUs, designed for products like smart phones and tablets. However, these
GPUs are generally not GPGPUs, and therefore using them for applications other than 3D
graphics is very challenging. An example of a GPGPU used in personal computers is the
NVIDIA GT240.
Digital Signal Processors with Accelerator(s) and a CPU
DSPs are very efficient for processing streaming data, since the bus and memory architecture are
optimized to process high-speed data as it traverses the system. This architecture makes DSPs an
excellent solution for processing image pixel data as it streams from a sensor source. Many DSPs

PAGE 7
for vision have been enhanced with coprocessors that are optimized for processing video inputs
and accelerating computer vision algorithms. The specialized nature of DSPs makes these devices
inefficient for processing general purpose software workloads, so DSPs are usually paired with a
RISC processor to create a heterogeneous computing environment that
offers the best of both worlds.
Digital signal processors (“DSP processors” or “DSPs”) are microprocessors specialized for
signal processing algorithms and applications. This specialization typically makes DSPs more
efficient than general purpose CPUs for the kinds of signal processing tasks that are at the heart
of vision applications. In addition, DSPs are relatively mature and easy to use compared to other
kinds of parallel processors. Unfortunately, while DSPs do deliver higher performance and
efficiency than general purpose CPUs on vision algorithms, they often fail to deliver sufficient
performance for demanding algorithms. For this reason, DSPs are often supplemented with one or
more coprocessors. A typical DSP chip for vision applications therefore comprises a CPU, a DSP,
and multiple coprocessors. This heterogeneous combination can yield excellent performance and
efficiency, but can also be difficult to program. Indeed, DSP vendors typically do not enable
users to program the coprocessors; rather, the coprocessors run software function libraries
developed by the chip supplier. An example of a DSP targeting video applications is the Texas
Instruments DM8168.
Field Programmable Gate Arrays (FPGAs) with a CPU
Instead of incurring the high cost and long lead times for a custom ASIC to accelerate computer
vision systems, designers can implement an FPGA to offer a reprogrammable solution for
hardware acceleration. With millions of programmable gates, hundreds of I/O pins, and compute
performance in the trillions of multiply accumulates/sec (tera-MACs), high-end FPGAs offer the
potential for highest performance in a vision system. Unlike a CPU, which has to use time slice or
multi-thread tasks as they compete for compute resources, an FPGA has the advantage of being
able to simultaneously accelerate multiple portions of a computer vision pipeline. Since the
parallel nature of FPGAs offers so much advantage for accelerating computer vision, many of the
algorithms are available as optimized libraries from semiconductor vendors. These computer
vision libraries also include preconfigured interface blocks for connecting to other vision devices,
such as IP cameras.
Field programmable gate arrays (FPGAs) are flexible logic chips that can be reconfigured at the
gate and block levels. This flexibility enables the user to craft computation structures that are
tailored to the application at hand. It also allows selection of I/O interfaces and on-chip
peripherals matched to the application requirements. The ability to customize compute structures,
coupled with the massive amount of resources available in modern FPGAs, yields high
performance coupled with good cost and energy efficiency.
However, using FPGAs is essentially a hardware design function, rather than a software
development activity. FPGA design is typically performed using hardware description languages
(Verilog or VHLD) at the register transfer level (RTL) a very low-level of abstraction. This
makes FPGA design time consuming and expensive, compared to using the other types
of processors discussed here.
However using FPGAs is getting easier, due to several factors. First, so called “IP block” libraries
—libraries of reusable FPGA design components are becoming increasingly capable. In some
cases, these libraries directly address vision algorithms. In other cases, they enable supporting
functionality, such as video I/O ports or line buffers. Second, FPGA suppliers and their partners
increasingly offer reference designs reusable system designs incorporating FPGAs and targeting
specific applications.
Third, high-level synthesis tools, which enable designers to implement vision and other
algorithms in FPGAs using high-level languages, are increasingly effective. Relatively low-
performance CPUs can be implemented by users in the FPGA. In a few cases, high-performance

PAGE 8
CPUs are integrated into FPGAs by the manufacturer. An example FPGA that can be used for
vision applications is the Xilinx Spartan-6 LX150T.

Mobile “Application Processor”


A mobile “application processor” is a highly integrated system-on-chip, typically designed
primarily for smart phones but used for other applications.
Application processors typically comprise a high-performance CPU core and a constellation of
specialized coprocessors, which may include a DSP, a GPU, a video processing unit (VPU), a 2D
graphics processor, an image acquisition processor, and so on. These chips are specifically
designed for battery-powered applications, and therefore place a premium on energy efficiency.
In addition, because of the growing importance of and activity surrounding smart phone and
tablet applications, mobile application processors often have strong software development
infrastructure, including low-cost development boards, Linux and Android ports, and so on.
However, as with the DSP processors discussed in the previous section, the specialized
coprocessors found in application processors are usually not user programmable, which limits
their utility for vision applications. An example of a mobile application processor is the Freescale
i.MX53.
Cameras/Image Sensors for Embedded Vision
While analog cameras are still used in many vision systems, this section focuses on digital image
sensors usually either a CCD or CMOS sensor array that operates with visible light. However,
this definition shouldn’t constrain the technology analysis, since many vision systems can also
sense other types of energy (IR, sonar, etc.).
The camera housing has become the entire chassis for a vision system, leading to the emergence
of “smart cameras” with all of the electronics integrated. By most definitions, a smart camera
supports computer vision, since the camera is capable of extracting application specific
information.
However, as both wired and wireless networks get faster and cheaper, there still may be reasons
to transmit pixel data to a central location for storage or extra processing.
A classic example is cloud computing using the camera on a smart phone. The smart phone could
be considered a “smart camera” as well, but sending data to a cloud-based computer may reduce
the processing performance required on the mobile device, lowering cost, power, weight, and so
on. For a dedicated smart camera, some vendors have created chips that integrate all of
the required features.
Until recent times, many people would imagine a camera for computer vision as the
outdoor security camera shown in Figure There are countless vendors supplying these
products, and many more supplying indoor cameras for industrial applications. There are
simple USB cameras for PCs available and billions of cameras embedded in the mobile
phones of the world. The speed and quality of these cameras has risen dramatically supporting
10+ mega pixel sensors with sophisticated image processing hardware. Another important factor
for cameras is the rapid adoption of 3D imaging using stereo optics, time-of-flight and structured
light technologies.
Trendsetting cell phones now even offer this technology, as do the most recent generation of
game consoles. Look again at the picture of the outdoor camera and consider how much change is
about to happen to computer vision markets as new camera technologies become pervasive.
Charge coupled device (CCD) image sensors have some advantages over CMOS image sensors,
mainly because the electronic shutter of CCDs traditionally offers better image quality with
higher dynamic range and resolution. However, CMOS sensors now account for more 90% of the
market, heavily influenced by camera phones and driven by the technology’s lower cost, better
integration, and speed.

PAGE 9
Other Semiconductor Devices for Embedded Vision
Embedded vision applications involve more than just programmable devices and image sensors;
they also require other components for creating a complete system. Most applications require data
communications of pixels and/or metadata, and many designs interface directly to the user.
Some computer vision systems also connect to mechanical devices, such as robots or industrial
control systems.
The list of devices in this “other” category includes a wide range of standard products. In
addition, some system designers may incorporate programmable logic devices or ASICs. In many
vision systems, power, space, and cost constraints require high levels of integration with the
programmable device often into a system-on-a-chip (SoC) device. Sensors to sense external
parameters or envienvironmental measurements are discussed in the separate chapter headings.

Memory
Processors can integrate megabytes’ worth of SRAM and DRAM, so many designs will not
require off-chip memory. However, computer vision algorithms for embedded vision often
require multiple frames of sensor data to track objects. Off-chip memory devices can store
gigabytes of memory, although accessing external memory can add hundreds of cycles of latency.
The systems with a 3D-graphics subsystem will usually already include substantial amounts of
external memory to store the frame buffer, textures, Z buffer, and so on. Sometimes this graphics
memory is stored in a dedicated, fast memory bank that uses specialized DRAMs.
Some vision implementations store video data locally, in order to reduce the amount of
information that needs to be sent to a centralized system.
For a solid state, nonvolatile memory storage system, the storage density is driven by the size of
flash memory chips. Latest generation NAND chip fabrication technologies allow extremely
large, fast and low-power storage in a vision system.

Networking and Bus Interfaces

Mainstream computer networking and bus technology has finally started to catch up to the needs
of computer vision to support simultaneous digital video streams. With economies of scale, more
vision systems will use standard buses like PCI and PCI Express. For networking, Gigabit
Ethernet (GbE) and 10GbE interfaces offer sufficient bandwidth even for multiple high-definition
video streams. However, the trade association for Machine Vision (AIA) continues to promote
Camera Link, and many camera and frame grabber manufacturers use this interface.

PAGE 10

You might also like