Professional Documents
Culture Documents
Free Download Design For Embedded Image Processing On Fpgas 2Nd Edition Donald G Bailey Full Chapter PDF
Free Download Design For Embedded Image Processing On Fpgas 2Nd Edition Donald G Bailey Full Chapter PDF
Free Download Design For Embedded Image Processing On Fpgas 2Nd Edition Donald G Bailey Full Chapter PDF
Second Edition
Donald G. Bailey
Massey University, Palmerston North, New Zealand
This second edition first published 2024
© 2024 John Wiley & Sons, Ltd
Edition History
John Wiley & Sons (Asia) Pte Ltd (1e, 2011)
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any
means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain
permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Donald G. Bailey to be identified as the author of this work has been asserted in accordance with law.
Registered Office(s)
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard
print versions of this book may not be available in other formats.
Trademarks
Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United
States and other countries and may not be used without written permission. All other trademarks are the property of their respective
owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Acknowledgments xix
1 Image Processing 1
1.1 Basic Definitions 1
1.2 Image Formation 3
1.2.1 Optics 3
1.2.2 Colour 5
1.3 Image Processing Operations 6
1.4 Real-time Image Processing 8
1.5 Embedded Image Processing 9
1.6 Computer Architecture 10
1.7 Parallelism 11
1.7.1 Temporal or Task Parallelism 12
1.7.2 Spatial or Data Parallelism 13
1.7.3 Logical Parallelism 14
1.7.4 Stream Processing 14
1.8 Summary 15
References 16
3 Design Process 45
3.1 Problem Specification 45
3.2 Algorithm Development 47
3.2.1 Algorithm Development Process 47
3.2.2 Algorithm Structure 48
3.2.3 FPGA Development Issues 51
3.3 Architecture Selection 51
3.3.1 System Architecture 52
3.3.2 Partitioning Between Hardware and Software 53
3.3.3 Computational Architecture 55
3.4 System Implementation 60
3.4.1 Mapping to FPGA Resources 60
3.4.2 Algorithm Mapping Issues 62
3.5 Testing and Debugging 63
3.5.1 Design 63
3.5.2 Implementation 64
3.5.3 Common Implementation Bugs 64
3.5.4 Timing 66
3.5.5 System Debugging 68
3.5.6 Algorithm Tuning 70
3.5.7 In-field Diagnosis 71
3.6 Summary 72
References 73
4 Design Constraints 77
4.1 Timing Constraints 77
4.1.1 Low-level Pipelining 78
4.1.2 Process Synchronisation 80
4.1.3 Synchronising Between Clock Domains 82
4.1.4 I/O Constraints 83
4.2 Memory Bandwidth Constraints 84
4.2.1 Memory Architectures 84
4.2.2 Caching 86
4.2.3 Row Buffering 87
4.3 Resource Constraints 88
4.3.1 Bit-serial Computation 89
4.3.2 Resource Multiplexing 89
4.3.3 Arbitration 92
4.3.4 Resource Controllers 94
4.3.5 Reconfigurability 95
4.4 Power Constraints 97
4.5 Performance Metrics 98
4.5.1 Speed 99
4.5.2 Resources 99
4.5.3 Power 99
4.5.4 Cost 100
4.5.5 Application Metrics 100
4.6 Summary 101
References 102
Contents vii
6 Interfacing 135
6.1 Camera Input 135
6.1.1 Analogue Video 136
6.1.2 Direct Digital Interface 137
6.1.3 MIPI Camera Serial Interface 138
6.1.4 Camera Link 139
6.1.5 USB Cameras 139
6.1.6 GigE Vision 139
6.1.7 Camera Processing Pipeline 140
6.2 Display Output 143
6.2.1 Display Driver 143
6.2.2 Display Content 146
6.3 Serial Communication 147
6.3.1 RS-232 147
6.3.2 I2 C 148
6.3.3 Serial Peripheral Interface (SPI) 149
6.3.4 Universal Serial Bus (USB) 150
viii Contents
Index 465
Preface
Image processing, and in particular embedded image processing, faces many challenges, from increasing reso-
lution, increasing frame rates, and the need to operate at low power. These offer significant challenges for imple-
mentation on conventional software-based platforms. This leads naturally to considering field-programmable
gate arrays (FPGAs) as an implementation platform for embedded imaging applications. Many image process-
ing operations are inherently parallel, and FPGAs provide programmable hardware, also inherently parallel.
Therefore, it should be as simple as mapping one onto the other, right? Well, yes … and no.
Image processing is traditionally thought of as a software domain task, whereas FPGA-based design is firmly
in the hardware domain. There are a lot of tricks and techniques required to create an efficient design. Perhaps
the biggest hurdle to an efficient implementation is the need for a hardware mindset. To bridge the gap between
software and hardware, it is necessary to think of algorithms not on their own but more in terms of their under-
lying computational architecture. Implementing an image processing algorithm (or indeed any algorithm) on an
FPGA therefore consists of determining the underlying architecture of an algorithm, mapping that architecture
onto the resources available within an FPGA, and finally mapping the algorithm onto the hardware architecture.
While the mechanics of this process is mostly automated by high-level synthesis tools, the underlying design
is not. A low-quality design can only go so far; it is still important to keep in mind the hardware that is being
implied by the code and design the algorithm for the underlying hardware.
Unfortunately, there is limited material available to help those new to the area to get started. While there
are many research papers published in conference proceedings and journals, there are only a few that focus
specifically on how to map image processing algorithms onto FPGAs. The research papers found in the literature
can be classified into several broad groups.
The first focuses on the FPGA architecture itself. Most of these provide an analysis of a range of techniques
relating to the structure and granularity of logic blocks, the routing networks, and embedded memories. As
well as the FPGA structure, a wide range of topics are covered, including underlying technology, power issues,
the effects of process variability, and dynamic reconfigurability. Many of these papers are purely proposals, or
relate to prototype FPGAs rather than commercially available chips. While they provide insights as to some of
the features which might be available in the next generation of devices, most of the topics within this group are
at too low a level.
A second group of papers investigates the topic of reconfigurable computing. Here, the focus is on how an
FPGA can be used to accelerate some computationally intensive task or range of tasks. While image processing
is one such task considered, most of the research relates more to high-performance computing rather than
low-power embedded systems. Topics within this group include hardware and software partitioning, hardware
and software co-design, dynamic reconfigurability, communications between an FPGA and central processing
unit (CPU), comparisons between the performance of FPGAs, graphics processing units (GPUs) and CPUs,
and the design of operating systems and specific platforms for both reconfigurable computing applications and
research. Important principles and techniques can be gleaned from many of these papers even though this may
not be their primary focus.
The next group of papers considers tools for programming FPGAs and applications, with a focus on improv-
ing the productivity of the development process. A wide range of hardware description languages have been
proposed, with many modelled after software languages such as C, Java, and even Prolog. Many of these are
developed as research tools, with very few making it out of the laboratory to commercial availability. There has
xiv Preface
also been considerable research on compilation techniques for mapping standard software languages to hard-
ware (high-level synthesis). Techniques such as loop unrolling, strip mining, and pipelining to produce parallel
hardware are important principles that can result in more efficient hardware designs.
The final group of papers focuses on a range of applications, including image processing and the implemen-
tation of both image processing operations and systems. Unfortunately, as a result of page limits and space
constraints, many of these papers give the results of the implementation of various systems but present rela-
tively few design details. This is especially so in the case of many papers that describe deep learning systems.
Often the final product is described, without describing many of the reasons or decisions that led to that design.
Many of these designs cannot be recreated without acquiring the specific platform and tools that were used or
inferring a lot of the missing details. While some of these details may appear obvious in hindsight, without this
knowledge, many are far from obvious just from reading the papers. The better papers in this group tended to
have a tighter focus, considering the implementation of a single image processing operation.
So, while there may be a reasonable amount of material available, it is quite diffuse. In many cases, it is
necessary to know exactly what you are looking for, or just be lucky to find it. The intention of this book,
therefore, is to bring together much of this diverse research (on both FPGA design and image processing) and
present it in a systematic way as a reference or guide.
Intended Audience
This book is written primarily for those who are familiar with the basics of image processing and want to
consider implementing image processing using FPGAs. Perhaps the biggest hurdle is switching from a software
mindset to a hardware way of thinking. When we program in software, a good compiler can map the algorithm in
the programming language onto the underlying computer architecture relatively efficiently. When programming
hardware though, it is not simply a matter of porting the software onto hardware. The underlying hardware
architecture needs to be designed as well. In particular, programming hardware usually requires transforming
the algorithm into an appropriate parallel architecture, often with significant changes to the algorithm itself. This
requires significant design, rather than just decomposition and mapping of the dataflow (as is accomplished by
a good high-level synthesis tool). This book addresses this issue by not only providing algorithms for image
processing operations but also discusses both the design process and the underlying architectures that can be
used to implement the algorithms efficiently.
This book would also be useful to those with a hardware background, who are familiar with programming
and applying FPGAs to other problems, and are considering image processing applications. While many of the
techniques are relevant and applicable to a wide range of application areas, most of the focus and examples are
taken from image processing. Sufficient detail is given to make many of the algorithms and their implementation
clear. However, learning image processing is more than just collecting a set of algorithms, and there are any
number of excellent image processing textbooks that provide these.
It is the domain of embedded image processing where FPGAs come into their own. An efficient, low-power
design requires that the techniques of both the hardware engineer and the software engineer be integrated tightly
within the final solution.
Against this, there has been an increasing awareness of power and sustainability issues. As a low-power
computing platform, FPGAs are well placed to address the power concerns in many applications.
The capabilities of FPGAs have improved significantly as technology improvements enable more to be
packed onto them. Not only has there been an increase in the amount of programmable logic and on-chip mem-
ory blocks, but FPGAs are becoming more heterogeneous. Many FPGAs now incorporate significant hardened
logic blocks, including moderately powerful reduced instruction set computing (RISC) processors, external
memory interfacing, and a wide range of communication interfaces. Digital signal processing (DSP) blocks
are also improving, with the move towards supporting floating-point in high-end devices. Technology improve-
ments have seen significant reductions in the power required.
Even the FPGA market has changed, with the takeover of both Altera and Xilinx by Intel and AMD, respec-
tively. This is an indication that FPGAs are seen as a serious contender for high-performance computing and
acceleration. The competition has not stood still, with both CPUs and GPUs increasing in capability. In par-
ticular, a new generation of low-power GPUs has become available that are more viable for embedded image
processing.
High-level synthesis tools are becoming more mature and address many of the development time issues
associated with conventional register transfer level design. The ability to compile to both software and hardware
enables more complex algorithms to be explored, with faster debugging. They also allow faster exploration of the
design space, enabling efficient designs to be developed more readily. However, the use of high-level synthesis
does not eliminate the need for careful algorithm design.
While the use of FPGAs for image processing has not become mainstream, there has been a lot of activity in
this space as the capabilities of FPGAs have improved. The research literature on programming and applying
FPGAs in the context of image processing has grown significantly. However, it is still quite diffuse, with most
papers focusing on one specific aspect. As researchers have looked at more complex image processing opera-
tions, the descriptions of the implementation have become higher level, requiring a lot of reading between the
lines, and additional design work to be able to replicate a design.
One significant area that has become mainstream in image processing is the use of deep learning models.
Deep learning was not around when the previous edition was written and only started becoming successful in
image processing tasks in the early 2010s. Their success has made them a driving application, not only for
FPGAs and FPGA architecture but also within computing in general. However, deep learning models pose a
huge computational demand on processing, especially for training, but also for deployment. In an embedded
vision context, this has made FPGAs a target platform for their deployment. Deep learning is a big topic on its
own, so this book is unable to do much more than scratch the surface and concentrate on some of the issues
associated with FPGA-based implementation.
Traditional hardware description languages are compared with high-level synthesis, with the benefits and limi-
tations of each outlined in the context of image processing.
The process of designing and implementing an image processing application on an FPGA is described in
detail in Chapter 3. Particular emphasis is given to the differences between designing for an FPGA-based
implementation and a standard software implementation. The critical initial step is to clearly define the image
processing problem that is being tackled. This must be in sufficient detail to provide a specification that may
be used to evaluate the solution. The procedure for developing the image processing algorithm is described in
detail, outlining the common stages within many image processing algorithms. The resulting algorithm must
then be used to define the system and computational architectures. The mapping from an algorithm is more
than simply porting the algorithm to a hardware description language. It is necessary to transform the algorithm
to make efficient use of the resources available on the FPGA. The final stage is to implement the algorithm
by mapping it onto the computational architecture. Several checklists provide a guide and hints for testing and
debugging an algorithm on an FPGA.
Four types of constraints on the mapping process are limited processing time, limited access to data, lim-
ited system resources, and limited system power. Chapter 4 describes several techniques for overcoming or
alleviating these constraints. Timing explores low-level pipelining, process synchronisation, and working with
multiple clock domains. A range of memory and caching architectures are presented for alleviating memory
bandwidth. Resource sharing and associated arbitration issues are discussed, along with reconfigurability. The
chapter finishes with a section introducing commonly used performance metrics in terms of both system and
application performance.
Chapter 5 focuses on the computational aspects of image processing designs. These help to bridge the gap
between a software and hardware implementation. Different number representation and number systems are
described. Techniques for the computation of elementary functions are discussed, with a particular focus on
those that are hardware friendly. Many of these could be considered the hardware equivalent of software libraries
for efficiently implementing common functions. Possible FPGA implementations of a range of data structures
commonly found in computer vision algorithms are presented.
Any embedded application must interface with the real world. A range of common peripherals is described
in Chapter 6, with suggestions on how they may be interfaced to an FPGA. Particular attention is given to
interfacing cameras and video output devices. Interfacing with other devices is discussed, including serial com-
munications, off-chip memory, and serial processors.
The next section of this book describes the implementation of many common image processing operations.
Some of the design decisions and alternative ways of mapping the operations onto FPGAs are considered. While
reasonably comprehensive, particularly for low-level image-to-image transformations, it is impossible to cover
every possible design. The examples discussed are intended to provide the foundation for many other related
operations.
Chapter 7 considers point operations, where the output depends only on the corresponding input pixel in the
input image(s). Both direct computation and lookup table approaches are described. With multiple input images,
techniques such as image averaging and background modelling are discussed in detail. The final sections in this
chapter consider the processing of colour and hyperspectral images. Colour processing includes colour space
conversion, colour balancing, and colour segmentation.
The implementation of histograms and histogram-based processing are discussed in Chapter 8. The tech-
niques of accumulating a histogram, and then extracting data from the histogram, are described in some detail.
Particular tasks are histogram equalisation, threshold selection, and using histograms for image matching. The
concepts of standard 1-D histograms are extended to multi-dimensional histograms. The use of clustering for
colour segmentation and classification is discussed in some detail. The chapter concludes with the use of features
extracted from multi-dimensional histograms for texture analysis.
Chapter 9 considers a wide range of local filters, both linear and nonlinear. Particular emphasis is given
to caching techniques for a stream-based implementation and methods for efficiently handling the processing
around the image borders. Rank filters are described, and a selection of associated sorting network architec-
tures reviewed. Morphological filters are another important class of filters. State machine implementations of
morphological filtering provide an alternative to the classic filter implementation. Separability and both serial
and parallel decomposition techniques are described that enable more efficient implementations.
Preface xvii
Image warping and related techniques are covered in Chapter 10. The forward and reverse mapping
approaches to geometric transformation are compared in some detail, with particular emphasis on techniques
for stream processing implementations. Interpolation is frequently associated with geometric transformation.
Hardware-based algorithms for bilinear, bicubic, and spline-based interpolation are described. Related
techniques of image registration are also described at the end of this chapter, including a discussion of feature
point detection, description, and matching.
Chapter 11 introduces linear transforms, with a particular focus on the fast Fourier transform (FFT), the
discrete cosine transform (DCT), and the wavelet transform. Both parallel and pipelined implementations of
the FFT and DCT are described. Filtering and inverse filtering in the frequency domain are discussed in some
detail. Lifting-based filtering is developed for the wavelet transform. This can reduce the logic requirements by
up to a factor of 4 over a direct finite impulse response implementation.
Image coding is important for image storage or transmission. Chapter 12 discusses the stages within image
and video coding and outlines some of the techniques that can be used at each stage. Several of the standards
for both still image and video coding are outlined, with an overview of the compression techniques used.
A selection of intermediate-level operations relating to region detection and labelling is presented in
Chapter 13. Standard software algorithms for chain coding and connected component labelling are adapted
to give efficient streamed implementation. These can significantly reduce both the latency and memory
requirements of an application. Hardware implementations of the distance transform, the watershed transform,
and the Hough transform are also presented, discussing some of the key design decisions for an efficient
implementation.
Machine learning techniques are commonly used within computer vision. Chapter 14 introduces the key
techniques for regression and classification, with a particular focus on FPGA implementation. Deep learn-
ing techniques are increasingly being used in many computer vision applications. A range of deep network
architectures is introduced, and some of the issues for realising these on FPGAs are discussed.
Finally, Chapter 15 presents a selection of case studies, showing how the material and techniques described
in the previous chapters can be integrated within a complete application. These applications briefly show the
design steps and illustrate the mapping process at the whole algorithm level rather than purely at the operation
level. Many gains can be made by combining operations together within a compatible overall architecture. The
applications described are coloured region tracking for a gesture-based user interface, calibrating and correcting
barrel distortion in lenses, development of a foveal image sensor inspired by some of the attributes of the human
visual system, a machine vision system for real-time produce grading, stereo imaging for depth estimation, and
face detection.
Conventions Used
The contents of this book are independent of any particular FPGA or FPGA vendor, or any particular hard-
ware description language. The topic is already sufficiently specialised without narrowing the audience further!
As a result, many of the functions and operations are represented in block schematic form. This enables a
language-independent representation and places emphasis on a particular hardware implementation of the algo-
rithm in a way that is portable. The basic elements of these schematics are illustrated in Figure P.1. I is generally
used as the input of an image processing operation, with the output image represented by Q.
With some mathematical operations, such as subtraction and comparison, the order of the operands is impor-
tant. In such cases, the first operand is indicated with a blob rather than an arrow, as shown on the bottom in
Figure P.1.
Consider a recursive filter operating on streamed data:
{
In , |In − Qn−1 | < T,
Qn = (P.1)
Qn−1 + k(In − Qn−1 ), otherwise,
where the subscript in this instance refers to the nth pixel in the streamed image. At a high level, this can be
considered as an image processing operation, and represented by a single block, as shown in the top-left of
Figure P.1. The low-level implementation is given in the middle-left panel. The input and output, I and Q, are
xviii Preface
Multiplexer
A–B A>B
Figure P.1 Conventions used in this book. Top-left: representation of an image processing operation;
middle-left: a block schematic representation of the function given by Eq. (P.1); bottom-left: representation
of operators where the order of operands is important; right: symbols used for various blocks within block
schematics.
represented by registers (dark blocks, with optional register names in white); the subscripts have been dropped
because they are implicit with streamed operation. In some instances additional control inputs may be shown, for
example CE for clock enable, RST for reset. Constants are represented as mid-grey blocks, and other function
blocks with light-grey background.
When representing logic functions in equations, ∨ is used for logical OR, and ∧ for logical AND. This is to
avoid confusion with addition and multiplication.
Donald G. Bailey
Massey University
Palmerston North, New Zealand
Acknowledgments
I would like to acknowledge all those who have helped me to get me where I currently am in my understanding of
field-programmable gate array (FPGA)-based design. In particular, I would like to thank my research students
(David Johnson, Kim Gribbon, Chris Johnston, Aaron Bishell, Andreas Buhler, Ni Ma, Anoop Ambikumar,
Tariq Khan, and Michael Klaiber) who helped to shape my thinking and approach to FPGA development as we
struggled together to work out efficient ways of implementing image processing algorithms. This book is as
much a reflection of their work as it is of mine.
Most of our early work used Handel-C and was tested on boards provided by Celoxica. I would like to
acknowledge the support provided by Roger Gook and his team, first with Celoxica and later with Agility
Design Solutions. Later work was on boards supplied by Terasic. I would like to acknowledge Sean Peng and
his team for their on-going support and encouragement.
Massey University has provided a supportive environment and the freedom for me to explore this field. In
particular, Serge Demidenko gave me the encouragement and the push to begin playing with FPGAs; he has
been a source of both inspiration and challenging questions. Other colleagues who have been of particular
encouragement are Gourab Sen Gupta, Richard Harris, Amal Punchihewa, and Steven Le Moan. I would also
like to acknowledge Paul Lyons, who co-supervised several of my students.
Early versions of some of the material in this book were presented as half day tutorials at the IEEE Region
10 Conference (TENCON) in 2005 in Melbourne, the IEEE International Conference on Image Processing
(ICIP) in 2007 in San Antonio Texas, and the 2010 Asian Conference on Computer Vision (ACCV) in Queen-
stown, New Zealand. It was then adapted to three-day workshops, which were held in Australia, New Zealand,
Japan, and as a Masters Course in Germany. I would like to thank attendees at these workshops and courses for
providing valuable feedback and stimulating discussion.
During 2008, I spent a sabbatical with the Circuits and Systems Group at Imperial College London, where
I began writing the first edition. I would like to thank Peter Cheung, Christos Bouganis, Peter Sedcole, and
George Constantinides for discussions and opportunities to bounce ideas off.
My wife, Robyn, has had to put up with my absence many evenings and weekends while working on
the manuscripts for both the first and second editions. I am grateful for both her patience and her support.
This book is dedicated to her.
Donald G. Bailey
About the Companion Website
This book is accompanied by a companion website.
www.wiley.com/go/bailey/designforembeddedimageprocessc2e
This website includes PowerPoint slides.
1
Image Processing
Vision is arguably the most important human sense. The processing and recording of visual data therefore has
significant importance. The earliest images are from prehistoric drawings on cave walls or carved on stone
monuments commonly associated with burial tombs. (It is not so much the medium that is important here; any-
thing else would not have survived to today.) Such images consist of a mixture of both pictorial and abstract
representations. Improvements in technology enabled images to be recorded with more realism, such as paint-
ings by the masters. Images recorded in this manner are indirect in the sense that the light intensity pattern is
not used directly to produce the image. The development of chemical photography in the early 1800s enabled
direct image recording. This trend has continued with electronic recording, first with analogue sensors, and
subsequently with digital sensors, which include analogue to digital (A/D) conversion on the sensor chip, to
directly produce digital images.
Imaging sensors have not been restricted to the portion of the electromagnetic spectrum visible to the human
eye. Sensors have been developed to cover much of the electromagnetic spectrum from radio waves through
to gamma rays. A wide variety of other imaging modalities have also been developed, based on ultrasound,
electrons, atomic force (Binnig et al., 1986), magnetic resonance, and so on. In principle, any quantity that can
be sensed can be used for imaging, even dust rays (Auer, 1982).
Since vision is such an important sense, the processing of images has become important too, to augment or
enhance human vision. Images can be processed to enhance their subjective content or to extract useful infor-
mation. While it is possible to process the optical signals associated with visual images using lenses and optical
filters, this book focuses on digital image processing, the numerical processing of images by digital hardware.
One of the earliest applications of digital image processing was for transmitting digitised newspaper pictures
across the Atlantic Ocean in the early 1920s (McFarlane, 1972). However, it was only with the advent of comput-
ers with sufficient memory and processing power that digital image processing became widespread. The earliest
recorded computer-based image processing was from 1957 when a scanner was added to a computer at the USA
National Bureau of Standards (Kirsch, 1998). It was used for the early research on edge enhancement and pattern
recognition. In the 1960s, the need to process large numbers of large images obtained from satellites and space
exploration stimulated image processing research at NASA’s Jet Propulsion Laboratory (Castleman, 1979). At
the same time, research in high-energy particle physics required detecting interesting events from large num-
bers of cloud chamber photographs (Duff, 2000). As computers grew in power and reduced in cost, the range
of applications for digital image processing exploded, from industrial inspection to medical imaging. Image
sensors are now ubiquitous, in mobile phones, laptops, and in video-based security and surveillance systems.
Design for Embedded Image Processing on FPGAs, Second Edition. Donald G. Bailey.
© 2024 John Wiley & Sons Ltd. Published 2024 by John Wiley & Sons Ltd.
Companion Website: www.wiley.com/go/bailey/designforembeddedimageprocessc2e
2 Design for Embedded Image Processing on FPGAs
intensity pattern on an optical sensor; a radiograph, which is a representation of density formed through exposure
to X-rays transmitted through an object; a map, which is a spatial representation of physical or cultural features;
and a video, which is a sequence of two-dimensional images through time. More rigorously, an image is any
continuous function of two or more variables defined on some bounded region of space.
A digital image is an image in digital format, so that it is suitable for processing by computer. There are
two important characteristics of digital images. The first is spatial quantisation. Computers are unable to easily
represent arbitrary continuous functions, so the continuous function is sampled. The result is a series of dis-
crete picture elements, or pixels (for 2-D images) or volume elements, voxels (for 3-D images). Sampling does
not necessarily have to be spatially uniform, for example point clouds from a LiDAR scanner. Sampling can
represent a continuous image exactly (in the sense that the underlying continuous function may be recovered
exactly), given a band-limited image and a sufficiently high sample rate (Shannon, 1949). The second character-
istic of digital images is sample quantisation. This results in discrete values for each pixel, enabling an integer
representation. Common bit widths per pixel are 1 (binary images), 8 (greyscale images), and 24 (3 × 8 bits for
colour images). Modern high dynamic range sensors can provide 12–16 bits per pixel. Unlike sampling, value
quantisation will always result in an error between the representation and true value. In many circumstances,
however, this quantisation error or quantisation noise may be made smaller than the uncertainty in the true
value resulting from inevitable measurement noise.
In its basic form, a digital image is simply a two (or higher)-dimensional array of numbers (usually inte-
gers), which represents an object or scene. Once in this form, an image may be readily manipulated by a
digital computer. It does not matter what the numbers represent, whether light intensity, reflectance, attenu-
ation, distance to a point (range), temperature, population density, elevation, rainfall, or any other numerical
quantity.
Digital image processing can therefore be defined as subjecting such an image to a series of mathematical
operations in order to obtain a desired result. This may be an enhanced image; the detection of some critical
feature or event; a measurement of an object or key feature within the image; a classification or grading of
objects within the image into one of two or more categories; or a description of the scene.
Image processing techniques are used in a number of related fields. While the principal focus of the fields
often differs, many of the techniques remain the same at the fundamental level. Some of the distinctive charac-
teristics are briefly outlined here.
Image enhancement involves improving the subjective quality of an image or the detectability of objects within
the image (Haralick and Shapiro, 1991). The information that is enhanced is usually apparent in the original
image but may not be clear. Examples of image enhancement include noise reduction, contrast enhancement,
edge sharpening, and colour correction.
Image restoration goes one step further than image enhancement. It uses knowledge of the causes of how an
image is degraded to create a model of the degradation process. This model is then used to derive an inverse
process that is used to restore the image. In many cases, the information in the image has been degraded to
the extent of being unrecognisable, for example severe blurring.
Image reconstruction involves restructuring the data that is available into a more useful form. Examples are
image super-resolution (reconstructing a high-resolution image from a series of low-resolution images),
image fusion (combining images from multiple sources), and tomography (reconstructing a cross section
of an object from a series of projections).
Image analysis refers specifically to using computers to extract data from images. The result is usually
some form of measurement. In the past, this was almost exclusively 2-D imaging, although with the
advent of confocal microscopy and other advanced imaging techniques, this has extended to three
dimensions.
Pattern recognition is concerned with identifying objects based on patterns in the measurements (Haralick and
Shapiro, 1991). There is a strong focus on statistical approaches, although syntactic and structural methods
are also used.
Computer vision tends to use a model-based approach to image processing. Mathematical models of both the
scene and the imaging process are used to derive a 3-D representation based on one or more 2-D images of
a scene. The use of models implicitly provides an interpretation of the contents of the images obtained.
Image Processing 3
Machine vision is using image processing as part of the control system for a machine (Schaffer, 1984). Images
are captured and analysed, and the results are used directly for controlling the machine while performing a
specific task. Real-time processing is often critical.
Remote sensing usually refers to the use of image analysis for obtaining geographical information, either using
satellite images or aerial photography (including from drones).
Medical imaging encompasses a wide range of imaging modalities (X-ray, ultrasound, magnetic resonance,
positron emission, and others) concerned primarily with medical diagnosis and other medical applications. It
involves both image reconstruction to create meaningful images from the raw data gathered from the sensors
and image analysis to extract useful information from the images.
Image and video coding focuses on the compression of an image or video, so that it occupies less storage space
or takes less time to transmit from one location to another. Compression is possible because many images
contain significant redundant information. In the reverse step, image decoding, the full image or video is
reconstructed from the compressed data.
1.2.1 Optics
A camera consists of a sensor combined with a lens. The lens focuses a parallel beam of light (effectively from
infinite distance) from a particular direction to a point on the sensor. For an object at a finite distance, as seen
4 Design for Embedded Image Processing on FPGAs
Sensor Object
(x,y,z)
f
Image Principal axis
(xi,yi) f
di do
Figure 1.1 Optical image formation using a lens using geometric optics.
in Figure 1.1, geometric optics gives the relationship between the lens focal length, f , the object distance, do ,
and image distance, di as
1 1 1
+ = . (1.1)
di do f
Since the ray through the centre of the lens is a straight line, this ray may be used to determine where an
object will be imaged to on the sensor. This pinhole model of the camera will map a 3-D point (x, y, z) in camera
coordinates (the origin is the pinhole location; the z-axis corresponds to the principal axis of the camera; and
the x- and y-axes are aligned with the sensor axes) to the point
xdi ydi
xi = , yi = (1.2)
z z
on the sensor plane. di is also called the effective focal distance of the lens. Note that the origin on the sensor
plane is in the centre of the image (the intersection of the principal axis). This projective mapping effectively
scales the object in inverse proportion to its distance from the camera (z = do ). From similar triangles, the optical
magnification is
d d
m= i = i. (1.3)
z do
The image is generally focused by adjusting the image distance, di , by moving the lens away from the sensor
(although in microscopy, di is fixed, and the image is focused by adjusting do ). This can either be manual or,
for an electronically controlled lens, an autofocus algorithm automatically adjusts di to maximise the image
sharpness (see Section 6.1.7.2).
Objects not at the optimal object distance, do , will become blurred, depending on both the offset in distance
from best focus, Δdo , and the diameter of the lens aperture, D. The size of the blurred spot is called the circle
of confusion and is given by
|Δdo | Df |Δdo | f2
c= = , (1.4)
do + Δdo do − f do + Δdo N(do − f )
Equations (1.4) and (1.5) may be combined to give the effective size of the blur as (Hansma, 1996)
√
de = c2 + d2 . (1.6)
Of course, any other lens aberrations or imperfections will further increase the blur.
1.2.2 Colour
Humans are able to see in colour. There are three types of colour receptors (cones) in the human eye that respond
differently to different wavelengths of light. If the wavelength dependence of a receptor is Sk (𝜆), and the light
falling on the receptor contains a mix of light of different wavelengths, C(𝜆), then the response of that receptor
will be given by the combination of the responses of all different wavelengths: