Professional Documents
Culture Documents
A Portable Vision System For Detecting and Counting Maggots: Lazar Lazarov
A Portable Vision System For Detecting and Counting Maggots: Lazar Lazarov
Lazar Lazarov
Abstract
As a part of a school science outreach program, the feeding preferences of the larval
Drosophila (the common fruit fly) are studied. To accomplish this, it is necessary to
identify their location within a petri-dish. However, most of the analysing of their feed-
ing behaviour is accomplished by human observation, which is a slow and laborious
process.
To help with this problem, we have developed an automated system based on computer
vision. It uses a standard Android smart-phone, to employ thresholding and our BFS
based contouring algorithm to segment and count maggots. Under ideal conditions,
our system gives precise results, but due to alterations of the environment and maggots
sometimes clumping together, there are still problems that are left unresolved.
Table of Contents
1 Introduction 7
1.1 Project Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Report Road-map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Project Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Background 9
2.1 Similar Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 C. Elegans Detection, Segmentation, and Counting . . . . . . 9
2.1.2 Image Processing to Detect Worms . . . . . . . . . . . . . . 10
2.2 Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Endrov Image Library . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 OpenCV Object Segmentation . . . . . . . . . . . . . . . . . 12
3 Product Design 15
3.1 Image Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Object Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Shape Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 Object invariance . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.2 Object comparison . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.3 Maggot Counting . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Implementation 19
4.1 Application Permissions . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.1 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.2 Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Device Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.1 Camera Resolution . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.2 Touch-screen Calibration . . . . . . . . . . . . . . . . . . . . 20
4.3 Recognition Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.1 Colour Blob Segmentation . . . . . . . . . . . . . . . . . . . 21
4.3.2 Maggot Selection . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.3 Chain Code Contouring . . . . . . . . . . . . . . . . . . . . 22
4.3.4 Breadth-First-Search Colouring . . . . . . . . . . . . . . . . 25
4.4 Android Development . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4.1 Code Translation . . . . . . . . . . . . . . . . . . . . . . . . 28
5
6 TABLE OF CONTENTS
5 Evaluation 33
5.1 Edge Case Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Application Usability . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6 Discussion 39
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Appendices 43
A Instruction Manual 45
Introduction
In a simple experiment, maggots are trained to associate odour with sugar. The idea of
our project is to develop an app, that would allow students to use their mobile phones
to get an instant count of how many maggots are in each side of a petri-dish, which
would show their odour preferences. The system should require minimal set-up, as
that would allow more people to have access to it. It should not be complicated to use
and would need to work faster than manual counting and be almost as accurate. To
accomplish these tasks we created an intuitive user interface and optimised the system
so that it would work with minimal delay and maximum accuracy.
• Chapter 2
In this chapter we will discuss two papers, which are set out to create a vision
system for recognizing and counting worms. We will also explore different types
of thresholding and how to automatically calculate an ideal threshold level. Fi-
nally we will look at two image processing libraries and describe in detail the
Freeman Chain Coding segmentation process.
• Chapter 3
In the design chapter we will talk about the overall ideas that went into designing
the system. We will see how images can be processed, as well as how object
recognition can be achieved. We shall also explain how to create features that
are invariant and how we use those to differentiate between the objects we can
encounter in our use cases.
• Chapter 4
In the implementation chapter we will talk about the methods provided by OpenCV
and how we have used them in our project. We will also explain how we devel-
7
8 Chapter 1. Introduction
oped our object detection processes and we will go into greater detail on how
they work and what their benefits are, compared to the provided library func-
tions. We will also talk about general Android development hurdles and how we
overcame them, as well as how we developed and implemented our own user
interface.
• Chapter 5
In the evaluation chapter we will look at the different edge case experiments we
have artificially created to test the extreme use cases of our app. We shall also
discuss its usability, by comparing the speed and accuracy of manual maggot
detection versus the application assisted one.
• Chapter 6
In the discussion chapter we will look at the two main underlying problems dis-
covered by our experiments and describe ways to overcome them in the future.
We will also talk about how much of the project goal we have achieved and why
that is.
Background
We will begin with a project by Kong et al. (2016) from the University of California,
Irvine, which tackles a problem similar to the one our application is trying to solve.
Their goal is to develop a robust, autonomous system with which to analyse high-
resolution images of the Caenorhabditis elegans roundworm, whose shapes are quite
similar to those of the maggots of our project.
They use keypoint detection, bounding box regression, chain model detection, as well
as semantic and instance segmentation to separate and count the worms with different
lengths. The overall pipeline is quite complex and some of the steps are repeated
multiple times for optimal results. This creates great performance needs, but because
they are using a desktop computer, they have access to plenty of processing power.
There are multiple differences between the requirements of their project and ours. One
such is the environment, theirs uses a stationary high-resolution scanner to capture the
images, while ours are taken by a hand-held mobile device, which means we would
need to focus more on the differences in lighting conditions. They are also faced with a
much denser population of worms, as can be seen in Figure 2.1, which requires greater
accuracy compared to our use case.
The reason we were interested in their project in the first place, was because it presents
a system with similar goals to ours and we believed it might have been possible to
implement it, but in a mobile context. However, due to our project being on a much
smaller scale and considering we have less experience with the development of such
systems, in the end we decided against it.
9
10 Chapter 2. Background
(a) Their Use Case of Clumped Worms (b) Our Use Case of Sparse Maggots
Figure 2.1: Difference in Use Cases Between Our Project and Kong et al. (2016).
Another similar work from the PhD student Fernandez (2010) at Uppsala University,
discusses the same problem and gives a more in-depth look at a solution. It uses
methods that we have deployed in our own project, but also includes ones that go
beyond the scope of our goal. The dissertation goes into great detail when it comes
to shape recognition, because similarly to the previous work, they are dealing with a
large quantity of worms and so discerning clumped masses is of great importance.
Most of the other differentiations exhibited between our methods and the ones pro-
posed here, have been already highlighted in the section above. Overall despite the
different use cases, there is still a great deal of useful information that we understood
thanks to this dissertation.
We will describe approaches in converting an unprocessed, raw image into usable in-
formation for an experiment, by comparing different methods some of which used by
the above papers.
2.2.1 Thresholding
Thresholding is a process that has many uses and not only in computer vision and due
to its usefulness, both works mentioned above employ it in some form. Threshold
representations are used, so that a simplified version of the original image may be
obtained. This is done because computers do not process images the same way as
humans do and so deconstructing images into meaningful features is the only way
to automate image data gathering. And when done with limited processing power,
2.2. Image Processing 11
There are multiple ways to calculate the ideal threshold value for a particular image as
can be seen in the works of Ismail and Marhaban (2009) and Abutaleb (1989), which
explore automatic bilevel thresholding.
The first explains a simple approach of finding the ideal threshold value, by only using
the histogram of the image. The process begins by converting the image into two
dimensions. Then it acquires the histogram of the 2D image, and differentiates it to
12 Chapter 2. Background
better suit it for gradient analyses, which determines the ideal threshold. The second
gives ideas on how to use not only the grey-scale level but also the spatial information
about the interaction between the image pixels. We shall not be using these ideas, as we
decided on a manual thresholding process, because of the many variables concerning
the features of our images.
The works of both the previously discussed image processing projects were based on
an open-source Java plugin architecture - the Endrov Library (2017) . It is mainly
used in desktop computers which makes it not necessarily suitable for our use-case.
Although, considering our application runs on the Android operating system, which
runs Java programs, there could have been a way to implement it into our own project.
However, at time of writing the website of the software has not been online for months
and because of that, we could not find a way to obtain it, nor was there documentation
that we could use to familiarise ourselves with it.
For our vision platform we decided on the open-source OpenCV library (2017) . It
is currently highly supported on multiple platforms and programming languages. It
is extensively documented and has been used for image processing in many different
fields, which makes it ideal for troubleshooting problems. Another reason why we
chose OpenCV, is that we have used it multiple times in the past in other courses and
projects. In addition, having some experience with it is quite beneficial, especially in
the early stage of development.
OpenCV provides a native colour blob detection method, that can be used to recognise
maggot like shapes. In order to measure its strengths and weaknesses we had to look
at in more detail, and understand how it operates. Most of the library’s functions
are focused on real-time image processing, so they are optimised to work quickly.
However, to achieve higher processing speeds some sacrifices must be made. In the
colour detector’s case, the effective resolution of the input image is greatly lowered,
which results in less operations being performed.
In order to contour objects of a specific colour, the integrated method uses an approx-
imation technique called Freeman Chain Coding (2017), developed by Dr. Herbert
Freeman in the second half of the 20th century. The idea behind it is simple but very
useful, especially when dealing with low amounts of computer processing power and
memory.
The process starts by defining an 8 directional compass (Figure 2.4), which will be
used to create the chain code when contouring. The input image is then overlaid with
a grid of many smaller shapes, which are of three distinct types - squares, circles or
equilateral rhombi as seen in Figure 2.5. The three shapes yield very different chain
2.2. Image Processing 13
codes even on the same image and their main difference is the amount of diagonal
directions found in the final chain code.
Figure 2.5: Single Maggot Contoured Using Three Different Quantization Methods
For example, using squares produces no diagonal lines since its sides are perpendicular
to one another and parallel to the pixel grid of the image. And as can be seen in
2.5a, this results in contours that have only right angles, which gives more inaccurate
measurements of the contour of the object, especially when dealing with more complex
object shapes. The rhombus (2.5c) and circle (2.5b) on the other hand produce diagonal
directions in different amounts, which makes them versatile and more appropriate for
general use. According to Dr. Freeman, in order to achieve good shape approximation,
about half of all directions in the final chain code should be diagonal.
The size of the shape used, determines the accuracy of the contouring, the bigger the
shape, the less precise it will be, but also the faster it will be computed. Unfortunately
OpenCV does not provide a direct way to adjust the type of shape or its size, which was
required for our project as we wanted pixel level accuracy. Another problem, although
not as significant, was the fact that the newest version of OpenCV no longer returns the
computed chain code, which has probably been done to simplify the interface of the
function. While you can still get valuable information about the object from the library
method, having the source chain code would have provided more customisation.
14 Chapter 2. Background
Figure 2.6: A Visual Representation of the Steps Taken by the Freeman Chain Coding
Method
An overview of the steps in the chain coding method can be seen in Figure 2.6. The
method starts by saving the coordinate of the first shape (square, circle, etc) it finds,
that is on the edge of an object. Then it starts going clockwise along the 8 different
directions on the compass and checking for the first neighbouring shape that contains
both parts of the object and of the background. The chosen direction is then recorded
into a data structure and the algorithm moves to the newly found neighbour. This
processes is repeated until the original starting position is reached, which indicates
that the whole outside of the object has been contoured.
If we want to extract the exact pixel locations of the perimeter of the object, we can go
to the starting coordinate and follow the directions recorded by the algorithm. OpenCV
can then calculate certain characteristics of the object, such its perimeter length, its area
and its centroid.
This approximation method is great for fast calculations as it processes only the outside
of the object, lowering the number of shapes that need to be visited. In addition, the
relatively large size of the shapes used, greatly helps with performance. However
as stated above, by approximating we lose vital information about the image, which
is needed since maggots are very small, usually less than 0,001% of the whole image.
And by not using the full resolution of the image, maggots that are close to one another
will be detected as a single object which unnecessarily lowers accuracy. So instead of
using this method, we developed our own version of it, that was slower because it
worked on a pixel by pixel bases, however that also made it more accurate.
Chapter 3
Product Design
In order to get an idea of the type of pictures the students would take with their devices,
we asked for and received a multitude of images taken by the target students with
minimal instructions on how they should be captured. This was done purposefully so
that we could have a general representation of the type of pictures the system might
have to process. After looking through the provided images, we came to the conclusion
that instructions would be needed, as most of the images were from very shallow angles
which made distortion a significant problem. Also a great number of the pictures were
not in focus which made recognising tiny objects almost impossible, as they would be
blurred and would blend into the background. During our experimentation phase we
created such an instruction manual, that can be found in Appendix A.
Originally, we were thinking of using a colour recognition algorithm for the maggot
detection, as they are a light grey colour compared to the darker shade of the back-
ground. However, we noticed that the outer wall of the petri-dish was a similar shade
and reflections from the side of the petri-dish would disrupt our vision system. The fi-
nal method we landed on still experienced some of these problems but it compensated
with higher accuracy and being more configurable, as we could implement it with all
the specific features for our use case.
15
16 Chapter 3. Product Design
There are many ways to classify an object, however not all methods are invariant to
changes in the way the object is positioned in the image. With maggots this can be
an issue, since while having similar sizes, their movements and positioning can vary
greatly. So the method we will be using to classify them will need to be rotation,
position and scale invariant.
The least computationally expensive way of accomplishing this, is by using objects
features that are already position and rotation invariant - the area and perimeter of
the object. While at first it might seem that they do not tell us a great deal about the
characteristics of the object, when dealing with non-complex shapes such as maggots,
we can extract enough information to achieve accuracy similar to that of more complex
methods. The area of a maggots is equal to the amount of pixels that the object is taking
up on the screen. In order to make it size invariant, we transform these measurements
from pixel values into a format that is correlated to a real world object, such as the
petri-dish. Then by only changing a single value, the size of the dish, we can make the
algorithm practically size invariant. Using only these simple procedures we can extract
useful information that is not dependant on anything and can be used to compare any
object in the image no matter its position, size or rotation.
Now that we have invariant values for our larvae, we can begin comparing them to one
another and to pre-calculated values that are in line with how a maggot can be described
with these properties. By doing manual experiments with different configurations we
can determine the average area of our test subjects and remove unwanted objects from
our calculations. We will talk more about this topic in the implementation chapter,
because by creating the application, we saw first hand the different effects of these
methods and which changes were the most beneficial.
Once we have distinguished and classified our objects we will need to achieve the
actual goal of the project, which is to count the amount of maggots in each side of the
dish. This process is quite straightforward as the information needed has already been
extracted by the previous steps. In the larvae experiments, lines are drawn to separate
the petri-dish into three distinct areas as seen in Figure 3.1.
3.3. Shape Classification 17
The middle area is the smallest as it is the starting position for all maggots. There are
substances placed on the left and right side of the dish, which will draw the maggots
towards each side. By recreating the lines and the petri-dish within our app, we can
then determine where and how many maggots there are in each side.
To find in which part of the dish a maggot is contained, we can check the location of
its centroid against the coordinates of the lines separating the areas of the petri-dish.
And depending on which part it is in, we can increment that specific area counter. We
do have to think about the edge case, where the maggot is laying on the edge of two
areas. However, this can be handled by the user of the app, as the camera frame and
the lines can be moved in such a way, as to place the maggot in the area that seems to
fit the most.
Creating the lines is a manual process that can be automated to a degree, if there is a
correlation between the sizes of the areas and the size of the petri-dish itself, however
we will not be doing this for the time being. Furthermore, finding the petri-dish itself
can be done automatically, if we were to look for large circular shapes in the image. We
opted out of both options, as they would have not only complicated the development
process, but would have taken control away from the user. This would not have been
beneficial, because it is crucial to correctly position the device in order to minimise
glare and distortion. We also believe that having finer manual control creates an overall
better user experience.
Chapter 4
Implementation
Here we will discuss the exact implementation steps of some of the ideas presented in
the previous chapter.
4.1.1 Camera
In order to get the image from the camera of the phone we will be using the camera
interface provided by the native Android libraries. As of Android 6.0 whenever an
application requires a certain device module, the application should ask for a permis-
sion to use it. With the help of the provided interface, the app can invoke a pop-up
window, which requests the permission and the user can choose to accept or decline it.
If declined, the application exits and the user is prompted with the same message on
the next start of the app.
4.1.2 Storage
For easier transfer of the app results into the actual experiment sheets, an option to save
the application result images onto the storage of the device has been developed. And
because we need to request storage permissions to do so, we use the same processes
described above to acquire them.
19
20 Chapter 4. Implementation
Most modern smart-phone cameras have an image capturing resolution between 5 and
12 mega-pixels. This equates to the total number of pixels captured in an image to
be between 5 and 12 million. And because most camera sensors are with a 4:3 aspect
ratio, this gives us a minimal resolution of 2560x1920 and maximal of 5000x3000.
For most vision applications these high resolutions are unnecessary as there are other
factors in image capture, such as sensor noise, that outweigh the benefits of higher
image resolution. Possibly because of this OpenCV automatically converts the camera
input stream into its own matrix representation that has 1920 columns and 1080 rows.
This ensures easy cross-device compatibility, as we do not have to think about different
resolutions and their effects on performance.
When OpenCV was first developed for Android in 2011, almost all devices used
standard displays, with a 16:9 aspect ratio with 854x480 display resolution for low
end smart-phones and 1280x720 for the higher models. However, since then de-
vices have drastically changed as now most manufacturers are creating their own
non-standardised displays, with custom aspect ratios and resolutions, to better fit their
phones.
The problem of many different display resolutions is gracefully handled by the Android
OS and by using the provided API, it is easy to get the coordinates of where the user
has touched the screen. A problem arises when trying to use these coordinates with
OpenCV, because as stated above it converts the input image from the native camera
resolution to 1920x1080. Therefore, the image shown on the screen by OpenCV can
often be a different resolution than the one of the display. This meant than when a user
touches the image at a particular point, it would not translate correctly.
To solve this, we had to calculate a ratio between the screen’s pixel width and the width
of the OpenCV image. Then whenever the user would touch the display, all we had
to do was multiply the ratio by both the coordinates given by the Android OS, which
would give us translated image coordinates. We had to make edge case detection for
coordinate values under zero and over the full width and height of the image matrix, as
that would give an out of bounds exception when trying to access pixels at that point.
4.3. Recognition Algorithms 21
Our first approach was using a method provided by OpenCV, which would detect
coloured blobs on the screen using the Freeman chain code method described in sec-
tion 2.2.3. This method works by taking an image and a colour as input and producing
an array of contours of all connected objects similar to the colour provided. A single
contour contains information about all the pixels of the outline of the object, as well as
other values, such as the area, perimeter and the centroid of the shape. In most cases
we do not need to store all points, as that is inefficient in terms of memory, instead we
can choose to use a different chain approximation method. The simple chain approxi-
mation is one of these and its main function is to remove unnecessary coordinates, for
example if one of the sides of our object is a straight line, we do not need to store all
the points along that line, instead we can just store its start and end positions. This
processes is automatically repeated by OpenCV along the contours of all objects in the
image, which leaves us with much more optimised shapes with minimal accuracy loss.
The colour contouring method described above, gives us all objects of a specific colour,
but unfortunately not all of these are actual maggots, so we have to do further process-
ing to eliminate the unnecessary elements. The way we tackled this, was by taking an
array of these contours and sorting them according to their area. Because maggots have
similar sizes and are the most common object detected, this meant that by sorting the
contours they would be located roughly in the middle of the array, and the unsuitably
small and large objects would be on the far ends of the array.
The method we are about to explain can be seen in Figure 4.1. We start by saving the
value of the median, then from its index we start iterating both ways - left and right.
For the left side of the array, we check if the value of every element is greater than the
middle area minus an error value, usually 10%-20% since maggots slightly differ in
size:
areaAtLe f tIndex > middleArea − middleArea ∗ error
For the right side we do the same, but invert the check:
We do the left and right iterations independently and when a check fails we save the
index at which this has occurred and once both the left and the right side checks fail
or reach the end of the array, we stop iterating. Then we take the left and right indices
saved from the previous step and remove all elements before and after them respec-
tively. This leaves us with an array of only objects of similar sizes. The effects of this
method can be seen in Figure 4.2.
Figure 4.2: Objects of Much Bigger or Smaller Sizes are Removed by the Method
After removing almost all unnecessary elements from the array, we call the drawCon-
tours() library function with the array and our input image as arguments, which takes
all points described in the elements of the array and draws them on the image.
In this section we will go through our own implementation of a chain code algorithm.
As stated before, we wanted to recreate this method as we would have finer control
over the function and would be able to better tailor it to the application’s needs.
Our project would require to be written in Java, as it is the main language supported by
the Android OS, however we started writing our method in the Python programming
language, as it very natural to write in and we could quickly realise concepts without
having to get bogged down in specific implementation details. OpenCV is just as
well supported in Python as it is in Java, so there should not have been any major
differences, which turned out to not be entirely the case.
Our Python code did not need to be efficient as it served as a draft and we were running
it on a personal computer, which had more than enough processing power and memory
compared to a smart-phone.
Figure 4.3: Visual Representation of the Steps Taken by Our Freeman Chain Coding
Method
Our full implementation can be seen in Appnedix B, but we will explain it in such a
4.3. Recognition Algorithms 23
way, that looking through the code is not necessary. For reference one can look at the
algorithm overview in Figure 4.3 and the control flow graph in 4.4.
The program begins by converting an input image into a grey-scale representation and
applying the OpenCV thresholding method with a predefined threshold value. This
creates a binary image, where the background elements are black and the foreground
are white. By using two nested “for” loops, all lines and columns of the input image
are visited and the value of every pixel is checked. Whenever one equals 255 (white),
the algorithm assumes an object has been detected and can begin contouring.
This results in the contourObject() function being called, which takes as input, the
coordinates of the pixel chosen in the previous step and begins the chain code contour-
ing. To store information about the object, two stacks are used in the function. The
first stores the chain code directions and the second the coordinates of every contoured
pixel. Our function also initialises minimum and maximum values for both X and Y
coordinates, which will later be used to find the centroid of the object.
A “while” loop is initiated, that will be broken when the starting position is reached
after having contoured the object. There is a “for” loop set within, that goes from 0 to
7, which are the 8 directions on the compass seen in Figure 2.4. Because the algorithm
is looping through the image from top to bottom and from left to right, it is always
going to start at the top-left corner of the object, which means that it will go clockwise
around it while contouring. To check the least amount of directions, every next pixel
will use the last direction as the starting point of its own direction loop.
The XYfromDirection() function has been defined, that takes a starting coordinate and
a direction number and computes the corresponding X and Y values. The computed
coordinates are then checked if they lie within the bounds of the image and if they
represent an edge pixel, by using the Boolean isEdgePixel() function. This method
succeeds when the given pixel has both white and black pixels as neighbours. If evalu-
ated to true, the pixel and direction are added to their respective stacks. In addition, the
coordinates are compared to the values in the Min and Max variables. If greater, they
become the new Max values and if smaller - the Min ones are replaced. After saving
the information in the data structures, the method returns to start of the “while” loop
and starts working on the next pixel of the object. The visited pixels are also coloured
in grey, so that the method does not try to contour the same object multiple times, as
only white pixels would be able to start the contouring process.
24 Chapter 4. Implementation
Input Image
Display Contoured
Finished Image
Image Pixel Loops
Looping
False
Non-Zero Pixel
Objects Array
Found
Chain Code
Stack
True
Centroid
Min-Max Values
While
False currentPosition !=
startingPosition
True
For
False
d < 8 ; i++
True
XYfromDirection(d)
X Y Coordinate
Are XY
False In Image
Bounds
True
False isEdgePixel
True
Figure 4.4: Control Flow Graph of Our Freeman Chain Coding Method
4.3. Recognition Algorithms 25
Once the object has been contoured, the method returns to the main image pixel loop
and saves information about the object, such as its perimeter, its bounding box and
both stacks. Looping through the image in resumed, so the remaining objects may be
found.
Overall this method worked well enough, however there were a few caveats. As men-
tioned before, the area of the object cannot be directly calculated by only using the
contouring method. Instead going from the top to the bottom of the object and sum-
ming the distances between every two points on the same row would be required. This
can lower the overall performance, as the algorithm would be making multiple passes
through the object’s pixels.
More importantly, by testing with custom shapes, we noticed that whenever an object
had a one pixel wide appendage that was more than two pixels long, our method would
get stuck, an example can be seen in Figure 4.3. This is due to the fact, that the method
is not allowed to go over pixels that have already been visited, which means that once
it reaches the end of the appendage, it is unable to go back.
A seemingly straight-forward approach to solving this, was allowing the algorithm to
go through visited pixels, however this presented the problem of getting stuck loop-
ing between the same two pixels. This method was also unusually slow, but after
researching online we came to the conclusion that it was because we were using native
python “for” loops, instead of the more optimised NumPy matrix loops. NumPy being
a standard Python library for creating and operating on multi-dimensional arrays and
matrices, that is implemented in the C programming language, which result in much
faster array access compared to more traditional Python methods.
Because we could not think of a way to solve our issues in a reasonable amount of
time, we decided to change to a whole different segmentation method, which would
take care of these problems without any extra work.
Breadth-first search (2018) (BFS) is an algorithm for traversing tree and graph data
structures, but we will be using our own version of it to find the connected pixel objects
within our image.
Once again, the code for this method can be found in Appendix C, but the provided
control flow graph 4.6 and the visual representation in Figure 4.5 should be sufficient.
Figure 4.5: A Visual Representation of the Steps Taken by Our BFS Algorithm
26 Chapter 4. Implementation
Input Image
Merge Object
Image Pixel Loops Finished Save to File
Matrix Array
Looping
False
Objects Array Non-Zero Pixel
Found
Visited Pixels
Move Pixel
Stack
True
Perimeter
While
False reachablePixes
not empty
For
d < 8 ; d++ False
True
Is White
False
Pixel
True
False Is Unique
True
Is Pixel In
False Bounds
True
False isEdgePixel
True
Similarly to the previous implementation, two nested “for” loops for looping through
the pixels of the image are created. And just as before, whenever a white pixel is
encountered the contouring function will be called with the coordinates of the pixel as
input arguments. We shall call our function colourObject() as it is not going around its
contours, but instead spreading through it and “colouring in” the pixels of the object.
The algorithm starts by creating a NumPy matrix, with the same width and height as the
original image. This matrix will contain information only at the coordinates associated
with the specific object currently being coloured in. The idea being, that after having
processed all objects within our image, we will be able to layer the suitable object
matrices on top of each other (Figure 4.7) to create the final representation containing
only maggots. This approach is quite memory inefficient, but as mentioned before, this
was just a draft of the final implementation, so these problems would later be ironed
out in the Python to Java code translation.
After having created the matrices, two stacks are initialized - visitedPixels and reach-
ablePixels. The first is self-explanatory and the second contains all pixels that are
neighbours of the ones in the visitedPixels stack. The pixel coordinate passed as an
argument is put on the visitedPixels stack and the perimeter of the object is set to 1.
A “while” loop is then started that will be broken when the reachable pixel stack is
empty. It has to be noted, that because the stack starts without any elements in it,
stopping needs to happen after looping at least once.
The same 8 iterations “for” loop from our previous method is also present here. How-
ever, instead of stopping when a suitable pixel is found, the BFS algorithm always
completes all iterations and adds every occurrence of a white pixel to the reachable
pixels stack. Every added pixel has an image bounds and duplication check performed,
to make sure it is not already present in the stack, as duplicate pixels can result in an
infinitely growing stack, which will crash our program.
At the end of each “while” loop iteration, one element from the reachable pixels stack
is moved to the top of the visited pixels stack, so that the processes of colouring can
continue. If this pixel is located on the edge of the object, the perimeter variable is
incremented and the shade of the pixel is set to a different colour. If not, it is still
coloured in, but in a different shade of grey. This gives a clear distinction between the
Figure 4.7: All Object Matrices Are cCmbined Into a Single Matrix
28 Chapter 4. Implementation
To mitigate this problem, we convert the Mat object into a primitive Java array, which
can then be natively and efficiently accessed. This presents a slight inconvenience,
since we are converting a 2D matrix into a one dimensional array, so all non-sequential
access will need an extra step of abstraction to make it more human readable. Our
immediate solution was to implement a set of two functions - XYtoIdx() and Idx-
ToXY() that would convert a 2D coordinate into a 1D index and vice versa. Then we
would be able to use the same function implementations as in our Python code and just
convert the inputs and outputs into the correct formats using our newly defined func-
tions. This resulted in a performance loss as we were making twice as many function
calls as required. This was resolved by rewriting all of our helper functions, such as
isEdgePixel() and XYfromDirection() into index based equivalents.
4.4. Android Development 29
Unlike our practically unlimited memory in our Python testing phase, Android has
a strict memory manager that does not allow programs to use more than a specified
amount of RAM, this is done to keep the system in a stable state. Our previously
mentioned inefficient method of storing all objects as a matrix the same size as the
original image, quickly became impossible to sustain. This resulted in our application
crashing when more than 10 objects were detected. Our solution was to only keep
information about the actual pixels of the object and its properties - the visitedPixels
stack, the area, perimeter and the centroid. This resulted in a dramatic decrease in
memory usage and we were no longer experiencing memory related crashes.
4.4.1.3 Streamlining
Originally, in order to show the user where the maggots were located in the picture,
we were looping through the objects and through their respective visitedPixels stacks
and colouring all pixels that were present there. This turned out to be very slow and
inefficient, since once again we were making function calls for every single pixel in
an object. Instead, we decided to just draw a dot at the location of the centroid of
each object. While this is not as accurate as drawing all pixels of a maggot, it serves a
similar purpose and is much faster to compute.
After making a close translation of our Python code into Java, with the few exceptions
mentioned above, we noticed that performance was still not optimal, as images would
take multiple seconds to compute. To optimise our implementation, we decided to
write our code in a less “clean” way, by reducing the amount of loops and functions
calls we use. This resulted in bigger blocks of code, which did not look as neat as
before, but resulted in much higher performance.
The changes consisted of removing the use of the 8 directional “for” loop, which called
the IdxfromDirection() function on every iteration. Instead we calculate all 8 directions
of the current pixel in the actual contouring function in rapid succession and then check
the image bounds only for the relevant directions. In the same “if” statements that look
for valid pixel neighbours, we take note if there are both white and black neighbours,
and if there are the pixel can be marked as being on the edge without having to call the
isEdgePixel() function.
Once we had implemented the main functionality of our app, we needed to make it
accessible by creating a practical user interface (UI), a representation of which can be
seen in Figure 4.8. Buttons are shown only when relevant, which helps with the ease
of use of the app.
The buttons are also colour coded, where green represents actions that advance the
overall process, blue is for buttons that alter perimeters and red is for cancelling pro-
cesses and returning to the starting screen.
We encountered a problem, when trying to add native Android UI elements to our
activity. Whenever we overlaid an element over the OpenCV camera frame, the ap-
plication would crash on start-up. As a temporary solution, we decided to implement
the UI, manually onto the OpenCV camera frame. At a later point we realised, that
the problem arose, because we were trying to call a UI function in a worker thread,
which resulted in the app crashing. However, it was far in the development process,
so changing it would have required a great deal of work. And because our temporary
solution was working well enough, we decided to keep using it.
In Figure 4.8 we have shown and numbered the different elements of the UI, and below
we will briefly describe their functionalities:
1. Reset Button sets the position of the green lines to their default location.
2. Shutter Button captures a picture and tapping it a second time starts processing
the image.
3. Increase Distance Button moves the lines outwards.
4. Decrease Distance Button moves the green lines, in the petri-dish, inwards by 1
pixel, this is used to achieve finer control.
5. Plus Button increases the level of the threshold by 10 with every tap.
6. Minus Button decreases it.
7. Back Button takes the user back to the image capturing screen.
8. Save Image Button stores the processed image to the device’s solid state storage.
The save image button, was added after we had a more in-depth look at the steps of the
actual experiment. We decided that being able to save the amount of maggots counted
and the image itself to the device, instead of having to copy them by hand, would
be beneficial. The images are saved in directory called ”MaggotAppResults” in the
home folder of the device, and the individual file names are created according to the
following format:
Where L, M and R correspond to the amount of maggots detected in the left, middle and
right side of the petri-dish and T equals the total number of maggots. This processes
4.4. Android Development 31
could then be automated and the information from the file titles could be saved into
the Excel spreadsheets used to keep track of the experiment results. The buttons
numbered 3, 4 and 7 were not part of the original design, they were added after a brief
user trial, in the feedback of which were included recommendations, on how to further
improve the application. The red back button was assigned the same function as the
native Android back button, but was added for clarity and easier navigation.
To make our code structure more legible, we created a helper functions class, which
contains the following 4 functions:
• inCircle()
This function takes a coordinate, a centre of a circle and the radius of that circle
and returns either True or False, depending whether the coordinate is within the
bounding box of that circle. This method is not precise, but since we are using it
exclusively to check if a user has pressed within the bounds of a button, it needs
to be computationally fast at the expense of accuracy.
• euclideanDistance()
For more accurate measurements we use this function to calculate the euclidean
distance between two points. And if for example we want to check if a point is
located within a circle, we can calculate the euclidean distance from the centre
of the circle to the point of interest, and if that distance is less than the radius of
the circle, we can conclude that it is contained within.
• saveImage()
As explained above, we are saving a copy of the app results onto the storage of
the device. This is accomplished in this function, by using the provided OpenCV
method matToBitmap() that converts the Mat object into a bitmap representation,
which is then saved in a Portable Network Graphics (PNG) format. We chose
PNG instead of JPEG, because PNG is a lossless format, which means we would
not be losing any data to compression algorithms compared to JPEG.
• drawUI()
Finally we have the drawUI function that takes a plethora of input arguments,
which are used to determine which parts of the UI should be drawn on which
matrices of the processing pipeline. All UI elements are drawn using the pro-
vided OpenCV functions for drawing polygon shapes onto a matrix. The scale
of the UI is based on the radius of the shutter button, as it had the primal func-
tionality of the app, which means that the UI can be universally scaled.
32 Chapter 4. Implementation
Figure 4.8: The UI Elements on Each of the Three Screens of Our Application
Chapter 5
Evaluation
In order to test the edge cases for our application, we have set-up 10 experiments,
almost all of which extreme cases, where we know that the app would fail. The results
can be seen in Table 5.1 and the corresponding images to each experiment can be found
in Figure 5.1. All test were artificially created in Adobe Photoshop, but used one of
the provided images by the school as a base. The idea was that by creating simulated
tests we would have finer control over the arrangement of the maggots and would be
able to easily create extreme test cases.
33
34 Chapter 5. Evaluation
Here we will explain what the results of each of the experiments means, where each
item count refers to the test with the same number:
1. Uniformly distributed maggots, with some being clumped together. As we can
see from the table, the results for the left and middle parts of the petri-dish are
correct and two maggots are missing from the right side. This is due to three
maggots being clumped together, which get recognised as one object.
2. Maggots are grouped close to one another, but no two are touching. All are
counted correctly, which shows that the system works well for distinct objects.
It also shows how our custom algorithm handles small gaps between maggots
and does not clump them together, as would have happened with the default
OpenCV method.
3. Similar positioning as above, with the exception that all maggots are touching.
One pair is detected as a single maggot, all other groups are too large to be
considered a valid object and are not counted as a result.
4. Uniformly distributed with some portion of maggots being close or touching the
walls of the petri-dish. Two are not detected because they are too close to the
wall and the system assumes they are part of the dish.
5. All maggots are located in the left side of the dish. One is on the edge and
is not detected, two groups are counted as one maggot each and there is one
non-maggot object miscounted. On the right side, three non-maggot objects are
counted, which is due to the different lighting of the petri-dish.
6. All are contained within the right side of the dish. Three groups are counted as
maggots each and one maggot is missing. There are two phantom objects that
are miscounted.
7. Every maggot is located somewhere along the edge of the dish. 11 maggots out
of 26 are properly detected, all others are either not counted.
8. Uniformly distributed, with some grouped and some on the edge. The large
clump is discarded and most of the maggots on the edge are properly counted.
9. Uniformly distributed all around the petri-dish, with most grouped and edge
maggots not being detected.
The above experiments show two underlying problems with our application. Firstly,
clumped maggots are not properly counted and secondly maggots on the edge of the
petri-dish blend in are also not recognised. We will explore possible solutions to these
problems in the next chapter.
36 Chapter 5. Evaluation
In this section we will look at experiments that target the usability of our app and
will be comparing the accuracy and speed of 4 maggots counting methods. All tests
are performed by different people with similar understanding of the problem, so as to
give the most unbiased outcomes. A third-party was assigned the job of timing each
experiments duration and announcing when to start counting. When done, each user
indicated that they had finished counting. These results can be seen in Table 5.2 and
the experiment images are shown in Figure 5.2.
We will explain what each set of values means in a clockwise fashion. The first, repre-
sents the baseline of all other methods, as it shows the correct count for all maggots in
all parts of the petri-dish. This was done by doing 2-3 manual passes on each image,
in order to make sure all maggots have been found and were later confirmed by further
5.2. Application Usability 37
examination. The second, shows maggots counted by a user utilizing the app, without
any outside corrections. The third, is a fully manual count, but with a single pass and
is done in a minimal amount of time. The fourth, is a combination of manual and au-
tomated work, as a user counts the maggots using the app, but then manually checks
for miscounted ones.
As can be seen from the above tables, the slowest method is the manual multi-pass
one, which is to be expected as it focuses on accuracy and not necessarily speed. In the
automatic application counts table, most experiments are processed by doing a single
image capture using the app, which results in speeds between 4-5 seconds. There are
some who have taken twice as long, which is due to the user making two or three cap-
tures of the same experiment. Furthermore, the fastest times are achieved by the pure
application count, unfortunately at the expense of accuracy. The manual single-pass
count strikes balance between speed and precision, as it gives predominantly correct
results, however processing times are almost twice as long as the automatic method.
Finally, the user assisted application count exhibits speeds close to those of the single-
pass manual method, but with much higher accuracy. The single erroneous value is
due to the application counting an extra object, which the user did not notice was not
an actual maggot.
These experiments highlight the weak points of our application, as exhibited in the
previous testing phase. Nevertheless, by combining manual with automated work, the
application can increase the counting accuracy without sacrificing speed compared to
the fully manual methods.
38 Chapter 5. Evaluation
Discussion
A clear tendency can be seen from the first set of experiments show above. The system
exhibits problems with clumped objects, as well as maggots that are located on the
edge of the petri-dish, both of these result in maggot miscalculation.
We have previously discussed the first problem and we had a few ideas on how to solve
it and we were close to getting a working solution, but due to lack of time, were unable
to complete it. Based on an idea we have talked about previously, one such way was
by putting the areas of the maggots in an array, sorting it and calculating the average
area of some number of maggots in the middle of the array. Then we would take the
objects on the far ends of the array and divide them by the average area. The ratio that
we would get, would be the number of maggots located in that specific clump. This
proved tricky to implement, since maggots vary in size and some single maggots were
being counted twice. We also encountered general implementation bugs that we could
not solve in time and had to scrap the idea.
Another solution, would be to shrink the area of the maggots, which would separate
them into distinct objects, that would then be counted correctly. This still does not
solve the problems when a clump of maggots is not loosely connected and might even
introduce new issues, such as tiny maggots becoming too small to count.
The worm skeletonization and endpoint detection method implemented by Fernan-
dez (2010) can be used to distinguish connected shapes and would have been useful
for our project, however due to our lesser scope we did not implement it. Still by
looking at his results, we can conclude, that he also was not able to fully automate
the process of separating worms, which indicates that a solution may be difficult to
achieve.
The second problem, when maggots are on the edge of the dish, proved very difficult
and in our case, impossible to solve. Because maggots are a similar colour to the edge
of the dish, whenever they are close enough, the system assumes they are one object
and discards them.
39
40 Chapter 6. Discussion
A solution to this problem has been developed in a paper by Khurana, Li, and Atkin-
son (2011), which is about tracking the trails of the same maggots as us. They have
focused specifically on creating better recognition when the larvae and their trails are
near the edges of a petri-dish. They have developed a method for Frame Averaging
followed by Subtraction then Thresholding (FAST), that removes static objects from
a video file. They also feed the maggots a substance that darkens their digestive tract,
which creates greater contrast between them and the petri-dish. Their results tell us,
that to be able to reliably track the maggots, using both the dyeing substance and the
FAST algorithm is required, as neither accomplish the goal on their own. This solution
would be infeasible for our app, as it requires changes in the experiment. Furthermore,
since we are using static images, implementing FAST would not be possible.
Our problems appear to not be trivial and would require extra time and multiple com-
plex algorithms to solve. Still it should be noted that for the general use case, the ratios
of maggots between each part of the dish remain similar, so the system should still be
useful even with imperfect results. And as noted in the testing above, since the user
can see which objects are detected and which are not, the overall counting time is still
reduced.
6.2 Conclusion
The larvae of the common fruit fly has been widely used in many different experiments
for over a hundred years. To help with automating the counting process in a school
experiment, we developed a mobile application that can be used by students to help
with their experiment. We set out to create a portable system that would be easy to
set-up and use. It needed to work faster than manual counting and to be relatively
accurate.
We will now briefly discuss the goals of our project and our success in achieving them.
In the final implementation, setting up our application is no more complicated than
any other non-official Android app. However, using it requires more preparation than
originally intended. As seen in the instruction manual in Appendix A, the positioning
of the user over the petri-dish is crucial to maximising the usability of the system. This
makes the application more difficult to use in the beginning, which is not in line with
our original idea. Nonetheless, from personal experience we noticed that after spend-
ing enough time using it, it becomes natural to position oneself in an ideal location.
From our own experiments shown above, we were able to compare the speed and
accuracy of our application to that of the manual counting. We intended for our system
to be faster than manual work, which in most tests was the case, by a factor of two.
Precision-wise, purely relying on the app is not the best course of action, as the two
main problems, maggot clumps and maggots on the edge, still persist. Nonetheless, by
combining manual and automated recognition by using our system, a user can achieve
faster processing times without having to sacrifice accuracy.
Bibliography
41
Appendices
43
Appendix A
Instruction Manual
45
Maggot App Instructions
Application Summary:
1. The first screen you’ll see is a direct feed from the camera on your phone. Position the phone
directly above and perpendicular to the petri-dish containing the maggots and align the blue
circle in the app with the walls of the dish.
2. By touching the screen move the green lines to the same position as the lines drawn in the dish.
If you wish to reset the lines to their original position, tap the blue reset button located to the
left.
3. Tap the green button to capture the picture. You’ll see a black and white representation of the
image you just captured. Tap either the plus or minus buttons until most maggots are visible,
look below for examples. If a big part of the maggots are not visible, go back to the previous
screen by pressing the back button and take another picture at a different angle.
4. After having set the threshold, press the green button with a checkmark to start processing the
image.
5. You should see the processed image, and it will display the number of maggots detected in each
part of the dish.
6. Tap the back button on your phone to go back to the home screen of the application.
7. Write down the real number of maggots and the number that the application has found.
The app will run only on Android 5.1 Lollipop and above, if you are unsure of your Android version, try
installing the app and if the installation fails your Android version is not supported.
Since this is an alpha release it is not available through the Play Store and will require a manual
installation. Download the MaggotAppAlpha.apk from here: www.bit.ly/MaggotApp1 and open it.
You will be asked to enable “Unknown Sources” to proceed with the installation.
The application will ask for camera permissions, you should click allow.
Take Picture
Decrease Threshold
To achieve optimal accuracy either lean over the petri-dish while capturing the image to avoid light or
try and minimize the difference between the background of the petri-dish as shown below. The
application needs a high and consistent contrast between the maggots and the background to work
well.
Align the lines shown on screen with the ones in the real petri-dish as shown below, if the distance
between the lines is not the same, adjust it by tapping the screen.
Once you’ve lined up the on-screen lines with the petri-dish ones, tap the green button on the right to
capture a picture:
Then adjust the threshold of the black and white area by using the blue plus and minus buttons on
either side of the green button. Try and make as many of the maggots visible according to the images
below. Then click the green check button to start processing the image
The numbers in each side of the petri-dish correspond to the number of maggots detected in that part
of the petri-dish. The number at the top is the sum of all detected maggots.
Write down the amount of maggots that the app has recognized next to the actual amount for each
side. This will be used to further improve the application in future.
Appendix B
53
54 Appendix B. Chain Code Python Implementation
33 return (x-1,y-1)
34 elif direct == 6: # up
35 return (x,y-1)
36 elif direct == 7: # up right
37 return (x+1,y-1)
38
39
40
79 nextIdx = XYfromDirection(xyStack[-1],lastDir)
80 print "nextIdx: " + str(nextIdx)
81 print
82 lx = nextIdx[0]
83 ly = nextIdx[1]
84 if (x,y) == (2,11):
85 print "lx,ly: " + str((lx,ly))
86 raw_input("")
87 if lx >= w: lx = w-1
88 if lx < 0: lx = 0
89 if ly >= h: ly = h-1
90 if ly < 0: ly = 0
91 if thresh1[ly][lx] > detectColor:
92 if isEdgePixel((lx,ly)): print "True"
93 if len(xyStack) > 1 and (lx,ly) == xyStack[0]:
94 breaking = True
95 break
96 if isEdgePixel((lx,ly)):
97 if lx > xMax: xMax = lx
98 if ly > yMax: yMax = ly
99 if lx < xMin: xMin = lx
100 if ly < yMin: yMin = ly
101 chainStack.append(lastDir)
102 xyStack.append((lx,ly))
103 break
104 # Remove last elem from stack
105 if i == 7:
106 if len(xyStack) == 1:
107 breaking = True
108 xMax = xyStack[0][0]
109 yMax = xyStack[0][1]
110 xMin = xyStack[0][0]
111 yMin = xyStack[0][1]
112 break
113 xyStack.pop()
114 chainStack.pop()
115 break
116 if breaking:
117 break
118 return xyStack,(xMin,yMin),(xMax,yMax)
119
57
58 Appendix C. BFS Python Implementation
33 else:
34 off = True
35 if on and off:
36 return True
37 else:
38 return False
39
79 if (thresh1[ly][lx] == (255,255,255)).all()
80 and (lx,ly) not in visitedPixels
81 and (lx,ly) not in reachablePixels:
82
83 reachablePixels.append((lx,ly))
84 if not reachablePixels:
85 breaking = True
86 break
87 for elem in reachablePixels:
88 if elem not in visitedPixels:
89 if isEdgePixel((elem[0],elem[1])):
90 objMat[elem[1]][elem[0]] = (0,0,255)
91 perim += 1
92 else:
93 objMat[elem[1]][elem[0]] = (255,255,255)
94 thresh1[elem[1],elem[0]] = (0,255,0)
95 visitedPixels.append(elem)
96 reachablePixels.remove(elem)
97 break
98 return objMat, len(visitedPixels), perim
99