Professional Documents
Culture Documents
V5i7 0298
V5i7 0298
V5i7 0298
net/publication/280693996
CITATIONS READS
3 2,550
4 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by S. Raviraja on 05 August 2015.
Abstract— Over the years, the advancement in computing technology, the reliability of computers, coupled with the
development of easy-to-use but never-the-less sophisticated software has led to significant changes in the way that
data are collected and analyzed. Several new methods of malaria diagnosis have recently been developed, however, all
of these are dependent on clinical notion and, consequently on explicit clinical request. Although some methods lend
themselves to automation, no technique can yet be used for routine clinical automated screening. Detection of
birefringent haemozoin has been used to diagnose malaria since the turn of the 20th century. New generations of full
blood count analyzers, used widely in clinical laboratories, have the potential to detect haemozion in white blood cells
and probably erythrocytes. This research introduces a blood image processing for detecting malarial parasites in
images of Giemsa stained blood slides, in order to evaluate the parasitaemia of the blood. Generally blood images are
made up of three different kinds of cells, red, white and blood platelets. Their dimension, shape and their colour
distinguish these. In malarial blood cell, the red corpuscles of vertebrates are infected by malarial parasites. The aim
of this research is to detect the red blood cells that are infected by malarial parasites using digital image processing
and Artificial Neural Network implementation. Further evaluation of the size and shape of the nuclei of the parasite
is also considered. In this research work, a total of 95 patient’s clinical data plasmodium parasite infected blood
smears have been diagnosed by new hybridized model. We found that, the model were able to predict the parasite
infected malaria disease with an accuracy of 60-96% based on selected dependent variables, validated using ROC
characteristics. Diagnostic models were developed to predict the parasite infected digital images. This could be useful
in determining potential treatment methods and monitoring the progress of treatment for infected patients.
Keywords— Malaria Parasite, Plasmodium, RBC, Microscopic Image, Feature Extraction, Neural Network.
I. INTRODUCTION
Malaria is an infectious disease caused by a parasite. It is spread by the bite of an infected mosquito. People catch
malaria when the parasite enters the blood. The parasite causes a deadly infection which kills many people each year.
The parasite that causes malaria is a protozoan called Plasmodium. Protozoa are organisms with only one cell, but they
are not bacteria. Bacteria are smaller and simpler than protozoa.
People usually get malaria from the Anopheles or Culex mosquitoes: they are the vectors of the disease.
The Plasmodium gets into people by the bites of mosquitoes. The Plasmodium is in the mosquito's special saliva. The
mosquito's saliva injects an anticoagulant into the person to prevent their blood from clotting. The person is then infected
with Plasmodium as a by-product. This makes the person have the disease so called malaria.
Only the female mosquito gives people malaria, because of only the female mosquito consumes blood. The male
mosquito lives on the nectar of flowers. The female uses blood as a source of protein for its eggs. Some people do not get
malaria from mosquitoes. A baby can get it while inside its mother. This is called maternal foetal transmission. People
can also get malaria from a blood transfusion. This is when someone gives blood to another person. Another way people
can catch malaria is by using a needle that someone with the disease used before them.
Malaria are protozoan parasites belonging to the subclass coccidian and this disease transmitted by the female
Anopheles mosquito, caused by parasitic protozoa of the genus Plasmodium spp, which infect human usually in the cells
of the liver and then in the red cells, by inserting into the hosts to populate. It is a serious disease caused by a blood
parasite and most malaria cases appeared in the developing countries. Estimated 500 million cases or more each year
between 1 and 3 million deaths per year, where Africa loses US 12 billion every year due to malaria (1% of GDP) [33].
Pregnant woman, elderly people and young children are the most vulnerable victims. Malaria is the leading cause of
death worldwide together with HIV-AIDS and TB [33].
B. Laboratory Investigation:
The Microscopical Identification of parasites in blood is the most certain of confirming infection with parasites
currently serological techniques are mainly of value in epidemiological work. Microscopical laboratory techniques for
investigation include the examination of stained thick blood films to detect the parasites and to examine white cells for
the malaria pigment. The following flow chart assists in the preliminary identification of parasites in thick films. The
command diagnosis method for malaria infection is carried out by searching for parasites in blood sample slides through
a microscope manually.
C. Clinical diagnosis:
A clinical diagnosis is based on the signs and symptoms of a disease, it is a diagnosis made without medical testing. In
the case of malaria one of the main symptoms which may lead to a clinical diagnosis of malaria is a fever. Any clinical
diagnosis of malaria should be confirmed by a trained professional based upon laboratory results as soon as it is possible.
E. Malaria microscopy:
To diagnose, if patients have malaria, doctors may do a blood test. This test is called a Giemsa blood smear. Blood is
put on a slide which is a thin piece of glass. The Giemsa stain is put on the slide. This stain helps doctors see the malaria.
Then they look at the slide under a microscope. The Plasmodium is seen in the red blood cells.
Generally in blood analysis, doctors or medical practitioners seek out for three different kinds of cells, which are red
cell, white cell and platelets or artefacts. Their dimensions and colours distinguish these. In peripheral blood sample
visual detection and recognition of Plasmodium spp is possible and efficient via a chemical process called (Giemsa)
staining. This staining process slightly colourises the red blood cells but highlights the rest of other cells such as
Plasmodium spp parasites, white blood cells, and platelets or artefacts. The command diagnosis method for malaria
infection is carried out by searching for parasites in blood sample slides through a microscope manually. However, this
method is a time consuming operation because the operator has to inspect 500 - 2000 cells and requires labour intensive.
Hence, it is essential to produce a standard automated tool which is able to perform diagnosis on this type of disease
anywhere [33].
Malaria are protozoan parasites belonging to the subclass coccidian and this disease transmitted by the Anopheles
mosquito, caused by minute parasitic protozoa of the genus Plasmodium, which infect human first in the cells of the liver
and then in the red cells, and insect hosts alternatively. It probably originated in Africa and accompanied human
migration to the Mediterranean shores, India and South East Asia. In the past it used to be common in the marshy areas
around Rome and the name is derived from the Italian, (mal-aria) or "bad air"; it was also known as Roman fever. The
detection techniques, today includes manual laboratory diagnosis of blood analysis and several new methods.
Generally in laboratory analysis for a blood sample, lab technicians or medical practitioners/ pathologist look for three
different kinds of cells, red, white and blood platelets. Their dimensions and their colour distinguish them. In malarial
blood the red corpuscles of vertebrates are infected by malaria parasites. Plasmodium, the protozoan parasite that causes
malaria, exists in a variety of different forms, which have successfully adapted to different cellular environments, in both
the vertebrate host and the mosquito vector. The parasite develops in a highly regulated manner through distinct cycles
in the vertebrate host. In malarial blood we have to look for cells, red both mature and young or reticulocytes and white,
and for parasites, in different stages of life, immature and mature trophozoites, schizonts and gametocytes.
……..Eq 1.1
In the above Eq1.1, where i denotes the image pixels, g is the edge detector, ii is the average luminance of the n × n
local neighbourhood vector bi (in column-wise format) centered at i, γ is a regularization parameter estimated using the
measure of median absolute deviation [13]. In the edge linking, resultant edge contours need to be linked together at their
terminal points to form closed boundaries around the RBCs [13]. The terminal points are recognised using 20 different 3
× 3 masks shown in Fig 1. Image pixels whose local neighborhoods are matched with any one of these masks are
identified as terminal points. Neighboring boundary edge contours are linked together if their terminal points are in close
proximity and the curvature at the linkage is similar to those of RBCs [13]. Whereas in clump splitting step, the RBCs
are clumped together adversely affects the accuracy of the parasitemia. Therefore, the researchers introduce a clump
splitting method, which is implemented in order to separate the clumps of two or more RBCs into constituent cells of
interest. At first, the deepest boundary pixels, i.e., the concavity pixels in a clump are detected using a fast and accurate
scheme [13]. Next, concavity based rules are applied to generate the candidate split lines that join pairs of concavity
pixels [13]. The figure to be compared is then used to identify the best split line from the set of candidate lines. This
process is repeated on the split clumps until no more split lines can be found. Finally in morphological and parametric
detection method, the parasites are characterised by regions within the RBCs (excluding the RBC boundaries) referring
to large edge response magnitude. The identification of the regions within RBCs via the use of a binary mask as follows.
First, a binary filling operation is performed on the closed boundary contour of the RBCs to yield some results. Next,
erosion s used on filled regions by applying a disk shaped structural element of radius two, in order to obtain the inner
regions of the RBCs where the parasites are located. Hence the comparison between an automated Malaria counting with
manual Malaria counting is approximately 0.2 percent precision.
Stanislaw Osowski et. al. [29], presents the application of a genetic algorithm (GA) and a support vector machine
(SVM) to the recognition of blood cells on the image of the bone marrow aspirate.GA is used for the selection of the
features for the recognition of the neighboring blood cells belonging to the same Ross et. al [25], in his work proposed an
image processing technique is described that is used to identify erythrocytes and possible parasites present on
microscopic slides. The algorithm consists of pre-processing of the image, image analysis, segmentation, features
generation and classification of erythrocytes as infected with malaria or not.
Gamini Wajyarathna et. al.[30] investigates the possibility of rapid & accurate automated diagnosis of red blood cell
disorders and describes a method to detect malaria parasites thalasseemia in blood sample images acquired from light
microscopes by hybridizing the techniques using Image processing, trainning of neural networks, SVM classifier.
Classification accuracy of 86.54% with 3 layers ANN was achieved in this study. SVM classifier is used to find better
accuracy.
Snehal Suryawanshi et. al [32] proposed improved technique for detection of malaria parasites within the blood cell
images, though there is considerable progress still there is need to improve accuracy, speed, automation level,
adaptability towards new applications. This paper proposes a new technique of image segmentation by Poisson
distribution using minimum error thresholding. The methods used in the proposed system are RGB to Gray conversion,
foreground extraction, Poisson distribution thresholding, Gabor filtering, Euclidean distance classifier. The system is
found to be robust, accurate and easy to implement.
In the above section the detection of malaria using image processing concepts and many other techniques are being
reviewed. The discussion is based on shape, colour and size features. Many algorithms are used like watershed are used.
Classifications are carried out by using ANN techniques. The drawbacks and disadvantages in previous research work
are discussed. The blood slides like giemsa stained and leisman slides are used for previous research work. This research
work uses giemsa stained slides. In previous research works the SVM is used as classifier to extract texture and geometry
features of image. The whole discussion leads to research in detection of malaria parasitic infected blood smears using
artificial intelligence technique to achieve better result compared to previous research work. This research work presents
a model for detection of malaria to assist or support doctors and lab technicians.
Features of parasites asexual cycle in humans: In their human host, malaria parasites have an asexual intracellular
cycle of development called schizogony. The parasites live and multiply, first in the cells of the liver and then in the red
cells. The forms of the parasite, which rupture from the red cells, infect new red cells. Several of these, instead of
repeating red cell schizogony, develop into gametocytes, which are the sexual forms of the parasite by which it is
transmitted to the mosquito to continue its life cycle.
Features of parasites sexual cycle in mosquito: In the mosquito, a sexual extra cellular cycle of development occurs,
called sporogony. In this male and female gametes are formed, fertilization occurs, and poroaoites are produced which
are infective to human. Transmission occurs when an infected female Anopheles mosquito takes a blood meal.
Plasmodium Vivax and Plasmodium Falciparum cause the most malaria in people. Falciparum malaria is the worst
kind, and kills the most people.
When Plasmodium enters the blood, they are then called sporozoites. Sporozoites go to the liver, where they make
many more sporozoites. Then they change into a different form of Plasmodium. This form is the merozoite. The
merozoites go into the red blood cells, and then they make many more merozoites. The merozoites break out of the red
blood cells again and again. When they do this, the person gets very sick, and shows symptoms of malaria. This happens
every few days, and is called a paroxysm.
Plasmodium vivax and Plasmodium ovale can live in the liver for a long time. A person can look well, but still have
the Plasmodium in the liver. This is called a dormant phase. Weeks or months later, the Plasmodium can leave the liver
to the blood, and the person will get sick again.
Plasmodium Falciparum is the most dangerous type of malaria. It makes people sicker than those with other types of
malaria, because there are more of them in the blood. Also, with Falciparum malaria, the red blood cells are sticky. This
makes the red blood cells block blood vessels. If blood vessels are blocked, this can hurt what the blood vessel brings
blood to, and can hurt people's organs
There are several species (kinds) of Plasmodium that cause malaria in humans:
Serious disease:
Plasmodium falciparum
Milder disease:
Plasmodium malariae
Plasmodium ovale
Plasmodium semiovale
Plasmodium vivax
Species which normally infect other primates:
Plasmodium knowesli
Species of Malaria Parasites: There are four species of the genus plasmodium responsible for the malarial parasite
infections that commonly infect man, Plasmodium Falciparum, Plasmodium Vivax, Plasmodium Malariae and
Plasmodium Ovale. The most important of these is Plasmodium Falciparum because it can be rapidly fatal and is
responsible for the majority of malaria related deaths.
A. Plasmodium Falciparum
Malaria causes by Plasmodium Falciparum is referred to as Falciparum malaria shown in Fig 10, formerly known as
sbertain (ST) or malignant tertian (MT) malaria. It is the most serious form of the disease and the most widespread.
Plasmodium Falciparum is found mainly in the hotter and more humid regions of the world, it is the main species found
in the tropical and sub tropical African countries and part of Central America and South America.
Diagnostic Points or the characteristics of Plasmodium Falciparum
Red Cells are not enlarged; Rings appear fine and delicate and there may be several in one cell; Some rings may have
two chromatin dots; and Presence of marginal or appliqué forms. It is unusual to see developing forms in peripheral
blood films. Gametocytes have a characteristic crescent shape appearance. However, they do not usually appear in the
blood for the first four weeks of infection. Maurer's dots may be present.
B. Plasmodium Vivax
Malaria caused by Plasmodium Vivax is referred to as Vivax malaria shown in Fig 11. It has a wide distribution in
temperate and subtropical regions.
Diagnostic Points the characteristics of Plasmodium Vivax
Red cells containing parasites are usually enlarged; Schuffner's dots are frequently present in the red cells as shown
above; The mature ring forms tend to be large and coarse; and developing forms are frequently present.
C. Plasmodium Malariae
It has a much lower prevalence than Plasmodium Falciparum or Plasmodium Vivax in Fig 12 and it is able to
prevalence in humans for many years. It is found in tropical and subtropical regions in Africa, it accounts for up to 25%
of plasmodium infections.
Diagnostic Points the characteristics of P. Malariae
Ring forms may have a squarish appearance; Band forms are a characteristic of this species; Mature schizonts may
have a typical daisy head appearance with up to ten merozoites; Red cells are not enlarged and Chromatin dot may be on
the inner surface of the ring.
D. Plasmodium Ovale
Malaria caused by Plasmodium Ovale is referred to as Ovale malaria in Fig 13. Formally known as Ovale tertian
malaria it is a relapsing species and has a restricted distribution and low prevalence. It is found in West Africa where it
accounts for up to 10% of malaria infection.
Diagnostic Points the characteristics of Plasmodium Ovale
Red cells enlarged; Comet forms common (top right); Rings large and coarse; Schuffner's dots, when present, may be
prominent and Mature schizonts similar to those of Plasmodium Malariae but larger and coarser.
© 2015, IJARCSSE All Rights Reserved Page | 872
Raviraja et al., International Journal of Advanced Research in Computer Science and Software Engineering 5(7),
July- 2015, pp. 863-886
Laboratory investigation: Malaria occurs in most tropical regions of the world with Plasmodium Falciparum
predominating in Africa, New Guinea and Haiti. Plasmodium Vivax is more common on the Indian sub-continent and
Central America with the prevalence of these two infections roughly equal in Asia, Oceania and South
America. Plasmodium Malariae is found in most endemic areas especially sub-Saharan Africa but much less
frequently. Plasmodium Ovale is relatively unusual outside Africa although some cases are now being identified in other
regions (eg. Southern States of India). It is also important to recognize that with the relative ease and speed of modern
travel and migration, "imported" cases of malaria may present in any country. Additionally so called "airport malaria"
(see History section) has now been identified in a number of countries including the USA, UK, Belgium, and
Switzerland. Airport malaria is particularly dangerous since Clinicians may have little reason to suspect it, if the patient
has had no recent travel to areas where malaria is endemic. This may result in a delay before the correct diagnosis is
made and which may lead to death before appropriate treatment can be initiated. Small outbreaks of malaria may occur
in countries considered free of the disease, such outbreaks are most likely the result of an infected person entering the
country asymptomatic and where suitable mosquito vectors are present [32].
Examination of a thick blood film should be the first step since this has the advantage of concentrating the parasites
by 20 fold in comparison to a thin film, although the parasites may appear distorted making species identification
difficult. If parasites are seen then the species should be confirmed by the examination of a thin film. Ideally blood
should be collected when the patient's temperature is rising.
Examination of blood for malaria parasites, with the spread of drug resistance, is becoming increasingly important to
confirm microscopic diagnosis of malaria. The Microscopical identification of parasites in blood is the most certain of
confirming infection with parasites currently serological techniques are mainly of value in epidemiological work as
shown Fig 14.
The 2D continuous image a(x,y) is divided into N rows and M columns. The intersection of a row and a column is
termed as a pixel. The value assigned to the integer coordinates [m,n] with {m=0,1,2,….,M-1} and {n=0,1,2,…..,N-1} is
a[m,n]. In fact, in most cases a(x,y) which we might consider to be the physical signal that impinges on the face of a 2D
sensor is actually a function of many variable including depth (z), color ( ), and time (t). Unless otherwise stated, we
will consider the case of 2D, monochromatic, static images in this research.
Stained Image Processing: An image from stained sample is prone to differ widely in the foreground or the
background color due to several conditions. This may be due to difference in the light source or filters, cameras, slide
preparation. In order to have an analysis towards constant color characteristics, the images are normalized. In this work,
the gray level normalization is incorporated through which a constant gray value of the image is maintained which does
not change to different conditions. In a diagonal model, an image of unknown illumination Iu can be simply transformed
to the known illuminant space Ik by multiplying pixel values with a diagonal matrix ( Ik r g b(x)= MIu RGB(x) ). Where
μI RGB are the mean for channels RGB. The constant grey values for each channels was assumed to be 255 (the
maximum possible value) which is similar to colorless transparent pixel color. Further the normalized image is
transformed to LAB color space. This color space is chosen because the L layer of the image has the image intensity as
one of its component which ensures the contrast enhancement and equalization in more efficient way compared to other
color spaces. The processed LAB color space image is converted back to RGB color space. The corrected RGB image is
segmented using histogram based thresholding operation. This step ensures the removal of noise and artifacts to major
extent without missing the infected cells. Since the protocol is dominant towards other color components, the threshold is
applied on the green component of the RGB image. The regional maxima and minima were used as markers and
thresholded images were reconstructed in order to avoid objects that are artefacts. The objects in the above process of
object detection are called for both the normalized image and the original input RGB image. The detection using
normalized image outputs a binary image which ensures the reconstruction of cells that are of interest and with very
minimal artefacts which tends to be appearing as cells or due to intensity factors. This image is used as marker and the
object detection using original input RGB is used as the mask image. A general reconstruction is performed between the
mask and the morphologically (disk of constant radius) eroded marker image. The reconstructed image is added with the
marker image in order to retain the original structure of the cells.
© 2015, IJARCSSE All Rights Reserved Page | 874
Raviraja et al., International Journal of Advanced Research in Computer Science and Software Engineering 5(7),
July- 2015, pp. 863-886
Image Pre-Processing: The goal of this step is to make the acquired images more suitable for subsequent processes
mainly image segmentation and feature extraction. Basically, there are three main objectives for image pre-processing.
One is to re- size the image for the purposes of either magnifying the image through digital zooming, or reducing the
image size in order to speed up processing. The second objective of image pre-processing is to reduce or eliminate noise
from the acquired image. Third is to enhance the image contrast for visual evaluation.
In this case, digital zooming and contrast enhancement is not necessary since the task of image classification and
recognition is to be performed by a computer and not a human operator. However image size normalization is essential in
order to standardize the spatial resolution for images from different sources. Image filtering is also necessary in order to
reduce or eliminate noise in images which could have been acquired during the process of sample preparation or image
acquisition
Feature Extraction: A total of 60 samples were used for training. Each samples had number of normal and infected
cells along with artefacts. The objects extracted from these samples are Parasites, RBC and artefacts. In order to classify
the detected objects, twenty three image features were extracted from the detected objects for training the system. The
feature includes intensity based Histogram features and shape measurement features. These features are extracted for
different channel of color spaces namely gray, hue, saturation and luminosity (standard deviation).
First Order Statistical Features / Histogram Features: The histogram counts and the bin locations are pixel counts and
bin (256) respectively. The first order features are defined by the following equations, reconstructed image are labeled.
Shape Measurement Features: Since these features are independent of color spaces, the following equations were
directly applied to the binary mask image. Shape measurements can detect the changes in the size. The advantage of
shape measurements is straightforward interpretation of the calculated feature values.
This stage is about choosing suitable parameters which adequately describes the information of the image. These
parameters are grouped together in vector form and are referred to as feature vectors. Features can be obtained directly
from images e.g., raw image pixel values or they could be derived quantities such as average image intensity, image
histogram moments, shape signature and object area.
HSV Color Space Transformation: HSL, HSV, HSI, or related models are often used in computer vision and image
analysis for feature detection or image segmentation. The applications of such tools include object detection, for instance
in robot vision; object recognition, for instance of faces, text, or license plates; content-based image retrieval;
and analysis of medical images.
For the most part, computer vision algorithms used on color images are straightforward extensions to algorithms
designed for grayscale images, for instance k-means or fuzzy clustering of pixel colors, or canny edge detection. At the
simplest, each color component is separately passed through the same algorithm. It is important, therefore, that the
features of interest can be distinguished in the color dimensions used. Because the R, G, and B components of an object’s
color in a digital image are all correlated with the amount of light hitting the object, and therefore with each other, image
descriptions in terms of those components make object discrimination difficult. Descriptions in terms of
hue/lightness/chroma or hue/lightness/saturation are often more relevant.
Starting in the late 1970s, transformations like HSV or HSI were used as a compromise between effectiveness for
segmentation and computational complexity. They can be thought of as similar in approach and intent to the neural
processing used by human color vision, without agreeing in particulars: if the goal is object detection, roughly separating
hue, lightness, and chroma or saturation is effective, but there is no particular reason to strictly mimic human color
response.
The function of color map cmap = rgb2hsv(M) converts an RGB colormap M to an HSV colormap cmap. Both
colormaps are m-by-3 matrices. The elements of both colormaps are in the range 0 to 1.
The columns of the input matrix M represent intensities of red, green, and blue, respectively. The columns of the
output matrix cmap represent hue, saturation, and value, respectively.
The function hsv_image = rgb2hsv(rgb_image) converts the RGB image to the equivalent HSV image. RGB is an m-
by-n-by-3 image array whose three planes contain the red, green, and blue components for the image. HSV is returned as
an m-by-n-by-3 image array whose three planes contain the hue, saturation, and value components for the image.
Binarisation and Region Detection Using ROI and BW Region: A binary image is a digital image that has only two
possible values for each pixel. Typically the two colors used for a binary image are black and white though any two
colors can be used. The color used for the object(s) in the image is the foreground color while the rest of the image is the
background color.[1] In the document-scanning industry this is often referred to as "bi-tonal".
Binary images are also called bi-level or two-level. This means that each pixel is stored as a single bit that is., a 0 or 1.
The names black-and-white, B&W, monochrome or monochromatic are often used for this concept, but may also
designate any images that have only one sample per pixel, such as grayscale images. In Photoshop parlance, a binary
image is the same as an image in "Bitmap" mode.
Binary images often arise in digital image processing as masks or as the result of certain operations such
as segmentation, thresholding, and dithering. Some input/output devices, such as laser printers, fax machines, and
bilevel computer displays, can only handle bilevel images.
A binary image can be stored in memory as a bitmap, a packed array of bits. A 640×480 image requires 37.5 KiB of
storage. Because of the small size of the image files, fax machine and document management solutions usually use this
format. Most binary images also compress well with simple run-length compression schemes.
Binary images can be interpreted as subsets of the two-dimensional integer lattice Z2; the field of morphological
image processing was largely inspired by this view.
Here is an example of a 5x5 Gaussian filter, used to create the image to the right, with = 1.4. (The asterisk denotes
a convolution operation.)
It is important to understand that the selection of the size of the Gaussian kernel will affect the performance of the
detector. The larger the size is, the lower the detector’s sensitivity to noise. Additionally, the localization error to detect
the edge will slightly increase with the increase of the Gaussian filter kernel size. A 5*5 is a good size for most cases, but
this will also vary depending on specific situations as in Fig 17.
Fig 17: The image after a 5x5 Gaussian mask has been passed across each pixel.
Finding the Intensity Gradient of the Image: An edge in an image may point in a variety of directions, so the Canny
algorithm uses four filters to detect horizontal, vertical and diagonal edges in the blurred image. The edge detection
operator (Roberts, Prewitt, Sobel for example) returns a value for the first derivative in the horizontal direction (Gx) and
the vertical direction (Gy). From this the edge gradient and direction can be determined:
,
Where G can be computed using the hypot function and atan2 is the arctangent function with two arguments. The edge
direction angle is rounded to one of four angles representing vertical, horizontal and the two diagonals (0, 45, 90 and 135
degrees for example). An edge direction falling in each color region will be set to a specific angle values, for example
alpha lying in yellow region (0 to 22.5 degrees and 157.5 degrees to 180 degrees) will be set to 0 degree.
Non-maximum suppression is an edge thinning technique. Non-Maximum suppression is applied to "thin" the edge.
After applying gradient calculation, the edge extracted from the gradient value is still quite blurred. With respect to
criteria 3, there should only be one accurate response to the edge. Thus non-maximum suppression can help to suppress
all the gradient values to 0 except the local maximal, which indicates location with the sharpest change of intensity value.
The algorithm for each pixel in the gradient image is:
Compare the edge strength of the current pixel with the edge strength of the pixel in the positive and negative gradient
directions.
If the edge strength of the current pixel is the largest compared to the other pixels in the mask with the same direction
(i.e, the pixel that is pointing in the y direction, it will be compared to the pixel above and below it in the vertical axis),
the value will be preserved. Otherwise, the value will be suppressed.
In some implementations, the algorithm categorizes the continuous gradient directions into a small set of discrete
directions, and then moves a 3x3 filter over the output of the previous step (that is, the edge strength and gradient
directions). At every pixel, it suppresses the edge strength of the center pixel (by setting its value to 0) if its magnitude is
not greater than the magnitude of the two neighbors in the gradient direction. For example,
If the rounded gradient angle is zero degrees (i.e. the edge is in the north–south direction) the point will be considered
to be on the edge if its gradient magnitude is greater than the magnitudes at pixels in the east and west directions,
If the rounded gradient angle is 90 degrees (i.e. the edge is in the east–west direction) the point will be considered to
be on the edge if its gradient magnitude is greater than the magnitudes at pixels in the north and south directions,
If the rounded gradient angle is 135 degrees (i.e. the edge is in the northeast–southwest direction) the point will be
considered to be on the edge if its gradient magnitude is greater than the magnitudes at pixels in the north west and south
east directions, If the rounded gradient angle is 45 degrees (i.e. the edge is in the north west–south east direction) the
point will be considered to be on the edge if its gradient magnitude is greater than the magnitudes at pixels in the north
east and south west directions.
© 2015, IJARCSSE All Rights Reserved Page | 876
Raviraja et al., International Journal of Advanced Research in Computer Science and Software Engineering 5(7),
July- 2015, pp. 863-886
In more accurate implementations, linear interpolation is used between the two neighboring pixels that straddle the
gradient direction. For example, if the gradient angle is between 45 degrees and 90 degrees, interpolation between
gradients at the north and north east pixels will give one interpolated value, and interpolation between the south
and south west pixels will give the other (using the conventions of last paragraph). The gradient magnitude at the central
pixel must be greater than both of these for it to be marked as an edge. Note that the sign of the direction is irrelevant,
that is north–south is the same as south–north and so on.
After application of non-maximum suppression, the edge pixels are quite accurate to present the real edge. However,
there are still some edge pixels at this point caused by noise and color variation. In order to get rid of the spurious
responses from these bothering factors, it is essential to filter out the edge pixel with the weak gradient value and
preserve the edge with the high gradient value. Thus two threshold values are set to clarify the different types of edge
pixels, one is called high threshold value and the other is called the low threshold value. If the edge pixel’s gradient value
is higher than the high threshold value, they are marked as strong edge pixels. If the edge pixel’s gradient value is smaller
than the high threshold value and larger than the low threshold value, they are marked as weak edge pixels. If the pixel
value is smaller than the low threshold value, they will be suppressed. The two threshold values are empirically
determined values, which will need to be defined when applying to different images.
The Histogram of Oriented Gradients and wavelet method for feature extraction: The histogram of oriented
gradients (HOG) is a feature descriptor used in computer vision and image processing for the purpose of object detection.
The technique counts occurrences of gradient orientation in localized portions of an image. This method is similar to that
of edge orientation histograms, scale-invariant feature transform descriptors, and shape contexts, but differs in that it is
computed on a dense grid of uniformly spaced cells and uses overlapping local contrast normalization for improved
accuracy.
The essential thought behind the histogram of oriented gradients descriptor is that local object appearance and shape
within an image can be described by the distribution of intensity gradients or edge directions. The image is divided into
small connected regions called cells, and for the pixels within each cell, a histogram of gradient directions is compiled.
The descriptor is then the concatenation of these histograms. For improved accuracy, the local histograms can be
contrast-normalized by calculating a measure of the intensity across a larger region of the image, called a block, and then
using this value to normalize all cells within the block. This normalization results in better invariance to changes in
illumination and shadowing.
The HOG descriptor has a few key advantages over other descriptors. Since it operates on local cells, it is invariant to
geometric and photometric transformations, except for object orientation. Such changes would only appear in larger
spatial regions. Moreover, as Dalal and Triggs discovered, coarse spatial sampling, fine orientation sampling, and strong
local photometric normalization permits the individual body movement of pedestrians to be ignored so long as they
maintain a roughly upright position. The HOG descriptor is thus particularly suited for human detection in images.
The first step of calculation in many feature detectors in image pre-processing is to ensure normalized color and
gamma values. As Dalal and Triggs point out, however, this step can be omitted in HOG descriptor computation, as the
ensuing descriptor normalization essentially achieves the same result. Image pre-processing thus provides little impact on
performance. Instead, the first step of calculation is the computation of the gradient values. The most common method is
to apply the 1-D centered, point discrete derivative maskin one or both of the horizontal and vertical directions.
Specifically, this method requires filtering the color or intensity data of the image with the following filter kernels:
Dalal and Triggs tested other, more complex masks, such as the 3x3 Sobel mask or diagonal masks, but these masks
generally performed poorer in detecting humans in images. They also experimented with Gaussian smoothing before
applying the derivative mask, but similarly found that omission of any smoothing performed better in practice.
Orientation binning: The second step of calculation is creating the cell histograms. Each pixel within the cell casts a
weighted vote for an orientation-based histogram channel based on the values found in the gradient computation. The
cells themselves can either be rectangular or radial in shape, and the histogram channels are evenly spread over 0 to 180
degrees or 0 to 360 degrees, depending on whether the gradient is “unsigned” or “signed”. Dalal and Triggs found that
unsigned gradients used in conjunction with 9 histogram channels performed best in their human detection experiments.
As for the vote weight, pixel contribution can either be the gradient magnitude itself, or some function of the magnitude.
In tests, the gradient magnitude itself generally produces the best results. Other options for the vote weight could include
the square root or square of the gradient magnitude, or some clipped version of the magnitude.
Descriptor blocks: To account for changes in illumination and contrast, the gradient strengths must be locally
normalized, which requires grouping the cells together into larger, spatially connected blocks. The HOG descriptor is
then the concatenated vector of the components of the normalized cell histograms from all of the block regions. These
blocks typically overlap, meaning that each cell contributes more than once to the final descriptor. Two main block
geometries exist: rectangular R-HOG blocks and circular C-HOG blocks. R-HOG blocks are generally square grids,
represented by three parameters: the number of cells per block, the number of pixels per cell, and the number of channels
per cell histogram. In the Dalal and Triggs human detection experiment, the optimal parameters were found to be 3x3
cell blocks of 6x6 pixel cells with 9 histogram channels. Moreover, they found that some minor improvement in
performance could be gained by applying a Gaussian spatial window within each block before tabulating histogram votes
in order to weight pixels around the edge of the blocks less. The R-HOG blocks appear quite similar to the scale-
invariant feature transform (SIFT) descriptors; however, despite their similar formation, R-HOG blocks are computed in
dense grids at some single scale without orientation alignment, whereas SIFT descriptors are usually computed at sparse,
L2-norm:
L2-hys: L2-norm followed by clipping (limiting the maximum values of v to 0.2) and renormalizing
L1-norm:
L1-sqrt:
In addition, the scheme L2-hys can be computed by first taking the L2-norm, clipping the result, and then
renormalizing. In their experiments, Dalal and Triggs found the L2-hys, L2-norm, and L1-sqrt schemes provide similar
performance, while the L1-norm provides slightly less reliable performance; however, all four methods showed very
significant improvement over the non-normalized data.
Histogram of oriented gradients (HOG) is a feature descriptor used to detect objects in computer vision and image
processing. The HOG descriptor technique counts occurrences of gradient orientation in localized portions of an image -
detection window, or region of interest (ROI).
Implementation of the HOG descriptor algorithm is as follows:
Step 1: Divide the image into small connected regions called cells, and for each cell compute a histogram of gradient
directions or edge orientations for the pixels within the cell.
Step 2: Discretize each cell into angular bins according to the gradient orientation.
Step 3: Each cell's pixel contributes weighted gradient to its corresponding angular bin.
Step 4: Groups of adjacent cells are considered as spatial regions called blocks. The grouping of cells into a block is
the basis for grouping and normalization of histograms.
Step 5: Normalized group of histograms represents the block histogram. The set of these block histograms represents
the descriptor.
The following Fig 18 demonstrates the algorithm implementation scheme:
Computation of the HOG descriptor requires the following basic configuration parameters:
The proposed system components are stored or captured microscopic images from the root directory, a data entry
program to create, add, and delete the image files and finally a diagnosis program which makes the matching process.
The table consists of the following programming files as shown in Table 2.
The proposed system is initialized with the root directory, which holds a number of captured sample images in
BMP/JPEG format – 8 bit per pixel and 65 X 65 h/w. The working model flow of encoding system is shown in following
Fig 20.
In the above research methodology section the discussion started with the classification and features of malaria
parasites which is important for the detection and classification. The image processing concepts were discussed. In data
collection the procedure for preparing slides were also briefly narrated. The preprocessing includes HSV color space
transformation, binarisation and region detection using ROI and BW region. Further segmentation is applied. After
segmentation Gussian filter applied for smoothening of image. The HOG and wavelet method are used to extract features.
In feature extraction eccentricity and convex hull are applied to extract features. Finally classification is carried out by
using ANN. The relevant formulas and equations are explained.
V. RESULT DISCUSSION
In other hand, a new experimental diagnostic model design has been developed to be used in the training,
classification and recognition of neural network models, and it is also as one of the contributions of the research. Where
other research work used moving average method for the experimental data design, in this research the data are prepared
using image processing from the clinical dataset. However, using the clinical dataset, the neural network models are
successful in recognizing the Plasmodium of the clinical image or in deriving depth values and hidden point of neural
network model by given 2D image, while the uses of moving average method are not satisfied, in other words neural
network models are probably not suitable in recognizing the data pattern.
Fig 22a : Result of Plasmodium Falciparum Detection Fig 22b : Result of Plasmodium Ovale Detection
Fig 22c: Result of Plasmodium Malarie 2nd type detection Fig 22d: Result of Normal RBC
(Not Infected or Malaria Negative)
Fig 22e: Result of Plasmodium Vivax Detection Fig 22f: Result of Normal RBC
(Not Infected or Malaria Negative)
It is observed that the matching program system performed 96.00% in detecting the malarial infected blood samples.
The result of the experiment, number of investigated samples and results using the Microscopical investigation were 15
and the result using the system was found 11 and 4 were resulted in unmatched with the samples stored. The problem of
undetermined parasite phases, and we found that the system is sensitive to the differences of parasite phases.
Receiver Operating Characteristics and Percent Accuracy: The predictive performance of the automated diagnosis of
parasite infected RBCs and hence its generalization capability was measured in terms of the area under the receiver-
operating characteristic. In medical prediction, the receiver operating characteristics (ROC) is commonly used to
determine the accuracy of predicted values as it can be used across different classification tools.
The ROC is a plot of sensitivity versus specificity for different test results [3].
Data sample which is infected and had a malaria “positive” test result is termed a True Positive,
Data sample which is infected and had a malaria “negative” test result is termed a False Negative
Data sample which is not infected and had a malaria “positive” test result is termed a False Positive.
Data sample which is not infected and had a malaria “negative” test result is termed a True Negative.
The summary of the above are shown in Table 6.
Sensitivity, equation (1) is the true positive test results divided by the entire infected cell. This is the probability that
the data sample will be classified as positive, when data image is infected.
Sensitivity = (a / (a + c)) ……. (1)
The specificity, equation (2) of a test is the true-negative test results divided by most the cell that are not infected. This
is the probability that data sample will be classified as not infected when sample is negative. “1-specificity is the
probability that sample will be classified as positive when the sample data not infected.
Specificity = (d / (b + d)) ……. (2)
To generate the ROC curve it is first necessary to determine the sensitivity and specificity for each test result. The X-
axis ranges from 0 to 1, or 0% to 100% and is the false positive rate, that is 1-specificity. The Y-axis ranges from 0 to 1,
or 0% to 100% and is the true positive rate, that is the sensitivity. The curve starts at (0,0) and increases towards (1,1).
The endpoints of the curve will run to these points and an area of the resulting trapezoids can therefore be calculated as
shown in Fig 23. The larger the area under the curve the better is the prediction.
The accuracy of the predictions is measured by the number of correctly predicted cases divided by all the cases in the
study (Percent Accuracy) .We found that the automated diagnostic detection were able to predict the parasite infected
malaria with an accuracy of 60-96% based on selected dependent variables, validated using ROC characteristics.
VI. CONCLUSIONS
This section presents a conclusion for the research and ideas for further research. The discussion starts with an
introduction to the research achievements, research framework summaries, research work summaries, and the
contribution of the research. The discussion ends up with the ideas for future works.
Everything that is newly invented has its merits and demerits. The assessment of the newly invented thing is, perhaps,
the most crucial phase in not just the invention process, but also in the decision of retaining it as useful, or in proposing
to bring in modifications to improve the performance or discarding it as unworthy. However, in fields where robust
and/or credible methodologies are already existent, a new approach introduced should not only be entirely assessed, but it
should also be thoroughly compared with the existing ones. This gives a clear picture of not only how good the new
approach is, but also if it is worthwhile pursuing improvements over the methodology, and also where the approach
stands against the state of the art. The newly proposed approach should closely follow the already existing robust ones,
and be preferably even better. This is the surest way of bringing in any kind of credibility to the approach right in its
inception. The following sections might be of use in assessing new approaches that we have taken up in our research
work.
As per to the best of our knowledge and efforts, we could not find standard methods for the classification and
recognition of plasmodium parasitamia, or for that matter, the likes of it. We made an extensive study of the existing
methods and fewer have been used, viz, HSV color space transformation, binarisation and region detection using ROI
and BW region. Next segmentation is applied. After segmentation Gussian filter applied for smoothening of image. The
HOG and wavelet method are used to extract features. In feature extraction eccentricity and convex hull are applied to
extract features. Finally classification is performed using ANN.
In this research work we have presented a hybrid model, which is the combination of digital image process and
artificial intelligence techniques to detect the parasites infected red blood cells. The proposed method automatically
identifies the parasites using colour, shape and size information, extracted by a digital image operation and AI techniques.
We have used the features of parasite infection in RBC to detect the parasites, according to a connectivity of a disk
shaped structuring element whose radius is the greatest size of the red blood cells.
Future Research Avenues: We found that the hybrid model is able to detect the malaria parasites infected blood cells
images with an accuracy of 73.33%-96% based on selected variables. The results of this model are significant, this
approach experimented and the results obtained in this research could be useful in determining potential treatment
methods and monitoring the progress of treatment for Malaria affected patients.
The classification algorithm is sensitive for to malaria phase wise features that the results such as, using different types
of dye in malaria diagnosis (Gemsia, Fields and Leisman) with and using different concentration. The system
performance can be further improved by considering the large image size, and colour scaling, and depth. In this
experiment the size and colour depth of the stored images in the database was 65x65 pixels image size and 8 bit
quantization still there is a room for research. The proposed system is sensitive to difference of parasite phases, due to
this reason; it is possible to extend and explore the proposed techniques to diagnose different types of disease.
Using different types of dye with different concentration generate a high noise, so the dye concentration must be equal
in both stored and investigated samples. Further avenue to design and develop a method(s) or technique(s) to reduce
noise as a follow up to our research. Finally we would venture the experimenting by like to incorporate the artificial
intelligent techniques and expert systems in the proposed in an improved efficient system to diagnose the patients
symptoms and then possible to investigate patients’ clinical blood sample.
ACKNOWLEDGMENT
We have been fortunate in our collaborative efforts with clinical expert Mr. Bhavani Shankar, District Malaria Officer,
Malaria District Regional Office, Mandya, Karnataka state, India, who not only shared their vast experience in cases of
Malaria and by playing the role of advisor but also provided the means to access fewer data for our experiment. In
particular, the authors would like to express our appreciation to Physician Dr. Dore Swamy, Practicing at Bangalore,
India.
REFERENCES
[1] S. K. Lee, CS. Lo, C M. Wang and P-C Chung, “A Computer Aided Design Mammography Screening System
for Detection and Classification of Micro Calcification”, International Journal of Medical Informatics, Vol 60,
pp 29-57, 2000.
[2] Ruberto, C.Di., Dempster, A., Shahid Khan, and Jarra, B. (2002). Analysis of Infected Blood Cell Images using
Morphological Operators. Image and Vision Computing , 20,133-146
[3] Seong Ho Park, Jin Mo Goo, Chan-Hee Jo,Receiver Operating Characteristic (ROC) Curve: Practical Review
for Radiologists. Korean J Radiol 5(1), March 2004, pp 11-18.
[4] C. Pan, X. Yan, and Zheng, “Recognition of Blood and Bone Marrow Cells Using Kernel-based Image
Retrieval” IJCSNS, International Journal of Computer Science and Network Security, Vol 60, No10, October
2006.
AUTHOR’S PROFILE
Dr. S. Raviraja, founder Chairman of Royal Research Foundation, India, a Research Institute
at present and former employee of department of Artificial Intelligence, the Faculty of
Computer Science and IT at University of Malaya (UM), Malaysia. Previously he was attached
to the Faculty of CS & IT at the University of Medical Sciences and Technology in Sudan. He
has received his Bachelors in Computer Science and Masters in Computer Applications from
Dr. S. Raviraja, PhD, University of Mysore, India. Also he has a PhD from University of Honolulu, US in which he
PDRF., built a multilingual script classification & recognition of African Language Scripts. Also
Founder Chairman, served as Post Doctoral Research Fellow at University of Malaya. This contemporary research
Royal Research from several perspectives (such as image process, robotic/ computer vision, 3G mobile
Foundation, application on controlling and monitoring) has met the necessity of addressing many of the
Mysore, India. theoretical, practical and methodological issues surrounding research on AI.
Former Employee, Raviraja started his career with Motorola (India) as Software Engineer, later as Software
Dept. Of Artificial Analyst and then as Project lead in reputed software companies in India, such as Pentsoft
Intelligence Technologies Pvt Ltd, and Raman InfoTech Ltd. He was then working as a research scholar
University of Malaya, and later as Assistant Professor university in Ethiopian, few years later continued his academic
Kuala Lumpur, and research career with university in Sudan working on automated detection of malaria. He
Malaysia. also served as dept. Head and Dean of the school in previous institutions. Later, had selection
from University of Malaya (Ranked 130 by THES QS 2010) as a Post Doctoral Research
Fellow. At present he is a founder chairman of Royal Research Foundation, a research institute
in India.
Raviraja has presented his research findings at several national, international conferences,
journals, workshops and also has innovation awards to his credit. He has previously taught,
examined students of India, Ethiopia, Sudan, Middle East countries, and Malaysia. During his
stay at UM he was also actively involved in teaching and supervising undergraduate and
postgraduate students. Formerly he was serving Editorial Member of Malaysian Journal of
Computer Science (ISI Indexed: ISSN 0127-9084) and ICTACT Journal of Soft Computing
(ISSN 0976-6561) etc., He is also member of Institute of Engineers India, Computer society of
India and several other professional associations. His research interest includes Medical &
Document Image Analysis, DIP, AI & Robotics and in Software Engineering Methodologies.