Last Vergen Reports

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 26

Reports of research papers…………………..

pages
Road Sign Detection and Recognition …………………............. 2
Real Time Road Sign Recognition System Using Artificial…….5
Neural Networks for Bengali Textual Information Box

A First Approach to Learning a Model of Traffic Signs using…9


Connectionist and syntactic methods

Sign finder: Using Color to detect, localize and identify………...15


Informational signs

Road Traffic Sign Detection and Classification………………….18


Our Approach ……………………………………………………………………………..22

1
Road Sign Detection and Recognition
Overview
The approach is proposed by Michael Shneier in June 2005. Road signs are detected
by means of rules that restrict color and shape and require signs to appear only in
limited regions. The candidate regions identified by color and pruned using shape
criteria. They are then recognized using a template matching method and tracked
through a sequence of images over time. The method is fast and can easily be
modified to include new classes of signs.

Approach

The Approach is composed of two parts:

1- Detecting the Signs:

1) The program accepts a video stream taken from a camera.


2) Signs are detected in a multistage process:

1. Starting with segmentation based on color.


 For warning signs, it requires that for each pixel the red (r), green (g)
and blue (b) values (ratios of RGB) conform to the following.
r/g >α warn & r / b > β warn & g / b >γ warn

 For predominantly red signs (stop, yield, no entry), we require


r/g >α red & r / b > β red & g / b >γ red
Where α, β and γ are constants.

 These constants are determined by sampling the red, green, and blue
values of images of typical signs.

2. The constraints are applied pixel by pixel to the image and result in a
binary image with 1’s where pixels are candidates for belonging to a sign.
3. Following the creation of the binary images, morphological erosion is
performed to get rid of single pixels and two or three dilations are done to
join parts of signs that may have become separated due to the presence of
writing or ideograms on the signs (Figure 1).
4. The next step is to find connected components in the binary images and
identify blobs that are likely to be signs.
1. Properties of each component are computed, including the centroid,
area, and bounding box.
2. They are used in a set of rules that accept or reject each blob as a sign
candidate:
a. The rules require the area of the blob to be greater than a
minimum and less than a maximum, the height to width ratio to
be in a specified range, and the centroid to be in a restricted
part of the image where signs can be expected to appear.

2
b. The ratio of the area of the blob to the area of the bounding box
is restricted to prevent blobs that are too thin from being
accepted.
3. Blobs that conform to the rules are considered to be candidate signs
and are tracked from image to image.
4. If a blob is seen in five successive frames, it is confirmed as a
candidate and goes on to the recognition phase of the algorithm.

2- Recognition:

 Recognition is achieved by template matching.

1) A preprocessing step is first applied to each candidate sign:


 It masks out the background surrounding the sign which would otherwise
interfere with the template matching.
 We make use of the results of the sign detection phase that already
constructed a mask for the sign.
 Using this mask results in good segmentation of the sign region from the
background (Figure 2).
 The masked candidate signs are scaled to a standard size (48x48 pixels)
2) The masked candidate signs are compared with stored signs of the same size.

 The stored sign templates are taken from video sequences similar to those being
recognized.
 Because there is a lot of variation in the signs, several stored templates may be
needed for each canonical sign.

3
Experiments & Results
 The algorithm was tested on video sequences recorded with a camera at normal
driving speeds.
 The results reported here are for a total of 23,637 frames containing 92 warning
and stop signs.
 The algorithm runs at over 20 frames per second on a 1.6 MHz Intel Pentium
Mobile.
 Table 1 shows the results for three individual runs and the combined totals, while
Table 2 shows the corresponding percentages.

4
5
Real Time Road Sign Recognition System Using Artificial
Neural Networks for Bengali Textual Information Box

Overview

The approach is proposed by Mohammad Osiur Rahman, Fouzia Asharf Mousumi,


Edgar Scavino, Aini Hussain, and Hassan Basri in 2009. Proposed system is basic
objective consists in the recognition of the road sign inscribing in Bengali language
and in the delivery of the corresponding audio stream to the users. The textual
information of the road signs is detected and extracted from the images. Then it
recognized by Bengali OCR system which is implemented using Multi Layer
Perceptron Neural Network.

Approach

The proposed system is structured in a sequence of processes such as:

1) Image acquisition, and preprocessing:

1. Record several video sequences by a webcam.


2. Every two second a frame is collected and stored in JPG format.
3. A median filter is used to reduce impulsive or noise from captured
images.
4. Images are normalized into 320 x 240 pixels format.

2) Text detection and extraction:

1. Read input image in .jpg format


2. Convert colored image into grayscale image
3. Apply 3 x3 median filter convolution masks on grayscale image
4. Calculate edges by applying Sobel convolution mask
5. Thicken the calculated edges by dilation
6. Apply vertical Sobel projection filter on dimmed image
7. Create a histogram by computing projection values
8. Find the threshold value of the image
9. Loop on the possible positive identifications based on the histogram
values
10. Extract the plausible area containing the text
11. Apply Sobel horizontal edge-emphasis for other possible text area
searches.
12. Convert detected text region into binary image
13. Calculate height and width of detected region of text
14. Crop the image

6
3) Bangla Optical Character Recognition Using MLP:

An artificial neural network based approach is used for Bangla optical


characters recognition (OCR) of text in road signs.
 The Bangla OCR module has three sub-modules:
3.1 Character Segmentation:
The text is partitioned into its coherent parts. To get
individual characters the text must pass in sequences of
processes: line segmentation, word segmentation, and
character segmentation.
3.2 Feature Extraction:
It is step for extracting feature column matrix which it key
of recognition phase.

3.3 Character Recognition by MLP Neural Network:


 The segmented characters are recognized using
Multilayer Perceptron (MLP) Neural Network.
 For segmented Bangla character recognition, the three
layers feed forward supervised neural networks are
designed.
 In these three layers MLP neural network Log-Sigmoid
and Hyperbolic-Sigmoid transfer functions are used as
activation functions.
 Two 3-layer neural networks are created;
o One is for all Bangla characters recognition
including vowels, consonants and conjunctives;
o Another is for modifiers of Bangla characters.
 Each MLP network has 400 neurons in input layer
according to the feature column matrix.

4) Confirmation of textual road sign:

 The recognized text is matched with a pre-memorized Bangla road


sign text list to confirm whether it is a Bangladeshi road sign text
or not.
 If it is a member of Bangla road signs' text list, then it is sent to the
conversion phase to convert text into Times New Roman text
Format.

5) Speech Synthesis:

 Process of converting written text into spoken language (digital


audio stream).

7
Experimental Results and Discussion

 The performance and accuracy rate of the proposed system are measured by
testing four major modules with a set of real time video frames, which contains
different types of Bangla road signs.
 The modules of our proposed Bangla Road Sign Recognition System are
implemented using Visual Basic 6.0 and MATLAB 7.0.
 The following definitions are applied:
Success Rate = (Total of Success/ Total Number of Input Sample) x100 %
Failure Rate=(Total Number of Failure/Total number of Input Sample) x100%
Efficiency= (100-Failure Rate) %
 In this system, the total computing time required for recognizing a road sign after
capturing an image up to speech synthesis is 0.98s in non complied Matlab
environment.
 That means that before capturing a new frame the previous frame completely pass
all of phases of the system. As a result, the time complexity of the system is quite
low.
 After testing this system, the obtained accuracy rate was evaluated at 91.48%.

8
9
10
A First Approach to Learning a Model of Traffic Signs
using Connectionist and syntactic methods
Overview

The approach is proposed by miguel sainz and alberto sanfeliu. The main purpose of this
research is to develop a system for learning and recognizing traffic signs using neural
network and syntactic methods. This paper use tow level of learning, first:
segmentation learning based on neural network, second: model learning based on
grammatical inference. Recognition process uses the results of the learning process as
input for identifying the traffic sings. The recognition of traffic sings in scene is done
in tow steps, first: the sign is located in the scene by using a connectionist
segmentation method, second: the sign is coded and analyzed to determine which
traffic sign it is. The system has been tested successfully only for the first step but the
second step is currently under development.

Approach
1. The Learning Process:

1.1 Segmentation Learning Based On Neural Network

1. The input of this step is color image obtained from TV camera on the
car.
2. The human Operator decides how many different labels the system will
consider (in the traffic sign recognition problem we have considered the
following 5 labels (road-road lines- sky -grass-traffic sign)).
3. then he marks and labels some areas in a set of images "Segmentation
areas "
4. Segmentation module consist of 3 layered neural net, this net is trained by
the back propagation method.
5. Once the net is trained we perform a validation test over a set of test
images to check the learning performances, In the case of low efficiency or
non-satisfactory we can modify the samples or the net parameters to
improve the learning of Segmentation module.

11
1.2 Model Learning
1. In this level the operator marks the areas on the scenes where the
model is locate these areas called model areas
2. It is necessary to preprocess the image before the learning process
starts .
i. the preprocessing has 3 parts :
ii. Optimization of the areas.
1.2.2.2 Normalization of the sizes (to get the same information
from any of the different images area, we should
normalize the size of the model areas).
1. 2.2.3 coding into symbols the content of the sample areas (we
code each pixel of the model area into one of the
following 4 symbols red(R),white(W),black(B) , the
remaining of colors ($). improve the shape of the traffic
sign (remove holes and smoothing the contour ) by
applied morphological process) .
1.2.3 Extract the primitive chains by reading the primitives of the coded
sample.
1.2.4 Validation test is applied to evaluate how good the system ,if it is not
good the operator may restart the model learning level .

2-The Recognition Process


The Recognition Process Divided Into 3 Steps and 3rd Step Divided Into 2 Phases

2.1 Location Of The Traffic Sign By Using Segmentation Model see figure 3 .
2.2 Morphological Process Is Applied To Remove Noise And Fill Up Grapes.
2.3 The Third Step Is The Recognition Of Traffic Sign This Step Is Divided
Into Phases

12
2.3.1 Finding A Distance Measure Between The Extracted
Symbol Chain And Each Inferred Grammar Of Traffic Sign
Models –Error correcting parser is used.
2.3.2 Analyze The Symbol Inside The Sign.

Results:

13
we see 2 road scenes on the right side and 2 segment rode scenes ,as we see the
Segmentation process gives very good results

we see traffic sign from the scenes and the results of coding them into grammar
symbols

14
Signfinder: Using Color to detect, localize and identify
Informational signs
Overview :
The Approach Is Proposed By A.L. Yuille, D. Snow, And M. Nitzberg .
The Main Purpose Of This Search Paper Is Describe An Approach That Is Able To Detect And
Locate Certain Important Classes Of Signs. The Signs Are Then Automatically Transformed
To A Standard (Frontal) Viewpoint . Most Street Signs Obey The Following Assumptions,
First :The Signs Have Stereotypical Boundary Shapes, Second : The Writing On The Sign Has
One Uniform Color And The Rest Of The Sign Has A Second Uniform Color , Third : The
Text Font Is In A Standard Font Set. We Start By Selecting Simple Tests Which Can Be Run
In Parallel Over The Image. These Tests Locate Seed Positions For Hypotheses. The Seeds
Initiate Region Growing To Segment Two Color Regions (E.G. Signs). A Specialized Edge
Detector Is Used On The Segmented Regions To Determine The Precise Location Of The Sign
Boundaries And To Con_Rm (Or Deny) That The Region Is Really A Sign. From The
Boundaries We Can Calculate An Affine Transformation To Transform The Sign To A
Standard (Frontal) Viewpoint .

Approach :

These Steps Have Been Applied On 100 Red And White Stop Signs . Steps In General Are As
Follows:
1. Get A Database Of Color Images With Signs In Them In Different Situations Of Light ,
Shadow , Vision Of The Place And Distortions
2. In Order To Locate The Position Of A Sign Inside Image ,We Must

2.1. Identify The Seed Regions (Which There Are Two Color Peaks).
In Order To Determine Seed We Use Statistical Analysis Of The Colors In Order To Learn Adjacent
Sets Of Red And White Pixels With Unknown Illuminant (I.E. For Stop Signs). And Then Apply
Multiplicative Model To Deduce The Set Of Illuminant Colors.

2.2.Then Apply The Algorithm To The Growth Of The Regions .

Growth Of The Seed Region Is Through The Integration Neighboring Pixels


Of The Seed Where The Properties Of Colors Similar To The Properties Of
The Colors In The Seed Region .

2.3 Tested It (I.E. Hypothesis Regions Which Obtained By Applying The Growing
Algorithm In The Previous Step) .
By Detecting Straight Line Edges (To Find Boundary , This Paper Use Specially
Tailored Edge Detectors And A Variant Of The Hough Transform To Detect The
Boundaries And The Corners Of The Sign) In Order To Obtain Information On The
Geometric Shape Of The Sign , Which Allows Us To Know The Direction Of The

15
Sign Then Normalize The Sign To A Front Parallel Viewpoint At Fixed Scale (This
Makes Reading Directly).

Results :
The Algorithm Worked At Close To One Hundred Percent Effectiveness On Our
Dataset . This Section Shows The Final Results On Some Of The More Difficult
Images In Our Database. These Involve Partial Occlusion, Heavy Shadowing, And
Difficult Illuminant Colors And Pose.
1 . Sign With Partial Occlusion

a) The image containing the occluded sign. b) The final result of the algorithm

2. Sign with large shadowing.

a) The image containing the shadowed sign. b) The final result of the algorithm.

3- Sign at difficult viewpoint and illuminant

16
a) The image containing the sign at difficult viewpoint and illuminant. b) The final result of the
algorithm.

Road Traffic Sign Detection and Classification


Overview:
The approach proposed by A. Escalera, L. E. Moreno ,M. A. Salichs and J. M.
Armingol. The algorithm presented here has two modules. The first one, localizes the
sign in the image depending on the color and the form. The second one,Recognizes
the sign through a neural network.
A system capable of performing such a task would be very valuable and would have
different applications. It could be used as an assistant for drivers, alerting them about
the presence of some specific sign or some risky situation.

Approach:
The algorithm is divided in two modules described as follow:

Module 1: TRAFFIC SIGN DETECTION:

Step 1: Color Thresholding


The functions that give the red, green, and blue levels of each point of the image.

17
Step 2: The Optimal Corner Detector

Have two methods:


1- The first is corner detectors that work on the codification
of the edge of an object:
o Divide the image into regions, extract the subsequent
of the edges of these regions then codification.
2- Corner detectors that work directly on the image:
o Find the corner using convolution of the image with
mask.

Step 3: Corner Extraction:


Algorithm steps to obtain a corner of an image as following:
 It obtains the convolution for every type of mask
 It selects the points above a threshold. This threshold is obtained from an ideal result.
 It calculates the center of mass. Although the detection mask is built to obtain the
maximum value of the convolution exactly in the corner, because the image is never
ideal and the threshold, a sole isolated point will never appear labeled as a corner.

Step 4: Example of different signs detection


 Triangular Signs Detection
By seeking in the image the three kinds of corners that form the triangle.
By proving they are forming an equilateral triangle.
Following these steps:
1- Corner detection.
2- The study of the position of the corners

 Rectangular Signs Detection


By seeks the four kinds of 90 corners that form the sign and that are located
defining a rectangle. Following these Steps:
1- Corner detection.
2- The study of the position of the corners

 Circular Signs Detection


Masks to locate some portions of a circumference can be built, and the
circumference they belong to can be found from the convolution.
The masks built for the 90 corners are an approximation of small-circumference arcs
located in the 45 , 135 , 225 , and 315 angles.

18
Module 1: TRAFFIC SIGN CLASSIFICATION

By using multilayer perceptron NN, the size of the input layer corresponds to an
image of 30* 30 pixels, and the output layer is ten . Present the image as the input
pattern. Two neural networks were trained because the detection algorithm is different
according to the form of the sign. The studied NN's were three, the number and
dimension of their hidden layers being different.

Step 1: Image Normalization


Normalize the image obtained by the detection module to the dimensions 30*30 using ,
the relation between the dimension we need and the ones we have obtained is calculated, the
pixels are repeated or discarded depending on that relation (using the nearest neighbor
method).

Step 2: Training Patterns


Nine ideal signs were chosen for the net training .The training patterns are obtained from
these signs through the following modifications.
1- The slope accepted for a sign is 6. From every one of the nine signs, another five were
obtained by covering that draft range.
2- Three Gaussian noise levels were added to each of the previous signs. This way, during the
training of the net, low weights were associated with the background pixels of the inner part
of the sign.
3- Four different thresholds were applied to the resulting image; the system is adapted to
various lighting conditions that the real images will present.
4- After making a decision about the net dimensions, a new set of training patterns was made,
taking into account a displacement of three pixels to the left and to the right. Then, from the
chosen ideal patterns, 1620 training patterns were obtained.

Experiments & Results:

The dimensions of the three studied NN's are as follows:


30*30 is input size to the network, 10 is the output and between them is the hidden
layer of Triangular, Rectangular and Circular respectively
1) 30*30/30/10;
2) 30* 30/30/15/10;
3) 30* 30/15/5/10.
The three NN's were trained with the patterns obtained from the first three conditions.
In order to compare the results, some test images were chosen, as shown in Fig. 9.
The best results corresponded to the third network and are shown in Table III (0
minimum value, 100 maximum).
The algorithm has been implemented in an ITI 150/40 in a PC486 33 MHz with Local
Bus. The speed of the detection phase is 220 ms for a 256*256 image.
The neural network runs in the PC CPU and takes 1.2 s. The implementation of the
neural network in a digital signal processor (DSP) is undergoing research, and the
expected speed is between 30–40 ms.

19
20
Fig.9 Ideal signs and test images.

Our Approach:
Module 1: concept of neural network
1-Diffentions:
 A neural network is an interconnected group of neurons. The prime examples
are biological neural networks, especially the human brain.
 An artificial neural network is a mathematical or computational model for
information processing based on a connectionist approach to computation.
 The original inspiration for the technique was from examination of
bioelectrical networks in the brain formed by neurons and their synapses. In
a neural network model, simple nodes (or "neurons", or "units") are
connected together to form a network of nodes — hence the term "neural
network”.
 Composed of many “neurons” that co-operate to perform the desired
function.

2- Neural network in real life:


 In real life applications, neural networks perform particularly well on the
following common tasks:
 Function approximation (aka regression analysis)
 Time series prediction
 Classification
 Pattern recognition
 Problems with noisy data
 Prediction.
 Classification.
 Data association.
 Data conceptualization.
 Filtering.
 Planning Recognizing and matching complicated, vague, or incomplete
patterns.
 Prediction: learning from past experience
 Pick the best stocks in the market.

21
 Predict weather.
 Identify people with cancer risk.
 Extrapolation based on historical data.

 Classification
 Image processing.
 Predict bankruptcy for credit card companies.
 Risk assessment.
 Feature extraction, image matching.
 Recognition
 Pattern recognition: SNOOPE (bomb detector in U.S. airports).
 Character recognition.
 Handwriting: processing checks.
 Data association
 Not only identify the characters that were scanned but identify when
the scanner is not working properly.
 Data Conceptualization
 Infer grouping relationships e.g. extract from a database the names of
those most likely to buy a particular product.

 Data Filtering
 E.g. take the noise out of a telephone signal, signal smoothing.
 Planning
 Unknown environments.
 Sensor data is noisy.
 Fairly new approach to planning.
 Noise Reduction
 Recognize patterns in the inputs and produce noiseless outputs.

3- What is neuron?

I. A simple neuron:
 An artificial neuron is a device with many inputs and one output.

 The neuron has two modes of operation:


1. The training mode:
• the neuron can be trained to fire (or not), for particular input
patterns.
2. The using mode.
• when a taught input pattern is detected at the input, its
associated output becomes the current output.

II. A more complicated neuron:


 The difference from the previous model is that the inputs are ‘weighted’; the
effect that each input has at decision making is dependent on the weight of
the particular input.

22
 For example:
 The network has 2 inputs, and one output(Fig1). All are binary.
The output is
o 1 if W0I0 + W1I1 + Wb > 0 
o 0 if W0I0 + W1I1 + Wb ≤ 0 
 We want it to learn simple OR: output a 1 if either I0 or I1 is 1.

Fig1. Explanation of how to calculate weight.

Module 2: Neural Network Architectures:


 Network architectures have two models with single-layer or with multiple-
layers:
 Single-layer:
The S-neuron, R-input, one-layer network (Fig.2) are the variables tell you
that for this layer ,P is a vector of length R , W is an S *R matrix, a and b are
vectors of Length S the layer includes the weight matrix, the summation and
multiplication operations, the bias vector b , the transfer function boxes and
the output vector.

Fig.2 Neuron with R Inputs.

 Multile-layers:
Multilayer networks are more powerful than single-layer networks(Fig.3). For
instance,

23
a two-layer network having a sigmoid first layer and a linear second layer can
be trained to approximate most functions arbitrarily well. Single-layer
networks cannot do this.

Fig.3 Three-
Layer Network.

 How to pick an Architecture?


Problem specifications help define the network in the following ways:
 Number of network inputs = number of problem inputs.
 Number of neurons in output layer = number of problem output.
 Output layer transfer function choice at least partly determined by
problem specification of the outputs.

Module 3: Learning rules:


The purpose of the learning rule is to train the network to perform some task. There
are many types of neural network learning rules They fall into three broad
categories.

 Supervised Learning :
The learning rule is provided with a set of examples(the training set) of proper
network behavior:
where p is an input to the network and t is the corresponding correct(target) output.
As the inputs are applied to the network, the network outputs are compared to the
targets.

 reinforcement (or graded) learning


Is similar to supervised learning, except that, instead of being provided with the
correct output for each network input, the algorithm is only given a grade. The
grade (or score) is a measure of the network performance over some sequence
of inputs. This type of learning is currently much less common than supervised
learning.

 unsupervised learning

24
The weights and biases are modified in response to network inputs only. There
are no target outputs available. At first glance this might seem to be impractical.
How can you train a network if you don`t know what it is supposed to do? Most
of these algorithms perform some kind of clustering operation. They learn to
categorize the input patterns into a finite number of classes. This is especially
useful in such applications as vector quantization.

Module 4: Perceptron Artificial Neural Network 


The perceptron is a type of artificial neural network invented in 1957 at the
Cornell Aeronautical Laboratory by Frank Rosenblatt. Consists of one or more
layers of artificial neurons; the inputs are fed directly to the outputs via a series
of weights.
 Perceptron Architecture:
 Single-layer Perceptron:
o Consists of a single layer of output nodes.
o The inputs are fed directly to the outputs via a series of
weights (feed-forward network).
o Single-unit perceptrons are only capable of learning linearly.
 Multi-layer Perceptron:
o Consists of multiple layers of computational units, usually
interconnected in a feed-forward way.
o Each neuron in one layer has directed connections to the
neurons of the subsequent layer.
o In many applications the units of these networks apply a
sigmoid function as an activation function.
o Use a variety of learning techniques, the most popular
being back-propagation.
o In back-propagation the output values are compared with the
correct answer to compute the value of some predefined
error-function.
o The error is then fed back through the network.
o The algorithm adjusts the weights of each connection in order
to reduce the value of the error function by some small
amount. After repeating this process for a sufficiently large
number of training cycles the network will usually converge to
some state where the error of the calculations is small. In this
case one says that the network has learned a certain target
function

25
26

You might also like