Professional Documents
Culture Documents
Design Lane Detector System For Autonomous Vehicles Based On Hardware Using Xilinx System Generator
Design Lane Detector System For Autonomous Vehicles Based On Hardware Using Xilinx System Generator
Abstract—Hough Transform (HT) algorithm is a method for the computation time of the Hough Transform algorithms. Ad-
extracting straight line from an edge image. In the HT, the ditionally, [2] also proposed solutions for parallel computation
parameters of edge pixels, i.e. points in an image with sharp and parallel voting to increase processing speed.
intensity changes, are treated as ”votes”. Then, they will be
accumulated in Hough-space variables (ρ, θ) to find the votes Based on [2], [3] continues to increase processing speed
which is maximum values or in other words, pixels that have the and reduce memory requirements by electing locations with
same value of Gatan and the ρ distance will be on the same. How- the largest number of votes locally - local maximum.
ever, this algorithm requires huge memory and computational For [4], the authors built a new algorithm for Hough
complexity. In this paper, we proposes a HT architecture that uses Parameter Space (HPS). This reduced the value of memory
a Look Up Table (LUT) to store trigonometric values and use
the value of orientation θ calculated in the Sobel Edge Detection requirements compared to the standard Hough Transform
algorithm instead of rotating small angles as HT standard. We while maintaining the accuracy of detected lines and overall
will then reduce the processing time for each image frame so that pixel information.
it can be applied in real-time processing. Our work has been Similar to [4], the authors modified the basic HT algorithm
processed at 170MHz and the processing time per 1024x1024 in [5] by using lane-specific properties, while also breaking
image resolution frame is 6.17ms.
Index Terms—Hough Transform, Matlab Simulink, Xilinx
down the modified algorithm into a collection of Hardware
System Generator, Hough Transform, Look Up Table, PCIe. (HW) and Software (SW) components. Directly lowering
HT computation and memory usage significantly improves
I. I NTRODUCTION operational performance.
Lane detection is one of the objectives with particular However, all reported solutions primarily optimized HT and
significance in image processing, computer vision; applied in solved the post-processing cost of Inverse Hough Transform
industry as vehicle indication or in advanced driver assistance (IHT) detection that has not yet been implemented in real
technology Advanced Driver Assistance System (ADAS). times. In this paper, we will rebuild a Hough Transform
Although many alternative algorithms have been used in architecture based on the article [4]–[5] to build a full system
lane recognition, the Hough Transform method has found of image processing algorithms from the original video, and
widespread use because it consistently handles issues includ- at the same time improve the execution speed, and processing
ing non-contiguous lines, the appearance of numerous broken resolution of the image frame. Since then, a lane detection and
lines, and the impact of traffic, interference caused by the simulation system has been designed and simulated on Matlab
environment. However, due of the numerous trigonometric Simulink software combined with Xilinx System Generator [6]
calculations and the substantial memory requirements, this library to check the system’s functionality. After testing the
technique has a high computational cost and will operate functionality, we will proceed to create an HDL Netlist into
slowly in real-world situations. Implementing a lane detecting Vivado software and encapsulate Block Design so that it can
system for driver less vehicles based on the Hough Transform be implemented on the PCIe interface.
method is thus still difficult. III. P ROPOSED ARCHITECTURE
In this paper, we suggest a new Hough Transform for As you can see in the Fig.1, our proposed architecture has
straight lane detection architecture that is better suited for an 03 parts:
FPGA implementation in order to achieve a real-time lane
• Matlab: include Matlab scripts designed to add the image
detection system.
to be processed simultaneously performs the steps, sets
The paper is set up as follows. Section II presents the
the parameters, Pre processing, Post processing, and
reported and customary some architectures. Sections III and IV
displays the results.
discuss the suggested architecture and FPGA implementation.
• Simulink and Xilinx System Generator: include modules
Finally, Sections V provides the testing and results of our
that will be designed with Matlab Simulink and Xilinx
architecture.
System Generator tool combined with the parameters
II. R ELATED WORK declared in the Matlab section. All the modules that we
According to previous studies, [2]–[3] built a hardware design are shown from A to D.
• Hardware: implement design dump to FPGA Virtex-7
architecture for the parallel Hough Transform algorithm and
used the LUT lookup table solution, which helped to reduce adopts a PCIe interface and real-time execution.
Fig. 3: Gray Scale block diagram
A. Masking module cal direction of the template and then multiply the vertical
direction by horizontal direction of thetemplate:
The main responsibility of the Mask module is to inform
1 0 −1
subsequent modules about which image pixel regions of lanes. vertical = 2 0 −2
This helps make follow-up processes simpler. The common 1 0 −1
lane will be white or yellow depending on the results of the
experiment. Based on the magnitude value of each pixel with 1 2 1
03 color channels (Red, Green and Blue) to determine if the horizontal = 0 0 0
pixel is white or yellow in the frame by comparing the value −1 −2 −1
• Cordic atan: calculate the gradient direction Gatan and the
of each pixel in each color channel with a constant according
to experiment. If those pixels are likely to be lanes, that pixel’s gradient size Gmag by the formula Eq.(2) Eq.(3) with a
RGB value will then be converted to (255, 255, 255) via AND block in Xilinx, CORDIC ATAN.
and Mux blocks to combine the 3 color channels together. Gatan = atan(Gy /Gx ) (2)
Fig.2 shows the architecture of this Masking module. q
Gmag = G2x + G2y (3)
• Coordinate Counter: store the coordinates (x, y) of the
pixels.
• Compare: select the appropriate Threshold (TH), If the
new pixel value ≥ T H, The pixel points can be regarded
as image edge points. Additionally, we will combine area
Region of Interest (ROI) to find competing pixels that are
likely to become white lines/yellow lines in the image.
V. T ESTING AND R ESULTS The prototype system’s three primary components make
up its core circuit, which uses the FPGA to implement line
The hardware-based autonomous vehicle lane detecting detection in the input video sequence. The first component,
system was created using the Xilinx System Generator and implemented in Python, carries out the preparation for im-
Matlab Simulink tools. Then, we will generate HDL code file age edge extraction and has the ability to read videos and
to run Post Timing Implementation. It was demonstrated that change the IP of an architecture proposal’s size. The second
the Block Design could attain a clock frequency of 170MHz component is the FPGA, which receives data and reads the
on a Virtex-7 VC707 board by synthesizing and implementing results from the return table. Following post-processing, it
it using the Xilinx Vivado Design Suite for various image will proceed to plot the detected coverage and show the
resolutions. The processing time for one pixel in our imple- results on the screen. The proposed implementation was tested
mentation is merely 5.88ns. This outcome does pertain to a on numerous videos with different lighting and road scene
1024x1024 pixel image, it is true. In order to achieve the conditions, including road type (urban street, highway), road
optimal balance of HS accuracy (resolution), processing speed, condition, occlusion, poor line paints, day and night. We
and FPGA resource use, the size of the processed image was should point out that the voting threshold was the same for
selected. Table.I will list the resources necessary to create a all photographs. By visual comparison, we can demonstrate
video with a 1024x1024 pixel resolution. Only 33.06% of from this figure that the implemented architecture successfully
the FPGA’s BRAM tiles are needed by the improved HPS recognizes the straight lane lines. Some images of videos
memory. Furthermore, only 5.83% of the Virtex-7 VC707- under different conditions is shown in Fig.12.
LUT 1’s is used, whereas 25% of the PCIe resource is used.
The performance comparison of various HT implementa-
The model is then written through the PCIe link to the Virtex-7
tions from the literature is shown in Table.II. The frame
VC707 FPGA Board. The Virtex-7 board is shown connected
rate is influenced by the architecture, the hardware platform
to the host PC’s PCIe connection in Fig.11.
being used, the size and the amount of symbols in the image,
as well as other factors. Since the submitted works employ
TABLE I: FPGA resource requirements various resolutions, pre-processing techniques, and platforms,
we adopt a normalized speed as a merit factor. Research [2]
Resource Ultilization Available Ultilization %
LUT 17689 303600 5.83 and [3] have built a processing system for 1024x768 and
LUTRAM 2820 130800 2.16 640x480 images with frequencies of 200MHz and 50MHz,
FF 19204 607200 3.16 respectively. In study [4], the authors built a processing system
BRAM 340.50 1030 33.06 for 460x480 images with an execution speed of 1.47ms/frame.
DSP 2 2800 0.07 In the study [5], the author processed each image frame
IO 4 700 0.57 with the size of 1024x1024 with a frequency of 145MHz,
GT 1 28 3.57 the processing speed is 9.03ms/frame. The architecture we
BUFG 7 32 21.88 built handles the same input image size of the same size the
MMCM 2 14 14.29 study [5], the system achieves a frequency of 170MHz and a
PCIe 1 4 25.00 processing time of 6.17ms/frame.
TABLE II: Results of our work comparison with different architectures
[2] [3] [4] [5] Our architecture
Image Resolution 1024x768 640x480 640x480 1024x1024 1024x1024
Fmax (MHz) 200 50 200 145 170
Processing Speed (ms/frame) 5.4 7.4 1.47 9.03 6.17
Normallized Speed (ns/pixel) 6.8 24.08 4.78 8.61 5.88
Device Altera Stratix IV Cyclone II FPGA Virtex-5 ML505 Xilinx xc7z001-1 Virtex-7 VC707
VI. C ONCLUSIONS
In this paper, we presented a hardware architecture for
Matlab Simulink and Xilinx System Generator-based real-time