Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/353314517

Autonomous Driving Software for FS-AI 2021

Technical Report · July 2021


DOI: 10.13140/RG.2.2.26045.18406

CITATIONS READS
0 263

15 authors, including:

Michael Samy Hannalla Ahmed Tarek


Ain Shams University The British University in Egypt
1 PUBLICATION 0 CITATIONS 3 PUBLICATIONS 2 CITATIONS

SEE PROFILE SEE PROFILE

Salma Ibrahim
Ain Shams University
1 PUBLICATION 0 CITATIONS

SEE PROFILE

All content following this page was uploaded by Michael Samy Hannalla on 17 July 2021.

The user has requested enhancement of the downloaded file.


Autonomous Driving Software for FS-AI 2021
Ain Shams University, ASU Racing Team

Michael Hannalla, Haidy Sorial, Ahmed Tarek, Salma Ibrahim, Ahmed Hesham, Islam Ayman,
Laila Elgenedi, Abdelrahman Ayman, Ahmed Samy, Omar Fadl, Kareem Alsawah,
Aly Alkady, Nesma Walid, Hassan Amr, Moataz Elashery

Abstract—Self driving cars field is a very active area of together to be allowed to communicate using an Ethernet
research that impacts and benefits everyday life on roads. This network.
document demonstrates the phases undergone by our team to
develop an autonomous vehicle, to participate and compete III. P ERCEPTION
in the 2021 Formula Student AI competition. We include the
state-of-the-art schemes we studied and the diverse methods we This section demonstrates the different cone detection
endeavored to approach our problem. pipelines and utilities we used in our system, mentioning both
laser based and vision based methods.
D ISCLAIMER
This paper is written as an autonomous design report to be A. LiDAR Pointcloud Pre-processing
submitted to FS-AI judges in Formula Student competition. 1) Field of View Trimming: Due to the LiDAR’s placement
Content may include mentioning trademarks and manufactur- on the vehicle and the wide surrounding, there are too many
ers for sensors, computers, or software. These manufacturers irrelevant points which we need to get rid of. We start by
do not supervise or endorse any of the content in this paper. applying pass-filters to trim the field-of-view, a longitudinal
to remove the vehicle points and anything far away, and a
I. I NTRODUCTION lateral one to limit the side field.
The aim of this work is to provide state of the art approaches 2) Adaptive Ground Removal: After applying the FOV
in solving the problems facing autonomous driving starting trimming, we need to filter out the ground points. We apply an
from the generalized perception in most of weather conditions adaptive algorithm that acclimates to changes in the inclination
and robust to the scale of the image semantic content moving of the ground. We use a RANSAC algorithm to segment the
through providing the car with conscious strategic maneuver ground plane from the cones’planes; treating the ground points
when required. The work would be tested against standards as outliers and leaving us with the non-ground points.
benchmarks in measuring the performance of self-driving 3) Clustering: For point cloud clustering, we made a choice
vehicles and partnered with the industrial pioneers in the field using the Euclidean distance Clustering Algorithm by the
of autonomous vehicles. means of a K-D tree as the base spatial locator class for
nearest neighbor estimation. This approach is implemented by
II. S YSTEM OVERVIEW subdividing the space into boxes of fixed widths, or in a more
This section provides an overview on the hardware and general case an octree data structure. This method is very fast
sensors used, high level architecture of our autonomous system to build and gives us a useful representation of the data in
from the very high level perception to the lowest level of every resultant 3D-box.
control and commands sent to the ADS-DV1 via the CAN 4) Cone Reconstruction: The next reasonable step is to
bus. Fig. 1 shows the high level architecture of our software. reconstruct the clusters obtained, as carrying out the ground
We use Velodyne VLP-16 LiDAR, FLIR Backfly S monocular removal process causes the loss and elimination of some
camera, and ZED stereo camera for perception. Along with an cone points along with the ground points.Unfortunately, this
IMU, GPS, and feedback from wheel and steering encoders dwindles the already modest number of points used in cone
for state estimation. Perception sensors are used in different detection. This is solved by restoring a cylinder-shaped surface
pipelines for cone detection and drivable space estimation. area of points with a diameter equal to that of the cone width
State estimates and cone detections are fed to the SLAM around each cluster using points of the antecedent point-cloud
module that fuses all these together to output an optimal state (before ground removal). This way, we can assure that most of
estimate and a map. The map and pose estimate are then taken the mistakenly revoked points are retrieved, so our detection
by the planning module to output waypoints that the navigation and colour pattern estimation processes can transpire smoothly
module sends control commands via the CAN bus to track and and more accurately.
follow these waypoints.
1 ADS-DV is the shared formula vehicle that FS-AI DDT (Dynamic Driving
All of our software is based on ROS installed on Ubuntu.
Task) competitors use to deploy their autonomous systems on during the
Our computing devices are the prefitted InCarPC and an competition. This vehicle has a prefitted ZED camera and an InCarPC
NVIDIA Jetson TX2. Computers and sensors are connected computer.
Fig. 1. High-level software architecture

5) Filtration: After the cluster reconstruction, each cluster specific problem, we train the model using open source racing-
should be passed through two filters to ascertain the fact that cones dataset, which we modified to also differentiate between
all the clusters used for detection are truly cones. This is different cone-colours.
done by the means of two filters; the Rule Based Filter and
the Z-Centroid Filter. The former depends on calculating C. Mono-LiDAR Cone Detection Pipeline
the expected number of points in a cone according to its In this section, we fused between two sensors: a camera
distance from the vehicle, cone dimensions, and the LiDAR and LiDAR. The LiDAR as a stand-alone sensor gets accurate
specifications, then comparing this value to the substantial depth and localization, while the camera as a stand-alone
cluster number of detected points. If the difference between sensor gets accurate colors. By fusing the two sensors, we
the two numbers does not exceed a certain threshold, then leverage the measurements coming out from the two sensors
the cluster passes by the latter filter. Since all cones are of individually. We mainly rely on projection of the LiDAR
equal dimensions, consequently they should all have the same points on the camera’s image plane to get the pose of the
centroid position. However, the variance in their distances cones and their colors.
from the vehicle results in having a certain range for the Let’s denote C as the incoming clusters from the LiDAR
centroid values instead of one specific distinct number. This pre-processing step as a stacked matrix of column vectors and
filter assures that all cone cluster centroids lie within that each column C i as i-th cluster in the incoming clusters from
range. Otherwise, the cluster is not regarded as a cone and the LiDAR pre-processing step.
it does not undergo the colour estimation process. Where L C are the clusters given in the LiDAR’s coordinate
frame and N represents the number of points in the point
B. Image Object Detection cloud. We then need to transform the points from the LiDAR
As the tracks are delineated by blue, yellow and different frame to the camera frame using simple homogeneous trans-
size orange cones, our goal is to localize the cones and formation as follows:
differentiate between their different colours with high accuracy C
C = C TL ∗ L C
and low latency for safe maneuvering through the track. We
tackle the detection challenge using the state-of-the-art model, Where C C are the clusters transformed to the camera’s
YOLOv3 [1] [2] architecture. To have our model fit our coordinate frame and C TL ∈ SE(3) represents the homo-
geneous transformation between the LiDAR frame and the
T T T
camera frame. [v̇x , v̇y ] = [ax , ay ] + ψ̇ [vy , −vx ] + nv
We then need to project these points from the 3D camera ψ̇ = nψ̇ (1)
frame to the image plane as follows (subject to matrix broad- ȧ = na
casting and vectorization):
       For the correction step of the EKF, multiple sensors were
U 1 fx 0 X c used to correct the state estimates. GPS was used to correct
=Z + x
V 0 fy Y cy position as well as the local longitudinal and lateral velocities
Where U is a stacked row vector for u-coordinates for of the vehicle with the following measurement model.
clusters projected on the image plane, V is another stacked row    
vector for v-coordinates for clusters projected on the image Pxgps Px
plane, f x is x-axis focal length, f y is y-axis focal length, cx  Pygps   Py 
 vxgps  =  vx cos ψ − vy sin ψ
    + ngps (2)
and cy are u-coordinate and v-coordinate of the center pixel 
on the image plane. vygps vx sin ψ + vy cos ψ
Since we already have both the bounding box coordinates
A magnetometer was used to correct vehicle yaw estimate.
- from the object detector module - and the projected LiDAR
Before correction, calibration is done for a variable duration
clusters on the image plane, we use bounding box coordinates
to register initial heading measurement by averaging all the
to find which cluster from the LiDAR are belongs to the
readings within this duration and use it as reference for later
bounding box. By reaching this points, we have got the 3D
estimates. This will ensure that magnetometer heading is
coordinates of the object defined inside the bounding box.
always zero at starting point.
D. Stereo Cone Detection Pipeline LiDAR scans were also used to achieve odometry through
In order to localize cones in the world frame, this pipeline Advanced LOAM implementation based on the paper [3] by
has two main components; first of them being the cones’ J. Zhang and S. Singh. Additionally, visual odometry from
bounding boxes and the second being the disparity map. Since ZED stereo camera was also used to correct vehicle’s position
the bounding boxes, accompanied with its colour, are already & yaw angle to overall provide a reliable and redundant
obtained from object detection module, this section focuses on system for estimating vehicle states.
obtaining 3D coordinates of the cones.
We start by rectifying the left and right images to remove In order to achieve sensor redundancy and protect the
any distortion. Then disparity matching is computed with autonomous system from sensor failure, wheel encoders were
reference to the left frame with a confidence score for every used to correct local longitudinal and lateral vehicle velocities
disparity-pixel. For every cone we obtain a disparity region as well as yaw rate through the implementation of the tire slip
by indexing the disparity map using the cone’s bounding model [4] but with a constant slip assumption.
box coordinates. The disparity region may contain different
values for the cone so we choose the pixel with the highest B. SLAM
confidence score as it being the most reliable. Using disparity As our system has to navigate autonomously in an unseen
value, bounding box center coordinates and intrinsic camera environment; we don’t know the shape or length of the race-
parameters we calculate the cone extrinsic parameters by using track. This leads us to deploy SLAM algorithms in our system.
the stereopsis equations. Our system needs to map cones of the track and estimate the
vehicle position within these landmarks of interest (cones).
IV. S TATE E STIMATION Our algorithm is based on FastSLAM 2.0 algorithm proposed
in [5] which is a probabilistic approach to SLAM based on the
A. Motion Estimation concept of Rao-blackwellization. Rao-blackwellization is the
splitting of the SLAM problem into pose estimation problem
A standard extended Kalman filter is used to predict
and independent landmarks estimation using N number of
and hcorrect the system states. The proposed state vector
iT Kalman filters, one for each landmark as its position state
x = Px , Py , ψ, vx , vy , ψ̇ ∈ R6 where Px and Py are the estimator. Where N is the number of mapped landmarks
vehicle’s position, vx and vy are the longitudinal & lateral (cones).
velocities respectively, and ψ is vehicle’s yaw angle. Due to the challenges imposed by FS-AI, we modified
the base FastSLAM in the literature to suit our needs and
In our proposed system, a constant acceleration process requirements.
model is used since jerk is close to a zero mean Gaussian First, we made our landmark estimator estimate, not only
distribution. Hence, the positions and velocities are propagated landmark’s position, but also estimate its color - yellow cone or
using the acceleration and yaw rate from the IMU, resulting blue cone for example in our case - and account for perception
in the following process model. uncertainties. Landmark’s color matters in our system because
we then pass this map to the planning module that decides triangulation is also used in constructing an edges matrix,
which path to follow. which is a matrix consisting of all pairs of cones making
Also, as the main core for SLAM to provide correct the midpoints. The midpoints matrix and the edges matrix are
estimates for landmarks and poses, the vehicle should be able synchronized together and used in both the cost function and
to decide if its perception system is seeing again a previously the constraints.
mapped landmark. We used the Mahalanobis distance metric
to decide whether this is a new landmark or a landmark previ- C. Cost Parameters
ously mapped and being seen again, we chose the Mahalanobis
Those midpoints are then given costs and passed through a
distance to account for measurement uncertainties.
cost function so that the midpoints with the lowest costs are
For loop closure, we exploit the prior knowledge that the
chosen as goal way-points. Those way-points are then passed
race track starts and ends by a special type of cones (mentioned
to the navigation module.
in FS-AI rule-book). Once we detect these cones and localize
1) Distances: The first cost parameter is the distance be-
them, we can know that this is where the zero position of
tween each midpoint and the vehicle pose in the current time-
the vehicle started with full certainty. This triggers our loop
step. This is calculated as a euclidean distance. Then all the
closure algorithm to optimize and refine the map that was
distances between the vehicle and all midpoints are passed
incrementally drifting before loop closure due to odometry
through a softmax function to make all costs between zero
and measurement inaccuracies.
and one.
As the vehicle is moving along the track for the first time,
2) Heading Angles: Secondly, the angle between the vehi-
we take the outputs from FastSLAM and construct a pose
cle heading vector and each vector between the vehicle and
graph that gets optimized once loop closure conditions are
each midpoint is calculated. Thus, midpoints that require a
met. We used a pose graph optimization algorithm to optimize
small steering angle or no steering at all are given a low cost.
the map after loop closure. This optimized map then is used
All angle costs are passed through a softmax function as well.
in subsequent race laps (in challenges that has two or more
laps) by the vehicle to only localize in it and doesn’t perform 3) Voronoi meshing: Voronoi Meshing is a geometric mesh-
map updates. Thus shutting down the SLAM process and only ing technique that constructs polygons around the cones given
performing localization. certain geometric constraints. Some of these polygons are
The optimized map is then used by a Monte Carlo based lo- unbounded. Thus, all midpoints that are positioned inside or
calization algorithm to determine the vehicle’s location within on the edge of bounded polygons are given a zero cost, and
this map. This gives us the capability for faster pose estimate all midpoints that are placed outside the bounded polygons are
update rate thus allowing faster navigation. given a high cost(unity).

V. PATH P LANNING D. Cost Function


Our path planning problem is defined as finding suitable After calculating all the different costs for each midpoint,
goal way-points within the track in order for the navigation a cost function is constructed to calculate the total cost for
pipeline to move the vehicle correctly. each midpoint where each cost parameter is multiplied with
The path planning module has two inputs. The map (cone xy its respective weight. Then, a specific number of the lowest
coordinates in the global frame), cone colour, and the vehicle cost midpoints are returned. The cost of each midpoints is as
pose (xy coordinate and its heading angle) also in the global follows:
frame. In each time-step, the input map consists of all cones (Wd * Distances) + (Wh * Headings) + (Wv * VoronoiOutput)
seen by the vehicle in this time-step and all cones seen by
the vehicle in previous time-steps. Thus, preprocessing and E. Constraints
filtering of this map is necessary.
Only one constraint is performed using the edges matrix. As
A. Map Preprocessing a result of the track layout, a way-point can not be between
two blue cones or between two yellow cones. Any midpoint
First, the map is transformed to the local frame of the
between two cones with the same colours is either outside the
vehicle. Then, only the cones within a specific distance and
track or on the track boundaries. Thus, all these midpoints are
position from the vehicle are chosen. So the planning process
neglected.
will only be done on cones which are near the vehicle and are
in a specific x-Coordinate in the vehicle’s local frame. Those
F. Edge Cases
cones are then transformed back to the global frame and then
are separated using their colour (yellow, blue, etc..) If the preprocessing of cones outputs less than 3 cones
or only cones from the same colour, the algorithm will not
B. Delaunay Triangulation function correctly because of its constraints and because of the
The preprocessed cones are then given to a triangulation meshing constraints. Thus, all of these edge cases are handled
meshing technique called Delaunay triangulation which is used to make sure the planning process computes smoothly with no
to extract all midpoints between all of these cones. Delaunay struggles.
VI. NAVIGATION C ONTROL velocity vx to get more robust control actions for both low
and high speeds.
We implemented geometric control to use for the first lap
while mapping, after the first lap the vehicle can speed up x̃˙ = λf˜dyn ˜ ˙ ˜
(x̃, ũ) + (1 − λ)fkin (x̃, ũ, ũ)= f(x̃, ũ, u̇)
using model predictive control (dynamic control) to maximize vx −vx, blend min (6)
λ = min max vx, blend max −vx, blend min
,0 ,1
speed and test its boundaries. We participate in the DDT
category and the ADS-DV has two control modes, speed By using our fusion factor λ, we guarantee that we will get
control will be used for geometric control while torque control the best performance for both kinematic model and dynamic
used for model predictive control. models [9].

A. Geometric Control ACKNOWLEDGMENT


PID control for longitudinal control [6] and pure pursuit or This project is supported by Autotronics Research Lab, Ain
Stanley for lateral control [7]. Shams University, Egypt, and Embotech.
1) Pure Pursuit: Simple lateral control algorithm, dated
R EFERENCES
back in history to the pursuit of missile to a target. It simply
aligns the heading of the vehicle to a target at a set distance [1] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look
once: Unified, real-time object detection,” in Proceedings of the IEEE
on the path. Conference on Computer Vision and Pattern Recognition (CVPR), June
2) Stanley: We used a Stanley controller for lateral control 2016.
as it tries to minimize the cross-track error (distance from the [2] M. Bjelonic, “YOLO ROS: Real-time object detection for ROS,”
github.com/leggedrobotics/darknet ros, 2016–2018.
given path) and the heading error (difference between current [3] J. Zhang and S. Singh, “Loam : Lidar odometry and mapping in real-
heading and the tangent to the path’s heading). time,” Robotics: Science and Systems Conference (RSS), pp. 109–111, 01
2014.
[4] V. Fors, Autonomous Vehicle Maneuvering at the Limit of Friction.
B. Model Predictive Control Linköping University Electronic Press, 2020, vol. 2102.
[5] M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit, “Fastslam 2.0: An
1) Problem Formulation: We need to convert our Optimal improved particle filtering algorithm for simultaneous localization and
Control Problem (OCP) to a non-linear problem (NLP) to mapping that provably converges,” Proc. IJCAI Int. Joint Conf. Artif.
Intell., 06 2003.
solve it and get the minimum for our optimization variables [6] Kiam Heong Ang, G. Chong, and Yun Li, “Pid control system analysis,
using multiple shooting with Runge-Kutta 4 discretization. design, and technology,” IEEE Transactions on Control Systems Technol-
ogy, vol. 13, no. 4, pp. 559–576, 2005.
[7] J. Snider, “Automatic steering methods for autonomous automobile path
Cost Function tracking,” 04 2011.
Running (stage) Costs: path tracking of the reference track [8] M. Dawood, M. Abdelaziz, M. Ghoneima, and S. Hammad, “A nonlinear
saved after the first lap model predictive controller for autonomous driving,” 02 2020, pp. 151–
157.
2 2 2 [9] J. Kabzan, M. I. Valls, V. J. Reijgwart, H. F. Hendrikx, C. Ehmke,
`(x, u, ∆u) = kxu − xr kQ +ku − ur kR +k∆u − ∆ur kRrate M. Prajapat, A. Bühler, N. Gosala, M. Gupta, R. Sivanesan et al., “Amz
(3) driverless: The full autonomous racing system,” Journal of Field Robotics,
Constraints vol. 37, no. 7, pp. 1267–1294, 2020.
We have a set of constraints limiting the states and controls,
in addition to the equality constraints on the dynamics of the
vehicle to avoid non continuous states. Also, we added track
boundaries as an inequality constraint to avoid going out of the
track where, for example, the steering angle limit represents
the physical limit of the steering system. [8]
PN −1
minimize JN (x0 , u) = k=0 ` (xu (k), u(k), ∆u(k))
u,x
(4)
subject to:

xu (k + 1) = f (xu (k), u(k))


2 2 2
(Xk − Xcen ,k ) + (Yk − Ycen ,k ) ≤ RTrack ,k
xu (0) = x0 (5)
u(k) ∈ U, ∀k ∈ [0, N − 1]
xu (k) ∈ X, ∀k ∈ [0, N ]

2) Prediction Model:
Three models were used, kinematic model, dynamic model,
and a fusion between both models according to the vehicle’s

View publication stats

You might also like