A Simulated Autonomous Car: Iain David Graham Macdonald

A Simulated
Autonomous Car
Iain David Graham Macdonald
Master of Science
School of Informatics
University of Edinburgh
2011
Abstract
This dissertation describes a simulated autonomous car capable of driving on urban-
style roads. The system is built around TORCS, an open source racing car simulator.
Two real-time solutions are implemented; a reactive prototype using a neural
network and a more complex deliberative approach using a sense, plan, act
architecture. The deliberative system uses vision data fused with simulated laser
range data to reliably detect road markings. The detected road markings are then
used to plan a parabolic path and compute a safe speed for the vehicle. The vehicle
uses a simulated global positioning/inertial measurement sensor to guide it along the
desired path with the throttle, brakes, and steering being controlled using
proportional controllers. The vehicle is able to reliably navigate the test track
maintaining a safe road position at speeds of up to 40km/h.
Acknowledgements
I would like to thank all of the lectures who have taught me over the past year, each
of whom contributed to this thesis in some way. Particular thanks must go to my
supervisor, Prof. Barbara Webb, for agreeing to supervise this project and for her
advice and encouragement throughout, and to Prof. Bob Fisher for his many useful
suggestions.
Declaration
I declare that this thesis was composed by myself, that the work contained herein is
my own except where explicitly stated in the text, and that this work has not been
submitted for any other degree or professional qualification except as specified.
(Iain David Graham Macdonald)

Table of Contents
CHAPTER 1 INTRODUCTION ............................................................................................................. 1
1.1 PURPOSE ........................................................................................................................................ 1

1.2 MOTIVATION ................................................................................................................................... 1
1.3 OBJECTIVES ..................................................................................................................................... 2
1.4 DISSERTATION OUTLINE ..................................................................................................................... 2
CHAPTER 2 BACKGROUND ............................................................................................................... 3
2.1 INTRODUCTION ................................................................................................................................ 3

2.1.1 Motivation ........................................................................................................................... 3
2.1.2 A Brief History of Autonomous Vehicles .............................................................................. 4
2.2 THE URBAN CHALLENGE .................................................................................................................... 6
2.2.1 The Challenge ...................................................................................................................... 6
2.3 BOSS .............................................................................................................................................. 7
2.3.1 Route Planning ..................................................................................................................... 7
2.3.2 Intersection Handling ........................................................................................................... 8
2.4 JUNIOR ........................................................................................................................................... 9
2.4.1 Localisation .......................................................................................................................... 9
2.4.2 Obstacle Detection ............................................................................................................. 10
2.5 ODIN............................................................................................................................................ 12
2.5.1 Path Planning ..................................................................................................................... 12
2.5.2 Architecture ....................................................................................................................... 12
2.6 DISCUSSION ................................................................................................................................... 14
2.7 CONCLUSION ................................................................................................................................. 15
CHAPTER 3 SIMULATION SYSTEM .................................................................................................. 17
3.1 ARCHITECTURE ............................................................................................................................... 17

3.2 TRACK SELECTION ........................................................................................................................... 18
3.3 CAR SELECTION .............................................................................................................................. 19
CHAPTER 4 REACTIVE PROTOTYPE ................................................................................................. 20
4.1.1 Image Processing ............................................................................................................... 21

4.1.2 Training .............................................................................................................................. 22
4.1.3 Results ................................................................................................................................ 24
4.1.4 Evaluation .......................................................................................................................... 25
CHAPTER 5 DELIBERATIVE APPROACH............................................................................................ 26

5.1 SUMMARY..................................................................................................................................... 26
5.2 GROUND TRUTH DATA .................................................................................................................... 27
5.3 SENSING ....................................................................................................................................... 28
5.3.1 Some Initial Experiments ................................................................................................... 28
5.3.2 The MIT Approach.............................................................................................................. 32
5.3.3 Road Geometry Modelling ................................................................................................. 43
5.3.4 Lane Marking Verification ................................................................................................. 48
5.3.5 Lane Marking Classification ............................................................................................... 49
5.4 PLANNING ..................................................................................................................................... 52
5.4.1 Trajectory Calculation ........................................................................................................ 52
5.4.2 Speed Selection .................................................................................................................. 54
5.5 ACTING......................................................................................................................................... 54
5.5.1 Speed Control ..................................................................................................................... 55
5.5.2 Steering Control ................................................................................................................. 56
CHAPTER 6 EVALUATION ................................................................................................................ 57
6.1 LANE MARKING DETECTION AND CLASSIFICATION ................................................................................. 57

6.2 TRAJECTORY PLANNING ................................................................................................................... 58
6.2.1 Generation of Trajectory Points ......................................................................................... 58
6.2.2 Flat Ground Assumption .................................................................................................... 59
6.2.3 Non-continuous Path ......................................................................................................... 60
6.2.4 Look-ahead Distance ......................................................................................................... 61
6.3 PHYSICAL PERFORMANCE ................................................................................................................. 62
6.3.1 Path Following ................................................................................................................... 62
6.3.2 Speed in Bends ................................................................................................................... 64
6.3.3 G Force Analysis ................................................................................................................. 64
6.3.4 Maximum Speed ................................................................................................................ 66
6.4 REAL-TIME PERFORMANCE ............................................................................................................... 67
CHAPTER 7 CONCLUSION ............................................................................................................... 69
7.1 SUMMARY..................................................................................................................................... 69
7.2 FUTURE WORK AND CONCLUSION...................................................................................................... 70
BIBLIOGRAPHY ............................................................................................................................... 73
Chapter 1 Introduction
"The weak point of the modern car is the squidgy organic bit behind the
wheel." Jeremy Clarkson
1.1 Purpose
This dissertation describes a simulated autonomous car capable of navigating on
urban-style roads at a variable speeds whilst staying in-lane. The simulation uses
TORCS, an open source racing car simulator which is known for its accurate vehicle
dynamics [23]. Two solutions to the problem are provided; firstly, a reactive
approach using a neural network and secondly, a deliberative approach inspired by
the recent DARPA Urban Challenge with separate sensing, planning, and control
stages. In particular, the latter system fuses vision and simulated laser range data to
reliably detect road markings to guide the vehicle.
1.2 Motivation
The recent DARPA Grand Challenges and the work of teams such as Tartan Racing
[8] and Stanford Racing [9] have demonstrated that the goal of fully autonomous
vehicles may be within reach. The development of such vehicles has the potential to
save many of the thousands of lives that are lost in collisions every year [1].
Due to the need for autonomous vehicles to interact in an environment populated by

human drivers and pedestrians, there is clear safety risk in the development process.
This risk can act as a barrier to development as any autonomous vehicle must prove a
sufficient level of safety before it is able to enter such an environment. One approach
to this problem, as seen in the DARPA challenges, is to create controlled
environments that are representative of the intended operating environment. Whilst
the benefits of this approach are clear, the logistics and the expense of organising
these events mean that they are likely to remain rare.
Another approach is to simulate the environment thereby eliminating the safety risk
and reducing costs. Simulations have the additional benefit of being able to test
performance under unusual circumstances and allow algorithms to be optimised and
1
Chapter 1 Introduction
improved. Such simulations were an essential part of the development process for the
entrants in the DARPA events.
However, the development of accurate simulations of complex systems and

environments is a non-trivial task and may be beyond the budgets of many
researchers. This project, therefore, explores the low-cost alternative of using an
open source racing simulator to develop a simulated autonomous car.
1.3 Objectives
The goal of this project was to develop an simulated autonomous vehicle capable of
driving on urban style roads. The vehicle must be capable of navigating around a test
track in a safe and controlled manner. Specifically, the vehicle must remain in the
correct lane and drive at an appropriate speed. Although the environment is
simulated, the intention is to approach the project as though the vehicle is real. With
that in mind, the vehicle should only make use of information that would be available
in the real-world, and the system must run in real-time. The project looks to the
recent DARPA Urban Challenge for inspiration.
1.4 Dissertation Outline

The remainder of this document is structured as follows: Chapter 2 provides a brief
history of autonomous vehicle research and discusses some of the techniques used by
state of the art vehicles. Chapter 3 describes the simulator and system architecture.
Chapter 4 describes a sub-project which was undertaken to establish the feasibility of
the main project. Here, a neural network is used to control a simulated autonomous
vehicle. Chapter 5 forms the main body of the dissertation and describes a
deliberative approach using image processing, data fusion, planning, and control
techniques to solve the problem. Chapter 6 provides the results of experiments
performed on the completed system as an evaluation. Finally, Chapter 7 offers a
summary of the work undertaken, conclusions, and suggestions for future work.
2
Chapter 2 Background
2.1 Introduction
This chapter examines the current state of the art in driverless cars. It focuses on the
2007 DARPA Urban Challenge, a competition held to promote research in the field
and the main inspiration behind this project. The motivation behind autonomous
vehicles is discussed, followed by a retrospective that places the Urban Challenge in
context. The challenges posed by the Urban Challenge are described and the vehicles
which finished in the top three positions, Boss, Junior, and Odin examined. As this
project aims to build a complete autonomous system, different aspects of each
vehicle are examined giving a broad overview of the field. Although some of the
techniques described do not relate directly to this project, they have helped shape it
and represent the aspirations of the project had more time been available.
2.1.1 Motivation
The benefits of autonomous vehicles can be split into three main categories; safety,
efficiency and lifestyle.
The World Health Organisation estimates that 1.2 million people are killed on the
roads each year and a further 40 million are injured. Human errors, such as
distraction or inappropriate speed are the primary cause of road accidents [1]. The
hope is that lives can be saved through the use of technology. Additionally, the US
government has mandated that 1/3 of US military ground vehicles must be unmanned
by 2015 [4].
There is increasing pressure on manufactures to produce more economical vehicles,

and on governments to maintain an efficient road network. The number of vehicles
on our roads increases every year with further congestion as the result. For human
drivers, the recommended safe inter-car gap is 2 seconds but autonomous vehicles
may be able to reduce this, increasing the capacity of the road network.
The car has been a significant force for social change, improving the mobility of the
population. Access to this mobility will increase the quality of life for certain groups,
such as the elderly and disabled, who cannot drive themselves. For others, being
3
released from time spent behind the wheel will simply allow that time to be put to
better use [1].
2.1.2 A Brief History of Autonomous Vehicles

This section provides a history of the development of autonomous vehicles from the
1980s to the present.
In the early 1980s, pioneer Ernst Dickmanns began developing what can be
considered the first real robot cars. He developed a vision system which used
saccadic camera movements to focus attention on the most relevant visual input.
Probabilistic techniques such as extended Kalman filters were used to improve
robustness in the presence of noise and uncertainty. By 1987 his vehicle was capable
of driving at high speeds, albeit on empty streets [5].
In the late 80s Dickmanns participated in the European Prometheus project

(PROgraMme for a European Traffic of Highest Efficiency and Unprecedented
Safety). With an estimated investment of 1 billion dollars in today’s money, the
Prometheus project laid the foundation for most subsequent work in the field. By the
mid-90s, the project produced vehicles capable of driving on highways at speeds of
80km/h in busy traffic [6]. Techniques such as tracking other vehicles, convoy
driving, and autonomous passing were developed.
Another pioneer in the field was Dean Pomerleau who developed ALVINN
(Autonomous Land Vehicle in a Neural Network) in the early 90s [7]. ALVINN was
notable for its ability to learn to drive on new road types with only a few minutes
training from a human driver.
After the successes of the 80s and early 90s, progress seems to have plateaued in the
late 90s. It was not until DARPA (Defence Advanced Research Projects Agency)
launched the first of its Grand Challenges that interest in the field was renewed. In
2004, DARPA offered a $1 million prize to the first autonomous vehicle capable of
negotiating a 150 mile course through the Mojave Desert. For the first time, the
vehicles were required to be fully autonomous with no humans allowed in the vehicle
during the competition. By this time, GPS systems were widely available,
significantly improving navigational abilities. Despite several high profile teams, and
4
general advances in computing technology, the competition proved to be a

disappointment with the most successful team reaching only 7 miles before stopping.
The following year, DARPA re-held the competition. This time the outcome was
very different with five vehicles completing the 132 mile course and all but one of
the 23 finalists surpassing the seven miles achieved the previous year. The
competition was won by Stanley, the entry from the Stanford Racing Team headed
by Sebastian Thrun [12].
Buoyed by this success, DARPA announced a new challenge, to be held in 2007

named the Urban Challenge. This would see the competition move from the desert to
an urban environment with the vehicles having to negotiate junctions and manoeuvre
in the presence of both autonomous and human-driven vehicles.
5
2.2 The Urban Challenge

This section describes the Urban Challenge and the main sub-challenges that were
set.
The competition took place in 2007 on a closed air force base in California. All the
autonomous vehicles and multiple human-driven vehicles were present on the course
at the same time. The environment can, therefore, be considered a good
approximation of a genuine urban environment even though the roads were not open
to the public.
2.2.1 The Challenge

Each vehicle was required to complete a mission specified by an ordered series of
checkpoints in a complex route network. Each vehicle was expected to be able to
negotiate all hazards including both static and dynamic obstacles, re-plan for
alternative routes, and obey California traffic laws at all times [4].
More specifically, each vehicle had to demonstrate the following abilities:
 Safe and correct check-and-go behaviour at junctions, when avoiding

obstacles and when performing manoeuvres
 Safe vehicle following at normal speeds and in slow moving queues
 Safe road following, only changing lane when safe and legal to do so
 GPS-free navigation (GPS may be used but is not reliable in urban
environments)
 Manoeuvres such as parking and u-turns
Each vehicle is supplied with two files, the Route Network Definition File (RNDF),
and the Mission Definition File (MDF). The RNDF specifies the layout of the road
network and is common to all teams. It specifies accessible road segments, lane
widths, and stop sign locations. The MDF contains the specific mission that the
vehicle must accomplish, with each vehicle having a unique but equivalent mission.
6
2.3 Boss
Boss was developed by Carnegie Mellon University and finished in 1st place [8].
This section describes the route planning and intersection handling techniques used
by Boss. The intersection handling described below would have been of relevance to
the project had time been available to add overtaking functionality.
Figure 2-1 Boss at the Urban Challenge
2.3.1 Route Planning

The RNDF is converted to a connected graph with directional edges representing
drivable lanes. Each edge is assigned a weight that represents the cost of driving the
corresponding road segment. The cost is calculated using the length and speed limit
of the segment as well as a term that represents the complexity or difficulty of the
terrain for Boss to negotiate. Graph search techniques are then used to plan a path
from the current location to a goal location.
As Boss navigates the chosen path, new information may become available that
requires the costs of road segments to be modified. For example, Boss maintains a
map of obstacles it believes to be static. If a static obstacle is determined to entirely
block the road, it is necessary to find an alternative route. To do this, Boss
significantly increases the cost associated with the road segment and re-calculates a
new route to the goal. The increased cost is sufficient to cause an alternative route to
be selected. However, it is not desirable to permanently avoid the blocked road and
so the cost is exponentially reduced over time to its original value.
7
2.3.2 Intersection Handling

A crucial requirement is that the vehicle is capable of negotiating intersections safely
and observing correct precedence. Precedence becomes important when the
intersection contains more than one stop line (4-way stops are common in the US).
The order of precedence is determined by the order in which the vehicles arrive at
their respective stop lines. Boss estimates precedence by defining a precedence
polygon that starts around three metres prior to the stop line. A vehicle is considered
to be in the polygon if its front bumper (or part of it) is within the polygon. The time
at which the vehicle is detected as being in the polygon is used to estimate
precedence. As vehicles with higher precedence leave their polygons, Boss moves up
the precedence order until it determines it has precedence.
Whilst this approach seems straight forward, care must be taken when determining
the size of the precedence polygon. Increasing the size of the polygon improves the
robustness of the algorithm but risks that two vehicles may be detected as one.
The idea of the precedence polygon is extended to apply to situations where Boss
must merge with moving traffic. In this case, yield polygons are calculated based on
the time it would take Boss to execute the manoeuvre and the safe inter-vehicle gap.
For example, if Boss wishes to cross a lane of traffic coming from the left to join
traffic coming from the right the following times would be considered:
 The time to cross the lane, Taction

 The time to accelerate to appropriate speed for the desired lane, Taccelerate
 The minimum safe time gap between vehicles, Tspacing
These times are used to determine the size and location of the yield polygon for both
the crossed lane and the destination lane. Any vehicle within a polygon has its
velocity tracked and the time at which it will cross Boss’s desired path is estimated.
Using the estimates for each lane, Boss is able to determine if there is sufficient time
to perform the manoeuvre.
8
2.4 Junior
Junior was developed by Stanford University and finished in 2nd place. This section
describes Junior’s use of LIDAR and GPS/IMU. See sections 5.3 and 5.5 for how
these sensors types are used by this project.
Figure 2-2 Junior at the Urban Challenge and the Velodyne HDL64 LIDAR used by several teams
2.4.1 Localisation
As GPS signals are carried by microwaves, they are absorbed by water leading to
reception problems in bad weather and under foliage. Tall buildings reduce the
visible area of the sky and, therefore, the choice of satellites, limiting accuracy.
Furthermore, GPS does not provide a means of directly determining the vehicle’s
orientation. For these reasons, the GPS unit is combined with an inertial
measurement unit (IMU) which uses gyroscopes and accelerometers to estimate the
velocity and acceleration of the vehicle [2].
Whilst combined GPS/IMU systems can provide sub-meter accuracy they are still
insufficient for safe road following. They provide a pose estimate that is the most
probable at the current time and are prone to position jumps. In addition, the data in
the RNDF is also GPS based and cannot be guaranteed to be accurate. It is, therefore,
necessary to use additional localisation techniques.
Junior uses kerb locations and road markings and to accurately localise relative to the
RNDF. The kerb locations are described below. Front and side mounted lasers that
are angled down are used to measure the infra-red reflectivity of the road. Lane
markings can be extracted from this data and compared with lane data in the RNDF.
9
This fine-grained localisation is used to maintain an internal co-ordinate system that

is robust to position jumps.
2.4.2 Obstacle Detection

Five of the six teams that completed the Urban Challenge used a high-definition
LIDAR system as their primary sensor. As the name suggests, LIDAR is similar to
RADAR but pulses of laser light are used rather than radio waves. Both Boss and
Junior used a system manufactured by Velodyne Inc that was developed for the
original Grand Challenge. This roof-mounted system comprises a rotating unit
containing 64 separate lasers. Each of the lasers is fixed at a different pitch and
therefore scans a different portion of the environment. The result is a highly detailed
3-dimensional map of the environment that can be used to detect kerb-sized objects
at 100m [3].
The LIDAR produces a detailed map of the environment in the form of a point-cloud.
For obstacle detection, this data must be processed and features of interest extracted.
One method of doing this would be to identify points that are the same distance and
direction from the vehicle but have different heights. However, the Stanford team
found that whilst such a method was suitable for detecting objects with large heights
such as cars and pedestrians, it was not suitable for smaller objects such as kerbs.
The problem was setting a threshold that would allow kerbs to be detected without
producing a large number of false-positives.
To combat this, Junior uses a novel approach. If the vehicle is on flat ground, each of
the LIDAR lasers will scan a circle of known-radius around the vehicle. The scans,
therefore, generate a series of concentric circles with each circle a fixed distance
apart. On ground that is not flat, the distances between the circles are distorted much
like the contours on a map. By comparing the distances between the contours with
the expected value, small objects can be detected with greater sensitivity than using
vertical measurements.
One complication with this approach is that of vehicle roll. As the vehicle turns, it
has a tendency to tilt outwards thus reducing the distance between the contours on
one side of the vehicle and increasing them on the other. If not compensated for, this
10
effect could register an object where there is none. The effect can be cancelled by
making the expected inter-ring distance a function of the vehicle’s roll.
Junior combines range data from multiple laser sensors to generate a 2D polar map
containing the range to the nearest object in every direction. The nature of laser
sensors means that it is straightforward to produce such a map from multiple
different laser types. Once generated, this map can be used to locate moving objects.
Moving objects are detected by comparing two maps from slightly different times.
For each difference detected, a set of motion hypotheses is generated and represented
as particles. As more information becomes available, the particles are filtered
allowing the object to be tracked over time.
11
2.5 Odin
Odin was developed by Virginia Tech as part of team VictorTango and finished in
3rd place [10]. This section describes the path planning and architecture of Odin. See
section 5.4 for details of how this project performs path planning.
Figure 2-3 Odin at the Urban Challenge
2.5.1 Path Planning

The RNDF contains a series of waypoints which define the road network. The
distance between the waypoints may vary and it is, therefore, necessary to calculate a
smooth path from one point to the next. To do this, Odin uses cubic splines. The
same technique is used to generate paths through intersections and in unstructured
parking zones.
Using splines guarantees a smooth path between points but does not guarantee that
the path accurately matches the road. To combat this problem the curvature of the
splines is manually adjusted using the aerial photographs supplied by DARPA.
2.5.2 Architecture
Odin implements a hybrid deliberative-reactive architecture [13]. Such architectures
combine the benefits of high-level deliberative planning with low-level reactive
simplicity. However, increases in computing power have allowed Odin to add a
further deliberative layer to handle low-level motion planning. The reactive driving
behaviours are, therefore, sandwiched in a deliberative-reactive-deliberative
progression [11].
12
2.5.2.1 Route Planning

The top-level deliberative component is responsible for route planning. It is invoked
on demand when a mission is first loaded or when an existing route is found to be
blocked. As with Boss, the road network is searched using A* graph search and aims
to find the route with the shortest time. The time for a route is based on the speed
limits and distances with additional fixed penalties for manoeuvres such as u-turns.
2.5.2.2 Driving Behaviours

The reactive layer comprises a set of independent driving behaviours. Each is
dedicated to a specific driving task such as passing another vehicle or merging with
moving traffic. However, not all driving behaviours are applicable all of the time
and, therefore, a sub-set is selected based on the current driving context. For
example, on a normal section of road the route driver, passing driver, and the
blockage driver are applicable whereas at a junction the precedence, merge, and left-
turn drivers are used. The driving context therefore acts as an arbiter that activates
multiple behaviours.
Driving Behaviour Purpose

Route Driver Assumes no other traffic
Passing Driver Pass other vehicles
Blockage Driver React to blocked roads
Precedence Driver Stop sign precedence
Merge Driver Enters or crosses moving traffic
Left Turn Driver Yields when turning left across traffic
Zone Driver Re-route when stuck
The arbiter and each of the driving behaviours are implemented as finite state
machines. These are arranged in a hierarchy with the arbiter as the root. The structure
of the hierarchy represents a top-down task decomposition rather than any idea of
behavioural priority.
As the arbiter is able to select multiple, potentially competing behaviours, an

additional mechanism is required to select which commands the vehicle. For this, a
form of command fusion is used which allows each behaviour to specify an urgency
13
parameter. This parameter indicates how strongly the behaviour feels that it should
be selected.
2.5.2.3 Low-level Planning and Vehicle Control

The bottom, low-level deliberative layer is concerned with motion planning. Its
purpose is to determine a speed and trajectory that will keep Odin in the desired lane
whilst avoiding obstacles or to perform manoeuvres such as parking.
Once the desired path and speed are established, the vehicle needs to be commanded
appropriately. To do this, the vehicle’s dynamics are modelled using a bicycle
model. This simplifies modelling by compressing four wheels into two and has
proved to be sufficient for the low speeds experienced in the Urban Challenge [25].
The base vehicle chosen for Odin was a hybrid-electric Ford Escape which has the
advantage of an existing built-in drive-by-wire system. Sending the appropriate
commands to this system allow the steering, throttle, and gear change to be easily
controlled. An additional advantage of hybrid vehicles is that they have sophisticated
power generation systems making it easy to power the computers and sensors.
2.6 Discussion
The Urban Challenge has stimulated huge interest in the field of autonomous
vehicles but how realistic a challenge did it represent? Having multiple autonomous
vehicles interacting with each other and human driven vehicles on the scale seen in
the Urban Challenge certainly presents a degree of realism not seen before. However,
there is a clear gap between the competition format and reality. The challenge did not
require vehicles to perceive road sighs or traffic lights. Nor were vulnerable road
users such as motorcycles or pedestrians encountered. These are active areas of
research but they were notable by their absence.
The RNDF together with aerial photography presented the teams with a rich
description of the environment. Despite this, manual modifications were required to
ensure correct operation. It is unrealistic to expect this level of detail to be available
or maintained on a global basis.
14
The vehicles must perform well at many different tasks in order to succeed. Some of
the problems can be considered solved whilst others show a trend towards a
particular solution. For example, high-level route planning, an important aspect of
the challenge posed little problem with standard techniques such as A* being used
successfully. Likewise, the low speeds encountered in urban driving pose little
problem in terms of vehicle control. An autonomous vehicle developed by Stanford,
named Shelley, recently competed in an off-road hill climb race demonstrating the
state-of-the-art in vehicle dynamics.
Perception is perhaps the area of most interest in the Urban Challenge. There is a
clear trend towards direct sensing technology such as LIDAR and away from vision.
This trend is likely to continue but many vision techniques will remain applicable to
images generated by laser sensors.
There is no doubt that the availability of combined GPS and IMU technology has
been crucial to the field but despite these advances, localisation still proves to be a
serious problem. In qualifying, Odin experienced a signal jump that caused it to
misjudge its position by 10m. A similar, though less severe problem occurred in the
final event. Another competitor, Knight Rider failed to complete the challenge due to
a localisation failure [14]. Accurate localisation is crucial; even an error of 1m could
be catastrophic.
Figure 2-4 Shelley, an autonomous Audi TT developed by Stanford
2.7 Conclusion
The Urban Challenge and indeed, the preceding Grand Challenges have been a
powerful driving force in the development of autonomous vehicles. Together they
mark a significant milestone towards the goal. Progress has, in large part, been due to
15
advances in GPS and LIDAR technologies but limitations still remain. Further
improvements in these technologies are required. Solving the technical challenges
seems inevitable but other challenges such as questions of liability and how to
adequately prove safety lie ahead.
16
Chapter 3 Simulation System
3.1 Architecture
The Open-source Racing Car Simulator (TORCS) is a racing simulator that has a
reputation for having an accurate physics engine [23]. It was developed with the
artificial intelligence community in mind, being used as a platform for the
development of computer-controlled opponents in racing games. It has also been
used as the base platform for the annual WCCI racing challenge [15].
TORCS is implemented in C++ and has an API [29] that provides physical vehicle
parameters such a speed, acceleration, wheel rotations rates, and so on, that can be
used as sensors (Table 3-1). This information is provided in real-time and is updated
at 50Hz. The API also provides a means of commanding the vehicle via the variables
shown in Table 3-2. Commanding the vehicle via this interface is analogous to the
use of vehicles with built-in drive-by-wire interfaces such as Odin in the Urban
Challenge.
OUTPUT WINDOW
VIDEO OUTPUT CAPTURED IMAGE
VEHICLE SENSOR DATA

TORCS AI VEHICLE
CONTROLLER
VEHICLE COMMANDS
Figure 3-1 Top-level system architecture
The objective of this project is to implement the artificial intelligence vehicle

controller (AIC) depicted in Figure 3-1. This controller must be capable of running in
real-time. Data is transferred between TORCS and the AIC using sockets allowing
the AIC to run on a separate PC should performance be an issue.
17
Rather than using a camera, input images must be captured directly from the
simulator’s output window. The Windows API provides a means of enumerating all
windows currently in use. From this it is possible to query the title of each window
and, therefore, locate the TORCS window. Once identified, a further Windows API
call can be used to perform a fast copy of image data from that window into local
process memory.
Table 3-1 TORCS Data and Potential Application
Parameter Potential Sensor Application

Wheel rotation rates Odometry
Vehicle’s position in the world (x,y,z) Global positioning system (GPS)
Vehicle acceleration (x,y,z) Inertial measurement unit (IMU)
Track geometry Light detection and ranging (LIDAR)
Table 3-2 Vehicle Command Parameters
Variable Type / Range Meaning

Steering angle Float / -1.0 … 1.0 -1.0 indicates full left-lock, 0.0 straight ahead, and
1.0 full right-lock
Throttle Float / 0.0 … 1.0 0.0 indicates no throttle, 1.0 indicates full throttle
Brake Float / 0.0 … 1.0 0.0 indicates no brake force, 1.0 indicates full
brake force
Gear Integer / 0 … 5 0 indicates neutral, 1…5 indicates desired
transmission gear
3.2 Track Selection

As this project is aimed at simulating a low-speed urban driving environment as
opposed to a high speed motorway environment, it is important to select a track that
is representative of an urban environment. With that in mind, I selected a track that
contained bends of varying severity as well as having lane markings that are in
agreement with a typical urban environment.
Figure 3-2 shows the track selected. It is 2.59km long and there are long gentle
bends, sharp hair-pin style bends, and tight bends in opposite directions close
together. The figure also shows the lane markings which consist (for the most part)
18
of a continuous white border line on the left and right marking the edge of the road
and a dashed white line separating the two driving lanes. In lane detection systems a
common problem is that of shadows and poor quality road markings. The simulator
image shown exhibits both these features to some extent.
Figure 3-2 Selected track left, and example screenshot from camera.
3.3 Car Selection

As TORCS is a racing simulator, it provides a selection of cars to choose from, the
majority of which are dedicated track racing cars. Most of the competitors in the
DARPA Urban Challenge used SUV-style vehicles and the rules stated that the
vehicle must be road-legal and of proven safety record [4]. Of the cars provided by
TORCS, the one that best matched these requirements was a Peugeot 406. This car
was selected on the grounds that it is a typical saloon style car common on the roads
and has good low speed handling characteristics due to being front-wheel drive. The
vehicle dimensions are . Figure 3-3 shows an image of the
selected vehicle.
Figure 3-3 Peugeot 406
19
Chapter 4 Reactive Prototype
It was necessary to perform a feasibility study to ensure that it was possible to

capture the simulator output, process the data, and send control commands back to
the simulator and to assess whether a typical laptop had sufficient processing power.
For the purposes of the prototype, it was important to select a technique that was
direct, allowing the main parts of the system to be put together relatively quickly. I
chose to base the prototype on Dean Pomerleau’s ALVINN [7]. This uses a neural
network to directly convert an input image of the road into a steering angle for the
vehicle. Thus, the control of the vehicle is directly reactive to the current road scene.
The feed-forward network is organised as three layers comprising 800 input nodes
(conceptually a image grid), 4 hidden nodes, and 31 output nodes. The
output , of each node is a function of its weighted inputs :
The network weights are trained using the back-propagation algorithm [24]. The
general training process is described in more detail below. Figure 4-1 illustrates the
network’s structure.
1 1 1 SHARP LEFT
2 2 2
… 3 …
800 4 31 SHARP RIGHT
INPUT IMAGE INPUT HIDDEN OUTPUT

LAYER LAYER LAYER
Figure 4-1 Neural network structure
At pixels, the input image is at a lower resolution than typically used for
modern lane-tracking systems and certainly lower than the captured image.
20
Chapter 4 Reactive Approach
Therefore, the image must be down-sampled in a process described in section 4.1.1

prior to being passed to the neural network.
Each of the 31 output nodes represents a specific steering angle with sharp-left
corresponding to node 0, straight-ahead corresponding to node 15, and sharp-right
node 30. Each output node returns a value between 0.0 and 1.0 indicating the degree
to which the network believes that to be the correct steering angle. The output,
therefore, represents a distribution of probable steering angles. This distribution is
then converted to a single floating point value by computing its centre of mass and
rescaling to the range -1.0…1.0 for compatibility with the TORCS API.
4.1.1 Image Processing

The captured image goes through the following steps to convert it into a format
suitable for input to the neural network. The effect of each step is illustrated in figure
Figure 4-2.
Cropping: The captured image has a resolution of pixels and is in RGB

format. The horizon lies approximately half-way down the image and, for the
purposes of this project is assumed to be fixed. The image is, therefore, cropped to
, discarding the upper half.
Intensity conversion: The cropped image is converted from RGB format to grey-
scale using the standard conversion formula:
Smoothing: The grey-scale image is smoothed to remove fine edges. The filter used
is a standard Gaussian.
Edge detection: The smoothed image is convolved with horizontal and vertical
Söbel filters to highlight intensity changes.
Binarisation: A manually selected threshold is used to convert the image from grey
to binary. The result is an image that has the road markings highlighted against the
black background of the road. Edge features to the sides of the road are also
highlighted but this does not pose a problem.
21
Down-sampling: At this point, the image resolution is slightly less than

due to shrinkage during filtering. This is still too high to use as input to the neural
network. The next step is, therefore, to reduce the resolution to by simply
averaging the intensities over blocks of pixels.
Figure 4-2 Image Processing Steps. Top left, cropped RGB image. Top right, grey scale image. Middle
left, smoothed image. Middle right, edge enhancement. Bottom left, binarisation. Bottom right,
resolution reduction.
4.1.2 Training
In order for the neural network to operate, it must first be trained. The training data
required consists of a set of tuples, each containing an input image and the
corresponding desired output steering angle. ALVINN relied upon a human driver to
train the network over a period of a few minutes driving on any new road type. I
chose to use a computer controlled ‘expert’ driver to train the network.
This expert driver is used to determine the correct steering angle to be associated
with a given input image as follows: The TORCS API makes it easy to determine the
exact position of the vehicle relative to the centre-line of the road. This information
can be used to make the vehicle follow a given lane using a proportional steering
control method. Thus, the steering angle can be captured at the same time as the
image, forming a training pair.
22
The speed of the expert driver was fixed 15km/h, being slow enough that the most
severe bends can be safely negotiated.
It would be possible to generate a batch of training data by periodically capturing

pairs as the expert driver navigates the track. Once captured, this data could be
partitioned into separate training and test sets. However, I chose to train the network
in an online manner.
As the vehicle travels around the track, an image is captured and the network
generates what it believes to be the correct output steering angle. This output is
compared with the correct steering angle as determined by the expert driver. If the
two steering angles differ by more than a specified threshold, the expert driver wins
the right to control the vehicle. When this happens, the captured image and the
correct steering angle are combined into a training pair and added to the current set
of training pairs. Conversely, if the steering angles are in reasonable agreement then
the AIC retains control. Thus, training data is generated only in situations where it is
required and constitutes a supervised learning approach.
The threshold used to determine which driver is in control is initially set to a very
small value so that any deviation between the two steering angles results in the
expert driver gaining control. The threshold is relaxed gradually over time. This
allows tight control at the outset but also allows more ‘wiggle room’ as the network
becomes more competent allowing for slight deviations from the desired path to go
unchecked.
During each image processing cycle, 25ms is allocated to training the network
incrementally using the all the training data accumulated so far. The network,
therefore, continually improves over time with training taking place whether the
expert or the network is in control.
When using the expert driver, we must convert from its exact steering angle to an
output distribution that is compatible with the desired network output. To do this, a
Gaussian distribution is created with its mean at the steering angle and a variance of
0.07 as illustrated in Figure 4-3.
23
CAPTURED IMAGE OUTPUT NODE VALUES
Figure 4-3 Illustration of the relationship between the input image and output steering
distribution (not to scale).
4.1.3 Results
This online supervised learning approach is highly effective. Within only a few tens
of metres, the network is able to take control on the initial straight section. As the
first lap progresses, the network is able to maintain control through the more gentle
bends. The more severe bends and bends in opposite directions in quick succession
are the last to be mastered by the network. Often, by the end of the 2nd lap the
network is fully trained and the 3rd lap is completed under full autonomous control.
Table 4-1 gives the percentage of autonomous control over three test runs of 4 laps
each and Figure 4-4 shows how the capability of the network improves over the 3
laps.
Figure 4-4: Expert versus autonomous control over 3 laps. Red dots indicate areas where the expert
driver was in control. The final lap (right) is completely autonomous.
The vehicle starts at the blue circle and travels in a clockwise direction. The red
markers indicate where the expert driver has control. The left image shows the 1st
lap, with the vehicle starting out under expert control. The neural network quickly
takes control and only needs occasional assistance during the first straight. Heavy
24
assistance is required during the bends on the first lap. The middle image shows the
2nd lap – the expert driver is only required in four locations. An interesting point is
that during the 2nd lap, the network requires more assistance to straighten up when
exiting a bend than it does on entry to the bend. The right image shows the 3rd lap
which is completed under full autonomous control.
Table 4-1 Percentage of Autonomous Control
Lap 1 Lap 2 Lap 3 Lap 4

Run Auto Pairs Auto Pairs Auto Pairs Auto Pairs
1 96% 199 99% 308 100% 422 100% 537
2 97% 204 99% 392 99% 634 100% 856
3 96% 179 99% 312 99% 442 99% 573
Whilst training, the system achieves a frame rate of around 11Hz and around 15Hz is
achieved once training is complete. During development it was observed that a frame
rate of less than 10Hz was insufficient to control the vehicle. The system was run on
a standard Windows laptop with an AMD Turion64 processor – a five year old
system at the time of writing. The combined processor load of running TORCS and
the AI controller was 100%.
An important point about this system is the direct coupling between the frame rate
and the steering command rate. Each steering command only applies for the instant
that it was generated. If the image processing were interrupted for any reason, the
vehicle will immediately lose control.
4.1.4 Evaluation
The development of this system was, in itself, a substantial amount of work
(approximately 6 weeks) but was necessary to demonstrate that TORCS could be
integrated successfully with an independent vision and control system. However, the
basic architecture with regards to image capture, image processing, and inter-process
communication would be re-usable. Indeed, the rest of this project would not have
been achievable in the available time had this prototype not been developed. Despite
the prototype being a success, it highlighted the limitations of the laptop used and, as
a result, a new high-performance laptop was used for the remainder of the project.
25
Chapter 5 Deliberative Approach
Whilst the neural network prototype successfully controls the vehicle, it operates in a
reactive way; the steering angle is a direct function of the input image. Furthermore,
this function is essentially hidden and does not lend itself to analysis. What features,
for example, is the network responding too? In order to deal with more complex
driving situations the entrants to the Urban Challenge required a higher level of scene
understanding is required.
The main body of this project is, therefore, concerned with controlling the vehicle in
a deliberative manner. Starting with a captured image, the road markings are
explicitly detected and modelled, prior knowledge of the road width is used to
classify road markings, a trajectory for the vehicle is computed, and the vehicle
controls both its speed and position to follow the desired path. This project,
therefore, takes a sense, plan, act approach to vehicle control.
5.1 Summary
This chapter forms the main body of the dissertation. It starts with a description of
how some ground truth data was generated for test purposes in Section 5.2. Section
5.3 covers the sensing aspects of the system. It provides a short description of some
initial investigations that, whilst useful, were not taken further, before describing the
main image processing steps and LIDAR simulation. Section 5.4 describes the
techniques used to convert the perceived environment into a path for the vehicle to
follow. Finally, section 5.5 describes how the vehicle is controlled. Figure 5-1 gives
an overview of the main steps in the system.
26
TORCS CAPTURED BINARY

IMAGE IMAGE
GREY SCALE
IMAGE
SIMULATED
LIDAR DATA MATCHED
FILTERS
FEATURE DATA
DETECTION FUSION
MARKING MARKING MARKING

DETECTION VERIFICATION CLASSIFICATION
SPEED PARABOLA TRAJECTORY

SELECTION FITTING POINTS
THROTTLE BRAKE STEERING

CONTROL CONTROL CONTROL
Figure 5-1: Main processing steps of the system. Sensing steps are shown in blue, planning steps in
red, and control steps in purple.
5.2 Ground Truth Data

In order to perform experiments and evaluate different approaches, it was necessary
to generate a test set of image pairs comprising an original captured image and the
corresponding ‘ground truth’. To do this, a set of 17 images was captured from
various points along the track. These images were chosen to be representative of the
track and, therefore, included straight sections and bends of various degrees.
Once the images were captured, they were converted to binary images using a
manually selected threshold such that the lane markings were fully present – it is not
desirable to lose any of the lane-marking information. The resulting images
contained a substantial amount of noise which was removed manually.
27
The result is a set of image pairs; the original captured image along with a binary
image containing only the lane markings. Figure 5-2 shows an example of such a
ground truth pair.
Figure 5-2 Example of a captured image and the corresponding ground truth.
5.3 Sensing
The section describes the development of the sensing system and culminates with the
fusing of vision and LIDAR data into a virtual lane marking sensor.
5.3.1 Some Initial Experiments

At the project outset, I had no specific technique in mind for detecting the road-
markings. Therefore, I performed some experiments to explore different options.
5.3.1.1 Simple Thresholding

It is important to try the simple approaches before looking for more sophisticated
techniques. Although I did not expect simple thresholding to be a reliable means of
distinguishing the lane markings from the background, I decided to start with this
approach. A side-effect of this is that it provides a baseline for evaluating other
methods.
Using the ground-truth test set, the original image is converted to a grey-scale image.
This is then thresholded and the resulting binary image compared with the ground-
truth. By doing this, it is possible to obtain a measure of the signal to noise ratio of
the binary image. The signal to noise ratio is defined as:
∑
∑
28
Thus, for each pixel set in the binary image we determine if it corresponds to a
genuine road marking in the ground-truth or whether it is a false positive (noise). By
repeating the process, the threshold with the highest SNR can be determined.
Figure 5-3 Effect of different thresholds on SNR. Top left, captured image. Right, SNR against
threshold. Bottom left, binary image with highest SNR.
Figure 5-3 shows the SNR obtained using different thresholds for a single image.
There is a clear peak at a threshold of 150 (for this particular image). Comparing
against the ground truth image in Figure 5-2, we can see that although the signal to
noise ratio has been maximised, there are significant sections of the markings
missing. This is, in part, caused by the shadows cast over the road. This problem is
typical in road marking detection systems and vision systems in general.
It is clear that using a simple threshold is not appropriate given that substantial
portions of the lane markings are absent even when we are in a position to choose the
best threshold.
The experiment was repeated using each pair in the test set. The peak SNR value
occurred at an average threshold of 136. This threshold was used to generate the
SNR values shown in Figure 5-17.
5.3.1.2 Inverse Perspective Mapping

A common approach to lane marking detection is to perform an inverse perspective
mapping (IPM) to remove the foreshortening effect due to perspective [16][17]. The
result of IPM is an image of the road as though viewed directly from above. The
technique works by projecting the perspective image onto the ground-plane which is
29
assumed to be both flat and horizontal. A description of the technique can be found
in [16]. In order to apply IPM, characteristics of the camera such as height, and field
of view must be known. Typically, this information is obtained using a semi-
automated calibration process which involves placing a chessboard pattern of known
dimensions in front of the camera. Many vision software libraries include routines to
facilitate this process. However, as this project uses a simulated camera, the
calibration approach is not applicable. Instead I obtained an approximation of the
camera characteristics by working through the simulator source code. It would have
been possible to obtain precise information as this is necessarily encoded within the
simulator but I was reluctant to spend too much time on this in the initial stages of
the project. My initial evaluation of this approach involved applying IPM to the
ground-truth images and simply evaluating the results by eye.
Figure 5-4: Effect of applying IPM to ground truth images.
Examples of applying IPM to ground-truth images are shown in Figure 5-4. The
benefits of IPM are clear, with the edges of the road now appearing parallel. Using
this approach would facilitate applying constraints when searching for the lane
markings – particularly searching for sets of parallel lines rather than individual
markings. However, despite the clear visual benefits of IPM, it is not without its
problems. As IPM relies on the flat ground-plane assumption, the image can become
distorted when the assumption does not hold. Furthermore, pixels in the perspective
image that are distant from the camera are mapped to multiple pixels in the IPM
image. This produces a block-like effect that becomes more apparent the further the
30
pixel is from the camera [16]. Figure 5-5 shows the effect of applying IPM to more
severe bends where the markings in the perspective image are thin.
Figure 5-5: Left image shows non-parallel lines. Right image illustrates block effect for distant
pixels.
In this figure, both of the IPM images show distortion of the road geometry as the
distance from the camera increases. In particular, the lane markings cease to be
parallel and the block effect of mapping a single pixel in the perspective image to
multiple pixels in the IPM image can be seen (despite anti-aliasing being used).
5.3.1.3 RANSAC Curve Fitting

TORCS represents bends as circular arc segments although more complex curves can
be made by joining segments of different radii together. As this project is concerned
with modelling road geometry, I experimented with fitting circular arcs to the IPM
images.
Figure 5-6: Result of using RANSAC to fit circles to the IPM image.
31
Figure 5-6 shows the result at fitting circular arcs to an IPM image using the
RANSAC algorithm [21]. The thickness of the lane markings causes many circles to
pass the acceptance test. This suggests that the approach is unlikely to be reliable at
determining the radius of a given bend. There were two further problems with this
approach. Firstly, no circles were matched to the centre-line and secondly, small
modifications to the RANSAC parameters seemed to make the difference between
many circles being detected and none being detected. Given these problems, I
decided that this approach was unlikely to succeed and did not investigate it further.
However, with hindsight there are several things that could have improved the
situation. For example, thinning the markings prior to the application of RANSAC
and using a parabolic model rather than the somewhat restrictive circular approach.
Nonetheless, the time spent understanding these techniques would prove useful
elsewhere in the project; both inverse perspective mapping and curve fitting are used
in section 5.3.3 on road geometry modelling.
5.3.2 The MIT Approach

Whilst many lane marking detection systems employ the inverse perspective
mapping approach as the first processing step, not all do so. In particular, the
approach taken by the MIT team and described in Albert Huang’s PhD thesis [19]
works around the foreshortening effect by applying filters of different sizes directly
to the perspective image.
This approach was of particular interest to me for two reasons. Firstly, their
technique proved to be successful in the competitive environment of the Urban
Challenge and secondly, they describe a way in which data from LIDAR sensors can
be fused with camera data.
5.3.2.1 Image Capture & Pre-processing

In contrast to the neural network approach, where the input image is used to directly
determine the steering angle, the approach taken here involves separating the tasks of
scene understanding and vehicle control. More specifically, the lane markings are
extracted and used to form a model of the road geometry and subsequently plan a
path for the vehicle to follow. As we are concerned with detecting specific features
and their location in the distance, it makes good sense to increase the resolution of
32
the input image. However, as resolution increases so does the cost of processing the
data. I chose to set the simulator output to a resolution of . The image is
captured and cropped to again assuming that the horizon is fixed halfway
down.
The image is then converted from RGB to grey-scale in the normal manner. In
addition to this, a separate binary image is created which is used for a verification
step described in section 5.3.4. As this binary image is not used directly for feature
detection the choice of threshold need not be too fine-tuned and is selected to provide
a reasonable separation of the lane-markings from the road surface.
5.3.2.2 Matched Filters

Huang observes that as lane markings are typically of a standard width and the rate
of foreshortening in a perspective image can be determined [19], it is possible to
locate the lane markings by searching for features of a size dependent on the distance
from the camera. As with inverse perspective mapping, this relies on the flat ground-
plane assumption. The technique, therefore, searches for features of a size that is a
function of the marking width and the scanline being searched.
However, this makes the further assumption that a single horizontal scanline
represents a line in the world that is a constant distance from the camera. This is not
the case and it is possible that extending the function to include the position within
the scanline may improve the algorithm. However, this was not investigated.
In order to detect a feature of a specified size the filter shown in Figure 5-7 is scaled
such that the portion above zero is the same length (in pixels) as the feature to be
detected.
Figure 5-7 Feature detection filter template. The filter is scaled to match the desired feature size.
33
Figure 5-8 illustrates the principle behind matched filters. A single scanline
containing two different sized features is convolved with a filter whose size matches
one feature but not the other. When the filter exactly matches the feature size, the
result is a clear local maximum. When the match is not exact, the result is a truncated
peak. We can, therefore, locate features by searching for definite local maxima.
Figure 5-8 Principle behind matched filters. Left, single scanline with two different sized features.
Right, the result of filtering. The filter does not match the first feature but matches the second
exactly.
34
5.3.2.2.1 Experimental Determination of Filter Sizes

The ground-truth test set can be used to determine the typical width of the lane
markings on any given scanline. Figure 5-9 shows an analysis of the lane marking
widths for a single ground truth image. The image is coloured to indicate the
individual marking from which the data originates.
Figure 5-9 Analysis of lane marking with in a single ground-truth image
This shows that the left (red) and right (blue) lane markings increase in width at the
same rate despite the left marking being further away than the right marking in any
given scanline. In contrast to this, the centre marking’s width shows a significantly
different gradient. This result was not expected and is simply due to the centre
marking being narrower than the left and right markings, though the difference is not
obvious to the naked eye in the perspective image. An additional point to note is the
sharp drop-offs in width as the scanline increases. This is simply due to the sections
boxed in black where the width of the marking is cropped at the edges of the image.
The problem of having different marking widths was not discussed by Huang. My
initial approach was to repeat the above experiment using all of the ground-truth
images, combining the results and computing the best-fit line using the least-squares
method. Whilst this was sufficient to make progress with the project it was ultimately
unsatisfactory with neither marking width being well detected.
5.3.2.2.2 Twin Filters

To properly address the problem, two separate filters must be used, one for each
feature width. The data from the previous experiment was re-examined to produce
35
two line equations; one to calculate the width of the left and right markings and one
for the centre marking. However, applying these filters to a single scanline results in
a new problem as illustrated in Figure 5-10. Here, two filters are applied with each
matching one feature but not the other. Despite being a perfect match to the left
feature, the blue peak is lower than the response from the non-matching filter in red.
Figure 5-10 Applying twin filters. Blue matches the left feature, red matches the right feature.
The solution to this is simply to normalise each filter such that the area under the
filter was 1 after being scaled to the desired size. By doing this, it allows the results
of the two filters to be added together prior to searching for the peaks. Figure
5-11illustrates the result of normalising the filters and adding the results. The peaks
of the matching filters are now greater than their non-matching counterparts. The
combined results show two clear peaks making the features easy to detect.
Figure 5-11 Combining normalised filter responses
36
The description above used a binary scanline as input to illustrate the filtering
technique. Figure 5-12 shows the same process applied to an actual scanline obtained
from the simulation. There are three points worth noting from the results. Firstly, the
technique has worked on genuine data, correctly producing strong peaks
corresponding to the features of interest. Secondly, the intensities in areas where no
features exist have been smoothed and are close to zero thus facilitating their reliable
removal using simple thresholding. Thirdly, there are two additional, smaller peaks
at the very edges of the scan line. These artefacts are caused by the truncation of the
filter at the edge of the image and are easily removed as they have a fixed location.
Figure 5-12 Normalised filters applied to actual scanline data. Left, input scanline intensities. Right,
filtering result showing sharp peaks at features of interest.
The filtering process is applied to each scanline in the image and the results are
thresholded using a value of 10. The result is then searched for local maxima,
producing a new image where a pixel that is set corresponds to a detected feature.
Figure 5-13 shows the results of applying the process to a complete image. The road
markings are very clearly identified by narrow lines of pixels even in the areas of the
image that were obscured by shadows. However, in addition to the lane markings,
several other features have also been detected. In particular the corners of the wall on
the left hand side have given a strong response to the filtering. This particular
problem was also experienced by Huang et al and they describe a method for the
removal of such artefacts by fusing LIDAR data with the image [20]. Their technique
is described in section 5.3.2.3 of this document. To some extent, experiencing this
problem confirms that TORCS is sufficiently realistic for the purposes of the project.
37
Figure 5-13 Result of applying matched filters to a complete image.
5.3.2.3 Simulated LIDAR

All of the teams that completed the Urban Challenge used multiple LIDAR sensors,
clearly identifying this as a state of the art approach. As TORCS necessarily has an
internal representation of the 3-dimensional structure of track, it would be possible to
simulate the type of data generated by LIDAR. This data could then be used for
obstacle detection in a manner similar to Junior (section 2.4). However, sensors such
as the Velodyne produce vast amounts of data, typically 1 million data points per
second. Given this level of data and the processing limitations experienced during the
development of the neural network prototype, there was a very real concern about
whether simulating LIDAR would be practical.
The question remains of how to utilise simulated LIDAR data. As stated earlier, the
use of matched filters performed well at detecting the lane markings but also detected
the walls and barriers at the roadside. This problem was experienced and solved by
Huang [19]. Huang’s approach involves fusing LIDAR data with the image data.
Vertical features detected by the LIDAR are converted into image co-ordinates and
used to directly mask areas of the image. Thus, the false positives are deleted from
the image with the effect of improving the overall signal-to-noise ratio.
Fusing the LIIDAR and the camera data in the real world requires calibration
between the two sensors to ensure that their relative positions are accurately
established. The same problem exists in principle in the simulated world though to a
far lesser degree. Simulation provides two advantages in this regard; firstly the
relative positions of the sensors can be determined with 100% accuracy, and
secondly there is no risk of this changing over time due to environmental factors
such as vibration.
38
In fact, I chose to place the laser sensor at exactly the same position as the camera
though clearly this is not something that can be done in the real-world. As TORCS
simulates a camera by projecting 3-dimensional points from the world-frame onto a
2-dimensional camera plane, we can simply apply the same projection matrix to the
LIDAR data to convert them to image co-ordinates. Whilst this is simple in principle,
determining the exact projection matrix used by TORCS was somewhat challenging.
In the real-world, a camera’s characteristics can be determined by using a standard
calibration library and a checkerboard image. This is, or course, not 100% accurate
but it is a relatively straight forward process. In contrast, determining the
characteristics of the simulated camera involved several days of reverse engineering
the source code and multiple tests to ensure data could be accurately aligned as the
vehicle pitches and rolls. This perhaps serves as an example of a task that was
trickier than its real-world counterpart.
Having established that the LIDAR data should be used to mask vertical features in
the image we can turn our attention to how this data can be simulated. In order to
simulate the data generated by a vehicle-mounted laser sensor accurately it is
necessary to take a ray-tracing approach. As the vehicle moves through the world,
individual rays are cast from the sensor each at a unique heading and pitch. The point
at which each ray intersects a surface in the world can then be computed. These
intersection points then need to be converted from the world-frame into the vehicle-
frame so that the co-ordinates are relative to the vehicle. As the sensor moves with
the vehicle through the world, the intersection points need to be recomputed in real-
time.
However, determining where each ray intersects with the world is a complex and
computationally expensive operation. Factors such as vehicle pitch and roll as it
accelerates and corners must be taken into account. In addition to this, the number of
data points required and how to put them to practical use was not clear. The ray-
tracing approach was, therefore, deemed to be too risky given the short timescale of
the project.
The alternative to generating real-time data is to pre-calculate it. As the track is

unchanging and there are no other road-users it is possible to pre-generate a set of
39
points representing the 3-dimensional structure of the track and barriers. They key
difference here is that each point is now fixed to the track and acts as a marker. As
the vehicle moves through the world, the distance and direction from the sensor to
each marker can be computed easily. The sub-set of markers that lie within the
sensor’s range can then be transformed into vehicle-frame co-ordinates and
transmitted as the simulated point cloud. This approach has very little performance
overhead and although the operation of the sensor is not accurately simulated, the 3-
dimensional information content of the data is adequately reproduced.
Using pre-calculated data, by its very nature is not truly representative of accurately
generated real-time data. Thus, compromise in data generation necessarily requires
compromise in how that data is ultimately used. It was, therefore, important to find a
balance between the ease of data generation and generating data that requires
processing in a way that remains realistic. Furthermore, being in a position to select
both how the data is generated and how it will be used runs the risk of unreasonable
assumptions being made.
My first approach to pre-generating the markers was to randomly generate points on

each road surface and barrier surface within the world. The RANSAC algorithm was
then used to detect the near-vertical planes identifying the walls and barriers. This
approach seemed promising but was however, unable to detect curved vertical
features such as a wall around a bend in the road. The problem is illustrated in Figure
5-14. It would have perhaps been possible to resolve this problem by using
RANSAC to remove the horizontal planes corresponding to the road surface and
assume what remains are the vertical surfaces. However, this approach would likely
struggle with sections of road that are either on a hill or on banked bends (both of
which exist in the selected track).
40
Figure 5-14 Example of 3-dimensional point data after RANSAC processing. The image shows the
parallel barriers of a section of road containing a 90 degree bend. The straight barrier sections are
identified but the curved sections are completely absent.
In view of this setback, I decided to take a more structured approach. The Velodyne
is a rotating sensor with 64 separate lasers, each at a different pitch. Thus, with the
sensor facing at fixed heading, a reading from the sensor will consist of 64 values
that lie on a single vertical plane. I chose to group the markers into a series of
vertical slices through the track at 1m intervals (Figure 5-15). Grouping markers
together in this way adds structure to the data in way that is comparable to that of the
Velodyne.
Figure 5-15 Vertical feature identification using LIDAR. On the left, the full point cloud with a single
group in green. On the right, the extracted vertical features.
As the vehicle progresses around the track, groups that lie within 50m are converted
from world-frame co-ordinates to vehicle frame co-ordinates and transmitted from
TORCS to the AI controller. This gives a data rate of approximately 35,000 points
per second compared with the 1 million per second for the Velodyne. The group
structure of the data is preserved on transmission allowing the AI controller process
each separately.
41
Detecting vertical features within a group is not quite as simple as searching it for
two points with the same location but different height. As the co-ordinates are in the
vehicle-frame and the vehicle is subject to pitching and rolling as it accelerates and
corners, two vertically aligned points in the world-frame are not necessarily so in the
vehicle-frame. However, this effect is small and is easily dealt with by using simple
tolerances.
Once identified, the co-ordinates of the vertical features in each group are stored.
Features in adjacent groups are then joined to form 3-dimensional polygons. These
polygons are then projected into the image plane, overwriting the underlying image.
Specifically, they are projected onto the output of the matched filter step, removing
false positives from the walls and barriers. Figure 5-16 shows an example of
polygons masking the barriers in the filter output.
Figure 5-16 Masking using LIDAR. Masked areas shown in red – in practice white (zeros) would be
used.
The result is an image with the lane markings clearly visible and with very little
noise. Figure 5-17 shows the signal-to-noise ratio before and after masking the
barriers. For most of the test images, masking produces a significant improvement in
the signal-to-noise ratio.
42
Figure 5-17 Comparison of SNR values. Simple thresholding (black), matched filters pre-masking
(red), and post-masking (blue).
5.3.3 Road Geometry Modelling

Having detected the road markings and removed as much of the noise as possible
using the LIDAR data it is now necessary to apply a model to the markings – in other
words, to group together the pixels that belong to the same lane marking.
The first step in doing this is to move away from the image processing format. The
image that we have at this point contains black pixels that are of interest against a
white background. The image is scanned for the black pixels and their co-ordinates
added to a list of points of interest (POI). Typically each image will have 200-400
such points of interest. The next step is to process the POI list by fitting some curve
model to the data.
5.3.3.1 Parabola Fitting

One approach is to fit quadratic curves to the road markings. Although the nature of
quadratic curves restricts their expressiveness, they have the benefit of being
computationally simpler than more complex curves such as splines. I performed a
brief investigation to determine if this approach would work with the image data.
Figure 5-18 shows the results of fitting parabolas to the lane markings.
43
Figure 5-18 Fitting parabolas to the lane markings
The left hand images show the input, and the right hand images show the result of
fitting parabolas to the markings. The technique works well, particularly on lines of
lower curvature as seen in the top row. Of particular interest is the ability to traverse
the gaps between the markings in the centre-line. However, the bottom row shows a
clear problem with high curvature bends, particularly when the curve is asymmetric.
It is clear that this technique has potential, having the benefit of generating smooth
continuous curves. Despite the potential, I chose not to use the approach as I felt that
a simpler approach inspired by Huang would be both quicker to implement and
execute. However, the time taken to investigate parabola fitting would prove to be
time well spent as the technique is used to good effect for path planning in section
5.4.1.
5.3.3.2 Line Segment Fitting

Huang’s approach is to fit splines to the road markings. This is done by first
representing the markings using connected line segments. This results in a set of
control points that are used as the basis for generating the splines.
Huang’s method of fitting line segments starts with selecting a set of random seed
points. For each of these seed points, a search is made through the POI list to find
those that are 55 ±5 pixels from the seed. These new points are evaluated using a
distance transform [19][18] which favours points that are linked to the seed point by
intermediate points. The point with the best score is selected and becomes the new
seed point. Thus, the line segments grow along the curve representing the road
44
marking. The process is repeated, generating a set of points that lie on the marking
can then be used as control points for spline fitting.
Rather than fitting splines, I chose to adapt Huang’s method of generating the control
points to fit line segments to the curve. My approach differs from Huang’s in several
respects. Firstly, I select seed points based on their proximity to hard-coded regions
of interest within the image. Secondly, I fit line segments of variable rather than
Huang’s (essentially) fixed length segments. Thirdly, I do not fit splines to the
resulting control points, instead projecting the points directly onto the ground plane.
I define three fixed regions of interest (ROI) that are used to select seed points for
tracing the markings. Seeds are selected by searching the point of interest list for the
point closest to each ROI. The locations of the regions of interest are based on the
observation that when the vehicle is correctly positioned in-lane on a straight section
of road, the lane markings intersect the edge of the image at fixed points (Figure
5-19). Of course the points of intersection change significantly as the vehicle
navigates a bend but the technique works reliably for two main reasons. Firstly, the
low level of noise in the image gives a high probability that the chosen seed belongs
to a lane marking and secondly, once a lane marking has been traced, all the points
that are associated with that marking are removed from the POI list preventing them
being selected as seeds for the remaining lane markings.
Figure 5-19 Location of regions of interest (red) used to select seed points
It is worth noting that selection of these points reflects the desire to drive in the left-
lane but the locations could easily be changed if, for example, an overtake
manoeuvre is required.
Once a seed point has been selected, the next step is to fit a series of line segments to
the curve. Huang used a fixed length of 55 ± 5 pixels for each segment. However, a
constant length in a perspective image does not correspond to a constant distance on
45
the ground plane. Thus, long line segments may work well for markings close the
vehicle but do not provide a good fit for more distant markings. For this reason, I
chose to make the length of the line segment (in pixels) variable such that it
corresponds to a constant distance (in metres) in the ground plane. The desired line
segment length is, therefore, computed as a function of the (x,y) coordinate of the
seed point. The ground-plane distance chosen was 1 metre and, therefore, each
additional line segment corresponds to tracing the lane marking approximately 1m
further.
In some instances the POI list can be sparse resulting in large gaps between pixels.
This is a particular problem with the centre markings where the gaps (although real)
need to be bridged in order to trace the lane marking. To combat this, the desired line
length is allowed to be very flexible with any POI that lies between 0.5m and 2.0m
of the seed point being considered as a candidate.
5.3.3.2.1 Distance Transform

Given a set of candidate points for adding a segment, each is evaluated using a
distance transform. The distance transform is an image where the intensity of a pixel
is computed as the distance from that pixel to the nearest pixel in the POI list (Figure
5-20). The naïve approach to generating the distance transform would involve
calculating the distance from every pixel in the image to every pixel in the POI list.
This approach is far too costly to be considered for a real-time system. Fortunately,
the distance transform is a common technique within the field of computer vision and
has been well studied. This project calculates the distance transform using a linear-
time algorithm described by Felzenszwalb and Huttenlocher [18].
Figure 5-20 Example of a distance transform image. The brighter the pixel, the shorter the distance
to the nearest point of interest.
Once the set of candidate points has been generated, the distance transform is used to
evaluate each in the following way. An imaginary line is drawn [22] over the
46
distance transform image from the seed point to the candidate point under test
(Figure 5-21). The intensities of each pixel that this line crosses are averaged and
kept as the score of the candidate point. The distance transform favours lines that lie
close to points of interest in the image as opposed lines that bridge large open areas.
The process is repeated for each candidate point and the one with the best score is
selected.
The process of adding line segments is repeated until the trace contains 20 segments
or until no further candidate points can be found. The set of line segments is then
stored as a potential lane marking awaiting verification.
Figure 5-21 Application of the distance transform. Two candidate lines (red) are evaluated using
the distance transform.
Using the distance transform does not guarantee that the selected point lies on the
marking. For example, tracking may jump across from the left lane marking to the
centre markings. This typically occurs if the data is sparse and there is noise in the
image making it more difficult to distinguish between good and bad points using the
distance transform. However, when such mistakes occur, they typically result in
sharp changes in direction of the connected line segments. I found that adding a
constraint that a new segment must be within 30 degrees of being collinear with the
previous segment greatly improved the situation and also prevents backward jumps
along portions that have already been traced.
Figure 5-22 Illustration of variable line segments fitted to the lane marking. The red dots indicate
the separate segments.
47
5.3.4 Lane Marking Verification

At this point we have traced a single lane marking. Despite the constraints on tracing
the marking, it is still possible to have a false trace. Figure 5-23 shows an example of
a false trace that has been caused by poor selection of the initial seed point. There are
a variety of additional constraints that may be applied to remove such occurrences.
For example, one approach would be to check that all traces are parallel to each other
when projected onto the ground-plane. The approach I took was to add a verification
step that uses the binary image created at the start of the image processing pipeline.
The trace is overlaid onto the binary image and the percentage of underlying pixels
that are set is computed. Traces with less than a 40% hit rate are discarded as being
false positives. A much higher threshold may be used and indeed this works well for
the left and right lane markings but in order to avoid discarding good centre-line
traces the threshold must be less than 50% due to the line being dashed. The value of
40% was found to successfully eliminate almost all false positive traces.
I had originally intended to use this technique to aid in the classification of a trace as
being dashed or continuous and it is likely that this would work well but was found
to be unnecessary.
Figure 5-23 Verification process. Left, a trace that passes. Right, a trace that fails.
Once a trace has been verified as being a true lane marking, the POI list is searched
for all points that lie within 5 pixels of any point on the trace. These points are
removed from the list and a new seed point is selected using the next region of
interest.
In total, the process is repeated three times, one for each region of interest. The
process could be repeated more times than this to help with situations such as when
the gaps in the centre-line have not been successfully jumped, however, this was not
found to be necessary.
48
5.3.5 Lane Marking Classification

The next step is to classify each trace as being the left, right, or centre marking. The
first step in doing this is to project the detected markings onto the ground plane. This
results in the traces being expressed in co-ordinates relative to the position of the
vehicle (which lies at the origin). It is now possible, to some extent, to classify the
markings using their positions relative to each other. This step is trivial in the case
where three separate markings have been detected on a straight road (Figure 5-24). In
such a case we can easily determine, for example, that one trace lies entirely to the
left of the other traces and classify it accordingly.
Figure 5-24 Projecting the detected lane markings onto the ground plane. The vehicle is at the
origin facing directly along the y-axis.
There are, however, situations where this classification is not trivial. For example, as
the vehicle navigates a sharp bend it is common for the lane marking at the inside of
the bend to be completely off-screen. There are, therefore, situations where we do
not have three good traces to classify and, as we explicitly look for three lane
markings, it is possible for two traces to lie on the same marking. This is particularly
true for the centre-line as the initial trace may fail to bridge a gap which leaves
enough of the marking free to be detected by another trace.
49
Figure 5-25 Example of the left lane marking not being visible. As a result the centre marking has
been detected twice.
Figure 5-25 shows a situation where the left marking is not currently visible. As a
result, two separate traces have been detected on the centre-line. The image on the
right shows the traces projected onto the ground plane. If we classify the markings
based simply on their positions relative to the vehicle, one centre-line would be
classified as being to the left and the other to the right.
Another problem, again as a result of bends, is that the start point of a single trace
may lie to one side of the vehicle but, due to the curvature of the road the end may lie
to the other side of the vehicle.
Figure 5-26 Example of only a single lane marking being visible.
Figure 5-26 shows an example of a sharp bend where there is only one detected lane
marking. When projected onto the ground plane, the trace starts to the left of the
vehicle and finished to the right making classification difficult. It is, therefore, not
sufficient to simply classify a trace based on its start and end positions relative to the
vehicle and other traces.
50
The solution to this problem lies in a loose assumption of parallelism of not only the
markings but also the planned trajectory of the vehicle. Rather than classifying a
trace using its position relative to the car, it is classified using its position relative to
the planned trajectory of the vehicle.
Therefore, to classify a trace, I make use of the fact that all traces start ahead of the
current vehicle position. The start point of the trace is compared with each point on
the current planned vehicle trajectory in order to find the closest point. The trace is
then classified based on the relative distance between its start point and the closest
point on the vehicle’s trajectory.
The classification is made using prior knowledge of the road width, 10m for the test
track. As the vehicle is assumed to be driving in the left lane, a marking found
approximately 2.5 to the left of the planned trajectory is classified as being the left
marking. Likewise, 2.5m to the right of the trajectory identifies the centre marking
and 7.5m to the right identifies the right marking.
This approach immediately leads to the problem of initialisation – we need to

classify the lines in order to plan a trajectory but we need the trajectory to classify
the markings. This is solved by providing an initial ‘straight-ahead’ trajectory,
implicitly making the assumption that the vehicle starts on a straight section of road.
Whilst it is clear that this is not ideal, no adverse impact has been observed as a
result of this assumption even when starting the vehicle in the middle of a bend.
51
Figure 5-27 Examples of lane marking classification. Red indicates left marking, green is the centre,
and blue is the right marking. The vehicle trajectory is shown in black.
5.4 Planning
This section describes the process by which the desired trajectory and speed of the
vehicle are determined.
5.4.1 Trajectory Calculation

After a trace has been classified, it can be used to contribute towards the planning of
the vehicle’s trajectory. The classification from the previous step is used to generate
a set of trajectory points that define a path that is parallel to the trace. This is done
by taking each segment of the trace on the ground-plane and calculating a point of
the appropriate distance perpendicular to the segment (Figure 5-28). The distance
from the segment to the trajectory point is based directly on the result of the marking
classification step.
52
Figure 5-28 Generation of trajectory points (blue dots) by casting lines perpendicular to the left
lane marking (red) exploiting the lane marking classification and prior knowledge of the road
width.
This process is repeated for each trace to produce a set of trajectory points which
should all lie on (or at least close to) the desired path of the vehicle (Figure 5-29).
Obviously, due to errors and inaccuracies in the image processing pipeline, these
points only approximate the desired trajectory of the vehicle. Indeed, there may be
points that are a significant distance from the true desired path. It is necessary to
generate a smooth path that is robust to errors and outliers. The method I employed is
to fit a parabola to the set of trajectory points. The vehicle’s current position (0,0) is
also included as a trajectory point in order to encourage a smooth transition from one
trajectory to another.
I used the GNU Scientific Library’s regression functions to compute the quadratic
coefficients using the least-squares error method [28]. Once the best-fitting parabola
is computed, the trajectory points are discarded and a new set of path points
generated at 1m intervals from the vehicle’s current position. These path points are
sent to a separate process that is responsible for generating the steering, throttle, and
brake commands required to control the vehicle (section 5.5).
53
Figure 5-29 Parabolic trajectory generation (black) using the set of trajectory points (blue).
5.4.2 Speed Selection

Using a parabola to generate the path has the side-effect of producing a simple
measure of the curvature of the trajectory. The curve has the form
and the magnitude of the coefficient a gives a (simplistic) measure of the curvature
of the path. As each trajectory is calculated, this coefficient is stored and a rolling
RMS value is determined over the previous 10 trajectories (roughly the last 0.5s).
Using an RMS value smooths out fluctuations and removes the effects of sign
changes in the coefficient. The RMS value is used in conjunction with a simple state
transition system to compute the desired speed of the vehicle. Currently, only three
speeds can be commanded by the AI controller; 20km/h, 30km/h, and 40km/h. The
lowest speed is slow enough to ensure even the tightest of bends can be safely
navigated. If the RMS curvature exceeds a threshold, then the speed is reduced to
20km/h. Once the curvature has dropped below this threshold for a period of 3s, the
speed is increased to 30km/h, after a further 2s the speed is increased to 40km/h. This
system, although basic, is sufficient to allow the speed to be reduced prior to
entering a bend and allow the vehicle to accelerate to a more practical speed on the
straights and gentler bends.
5.5 Acting
This section describes how the vehicle’s throttle, brakes, and steering are controlled
in order to follow the desired trajectory at the desired speed.
54
The trajectory that was generated by the previous stage was in vehicle-frame
coordinates i.e. the vehicle is fixed at the origin facing directly along the y-axis.
Since the vehicle must travel along the desired trajectory, the coordinates must be
converted to the world-frame with the origin at a fixed location independent of the
vehicle. The vehicle then moves through this space and has a position and heading
(x,y,θ) at any given time.
TORCS provides the global position and heading of the vehicle. This information is
derived by integrating the forces acting on the vehicle over time. In this respect, the
position information can be considered to be equivalent to a highly accurate
GPS/IMU sensor of the type used by entrants to the Urban Challenge. The
information from this sensor is used to convert the trajectory into world-frame
coordinates, fixing it in space. This allows the vehicle to move relative to the
trajectory.
The vehicle control system runs in a separate process and is, therefore, decoupled
from the sensing and planning system. This has several benefits. Firstly, it allows
vehicle control to operate at a higher frequency (50Hz) than the sensing and planning
systems (~25Hz). Secondly, the vehicle will follow the current trajectory until a new
one is available. This leaves the possibility of incorporating a fail-safe mechanism
with the vehicle remaining under control for sufficient time to perform an emergency
stop should the sensing/planning systems fail. Thirdly, the sensing and planning
systems can be modified and improved without need to modify the control system.
5.5.1 Speed Control

The physical speed of the vehicle is matched to the commanded speed using two
simple PI controllers. One controller is connected to the throttle and allows the
vehicle to maintain a constant speed through throttle control alone and to accelerate
should the commanded speed increase. As with a real vehicle, throttle modulation is
not sufficient to reduce the speed significantly, therefore, a second controller is
connected to the brake pedal. This only becomes active when the vehicle’s speed is
more than 2km/h above the command speed.
55
The two controllers are mutually exclusive with only one being active at any given
time. Each controller generates a command value in the range 0.0…1.0 for
compatibility with the TORCS API. The controller constants and were
determined using the Ziegler-Nichols method [26].
5.5.2 Steering Control

Steering control is more complicated than speed control. Here we are concerned with
navigating the vehicle along a specific trajectory. This not only involves ensuring
that the vehicle is in the correct position but also has the correct heading. We
therefore have two concerns, first, to ensure the vehicle is facing in the correct
direction and second, to minimise the distance between the current position and the
desired position.
The first step is to search the desired trajectory for the point that is closest to the
vehicle’s current position. From this we obtain the desired position and heading. The
steering angle is set to the difference between the current heading and the desired
heading (subject to the physical limitations of the car). This is essentially a
proportional controller with a . This has the effect of ensuring the vehicle
navigates a path that is parallel to the desired path. In order to minimise the lateral
offset between the vehicle and the desired path, the distance to the closest point on
the trajectory is calculated. The previously calculated steering angle is then modified
proportionally to steer towards the desired position.
As the steering and speed control are separate from the vision system, the vehicle is
able to follow the desired trajectory independently either until it reaches the end of
the planned trajectory or a new updated trajectory is available. In practice, the
average trajectory is around 20m giving the vehicle a path sufficient for 1.8 seconds
at 40km/h.
56
Chapter 6 Evaluation
This section describes the experiments and analysis used to evaluate the system and
the results obtained. As the system contains several sub-systems, an attempt has been
made to evaluate each separately.
6.1 Lane Marking Detection and Classification

In order to evaluate the lane marking detection and classification algorithms, a set of
images was captured as the vehicle progressed around the track. Each image
comprises the original input image with the detected lane markings overlaid and
colour-coded by their classification. Images were captured at 1s intervals over a
single lap of the track, resulting in 241 images. Two example images are shown in
Figure 6-1. The images were then manually assessed and the following parameters
recorded:
 The number of lane markings visible in the image (the dashed centre-line
counts as a single lane marking)
 The number of lane markings correctly detected (true positives)
 The number of lane markings incorrectly detected (false positives)
 The number of lane markings correctly classified as left, right, or centre
The image set contained a total of 690 visible lane markings of which 646 were
correctly detected giving a detection rate of 93.6%. There were no false positive lane
marking detections. A total of 635 lane markings were correctly classified meaning
that 98.1% of detected lane markings were correctly classified. Expressed as a
percentage of the total number of visible lane markings, 91.9% were correctly
identified and classified. In addition to this, the data reveal that every image in the set
had at least one correctly detected and classified lane marking, allowing a path to be
computed.
Whilst these results are impressive, it is worth noting that the data set is biased. Lane
marking detection in bends is more difficult than on straights and, as less time is
57
spent in bends the difficult sections are under-represented. To combat this, each
image was classified as either a bend or a straight section of road.
Of the 241 images, 169 were of straight or nearly straight sections. These contained a
total of 507 visible lane markings of which 95.7% were detected. Of those markings
detected, 100.0% were correctly classified. Thus, in the straight dataset, 95.7% of
visible lane markings were correctly detected and classified.
The remaining 72 images were classified as being of bend sections. There were a
total of 183 visible lane markings of which 88.0% were correctly detected. Of those
detected, 92.5% were correctly classified. Therefore, in the bend dataset, 81.4% of
visible lane markings were correctly detected and classified.
Figure 6-1 Marking detection and classification results.
6.2 Trajectory Planning

This section provides an assessment of the path planning algorithm.
6.2.1 Generation of Trajectory Points

Using line segments to model the curvature of the lane markings has the benefit of
simplicity but often produces a model that is not smooth. Figure 6-2 illustrates the
effect and its consequences. Some of the line segments are short and are not
tangential to the curve and can, therefore, generate trajectory points that are some
distance from the lane centre. The effect is multiplied in the case where the lane
marking being modelled is the right border. Due to its greater distance from the path
of the vehicle the computed trajectory points are deflected further.
As the system is running in real-time, it is difficult to assess what impact this effect
has on the vehicle path. However, this type of noise in the data was anticipated and is
mitigated by fitting a parabola to the data to generate the vehicle path.
58
Figure 6-2 Line segments do not produce a smooth model. The left image shows an example of
trajectory points (blue dots) being generated from a jagged left border marking. The right image
shows trajectory points generated from the right border marking. As the lateral distance is greater,
the generated points are deflected more.
6.2.2 Flat Ground Assumption

Whilst observing the vehicle navigate the track, I noticed that there are occasions
where the vehicle visibly oscillates within its lane. This happens consistently at
points on the track where the vehicle is approaching a change of gradient. When
approaching the base or brow of a hill, the flat ground-plane assumption no longer
holds. Under these circumstances, the lane markings continue to be correctly
identified but the act of projecting them onto a flat ground-plane causes them to be
distorted. The lane markings appear to diverge on approach to the base of a hill and
converge on approach to the brow of a hill. Figure 6-3 illustrates this distortion.
One solution to this would be to extract the road gradient from the 3-dimensional
LIDAR data. The detected lane markings could then be projected onto an accurate
model of the road surface removing the need for the flat ground-plane assumption.
59
Figure 6-3 Distortion of lane geometry when the flat ground-plane assumption does not hold. The
left trace is taken at the base of the hill causing divergence and an exaggerated distance calculation.
The right image is taken at the brow of a hill causing convergence.
6.2.3 Non-continuous Path

The result of each image processing cycle is the generation of a new parabolic path
for the vehicle to follow. Each cycle is independent and, as such, there is no
guarantee that consecutive paths are continuous. The vehicle will follow the latest
path that it has received and, therefore, any non-continuity will lead to jerky
movements. This phenomenon is not noticeable on straight sections where continuity
is more likely but is noticeable as the vehicle navigates a bend. Figure 6-4 illustrates
the problem.
An attempt was made to solve the problem by using the previous n sets of trajectory
points to generate a parabolic path rather than using just the current set. The idea
being that including multiple sets of points would provide a smoother transition with
the latest data. This is not trivial to do as the parabolas can only be generated in the
vehicle’s co-ordinate frame (as we require the parabola’s directrix to be parallel to
the vehicle’s heading). However, as the vehicle moves by a finite amount during
each cycle we cannot directly merge consecutive sets of trajectory points. In order to
merge the data, a common co-ordinate frame must be used. Therefore, as each
trajectory set is generated, it is converted to world-frame co-ordinates and stored. In
this frame, the previous n sets can be merged directly. The merged data is then
converted back in the current vehicle-frame for parabola fitting. Finally, the
generated parabola is converted back to the world-frame for path following.
60
Whilst the process of merging the trajectory-sets operated correctly, the resulting
vehicle motion was significantly worse than before. With n set to between two and
ten the vehicle oscillated wildly and control was quickly lost. No investigation was
made as to why the technique failed but is possibly due to the inaccuracy of
trajectory points far in the distance. By using historic data in this manner, inaccurate
points persist and are allowed to influence the trajectory more than they ought to.
One resolution to this would be to weight the trajectory sets with older data
providing less influence.
A better approach, however, would be to use connected splines rather than parabolas
for path generation. Doing so would guarantee that consecutive paths are continuous.
Figure 6-4 three consecutive parabolic paths that are not continuous. The red circles indicate the
start point of each path.
6.2.4 Look-ahead Distance

The look-ahead distance is a measure of how far into the distance the system can see
and is equivalent to the length of the path generated for the vehicle to follow. To
measure this, every path generated during the course of a single lap was recorded.
Figure 6-5 shows a histogram of the individual path lengths. The distribution is
multimodal, with peaks around the mean (20.70m), the 9.0m mark, the 14.0m mark,
and the 40.0m mark. The peaks at 9.0m and 14.0m are likely to represent the paths
through bends due to the reduced look-ahead distance. The peak at 40.0m is
explained by the software truncating any longer paths. Although it is difficult to
make direct comparisons, the figure of 20.7m compares well with the 10-15m look-
ahead distance of Odin [10].
61
If we assume that the average value of 20.70m is representative of the path length on
a straight section of road, we can compute that maximum safe speed of the vehicle.
To do this, we apply the constraint that the vehicle must be able to stop within the
distance for which it has a valid path. Taking 20.70m as a maximum stopping
distance gives an approximate maximum speed of 63km/h in good conditions [27].
Figure 6-5 Histogram of path lengths over a single lap.
6.3 Physical Performance

This section evaluates the system’s ability to act according to the instructions issued
by the AI controller.
6.3.1 Path Following

As the width of the lanes are constant and the desired path aims to be along the
centre of the lane, the centre of the lane can be used as a baseline for the evaluation
of the actual path taken by the vehicle.
As the vehicle progresses around the track the lateral offset from the centre of the
lane is recorded. Figure 6-6 shows the results for a single lap. The maximum lateral
offset experienced during the lap was 118cm. Given that the width of the vehicle is
200cm and the width of the lane is 470cm we can conclude that at no time does the
vehicle cross the centre line. The system, therefore, meets the project objective of
maintaining a safe position on the road.
62
Comparing the performance of the autonomous vehicle with the expert driver when
driving at a constant 20km/h we can see that the expert driver strays less from the
correct path. As both use the same proportional steering control method to follow
their respective paths, we can conclude that the large lateral offsets experienced by
the autonomous vehicle are the result of poor path planning rather than poor path
following.
By way of comparison, the figure shows lateral offset corrections experienced by

Junior during the Urban Challenge [9]. Note that these offsets are what Junior would
have experienced had it been relying solely on its GPS/IMU system rather than its
probabilistic localisation system. Performance, therefore, seems broadly comparable
with Junior’s GPS/IMU based system.
Figure 6-6 Lateral offsets of the vehicle. Top left, autonomous vehicle travelling at variable speed.
Top right, autonomous vehicle travelling at constant 20km/h. Bottom left, expert vehicle travelling
at constant 20km/h. Bottom right, lateral offset corrections
Interestingly, the autonomous driver has peaks around 20cm for both constant and
variable speed. The fact that the histogram does not peak at zero suggests that there
may be some bias in the calculation of the vehicle’s path. A small error in the
estimation of distance between the lane markings will result in the trajectory points
not forming a single line. In such a case, the lane marking that contributes the most
63
points is able to cause bias in the calculated path. This point was not investigated
further.1
6.3.2 Speed in Bends

The following figure shows the commanded speed of the vehicle as it progresses
around the track. The commanded speed is the speed that the autonomous driver
wishes the vehicle to maintain and can take the values 20, 30, or 40km/h. Upon
receiving a new speed command, the vehicle’s speed controller will accelerate or
decelerate as required.
Figure 6-7 Commanded vehicle speeds around the track. Red = 20km/h, green = 30km/h, and blue
= 40km/h. The left image shows the whole track (travelling clockwise), the right image shows
detail of a complex set of bends (travelling top to bottom).
Figure 6-7 illustrates the system’s ability to correctly select the appropriate speed
based on the curvature of planned trajectory. In many cases, the vehicle is
commanded to reduce speed for a bend prior to entering the bend. The system,
therefore, exhibits predictive behaviour and successfully meets the project objective
of maintaining a speed appropriate to the current road geometry.
6.3.3 G Force Analysis

As mentioned earlier, the vehicle motion can be jerky whilst navigating bends. This
prompted me to compare the lateral g-forces experienced by the expert driver with
those of the autonomous driver. Figure 6-8 shows the expert driver in red and the
autonomous driver in blue; both were driving at 20km/h. The difference between the
two is striking.
1
The cause of this bias was later confirmed as being due to an error in the coding of the road width.
However, due to time constraints, the experiments in this chapter were not re-run.
64
Looking at the expert driver, the flat sections correspond to straight sections of track,
the positive areas correspond to right-hand bends, and the negative areas are left-
hand bends. A close-up view of the behaviour in a straight section and a bend are
also provided.
On the straight sections, the expert driver experiences virtually no lateral force with
only slight oscillations around zero. In contrast, the autonomous driver experiences
significantly higher forces. The peak g-force experienced by the expert driver during
the straight section shown is 0.02g (std dev 0.005) compared to 0.29g (std dev 0.068)
for the autonomous driver; a factor of almost 15.
The typical behaviour as the expert driver enters a bend is for the g-force to transition
from near zero to an almost constant value for the duration of the bend before
returning to zero on exit. In contrast, the autonomous driver continues to oscillate,
with both positive and negative g-forces being experienced throughout the bend. The
peak force during the bend shown is 0.23g (std dev 0.07) for the expert driver and
0.97g (std dev 0.17) for the autonomous driver. The likely cause of this is the non-
continuous path problem described in section 6.2.3.
65
Figure 6-8 Comparison of lateral g-force between expert (red) and autonomous (blue) drivers. Top,
complete lap. Bottom left, straight road. Bottom right, road bends to the right.
6.3.4 Maximum Speed

One of the benefits of using simulations is the ability to test a system to the point of
failure without suffering the consequences of such a failure in the real world. It is,
therefore, possible to determine the maximum speed at which the vehicle remains
controllable by increasing the speed until control is lost.
For this experiment, the vehicle was driven, under autonomous control along an
almost straight section of track. As the vehicle progressed, its lateral position relative
to the centre of the lane was recorded. The process was then repeated at increasing
speeds until the vehicle was unable to stay within the lane boundary.
Figure 6-9 shows the results from runs at 20km/h, 100km/h and 110km/h. At 20km/h
the vehicle maintains a steady path along the track. At 100km/h the vehicle oscillates
66
severely along the track but, with a peak lateral offset of only 39cm, it remains well
within the boundaries of the lane. At 110km/h the vehicle becomes unstable with the
oscillations increasing in amplitude until control is lost and the vehicle leaves the
road. In this case, the vehicle drove onto a grassy surface with low friction causing it
to skid many metres from the track.
It is, therefore, possible to state that the maximum straight-line speed at which the
vehicle is controllable is around 100km/h. This easily exceeds the 48km/h limit set
during the Urban Challenge. However, in practice, the maximum safe speed would
be lower than this and the maximum comfortable speed even lower still.
Figure 6-9 Maximum controllable speed. Blue = 20km/h, black = 100km/h, and red = 110km/h.
6.4 Real-time Performance

Table 6-1 gives typical processing times for the main stages of the processing
pipeline. The system runs on a standard Windows laptop with an Intel i7 quad core
processor. The total cycle time is around 42ms giving a frame rate of 24Hz. Both
TORCS and the AI controller consume ~12% of total processor capacity giving a
total processor load of ~25%. There is, therefore, ample capacity for more
computationally intensive algorithms to be incorporated.
67
Table 6-1 Pipeline performance
Stage Time (ms)

Pre-processing ~1
 Grey-scale conversion
 Binary image for verification
Matched filters 24
Point cloud processing ~3
 Vertical feature identification
 Data fusion (image masking)
Line tracking 14
 Line segments
 Distance transform
 Classification
 Parabola fitting
Total 42
68
Chapter 7 Conclusion
7.1 Summary
This dissertation has examined the importance of the recent DARPA challenges,
particularly the Urban Challenge, on the field of autonomous vehicles. Advances in
sensor technologies such as LIDAR and GPS along with ever-increasing processing
power have brought the goal of autonomous vehicles closer.
Two approaches to autonomous vehicle control have been simulated using TORCS.
The first, a reactive approach drawing on the work of pioneer Dean Pomerleau,
served as a feasibility study for the project. The second, a deliberative approach with
separate sensing, planning, and acting stages forms the main body of the dissertation.
The deliberative approach was heavily influenced by the work of the MIT racing
team and, in particular, the vision system described by Huang. The system extends
Huang’s approach by using twin matched filters to extract road markings from a
forward facing camera. Simulated LIDAR data is fused with the image data in order
to remove noise caused by roadside barriers. In contrast to Huang’s approach, road
markings are modelled using variable length line segments and a simple verification
process is used to remove false-positive detections. The verified lane markings are
then projected onto the ground plane and classified as left, right, or centre markings
according on their position relative to the vehicle’s planned trajectory. Prior
knowledge of the road width is used to convert the lane markings into a parabolic
path for the vehicle. The curvature of this path is used to compute an appropriate
speed for the vehicle. A separate process using simulated GPS/IMU data is used to
guide the vehicle along the desired path at the desired speed. The vehicle’s steering,
throttle, and brakes are controlled using PI controllers.
Analysis of the system’s performance shows that the lane detection and classification
system operates very well, with 91.9% of all lane markings being correctly detected
and classified. The system is able to generate a path for the vehicle such that the
vehicle maintains a safe position and an appropriate speed at all times. Furthermore,
the vehicle is able to navigate the test track reliably having completed well over 100
69
autonomous laps at speeds of up to 40km/h without failure. The vehicle remains

controllable at speeds of up to 100km/h on straight roads.
However, analysis has also shown that the vehicle experiences considerably higher
lateral g-forces than the benchmark expert driver. The primary cause of this problem
is the fact that consecutive paths generated by the system are not continuous. The
result is a very slight but definite change in direction when a new path is generated
which is particularly apparent in bends. The vehicle also suffers from oscillations on
the approach to changes in road gradient. This is the result of the flat ground-plane
assumption not holding in these circumstances.
7.2 Future Work and Conclusion

The system successfully meets the project objective of maintaining a safe road
position and speed. Furthermore, the broader goal of using techniques that were seen
in the Urban Challenge has been met by combining vision, LIDAR, and GPS/IMU
data. This project implemented a camera based vision system despite traditional
vision systems playing a secondary role is the Urban Challenge. However, it is worth
noting that although vehicles such as Junior use lasers to detect road markings, the
process still makes use of intensity images. Therefore, the image processing
techniques described in this thesis are still of relevance.
The vision system implemented here operates at around 25Hz which is faster than
necessary. This is demonstrated by the neural network’s vision/control cycle
operating as low as 11Hz and the state-of-the-art road detection system employed by
Boss operating at around 10Hz [8]. Furthermore, the average path generated by the
system is sufficient for ~1.8s of driving, suggesting that frame rates less than 10Hz
may still be capable of maintaining vehicle control.
Despite the overall success, each part of the system could be improved. Indeed, as it
stands, the system represents the bare minimum functionality expected of an
autonomous vehicle. The most important improvement would be to remedy the non-
continuous path problem to reduce the lateral g-forces. Using a more sophisticated
curve model such as cubic splines would be a vast improvement.
70
The project attempted to simulate LIDAR data and to some extent, succeeded in this.
However, the approach taken was a compromise between finding a useful data
representation and a practical means of generating it. Initial fears regarding
insufficient processing power were unfounded and, given a little more time, a ray-
tracing approach could have been implemented. Such an approach would allow the
sensor to be extended to detect dynamic objects in the environment rather than only
static features. This would pave the way for researching interactions with other
vehicles, something of particular importance in an urban environment.
The use of simple PI controllers to steer the vehicle also has its drawbacks. The
technique is only capable of responding to deviations once they occur i.e. the
controller steers to minimise an existing error. A better approach, commonly used in
the Urban Challenge, was to implement a kinematic bicycle model (see section
2.5.2.3). Such a model would allow the motion of the vehicle to be predicted
allowing it to steer to prevent deviations from the desired path.
The use of a simulator was, of course, essential to this project. Given the low driving
speeds used in the project, the dynamic capabilities of TORCS easily provided
adequate realism in this respect. The fact that the roadside barriers caused false-
positive marking detections in exactly the same manner described by Huang
confirms that TORCS provides good visual realism. However, as the project
focussed on a single track, the style and quality of the road markings were highly
consistent. This lack of variety does not represent reality and is a weakness with the
project. This could be addressed by using multiple tracks and indeed, some informal
testing was done on other tracks. The results of this were very encouraging but there
were problems when the lane markings differed significantly from the test track.
Although it was not the purpose of this project to compare the reactive approach with
the deliberative approach, it is worth finishing by commenting on the difference
between the two. Superficially, the two systems are equivalent in terms of
functionality, with each being able to navigate the test track. The key difference is
that the reactive approach operates as a black-box with the output being a direct
function of the input whereas the deliberative approach decouples perception and
control. This allows the control system to operate at higher frequencies than the
71
perception system and also facilitating functionality improvements. The deliberative

approach is aware of the road layout and of the vehicle’s position relative to the road.
Consequently, functionality such as driving on the other side of the road is an almost
trivial modification, simply requiring the trajectory points to be placed accordingly.
By contrast, the neural network would require retraining to perform this task.
However, that is not to suggest that the deliberative approach is inherently better than
the reactive. Indeed, as described in Chapter 2, state of the art vehicles such as Odin
are combining the approaches in novel ways.
To conclude, the project successfully met its objective of producing a real-time

simulation of an autonomous car and has demonstrated that low-cost simulators such
as TORCS can be useful for research and development in the field.
72
Bibliography
[1] A. Broggi, A. Zelinsky, M. Parent, C. Thorpe: Intelligent Vehicles. In: Springer

Handbook of Robotics, Springer, Chapter 51.
[2] G. Dudek, M. Jenkin: Inertial Sensors, GPS, and Odometry. In: Springer
Handbook of Robotics, Springer, Chapter 20.
[3] B. Schwarz: LIDAR: Mapping the World in 3D. In: Nature Photonics, Volume 4,
July 2010.
[4] DARPA, 2007. DARPA, Urban Challenge Rules,

http://www.darpa.mil/grandchallenge/rules.asp, October 27, 2007.
[5] E. Dickmanns, A. Zapp: Autonomous High Speed Road Vehicle Guidance by

Computer Vision. In: Triennial World Congress of the International Federation of
Automatic Control, Volume IV, July 1987.
[6] E. Dickmanns, R. Behringer, D. Dickmanns, T. Hildebrandt, M. Maurer, J.

Schiehlen, F. Thomanek: The Seeing Passenger Car VaMoRs-P. In: Proceedings of
the International Symposium on Intelligent Vehicles, Paris, France, 1994.
[7] D. Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. In:

Advances in Neural Information Processing Systems, Morgan Kaufmann, San
Mateo, CA (1989).
[8] C. Urmson, J. Anhalt, D. Bagnell et al: Autonomous Driving in Urban

Environments: Boss and the Urban Challenge, Springer Tracts in Advanced
Robotics, Springer, Volume 56, 2009.
[9] S. Thrun, M. Montemerlo, J. Becker et al: Junior: The Stanford Entry in the
Urban Challenge, Springer Tracts in Advanced Robotics, Springer, Volume 56,
2009.
73
[10] C. Reinholtz, D. Hong, A Wicks et al: Odin: Team VictorTango’s Entry in the
DARAP Urban Challenge, Springer Tracts in Advanced Robotics, Springer, Volume
56, 2009.
[11] P. Currier, D. Hong, A. Wicks: VictorTango Architecture for Autonomous

Navigation in the DARPA Urban Challenge, Journal of Aerospace Computing,
Information, and Communication, Volume 5, 2008
[12] S. Thrun, M. Montemerlo, H. Dahlkamp et al: Stanley, the Robot That Won the
DARPA Grand Challenge. Journal of Field Robotics 23(9), 661–692 (2006).
[13] R. R. Murphy: Introduction to AI Robotics, The MIT Press, 2000.
[14] B. J. Pratz, Y. Papelis, R. Pillat, G. Stein, D. Harper: A Practical Approach to

Robotic Design for the DARPA Urban Challenge, Springer Tracts in Advanced
Robotics, Springer, Volume 56, 2009.
[15] Loiacono, D.; Togelius, J.; Lanzi, P.L.; Kinnaird-Heether, L.; Lucas, S.M.;
Simmerson, M.; Perez, D.; Reynolds, R.G.; Saez, Y.; , The WCCI 2008 simulated car
racing competition, IEEE Symposium On Computational Intelligence and Games,
vol., no., pp.119-126, 15-18 Dec. 2008
[16] Muad, A.M.; Hussain, A.; Samad, S.A.; Mustaffa, M.M.; Majlis, B.Y.; ,
Implementation of inverse perspective mapping algorithm for the development of an
automatic lane tracking system, TENCON 2004. 2004 IEEE Region 10 Conference ,
vol.A, no., pp. 207- 210 Vol. 1, 21-24 Nov. 2004
[17] Aly, M.; Real time detection of lane markers in urban streets, Intelligent
Vehicles Symposium, 2008 IEEE , vol., no., pp.7-12, 4-6 June 2008
[18] Felzenszwalb, Pedro F, and Daniel P Huttenlocher. Distance Transforms of

Sampled Functions, Cornell Computing and Information Science TR20041963.
2004.
[19] Albert S. Huang, Lane Estimation for Autonomous Vehicles using Vision and
LIDAR, PhD Thesis, Massachusetts Institute of Technology, 2009
74
[20] J. Leonard, J. How, S. Teller, M. Berger, S. Campbell, G. Fiore, L. Fletcher, E.
Frazzoli, A. Huang, S. Karaman, O. Koch, Y. Kuwata, D. Moore, E. Olson, S. Peters,
J. Teo, R. Truax, M. Walter, D. Barrett, A. Epstein, K. Maheloni, K. Moyer, T.
Jones, R. Buckley, M. Antone, R. Galejs, S. Krishnamurthy, and J. Williams, A
perception driven autonomous urban vehicle, J. Field Robot., vol. 25, no. 10, pp.
727–774, 2008.
[21] Martin A. Fischler and Robert C. Bolles. Random sample consensus: a

paradigm for model fitting with applications to image analysis and automated
cartography, Commun. ACM 24, 6 (June 1981), 381-395.
[22] Bresenham, J. E.; Algorithm for computer control of a digital plotter, IBM
Systems Journal , vol.4, no.1, pp.25-30, 1965
[23] Onieva, E.; Pelta, D.A.; Alonso, J.; Milanes, V.; Perez, J.; A modular
parametric architecture for the TORCS racing engine, IEEE Symposium on
Computational Intelligence and Games, 2009. CIG 2009., vol., no., pp.256-262, 7-10
Sept. 2009
[24] Stuart Russell, Peter Norvig. Artificial Intelligence A Modern Approach.

Prenitce Hall, 1995, p. 578.
[25] Currier, P.; Development of an automotive ground vehicle platform for

autonomous urban operations. Master’s thesis, Virginia Polytechnic Institute and
State University, Blacksburg, VA. 2008
[26] Wikipedia, Ziegler-Nichols Method, http://en.wikipedia.org/wiki/Ziegler-

Nichols_method, 14 August 2011
[27] Vehicle Stopping Distance Calculator,

http://www.csgnetwork.com/stopdistcalc.html, 14 August 2011
[28] GNU Scientific Library, http://www.gnu.org/s/gsl/, 14 August 2011
[29] TORCS Source Code, http://sourceforge.net/projects/torcs/, 14 August 2011
75

A Simulated Autonomous Car: Iain David Graham Macdonald

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Simulated Autonomous Car: Iain David Graham Macdonald

Uploaded by

Copyright:

Available Formats

A Simulated

Iain David Graham Macdonald

(Iain David Graham Macdonald)

CHAPTER 1 INTRODUCTION ............................................................................................................. 1

1.1 PURPOSE ........................................................................................................................................ 1

CHAPTER 2 BACKGROUND ............................................................................................................... 3

2.1 INTRODUCTION ................................................................................................................................ 3

CHAPTER 3 SIMULATION SYSTEM .................................................................................................. 17

3.1 ARCHITECTURE ............................................................................................................................... 17

CHAPTER 4 REACTIVE PROTOTYPE ................................................................................................. 20

4.1.1 Image Processing ............................................................................................................... 21

CHAPTER 5 DELIBERATIVE APPROACH............................................................................................ 26

CHAPTER 6 EVALUATION ................................................................................................................ 57

6.1 LANE MARKING DETECTION AND CLASSIFICATION ................................................................................. 57

CHAPTER 7 CONCLUSION ............................................................................................................... 69

Due to the need for autonomous vehicles to interact in an environment populated by

However, the development of accurate simulations of complex systems and

1.4 Dissertation Outline

There is increasing pressure on manufactures to produce more economical vehicles,

2.1.2 A Brief History of Autonomous Vehicles

In the late 80s Dickmanns participated in the European Prometheus project

general advances in computing technology, the competition proved to be a

Buoyed by this success, DARPA announced a new challenge, to be held in 2007

2.2 The Urban Challenge

2.2.1 The Challenge

More specifically, each vehicle had to demonstrate the following abilities:

 Safe and correct check-and-go behaviour at junctions, when avoiding

Figure 2-1 Boss at the Urban Challenge

2.3.1 Route Planning

2.3.2 Intersection Handling

 The time to cross the lane, Taction

This fine-grained localisation is used to maintain an internal co-ordinate system that

2.4.2 Obstacle Detection

Figure 2-3 Odin at the Urban Challenge

2.5.1 Path Planning

2.5.2.1 Route Planning

2.5.2.2 Driving Behaviours

Driving Behaviour Purpose

As the arbiter is able to select multiple, potentially competing behaviours, an

2.5.2.3 Low-level Planning and Vehicle Control

Figure 2-4 Shelley, an autonomous Audi TT developed by Stanford

VIDEO OUTPUT CAPTURED IMAGE

VEHICLE SENSOR DATA

Figure 3-1 Top-level system architecture

The objective of this project is to implement the artificial intelligence vehicle

Table 3-1 TORCS Data and Potential Application

Parameter Potential Sensor Application

Table 3-2 Vehicle Command Parameters

Variable Type / Range Meaning

3.2 Track Selection

3.3 Car Selection

Figure 3-3 Peugeot 406

It was necessary to perform a feasibility study to ensure that it was possible to

800 4 31 SHARP RIGHT

INPUT IMAGE INPUT HIDDEN OUTPUT

Figure 4-1 Neural network structure

Therefore, the image must be down-sampled in a process described in section 4.1.1

4.1.1 Image Processing

Cropping: The captured image has a resolution of pixels and is in RGB

Down-sampling: At this point, the image resolution is slightly less than

It would be possible to generate a batch of training data by periodically capturing

CAPTURED IMAGE OUTPUT NODE VALUES