Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/350499399

An Open Database Generation with Monte Carlo Based Lane Marker Detection
and Critical Analysis of Vehicle Trajectory - High-Granularity Highway
Simulation (HIGH-SIM)

Preprint · March 2021


DOI: 10.13140/RG.2.2.30725.06887

CITATIONS READS

0 119

7 authors, including:

Dongfang Zhao Xiaopeng Li


University of South Florida University of South Florida
7 PUBLICATIONS   17 CITATIONS    134 PUBLICATIONS   2,157 CITATIONS   

SEE PROFILE SEE PROFILE

Xiaowei Shi Handong Yao


University of South Florida University of South Florida
14 PUBLICATIONS   17 CITATIONS    11 PUBLICATIONS   35 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Narrowing of Freeway Lanes and Shoulders to Create Additional Travel Lanes View project

Multi-Modal Intelligent Traffic Signal Systems (MMITSS) Impacts Assessment View project

All content following this page was uploaded by Dongfang Zhao on 30 March 2021.

The user has requested enhancement of the downloaded file.


An Open Database Generation with Monte Carlo Based Lane
Marker Detection and Critical Analysis of Vehicle Trajectory -
High-Granularity Highway Simulation (HIGH-SIM)
Dongfang Zhao, Xiaopeng Li*, Xiaowei Shi, Handong Yao

Department of Civil and Environmental Engineering, University of South Florida, 33620 Florida, U.S.

Rachel James
U.S. Department of Transportation, 20590 Washington, DC, U.S.

David K. Hale, Amir Ghiasi


Leidos, Inc, 20024 Washington, DC, U.S.

Abstract
High-granularity vehicle trajectory data can help researchers design traffic simulation models and
develop traffic safety measures for understanding and managing highway traffic. We propose a
trajectory extraction method to extract long vehicle trajectories from aerial videos. The proposed
method includes video calibration, vehicle detection and tracking, lane identification, and vehicle
position calibration. The proposed method is applied to several high-resolution aerial videos in the
numerical example, including the new collected High-granularity highway simulation (HIGHSIM)
vehicle trajectory. Besides, we applied several trajectory data analysis methods to analyze the
accuracy and consistency of a trajectory dataset. The quality of the extracted HIGH-SIM dataset is
compared with NGSIM 101 data. The result shows that the HIGH-SIM has more reasonable speed
and acceleration distribution than the NGSIM 101 dataset. Besides, the internal consistency and
platoon consistency of the extracted HIGH-SIM dataset gives lower errors comparing to the
NGSIM 101 dataset. The HIGH-SIM dataset is published in the data shared link of Federal
Highway Administration, US Department of Transportation for public use.
Keywords: Traffic flow, Driver behavior, Microsimulation, Traffic flow theory, Highway traffic,
NGSIM Empirical data
1. Introduction
Real-world vehicle trajectories have significant efforts on studying various traffic phenomena, such
as car-following (Pei et al., 2016) and lane-changing behaviors (Wang et al., 2019, Li et al., 2021
and Soleimaniamiri et al., 2020), traffic oscillation propagation (Li et al., 2012), and traffic capacity
drops (Shi and Li, 2020), etc., and thus enormous efforts have been made to collect real-world
vehicle trajectory datasets. Based on Zhao and Li (2019)’s categorization, existing vehicle
trajectory datasets can be classified into four categories, such as lidar-based trajectory datasets
(Coifman et al., 2016; Zhao et al., 2017), radar-based trajectory datasets (Victor, 2014), GPS-based
trajectory datasets (Shi and Li, 2020) and aerial video-based trajectory datasets (Babinec et al.,
2014; Kim et al., 2019; Xu et al., 2017). Due to the emerging of unmanned aerial vehicles (UAVs)
technology that gives more advantages to the collection of aerial video-based trajectory datasets,
such as flexible, economical, and unbiased (Kim and Cao, 2010), the collection of aerial video-
based trajectory datasets attracted wide attention of researchers both from industry and academia.
1
Existing aerial-video-based trajectory data sets include the Next Generation Simulation (NGSIM)
data set (https://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm) and the Highway Drone (HighD)
data set (https://www.highd-dataset.com). The NGSIM data set was collected by the Federal
Highway Administration in 2007. The vehicle trajectory data in the data set are extracted from
videos taken by multiple digital video cameras installed at different places near the freeway
segments of interest (Kim and Malik, 2003). Vehicles in these videos are detected by a feature-
based vehicle detection algorithm and tracked based on a zero-mean cross-correlation matching
algorithm(Kim et al., 2005). Lane markings are identified manually in order to find lane numbers
of vehicles. However, based on the results, more than 10 percent of vehicles are not detected
successfully, and tracks can be lost for several consecutive frames. Because of the vehicle detection
error, after accounting for vehicle length, the trajectories in the data set often overrun their leaders,
seemingly resulting in “collisions of trajectories” and the acceleration often exhibits unrealistically
large magnitudes (Punzo et al., 2011; Montanino and Punzo, 2015). Since the trajectories are
interpolated between two points in space observed many seconds apart caused by track lost in
frames, the vehicles’ speed exhibits unrealistic piecewise constant behavior (Coifman and Li, 2017).
Obviously, these errors in the NGSIM data set are particularly troublesome for testing and
validation of car following models (Coifman and Li, 2017) or conducting traffic safety assessment.
In comparison with the NGSIM data set, the HighD dataset (Krajewski et al., 2018) collected by
Ika team in RWTH Aachen University provides more accurate trajectory data. They propose to
adapt the U-Net (Cicek et al., 2016), a common neural network architecture, to detect and track
vehicles from aerial videos. The aerial videos are recorded by drones that hover next to the German
highways and capture the traffic from a bird’s eye view on road sections. The U-Net is then applied
to each frame of the videos to detect and track vehicles. Lane markings are annotated manually.
The results show that it has higher detection accuracy than the NGSIM data set because of the
advanced detection algorithm and high video resolution. However, the trajectory extraction method
used in HighD data is not capable of extracting trajectories from aerial videos when cameras are
rotating or shifting when capturing videos. What is more, manually identifying lane structures is
nearly impossible when lane structures in a study area are complicated. Although there are some
trajectory cluster methods proposed by previous works (Nawaz et al., 2014), the accuracy of results
is less than 90 percent. Thus, accurate lane identification is needed to cluster trajectories.
To extract longer and more accurate vehicle trajectory data from aerial videos, we propose an
advanced vehicle trajectory extraction system. The proposed method utilizes the efficient object
detection neural network YOLOv3 to detect vehicles and the feature-based vehicle tracking method
to track vehicles according to their features and vehicle kinematics. Furthermore, we correct camera
rotation and shifting with a multi-point feature-matching-based video calibration algorithm and
identify lanes with a novel linear regression and feature matching-based lane identification
algorithm. As an effective and robust detection algorithm, YOLOv3 provides high detection
accuracy and detection rate. Besides, we propose a Monte Carlo-based lane marker detection
method that detects lane markers accurately and efficiently. Combined with the video calibration
method and lane markers detected, the positions of vehicles are calculated accurately. With lane
marker detected and linear regression method, local locations and GPS locations of vehicles are
obtained. By combining vehicles’ positions in multiple videos, the proposed method enables us to
obtain long vehicle trajectories from aerial videos.

2
With the proposed vehicle trajectory extraction method, we extracted a 2-hour 30fps vehicle
trajectory dataset with 8000 ft coverage – the HIGH-SIM dataset. We then analyze the accuracy,
consistency, and driver behavior of the extracted datasets. By comparison with NGSIM data, the
results show that the HIGH-SIM data have advantages in all aspects discussed.
The disposition of this paper is as follows. Section 2 reviews relevant literature and identifies the
unique contributions of this study. Section 3 describes the proposed vehicle trajectory extraction
methods, the Monte Carlo-based lane marker detection algorithm, and trajectory analysis methods.
Section 4 provides the numerical experiment where we extracted the HIGH-SIM data and analyzes
and compares the quality of the HIGH-SIM data with NGSIM data. Section 5 concludes the
proposed method and discusses the advantages of the extracted HIGH-SIM datasets and
corresponding findings.
2. Literature review
There are several studies reporting the extraction of aerial-video-based trajectory datasets
in the literature. For example, Azevedo et al. (2014) extracted the trajectory data by a traffic video
recorded at A44 motorway, an urban motorway in the southern region of Porto, Portugal. The result
shows that the average length of the extracted trajectories is around 500 meters. To check the
proposed learning-based trajectory extraction methods, Kim et al. (2019) extracted the trajectory
data of two test videos respectively collected at Korea Expressway No.1 and No.120. The results
show that the average length of the extracted trajectories for the two videos are 188 and 137 meters.
Babinec et al. (2014) extracted the trajectory data from a traffic video captured at a ring road in
Bohunice, Brno, Czech Republic. The length of the ring road is about 300 meters. Xu et al. (2017)
proposed an enhanced Viola-Jones vehicle detection method, and the average length of the
extracted trajectories is around 160 meters. Despite various trajectory extraction methods were
proposed in the literature recently, which significantly enhanced the detection accuracy of the
vehicles as well as the trajectory quality, the aforementioned short-length problem has not been
resolved yet. To the best of the authors’ knowledge, the NGSIM dataset owns the longest
trajectories, while which is only 640 meters.
Existing publicly available aerial-video-based trajectory datasets include the Next
Generation Simulation (NGSIM) dataset (https://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm)
and the Highway Drone (HighD) dataset (https://www.highd-dataset.com/#). The NGSIM dataset
was collected by the Federal Highway Administration in 2007. The vehicle trajectory data in the
dataset are extracted from videos taken by multiple digital video cameras installed at different
places near the freeway segments of interest. It was found that the longest recorded trajectory in
NGSIM dataset is around 640 meters (US Highway 101 Dataset). Compared with the NGSIM data
set, the HighD dataset collected by Ika team in RWTH Aachen University records the video data
by drones that hover next to the German highways and capture the traffic from a bird’s eye view
on road sections. It was found that the recorded trajectory in HighD dataset is around 420 meters.
To extract vehicle trajectory data accurately, lane detection is critical since vehicles’ lane
number is calculated based on locations of lanes. Most existing lane detection algorithms are
proposed to solve lane detection on autonomous cameras (Kreucher et al., 1998; Chen and Wang,
2006; Zhao et al., 2017; Lee and Moon 2018). However, the proposed methods are not suitable for
aerial videos in the trajectory extraction system. Behrendt and Witt (2017) proposed a deep
learning-based lane marker detection method with a 97.84% detection rate. However, any detection
3
errors will cause additional position errors or lane number errors to a vehicle trajectory process.
Similarly, Wu et al. (2018) proposed a VH-HFCN based lane markings segmentation method with
a 96.25% detection rate. Lee and Yi (2018) proposed a lane tracking method with a Kalman filter
and obtained a 95% detection rate. Detection errors are not eliminated by these methods.
In this paper, we propose an innovative trajectory exaction system with Monte Carlo-based
lane marker detection method that can extract high-quality vehicle trajectories. The detection rate
of the proposed Monte Carlo method is 100% in the test cases. Collaborating with a trajectory post-
processing method (see Shi et al. (2021) for more details), the proposed method can help
researchers collect the required long vehicle trajectory datasets from aerial videos. We apply the
proposed trajectory extraction methods to a number of aerial videos and extract the HIGH-SIM
data, which has the properties of 30fps, 8000ft long and captures the full cycle of congestion
behavior of highway congestions. The quality of the extracted HIGH-SIM data is analyzed and
compared with NGSIM data.
3. Methodology
 Trajectory extraction system

Figure 1 Vehicle Trajectory Extraction System


As shown in Figure 1, there are seven steps in the proposed vehicle trajectory extraction system:
With the given videos and camera parameters (camera height, camera resolution, and camera angle)
in step (a), the videos can be decoded and calibrated to frames. In step (b), a number of areas of
interest in the background are selected for feature matching. For each two consecutive frames 𝑓𝑎 , 𝑓𝑏 ,
Templates of the area of interests in frame 𝑓𝑎 are extracted and tracked in frame 𝑓𝑏 based on the
description in OpenCV library (https://docs.opencv.org/3.4/de/da9/tutorial_template_matching.ht-
4
ml) in step (c). With template matching and locations of matched templates, a perceptive
transformation matrix 𝑀𝑎𝑏 can be calculated. With the matrix 𝑀𝑎𝑏 , we are able to transform each
frame with a relatively static background.
To identify and track vehicles in each frame efficiently and correctly in step (d), we propose to
locally train the YOLOv3 (Redmon and Farhadi, 2018) and apply the model to detect vehicles in
each frame of the aerial videos. To train the YOLOv3 locally, we generate a training data set from
the aerial video using a background extraction algorithm. We firstly apply Gaussian mixture-based
background/foreground segmentation algorithm (KaewTraKulPong and Bowden, 2002) to extract
the background and foreground of each frame. Since the vehicles are moving across the frames, the
contours of the vehicles will be extracted by the algorithm with noise. Therefore, we match the
contours in two consecutive frames by their distances and sizes or discard the contour if no feasible
match of the contour is found. In this way, a portion of vehicles in the video will be detected. We
extract the contours of these detect vehicles and create training data for vehicles. Afterward, we
train the YOLOv3 with the training data. With the locally trained YOLOv3, we are able to detect
vehicles in the video accurately.

The proposed Monte Carlo-based lane marker detection and tracking algorithm is shown in
Algorithm 1. The algorithm takes advantage of the distribution properties of lane markers that
lane markers on two lanes are parallel and the distance of any two connected lane markers on the
same lane are consistent. Besides, in the algorithm, it makes the assumption that the locations of
lane markers in images are the perspective transformation of their locations on any other images.
We first detect lane markers locations 𝑙𝑜𝑐𝑖 with traditional template matching or feature method or
convolutional neural network in each frame 𝑓𝑖 , 𝑖 ∈ [0, 𝑁]. We then select a subset of index 𝑟 for

5
the locations of lane marker detected. With an initial lane marker locations 𝑙𝑜𝑐 ∗ distributed
following the mentioned distribution properties and perspective transformation matrix calculation
function 𝑇, we are able to calculate the perspective transformation matrix 𝑀𝑖 =
𝑇(𝑙𝑜𝑐 ∗ [𝑟], 𝑙𝑜𝑐𝑖 [𝑟]) with index 𝑟. The perspective transformation matrix 𝑀𝑖 is then used to
calculate the expected locations of all lane markers in frame 𝑓𝑖 , 𝑙𝑜𝑐𝑖′ = 𝑀𝑖 𝐿𝑜𝑐𝑚 . The sum of

Euclidean distance 𝑆 = ∑𝑗∈𝐽 ||𝑙𝑜𝑐𝑖,𝑗 − 𝑙𝑜𝑐𝑖,𝑗 || for all expected locations 𝑙𝑜𝑐𝑖 ′ and detected
locations 𝑙𝑜𝑐𝑖 is calculated to estimate the correctness of the perceptive transformation matrix.
The random selection of index 𝑟 will be operated several times in order to find the optimal
perceptive transformation matrix. With the optimal perceptive transformation matrix obtained, the
expected lane marker detection location can be calculated: 𝑙𝑜𝑐𝑖∗ = 𝑀𝑖∗ 𝑙𝑜𝑐 ∗ .
With the lane marker detected with the proposed Monte Carlo-based lane marker detection and
tracking method, the locations and lane number of vehicles can be obtained by applying linear
regression function calculated with lane markers to the vehicles’ locations. The locations of the
same vehicle are connected with a car following-based vehicle trajectory post-processing method.
Missing detections are filled, and wrong detections are corrected by the trajectory connection
method.

 Trajectory analysis method (Dongfang and Handle)


To test the performance of the proposed vehicle extraction method, we develop a tool named Video-
Based Intelligent Road Traffic Universal Analysis Tool (VIRTUAL), which is used to extract
vehicle trajectory data based on the proposed method. To analysis the quality of the extracted
vehicle trajectory data extracted by VIRTUAL from aerial videos, we propose to apply the
trajectory accuracy analysis method proposed by Punzo et al. (2011).
We analyze the trajectory based on the following factors:
(1) Jerking factor
Jerking factor, 𝜖𝑡𝑧 , presents the variations of acceleration in time and is calculated as the derivative
of the acceleration:

𝜖𝑡𝑍 = 𝑑(𝑎̂𝑡 − 𝑎̂𝑡−1 )/𝑑𝑡,


where 𝑡 is time, 𝑎𝑡 is the observed acceleration of a vehicle at time 𝑡.
(2) Internal consistency
The internal consistency analysis aims to check the consistency of differentiation of vehicle
trajectory with its speed and acceleration (Punzo et al., 2005). The internal consistency of space,
𝜖𝑡𝑆 , is calculated as:
𝑡
𝜖𝑡𝑆 = 𝑠̂𝑡 − (𝑠̂0 + ∫0 𝑣̂𝑡 𝑑𝑡),

where 𝑠̂𝑡 is the observed location of a vehicle and 𝑣̂𝑡 is the observed speed of the vehicle at time 𝑡.
Similarly, the internal consistency of speed, 𝜖𝑡𝑉 , is calculated as:
𝑡
𝜖𝑡𝑉 = 𝑣̂𝑡 − (𝑣̂𝑡 + ∫0 𝑎̂𝑡 𝑑𝑡),

6
(3) Platoon consistency
The platoon consistency is adopted to estimate the consistency of trajectories of vehicle pairs in
vehicle trajectory data. The platoon consistency of space can be calculated as:
𝑃𝑆 𝑡 𝑡
𝜖𝑛𝑝𝑡 : = (𝑠̂𝑛0 − 𝑠̂𝑝0 ) + (∫0 𝑣̂𝑛𝑡 𝑑𝑡 − ∫0 𝑣̂𝑝𝑡 𝑑𝑡),

where 𝑛, 𝑝 denote the ID of the subject vehicle and its following vehicle, respectively. Similarly,
the speed platoon consistency can be calculated as:
𝑃𝑉 𝑡 𝑡
𝜖𝑛𝑝𝑡 : = (𝑣̂𝑛0 − 𝑣̂𝑝0 ) + (∫0 𝑎̂𝑛𝑡 𝑑𝑡 − ∫0 𝑎̂𝑝𝑡 𝑑𝑡).

The measurement of the bias (𝜖: 𝜖𝑡𝑍 , 𝜖𝑡𝑆 , 𝜖𝑡𝑉 , 𝜖𝑛𝑝𝑡


𝑃
) in a vehicle trajectory data is summarized as
follows:

 the minimum bias: min(𝜖);


 the maximum bias: max(𝜖);
 the mean bias: 𝑚𝑒𝑎𝑛(𝜖) = ∑𝜖/𝑁;
 root mean square error of bias: √∑(𝜖 2 )/𝑁;

4. Numerical Experiment

Figure 2. The study area of the HIGH-SIM dataset.


In the numerical experiment, we apply the VIRTUAL to an aerial video and extracted the new and
comprehensive dataset, named HIGH-SIM. As shown in Figure 2, the aerial video data are collected
by three 8K cameras on a helicopter from 4:15 – 6:15 pm on Tuesday (May 14, 2019) on an 8000ft
long segment of the I-75 freeway in Florida, USA. A sample of detected vehicles is shown in Figure
2 boxed in red boxed. The trajectory contains vehicle trajectory data of 3 lanes and one off-ramp.
The frequency of the video data and the extracted trajectory, the HIGH-SIM, is 30 Hz. The format
of the HIGH-SIM dataset is listed in Table 1. As shown in the table, the format of the HIGH-SIM
dataset is kept consistent with the NGSIM dataset for the convenience of further trajectory analysis

7
and future public use. The trajectory is plotted in Figure 3. As shown in Figure 3, congestion exists
on lane 0 and the ramp. In comparison, the trajectories on lane 1 and lane 2 are much more fluent.

Table 1 Format of HIGH-SIM dataset

Column Name Explanation


Vehicle ID ID number for each vehicle
Global Time Time in seconds from 12:00:00 AM of the day
Frame ID Frame number in the corresponding video
Local X (ft) Position in the direction perpendicular to the road
Local Y (ft) Position in the direction along the road
Global X (Longitude) Vehicle’s GPS longitude location
Global Y (Latitude) Vehicle’s GPS latitude location
Width (ft) Vehicle width
Length (ft) Vehicle length
Class (1 motor; 2 auto; 3 truck) Vehicle class
Speed (ft/s) Vehicle speed
Acceleration (ft/s2) Vehicle acceleration
Lane Num Lane number
Space Highway (ft) Distance between this vehicle’s front bumper to its
following vehicle’s front bumper

(a) Lane 0 (b) Lane 1

8
(c) Lane 2 (d) Ramp
Figure 3. Examples of vehicle trajectories in the HIGH-SIM dataset.
To compare the accuracy of the HIGH-SIM dataset, we also analyzed the accuracy of the NGSIM
US-101 dataset, which is collected on southbound US 101 in Los Angeles, CA. The study area
includes five mainline lanes and an auxiliary lane that is between the on-ramp at Ventura Boulevard
and the off-ramp at Cahuenga Boulevard. Total 45-min data are separated into three 15-min periods,
including 7:50 am to 8:05 am, 8:05 am to 8:20 am, and 8:20 am to 8:35 am. The data record
frequency is 10 Hz per frame.
(1) Distribution of speed and acceleration
We first compare the distribution of speed and acceleration of two datasets. As shown in Figure 4,
the distribution of speed in HIGH-SIM data is smoother than NGSIM 101 data. There are mainly
three peaks in the HIGH-SIM data. It is consistent with the trajectory plotted in Figure 3. The
trajectory on lane 1 and lane 2 show a free traffic flow consistent with the peak at 100ft/s and the
trajectory on lane 0 and the tamp consistent with the peaks at 25 ft/s and 0 ft/s. In comparison, there
is only one peak in the NGSIM 101 data at 35 ft/s, which indicated that the traffic flow is in
congestion condition only. The distribution of acceleration of the two datasets is shown in Figure
5. The acceleration of HIGH-SIM data is in the range of [−20 𝑓𝑡/𝑠 2 , 20𝑓𝑡/𝑠 2 ] and the acceleration
of the NGSIM 101 data is in range of [−12.5𝑓𝑡/𝑠 2 , 12.5𝑓𝑡/𝑠 2 ].

(a) HIGH-SIM (b) NGSIM 101


Figure 4 Distribution of speed of HIGH-SIM and NGSIM 101

9
(a) HIGH-SIM (b) NGSIM 101
Figure 5 Distribution of acceleration of HIGH-SIM and NGSIM 101

(2) Internal consistency of speed and acceleration


We compare the above two datasets in the aspect of a platoon and internal consistency analysis
based on the method proposed by Punzo et al. (2011). Table 1 shows the results of space/speed
internal consistency analysis, and Table 3 shows the results of the speed/acceleration consistency
of the two datasets, respectively. As regards the two tables, the internal consistency is largely
unsatisfactory for the NGSIM 101 dataset with high minimum bias, maximum bias and a high
percentage of bias value greater than 1 (1 ft for space/speed consistency and 1 ft/s for
speed/acceleration consistency). The high RMSE and RMSPE of internal consistency in the
NGSIM dataset indicate that the integral of acceleration is likely not consistent with the change of
speed, and the integral of speed is likely not consistent with the change of acceleration. In
comparison, the bias of space/speed and speed/acceleration consistency of the HIGH-SIM data is
much smaller than the NGSIM 101 dataset. It indicates a much higher space/speed and
speed/acceleration consistency of the HIGH-SIM data over the NGSIM 101 dataset.

Table 1. Internal space/speed consistency of HIGH-SIM and NGSIM 101


Dataset HIGH-SIM NGSIM US-101
Maximum bias max(𝜖 𝑆 ) 0.059 6.63
Minimum bias min(𝜖 𝑆 ) -0.056 -28.22
Mean bias 𝑚𝑒𝑎𝑛(𝜖 𝑆 ) 5.76e-5 -0.098
Percentage of bias greater than 1 (ft) (𝑃𝜖𝑆 >1 (%)) 0.0 29.2
RMSE of bias (𝑅𝑀𝑆𝐸(𝜖 𝑆 ) 0.0018 0.36
RMSPE (𝑅𝑀𝑆𝑃𝐸(𝜖 𝑆 ) 0.0082 0.4
Percentage of percentage bias greater than 10% (𝑃𝜖𝑆 >0.1𝑆 ) 6.49e-5 0.1

Table 3. Internal speed/acceleration consistency of HIGH-SIM and NGSIM 101


Dataset HIGH-SIM NGSIM US-101
Maximum bias max(𝜖 𝑉 ) 0.010 56.56
Minimum bias min(𝜖 𝑉 ) -0.001 -63.75
Mean bias 𝑚𝑒𝑎𝑛(𝜖 𝑉 ) 7.68e-6 0.30
Percentage of bias greater than 1 (ft/s) (𝑃𝜖𝑉 >1 (%)) 0.0 65.2
RMSE of bias (𝑅𝑀𝑆𝐸(𝜖 𝑉 ) 0.0018 3.12
RMSPE (𝑅𝑀𝑆𝑃𝐸(𝜖 𝑉 ) 0.061 24
Percentage of percentage bias greater than 10% (𝑃𝜖𝑉 >0.1𝑉 ) 6.49e-5 36.1

Besides, the platoon consistency of space/speed and speed/acceleration of HIGH-SIM and NGSIM
101 is shown in Table 4 and Table 5, respectively. As shown in the two tables table, the maximum
bias, minimum bias, mean bias of HIGH-SIM data is much closer to zero compared to NGSIM 101
dataset. Especially, the percentage of bias greater than 1 (ft for space/speed consistency or ft/s for

10
speed/acceleration consistency) of HIGH-SIM data is 0, which shows a much higher platoon
consistency compared to NGSIM 101 data with 9.93% of space/speed bias greater than 1 (ft) and
65.88% of speed/acceleration bias greater than 1 (ft/s). Besides, the HIGH-SIM data has less
number and less percentage of vehicle pairs with negative unphysical spacing comparing to NGSIM
101 data.
Therefore, based on the comparison of internal consistency and platoon consistency of the HIGH-
SIM dataset and NGSIM 101 dataset, the HIGH-SIM dataset is expected to have a higher data
quality in the internal and platoon consistency aspects.
Table 4. Platoon space/speed consistency of HIGH-SIM and NGSIM 101
Dataset HIGH-SIM NGSIM US-101
Maximum bias max(𝜖 𝑃𝑆 ) 0.014 11.25
Minimum bias min(𝜖 𝑃𝑆 ) -0.056 -4.40
Mean bias 𝑚𝑒𝑎𝑛(𝜖 𝑃𝑆 ) 1.19e-6 -0.033
Percentage of bias greater than 1 (ft) (𝑃𝜖𝑃𝑆 >1 (%)) 0.0 9.93
RMSE of bias (𝑅𝑀𝑆𝐸(𝜖 𝑃𝑆 ) 7.96e-4 0.23
RMSPE (𝑅𝑀𝑆𝑃𝐸(𝜖 𝑃𝑆 ) 3.16e-5 0.5
Percentage of percentage bias greater than 10% (𝑃𝜖𝑃𝑆 >0.1𝑃𝑆 ) 0.0 0.4
Total number of vehicle pairs 7057678 985552
Number of vehicle pairs with negative inter-vehicle spacing 1232 1438
% of vehicle pairs with negative inter-vehicle spacing 1.74e-4 0.0014

Table 5. Platoon speed/acceleration consistency of HIGH-SIM and NGSIM 101


Dataset HIGH-SIM NGSIM US-101
Maximum bias max(𝜖 𝑃𝑉 ) 1.33 88.28
Minimum bias min(𝜖 𝑃𝑉 ) -0.87 -88.80
Mean bias 𝑚𝑒𝑎𝑛(𝜖 𝑃𝑉 ) -2.39e-4 0.035
Percentage of bias greater than 1 (ft/s) (𝑃𝜖𝑃𝑉 >1 (%)) 1.45e-4 65.88
RMSE of bias (𝑅𝑀𝑆𝐸(𝜖 𝑃𝑉 ) 0.11 4.51
RMSPE (𝑅𝑀𝑆𝑃𝐸(𝜖 𝑃𝑉 ) 0.0028 0.170
Percentage of percentage bias greater than 10% (𝑃𝜖𝑃𝑉 >0.1𝑃𝑉 ) 1.28e-6 0.113

5. Conclusion
In this paper, we proposed an advanced vehicle trajectory extraction system to extract longer and
more accurate vehicle trajectory data from aerial videos. The proposed system applied YOLOv3
for vehicle detection and a feature matching method for vehicle tracking. Besides, a novel Monte
Carlo-based lane marker tracking method is proposed to track lane markers across frames. By
implementing the proposed system, we developed a vehicle trajectory tool called VIRTUAL and
extracted a new and long trajectory dataset – HIGH-SIM. We analyzed the quality of HIGH-SIM
and compared the internal and platoon consistency of HIGH-SIM and NGSIM US 101 dataset. The
results show that the new dataset HIGH-SIM provides much higher quality trajectory data. It not
only indicates the efficiency of the proposed vehicle extraction method but also provides a high-
quality trajectory dataset of trajectory-related research activities. The HIGH-SIM dataset can help

11
the theoretical development of new behavioral models, estimate model parameters, and validate
them.
The proposed trajectory extraction method can be generally applied to extracted vehicle trajectories
from aerial videos. To improve the capability of the method for other objects and higher accuracy,
the method can be revised by obtaining the camera 3D motion data and calibrating the movement
of camera. The process will help build map of the area of interests with higher accuracy. Besides,
the accuracy of object detection and location extraction can be improved by applying RGBD
camera to obtain the aerial video. With depth dimension of each detect object, the detection and
location error caused by shadows can be eliminated.

Acknowledgement
We thank Dr. Zhenyu (University of South Florida) Wang for the suggestion to apply the YOLOv3
object detection network in the object detection part in the proposed trajectory extraction system.
Reference
Azevedo, C.L., Cardoso, J.L., Ben-Akiva, M., Costeira, J.P., Marques, M., 2014. Automatic
Vehicle Trajectory Extraction by Aerial Remote Sensing. Procedia - Soc. Behav. Sci. 111,
849–858. https://doi.org/10.1016/j.sbspro.2014.01.119
Babinec, A., Herman, D., Cecha, S., 2014. Automatic Vehicle Trajectory Extraction For Traffic
Analysis From Aerial Video Data.
Behrendt, K. and Witt, J., 2017, September. Deep learning lane marker segmentation from
automatically generated labels. In 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS) (pp. 777-782). IEEE.
Chen, Q. and Wang, H., 2006, June. A real-time lane detection algorithm based on a hyperbola-
pair model. In 2006 IEEE Intelligent Vehicles Symposium (pp. 510-515). IEEE.
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T. and Ronneberger, O., 2016, October. 3D U-
Net: learning dense volumetric segmentation from sparse annotation. In International
conference on medical image computing and computer-assisted intervention (pp. 424-432).
Springer, Cham.
Coifman, B. and Li, L., 2017. A critical evaluation of the Next Generation Simulation (NGSIM)
vehicle trajectory dataset. Transportation Research Part B: Methodological, 105, pp.362-
377.
Coifman, B., Wu, M., Redmill, K., Thornton, D.A., 2016. Collecting ambient vehicle trajectories
from an instrumented probe vehicle: High quality data for microscopic traffic flow studies.
Transp. Res. Part C Emerg. Technol. 72, 254–271. https://doi.org/10.1016/j.trc.2016.09.001
FHWA (2008) The Next Generation Simulation (NGSIM) [Online]. In: Available:
<http://www.ngsim.fhwa.dot.gov/> (Accessed)
KaewTraKulPong, P. and Bowden, R., 2002. An improved adaptive background mixture model
for real-time tracking with shadow detection. In Video-based surveillance systems (pp. 135-
144). Springer, Boston, MA.

12
Kim, Z., Gomes, G., Hranac, R. and Skabardonis, A., 2005, November. A machine vision system
for generating vehicle trajectories over extended freeway segments. In 12th World Congress
on Intelligent Transportation Systems.
Kim, Z. and Malik, J., 2003, October. Fast vehicle detection with probabilistic feature grouping
and its application to vehicle tracking. In null (p. 524). IEEE.
Kim, E.J., Park, H.C., Ham, S.W., Kho, S.Y., Kim, D.K., Hassan, Y., 2019. Extracting Vehicle
Trajectories Using Unmanned Aerial Vehicles in Congested Traffic Conditions. J. Adv.
Transp. 2019. https://doi.org/10.1155/2019/9060797
Kim, Z.W., Cao, M., 2010. Evaluation of feature-based vehicle trajectory extraction algorithms.
IEEE Conf. Intell. Transp. Syst. Proceedings, ITSC 99–104.
https://doi.org/10.1109/ITSC.2010.5625278
Krajewski, R., Bock, J., Kloeker, L. and Eckstein, L., 2018, November. The highd dataset: A
drone dataset of naturalistic vehicle trajectories on german highways for validation of highly
automated driving systems. In 2018 21st International Conference on Intelligent
Transportation Systems (ITSC) (pp. 2118-2125). IEEE.
Kreucher, C., Lakshmanan, S. and Kluge, K., 1998, October. A driver warning system based on
the LOIS lane detection algorithm. In Proceedings of IEEE international conference on
intelligent vehicles (Vol. 1, pp. 17-22). Stuttgart, Germany.
Lee, C. and Moon, J.H., 2018. Robust lane detection and tracking for real-time applications. IEEE
Transactions on Intelligent Transportation Systems, 19(12), pp.4043-4048.
Lee, J. and Yi, K., 2018. A Method of Lane Marker Detection Robust to Environmental Variation
Using Lane Tracking. Journal of Korea Multimedia Society, 21(12), pp.1396-1406.
Li, Q., Li, X. and Mannering, F., 2021. Assessment of Discretionary Lane-Changing Decisions
using a Random Parameters Approach with Heterogeneity in Means and Variances.
Transportation Research Record, p.0361198121992364.
Li, X., Wang, X., Ouyang, Y., 2012. Prediction and field validation of traffic oscillation
propagation under nonlinear car-following laws. Transp. Res. Part B Methodol. 46, 409–
423. https://doi.org/10.1016/j.trb.2011.11.003
Montanino, M. and Punzo, V., 2015. Trajectory data reconstruction and simulation-based
validation against macroscopic traffic patterns. Transportation Research Part B:
Methodological, 80, pp.82-106.
Nawaz, T., Cavallaro, A. and Rinner, B., 2014, October. Trajectory clustering for motion pattern
extraction in aerial videos. In 2014 IEEE International Conference on Image Processing
(ICIP) (pp. 1016-1020). IEEE.
Pei, X., Pan, Y., Wang, H., Wong, S.C., Choi, K., 2016. Empirical evidence and stability analysis
of the linear car-following model with gamma-distributed memory effect. Phys. A Stat.
Mech. its Appl. 449, 311–323. https://doi.org/10.1016/j.physa.2015.12.104
Punzo, V., Borzacchiello, M.T. and Ciuffo, B., 2011. On the assessment of vehicle trajectory data
accuracy and application to the Next Generation SIMulation (NGSIM) program data.
Transportation Research Part C: Emerging Technologies, 19(6), pp.1243-1262.
Punzo, V., Formisano, D.J., Torrieri, V., 2005. Nonstationary Kalman filter for estimation of
13
accurate and consistent car-following data. Transportation Research Record: Journal of the
Transportation Research Board 1934, 3–12.
Redmon, J. and Farhadi, A., 2018. Yolov3: An incremental improvement. arXiv preprint
arXiv:1804.02767.
Soleimaniamiri, S., Shi, X., Li, X.S. and Hu, Y., 2020. Incorporating Mixed Automated Vehicle
Traffic in Capacity Analysis and System Planning Decisions.
Victor, T., 2014. Analysis of Naturalistic Driving Study Data: Safer Glances, Driver Inattention,
and Crash Risk, Analysis of Naturalistic Driving Study Data: Safer Glances, Driver
Inattention, and Crash Risk. Transportation Research Board, Washington, DC
https://doi.org/10.17226/22297
Wang, Z., Shi, X., Li, X., 2019. Review of Lane-Changing Maneuvers of Connected and
Automated Vehicles: Models, Algorithms and Traffic Impact Analyses. J. Indian Inst. Sci.
https://doi.org/10.1007/s41745-019
Wu, Y., Yang, T., Zhao, J., Guan, L. and Jiang, W., 2018, June. VH-HFCN based parking slot
and lane markings segmentation on panoramic surround view. In 2018 IEEE Intelligent
Vehicles Symposium (IV) (pp. 1767-1772). IEEE. -00127-7
Xu, Y., Yu, G., Wu, X., Wang, Y., Ma, Y., 2017. An Enhanced Viola-Jones Vehicle Detection
Method from Unmanned Aerial Vehicles Imagery. IEEE Trans. Intell. Transp. Syst. 18,
1845–1856. https://doi.org/10.1109/TITS.2016.2617202
Yim, Y.U. and Oh, S.Y., 2003. Three-feature based automatic lane detection algorithm
(TFALDA) for autonomous driving. IEEE Transactions on Intelligent Transportation
Systems, 4(4), pp.219-225.
Zhao, D., Li, X., 2019. Real-World Trajectory Extraction from Aerial Videos - A Comprehensive
and Effective Solution. 2019 IEEE Intell. Transp. Syst. Conf. ITSC 2019 2854–2859.
https://doi.org/10.1109/ITSC.2019.8917175
Zhao, H., Wang, C., Lin, Y., Guillemard, F., Geronimi, S., Aioun, F., 2017. On-Road Vehicle
Trajectory Collection and Scene-Based Lane Change Analysis: Part I. IEEE Trans. Intell.
Transp. Syst. 18, 192–205. https://doi.org/10.1109/TITS.2016.2571726

14

View publication stats

You might also like