Professional Documents
Culture Documents
Sustainability 15 05618
Sustainability 15 05618
Sustainability 15 05618
Article
GPS Data Analytics for the Assessment of Public City Bus
Transportation Service Quality in Bangkok
Rathachai Chawuthai 1 , Agachai Sumalee 2 and Thanunchai Threepak 1, *
1 School of Engineering, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand;
rathachai.ch@kmitl.ac.th
2 School of Integrated Innovation, Chulalongkorn University, Bangkok 10330, Thailand; agachai.s@chula.ac.th
* Correspondence: thanunchai.th@kmitl.ac.th; Tel.: +66-2329-8341 (ext. 114)
Abstract: Evaluation of the quality of service (QoS) of public city buses is generally performed using
surveys that assess attributes such as accessibility, availability, comfort, convenience, reliabilities,
safety, security, etc. Each survey attribute is assessed from the subjective viewpoint of the service
users. This is reliable and straightforward because the consumer is the one who accesses the bus
service. However, in addition to summarizing personal feedback from humans, using data analytics
has become another useful method for assessing the QoS of bus transportation. This work aims to use
global positioning system (GPS) data to measure the reliability, accessibility, and availability of bus
transportation services. There are three QoS scoring functions for tracking complete trips, on-path
driving, and on-schedule operation. In the analytical process, GPS coordinates rounding is adopted
and applied for detecting trips on each route path. After assessing the three QoS scores, it has been
found that most bus routes have good operations with high scores, while some bus routes show room
for improvement. Future work could use our data to create recommendations for policy makers in
terms of how to improve a city’s smart mobility.
Keywords: bus transportation; GPS data analytics; quality of service; smart city; smart mobility;
urban informatics
1. Introduction
Citation: Chawuthai, R.; Sumalee, A.;
Threepak, T. GPS Data Analytics for City bus transportation is a public transportation option that is commonly used in
the Assessment of Public City Bus many countries as it supports the growing transportation demand and takes into account
Transportation Service Quality in affordability for passengers [1]. Thus, having qualified bus services becomes a key factor
Bangkok. Sustainability 2023, 15, 5618. for smart life in a city. In this case, before enhancing the service quality, we need to
https://doi.org/10.3390/su15075618 understand the current quality of service (QoS) of bus transportation, then improve it
Academic Editor: Juneyoung Park
point by point. The QoS of city bus transportation is generally measured by user surveys:
e.g., Wethyavivorn and Sukwattanakorn [2], Ueasangkomsate [3], Chan et al. [4], Page and
Received: 30 January 2023 Yue [5], and Goyal et al. [6]. These studies found that the common issues are accessibility,
Revised: 18 March 2023 availability, reliability, security, and comfortability. As to research from Thailand, the
Accepted: 20 March 2023 authors of [2,3] stated that passengers in particular areas of Bangkok had serious concerns
Published: 23 March 2023
about the physical facilities and service reliability. The results of [3] were reported to
the government to help it plan policies for enhancing the efficiency of public buses. The
relevant works are reviewed in Section 2 and summarized in Table 1.
Copyright: © 2023 by the authors.
As can be seen, survey results help a city to explore issues from the viewpoints of
Licensee MDPI, Basel, Switzerland. users in order to improve bus services. It is well known that survey results depend on the
This article is an open access article individual. This means that obtaining feedback from a large number of people can reflect
distributed under the terms and most of the problems and needs of citizens. However, in the age of data technology, using
conditions of the Creative Commons data to measure the quality of service of city bus transportation has become another way to
Attribution (CC BY) license (https:// understand the issues. Thus, this work aims to contribute data for measuring the QoS of
creativecommons.org/licenses/by/ bus transportation by focusing on the aspects of accessibility, availability, and reliability,
4.0/). which can benefit directly from data analytics.
Table 1. Summary of literature studies on the uses of GPS technology for transportation and the
quality of service of bus transportation.
Our approach defines three scoring levels, QoS-1, QoS-2, and QoS-3, to describe all
objectives. Taking a closer look at the situation of the management of public city bus
transportation in Bangkok, there are four challenges that our work faces. First, there is
no wireless sensor detecting a bus at a bus stop; as some works have mentioned [15,16],
analyzed; the improvement of some attrib-
utes under these dimensions is required.
Our approach defines three scoring levels, QoS-1, QoS-2, and QoS-3, to describe all
Sustainability 2023, 15, 5618 objectives. Taking a closer look at the situation of the management of public city bus trans- 3 of 23
portation in Bangkok, there are four challenges that our work faces. First, there is no wire-
less sensor detecting a bus at a bus stop; as some works have mentioned [15,16], the ana-
lytics
the of GPS transactions
analytics with route
of GPS transactions withpolylines are adopted
route polylines to detectto
are adopted trips of buses.
detect Second,
trips of buses.
a bus route in Bangkok could take several different courses
Second, a bus route in Bangkok could take several different courses depending on thedepending on the demand
from passengers
demand and the and
from passengers strategies of the bus
the strategies operators.
of the There There
bus operators. must bemust main routes
be main on a
routes
bus route, but it is also possible to have subpaths, which are shorter
on a bus route, but it is also possible to have subpaths, which are shorter versions of the versions of the main
path, and
main path,split
andpaths, whichwhich
split paths, diverge from the
diverge main
from thepath
maintopath
go totoother
go todestinations. Third,
other destinations.
a bus can
Third, choose
a bus can anychoose paths
anyinpaths
a day in following the schedule
a day following conditionsconditions
the schedule from a busfrom route a
provider, so we need to use data analytics to detect the path that
bus route provider, so we need to use data analytics to detect the path that a bus drove a bus drove through.
Last, there
through. is no
Last, executable
there timetabletimetable
is no executable to showto the departure
show time. Intime.
the departure fact, Inschedule condi-
fact, schedule
tions only provide the number of trips in any time period, while
conditions only provide the number of trips in any time period, while bus providers bus providers manage
the departure
manage time by themselves.
the departure time by themselves.
Due to tothese
theseissues,
issues,data
data analytics
analytics on on
GPSGPS datadata
and and
otherother datasets
datasets is mainly
is mainly employed em-
ployed
to to determine
determine the QoSthe QoS In
scores. scores. In this
this case, ourcase,
methodour method
provides provides four phases,
four phases, input,
input, prepro-
preprocessing,
cessing, scoring, scoring, and output,
and output, as depicted
as depicted in Figurein Figure
1. Input1. Input
data aredatatheareGPSthetransaction
GPS trans-
action
of of the
buses, buses, the polyline
polyline of every ofbus
every bus and
route, route,
theand the schedule
schedule conditionsconditions
of all busof all bus
routes.
To workTowith
routes. work GPSwith data,
GPSthe techniques
data, of GPS
the techniques of coordinates
GPS coordinatesrounding
roundingis adopted at the
is adopted at
preprocessing
the preprocessing phase. Then,
phase. bus trips
Then, and metadata
bus trips and metadata are calculated in order
are calculated to measure
in order three
to measure
QoS
threescoring functions.
QoS scoring Our work
functions. resulted
Our work in thein
resulted QoS
thescore of each
QoS score ofbus
eachroute for thefor
bus route three
the
months of the of
three months lastthequarter of 2021,
last quarter of and
2021,found that there
and found that was
thereroom
was for
roomimprovement
for improvement in the
sustainability of bus of
in the sustainability transportation services.
bus transportation services.
Trajectory- Complete
Bus GPS Bus-Trip Bus Trips QoS-1 Score
Route
Transactions Detecting Tracking
(3.2.1)
Matching
(3.4.1) (3.5)
(3.3.2)
Bus Bus
Schedule QoS-3 Score
Schedule
Conditions Tracking
(3.2.3) (3.7)
This manuscript contains five sections. The first provides an overall introduction to
our work. Second, we review the uses of GPS in transportation, the quality of service of
bus transportation, and the technical methods of GPS data processing. The third section
explains about
aboutthe
thedata
dataand
and proposed
proposed methods
methods for for calculating
calculating the three
the three QoS scoring
QoS scoring func-
functions. The fourth
tions. The fourth section
section demonstrates
demonstrates the results
the results of our analytical
of our analytical methodsmethods in the
in the form of
form
tablesofand
tables andtogether
charts, charts, together with a discussion.
with a discussion. In the
In the last last section,
section, a summary
a summary and
and recom-
recommended
mended futurefuture work based
work based on ouron our approach
approach are provided.
are provided.
2. Literature Review
This section studies the uses of GPS technology for transportation and the QoS of
bus transportation in several works, which are summarized in Table 1. In addition, the
technique of GPS coordinates, which is used to analyze spatial data, is reviewed.
Sustainability 2023, 15, 5618 4 of 23
extra trips, number of curtailed trips, total number of employees, number of routes, and
route distance.
Other works from Thailand [2,3] surveyed the QoS of public transportation based on
five dimensions: tangibility, reliability, responsiveness, assurance, and access. The authors
analyzed the results and concluded that the perceived quality of service in the Bangkok
metropolitan area and the East region was similarly poor and improvement is required
on some attributes, such as the number of buses, availability, precise bus schedules, buses’
current locations, safety, driver ability, interconnection of the transport system, etc.
Third, bus trips are analyzed in order to input data for calculating the three QoS scores.
The scores are for complete trip tracking, bus-driving route tracking, and bus schedule
tracking. This is discussed in Sections 3.4–3.7.
QoS-1, QoS-2, and QoS-3 scores are the output of the three steps.
3.1. Definitions
Our method introduces various terms, defined as follows:
- p (e.g., p1): An original coordinate point that is a relation comprising of latitude
and longitude.
- p with a dot (e.g., p1.1): An inner point between original coordinate points.
- p* (e.g., p*1, p*1.1): A rounding box of a coordinate point p.
- p*(x,y) (e.g., p*1(+1,+2) ): A neighbor of a p*. For example, if the 2-decimal rounding
box p*1 is (13.00, 100.00), the neighbor p*1(+1,+2) is (13.00 + 1 × 0.01, 100.00 + 2 × 0.01)
being (13.01, 100.02).
- P (e.g., P1): A path that is a sequence set of p.
- P* (e.g., P1*): A path P whose points are rounded.
- P** (e.g., P1**): A path that contains all neighbors of all coordinate points from P*.
- POR (b*, P**): A function to detect a point of bus (b*) on a path (P**).
Table 2. Example GPS data of a bus on route R7234. In this table, the bid is a bus identifier, route is a
route number, ts is a timestamp, lat is a latitude, lon is a longitude, and speed is a speed in kilometers
per an hour.
main path / go
(1) begin
point
main path / go
begin end
(2)
point point
main path / back
main path / go
begin end
(3) main path / back
point sub path / go point
sub end
point
sub path / back
main path / go
begin end
(4)
point split path / go main path / back point
split end
point
Figure
Figure2.2.Behaviors
Behaviorsofofbus
busroutes
routesand
andpaths
pathsinin
Thailand.
Thailand.(1)(1)
AA loop path.
loop (2)(2)
path. A two-direction path.
A two-direction path.
(3) A main path and subpath. (4) A main path and split
(3) A main path and subpath. (4) A main path and split path.path.
Due
Duetotothe
thedetails
detailsofofroutes
routes and
and paths described in
paths described in the
the previous
previousparagraph,
paragraph,ananexample
exam-
ple
of of a bus
a bus route
route polylinesdataset
polylines datasetisispresented
presentedin in Table 3, withwith route,
route,path_id,
path_id,path_type,
path_type,
direction,
direction,and
andpolyline.
polyline.EachEachentry
entryininthis
thistable
tableisisaasingle
singlepath,
path,where
whereone
oneroute
routecan
canhave
have
many
many paths due to the type and direction of the path. In addition, one route must haveaa
paths due to the type and direction of the path. In addition, one route must have
main
mainpath
pathwith
withonly
onlydirection,
direction,gogoororback,
back,but butmay
mayhavehavemanymanysplit
splitpaths
pathsand
andsubpaths.
subpaths.
- - route: a route number.
route: a route number.
- - path_id:
path_id:a aunique
uniqueidentifier
identifierofofa apath.
path.
- - path_type:
path_type:thethetype
typeofofpath,
path,that
thatcancanbe bemain,
main,split,
split,and
andsub.
sub.
- - direction:
direction:the
thebusbusdirection
directionofofa apath,
path,that
thatcan
canbebegogoandandback.
back.
- - begin_point:
begin_point:the thebegin
beginpoint
pointofofthethepolyline.
polyline.
- - end_point:
end_point:the theending
endingpoint
pointofofthe
thepolyline.
polyline.
- - polyline: the sequence set (array) of coordinates.
polyline: the sequence set (array) of coordinates.
The updateddataset
The updated datasetofof bus
bus route
route polyline
polyline datadata
fromfrom
2021 2021 for Bangkok
for Bangkok and itsand its
metro-
metropolitan area has 1085 entries, including 454 routes, as shown in Figure 3; each
politan area has 1085 entries, including 454 routes, as shown in Figure 3; each route has route
has 2.4 paths,
2.4 paths, 0.7 split
0.7 split paths,
paths, andand 0.2 subpaths
0.2 subpaths on average.
on average.
0 10 km
The value of the field param is dependent on the con_type. First, each path must have
one condition, with con_type being “all trips” in order to check the minimum number of
trips. As in the first entry (con_id = 1), the path_id R7234.00 must have 50 trips. Second, if
the con_type is “count,” the parameter (param) is the number of buses. If the con_type is
“headway,” the parameter is the bus-headway minutes. In this case, the second condition
(con_id = C0002) interprets that the number of bus trips on the path “R7234.00” of the
route “R7234” between 05:00 and 21:00 must be at least 50. Last, the third condition
(con_id = C0003) shows that, between 06:00 and 09:00, the start time of each trip must be
no more than 10 min. Conditions C0013, C0014, and C0015 are set to be example cases in
the next section.
p2 p2
p1.2
p2.1
p1.1
p1 p3 p1 p3
(1) (2)
p2 p2
p1.2 p1.2
p2.1 p2.1
p1.1 p1.1
p1 p3 p1 p3
(3) (4)
(5) (6)
p*2 p*2
p*1.2 p*1.2
p*1.1 p*1.1
p*1 p*2.1 p*3 p*1 p*2.1 p*3
(7) (8)
Figure 4. Steps
Figure 4. Steps toto construct
construct GPS
GPS rounding
rounding boxes.
boxes. (1)
(1) An
An original
original polyline. (2) Inner
polyline. (2) Inner points
points between
between
corner
cornerpoints.
points.(3)
(3)The
Theconstruction
constructionofofaarounding
roundingboxboxgrid.
grid.(4)
(4)Mapping
Mappinga apoint
point into itsits
into rounding
rounding box.
(5) The
box. (5)representation of rounding
The representation box ofbox
of rounding each
ofpoint with awith
each point star asymbol. (6) A guideline
star symbol. for creating
(6) A guideline for
the first-layer
creating neighborsneighbors
the first-layer of a givenofrounding box. (7) The
a given rounding neighbors
box. of the firstofrounding
(7) The neighbors box. (8) All
the first rounding
box. (8) Allofneighbors
neighbors of all
all rounding rounding boxes.
boxes.
Step
Step 1,
1, Figure
Figure 4(1):
4(1): PP represents
represents aa bus
bus path
path that
that is
is aa set
set of
of sequence
sequence points
points pp from
from the
the
begin
begin point
point to
to the
the ending
ending point.
point. For
For example,
example,
P {p1,
P= = {p1, p2,
p2, p3}.
p3}. (1)
(1)
Step2,2,Figure
Step Figure4(2):
4(2):Since
Since most
most points
points on on polylines
polylines are corner
are corner points,
points, a distance
a distance be-
between
tween adjacent points might be far in case of a long straight line. Thus, we need
adjacent points might be far in case of a long straight line. Thus, we need to find inner pointsto find
inner points
between cornerbetween
points.corner points. of
The distance The distance
nearby innerofpoints
nearbycan
inner points can
be adjusted be adjusted
depending on
developers, such as 10 m. For example, as with path P in step (1), the inner points the
depending on developers, such as 10 m. For example, as with path P in step (1), inner
between
points
p1 and between
p2 mightp1 beand
p1.1p2andmight
p1.2.be p1.1 Pand
Thus, canp1.2. Thus, Pascan
be written be written as follows:
follows:
Step 3, Figure 4(3–5): All points of P are rounded into rounding boxes. The rounding
digit is customizable by developers. In an area close to the equator such as Thailand, the
size of 0, 1, 3, 4, and 5 -digit rounding boxes is approximately 100 km, 10 km, 100 m, 10 m,
and 1 m, respectively. For example, if the coordinates of pi are p = (13.13243, 100.47386), the
P = {p1, p1.1, p1.2, p2, p2.1, p3}. (2)
Sustainability 2023, 15, 5618 Step 3, Figure 4(3–5): All points of P are rounded into rounding boxes. The rounding11 of 23
digit is customizable by developers. In an area close to the equator such as Thailand, the
size of 0, 1, 3, 4, and 5 -digit rounding boxes is approximately 100 km, 10 km, 100 m, 10 m,
and 1 m, respectively. For example, if the coordinates of pi are p = (13.13243, 100.47386),
3-digit rounding box of p will be p* = (13.132, 100.474). According to step (2), the rounding
the 3-digit
boxes of the rounding
path P is P*box
in of
thepfollowing
will be p* = (13.132, 100.474). According to step (2), the
line:
rounding boxes of the path P is P* in the following line:
P*P*= {p*1, p*1.1,
= {p*1, p*1.2,
p*1.1, p*2,
p*1.2, p*2.1,
p*2, p*3}.
p*2.1, p*3}. (3)
(3)
Step
Step4,4,Figure
Figure4(6–8):
4(6–8): The
Therounding
roundingboxesboxesof ofP*
P*ininthe
theprevious
previoussteps stepscannot
cannotcreate
createaa
continuous route path. In our work, we have to create neighbors
continuous route path. In our work, we have to create neighbors of a rounding box of a rounding box in orderin
to connect all rounding boxes and expand the area of a path.
order to connect all rounding boxes and expand the area of a path. The neighbors areThe neighbors are created
around
createdaaround
box in aallboxdirections. A neighbor
in all directions. is defined
A neighbor by p*(x,y)by
is defined , where
p*(x,y), subscripts x and yx
where subscripts
are the shifting direction of the current p*. For example, if the three-digit
and y are the shifting direction of the current p*. For example, if the three-digit rounding rounding box of p
isbox
p* of
= (13.132, 100.474), the p*
p is p* = (13.132, 100.474), (–1,–1) is (13.132–0.001, 100.474–0.001), which becomes
the p*(–1,–1) is (13.132–0.001, 100.474–0.001), which becomes (13.131,
100.473). In this case,
(13.131, 100.473). In thisthecase,
original p* is represented
the original by p*(0,0)
p* is represented by. p*It (0,0)
means that one-layer
. It means that one-
neighbors are nine boxes, including the original one. If a developer
layer neighbors are nine boxes, including the original one. If a developer chooses two- chooses two-layer
neighbors, there will be 25 boxes. Thus, the number of neighbors including the original
layer neighbors, there will be 25 boxes. Thus, the number of neighbors including the orig-
one is (2n + 1)2 , where n is the number of layers surrounded.
inal one is (2n + 1)2, where n is the number of layers surrounded.
As demonstrated in Figure 4(6,7), the neighbors of the point p*1, including itself, can
As demonstrated in Figure 4(6,7), the neighbors of the point p*1, including itself, can
be p*1(–1,–1) , p*1(0,–1) , p*1(1,–1) , p*1(–1,0) , p*1(0,0) , p*1(1,0) , p*1(–1,1) , p*1(0,1) , and p*1(1,1) . Thus,
be p*1(–1,–1), p*1(0,–1), p*1(1,–1), p*1(–1,0), p*1(0,0), p*1(1,0), p*1(–1,1), p*1(0,1), and p*1(1,1). Thus, P**,
P**, which is a set of neighbors of elements of P*, as shown in Figure 4(8), can be as follows:
which is a set of neighbors of elements of P*, as shown in Figure 4(8), can be as follows:
P** = { p*1(-1,-1) , p*1 = {, p*1
P**(0,-1) p*1(–1,–1)
(1,-1),, p*1
p*1(0,–1), ,p*1
(-1,0) p*1(1,–1) , p*1
, p*1
(0,0) , p*1
, p*1
(1,0) , . (1,0)
, p*1
(-1,1) . . ,, p*3
p*1(0,1) , p*3(1,1)(0,1)
(–1,1), …, p*3
}. , p*3(1,1) }. (4)
(–1,0) (0,0) (4)
An
An example
example of
of the
the P**
P** of
of aa route
route isisdemonstrated
demonstrated inin Figure
Figure 5(1),
5(1), where
where Figure
Figure 5(2)
5(2)
shows
showsrounding
roundingpoints
pointsininaazoom-in
zoom-inof ofthetheselected
selectedrectangle
rectanglearea
areaininFigure
Figure5(1).
5(1).
(1) (2)
Figure 5. Example rounding boxes of a bus route path: (1) a route path with a selected area; (2)
Figure 5. Example rounding boxes of a bus route path: (1) a route path with a selected area; (2)
rounding
roundingboxes
boxesof
ofthe
theselected
selectedarea
areainin(1).
(1).
Thus, the begin point, end point, and polyline of each path in Table 3 are calculated
Thus, the begin point, end point, and polyline of each path in Table 3 are calculated
via the rounding boxes and presented in Table 5. In this table, rounding boxes’ data
via the rounding boxes and presented in Table 5. In this table, rounding boxes’ data are
are presented by variables. For clarity, the begin point and the end point refer to the
presented by variables. For clarity, the begin point and the end point refer to the path_id
path_id with subfix “.B” and “.E.” For example, in the first entry, R7234.00.B**, R7234.00.E**,
with subfix “.B” and “.E.” For example, in the first entry, R7234.00.B**, R7234.00.E**, and
and R7234.00** are the sets of rounding boxes of the begin point, the end point, and the
R7234.00** are the sets of rounding boxes of the begin point, the end point, and the pol-
polyline, respectively.
yline, respectively.
Sustainability 2023, 15, x FOR PEER REVIEW 12 of 24
Sustainability 2023, 15, 5618 12 of 23
Table 5. Example of bus route polyline data with rounding boxes (a point name ending with two-
Table 5. Example of bus route polyline data with rounding boxes (a point name ending with
star symbols.)
two-star symbols.)
begin_point end_point polyline
route path_id path_type direction begin_point end_point polyline
route path_id path_type (Rounding
direction Boxes) (Rounding Boxes) (Rounding Boxes)
(Rounding Boxes) (Rounding Boxes) (Rounding Boxes)
R7234 R7234.00 main go R7234.00.B** R7234.00.E** R7234.00**
R7234
R7234 R7234.00
R7234.01 main
split go go R7234.00.B**
R7234.01.B** R7234.00.E**
R7234.01.E** R7234.00**
R7234.01**
R7234 R7234.01 split go R7234.01.B** R7234.01.E** R7234.01**
R7234 R7234.02 split back R7234.02.B** R7234.02.E** R7234.02**
R7234 R7234.02 split back R7234.02.B** R7234.02.E** R7234.02**
R7234
R7234 R7234.03
R7234.03 subsub go go R7234.03.B**
R7234.03.B** R7234.03.E**
R7234.03.E** R7234.03**
R7234.03**
R7234
R7234 R7234.04
R7234.04 subsub back back R7234.04.B**
R7234.04.B** R7234.04.E**
R7234.04.E** R7234.04**
R7234.04**
R8190
R8190 R8190.00
R8190.00 main
main go go R8190.00.B**
R8190.00.B** R8190.00.E**
R8190.00.E** R8190.00**
R8190.00**
R8190 R8190.01 main back R8190.01.B** R8190.01.E** R8190.01**
R8190 R8190.01 main back R8190.01.B** R8190.01.E** R8190.01**
... ... ... ... ... ... ...
… … … … … … …
3.3.2.
3.3.2.Trajectory
TrajectoryRoute
RouteMatching
Matching
The
The trajectory routematching
trajectory route matchingisisaamethod
methodto tocheck
checkwhether
whetheraaGPSGPSpoint
pointisison
on aa path.
path.
Since
Since it is unlikely that a coordinate point will be exactly on a path, the distancefrom
it is unlikely that a coordinate point will be exactly on a path, the distance fromthe
the
point
pointtotothetheperpendicular
perpendicularline lineononthe
thepath
pathsurface
surfaceisisgenerally
generallyconsidered,
considered,as
asshown
shownin in
Figure
Figure6(1,2).
6(1,2). For
For this
this vector
vector technique,
technique, aamaximum
maximum distance
distance should
should be
bedefined,
defined,andanditit
consumes
consumescalculation
calculationtime
timethat
thatisisnot
notappropriate
appropriatewith withaalarge
largeamount
amountofofdata.
data.Thus,
Thus,we we
decided to use the rounding boxes of a path for the trajectory route matching.
decided to use the rounding boxes of a path for the trajectory route matching. In this fig-In this figure,
b1 is a coordinate of a bus, where a path is a bus route path. Figure 6(3) shows that b1 is
ure, b1 is a coordinate of a bus, where a path is a bus route path. Figure 6(3) shows that
rounded into b*1. This location is on a path P if b*1 is an element of P**. The function to
b1 is rounded into b*1. This location is on a path P if b*1 is an element of P**. The function
detect a point on a route path (POR) is defined in the following equation, where b* is any
to detect a point on a route path (POR) is defined in the following equation, where b* is
point and P** is a set of rounding boxes in any path.
any point and P** is a set of rounding boxes in any path.
∗ ∗ ∗∗ ∗∗
𝑏 ∈b 𝑃∈ P
1, 1,
POR (b∗∗,,P𝑃∗∗
𝑃𝑂𝑅(𝑏 ∗∗ ):
) :=
= (5)
(5)
0, 0,
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
otherwise
b1
(1)
d
b1
(2)
b*1
(3)
Figure 6. Steps of bus-route matching using GPS rounding boxes. (1) A location of a bus b1 closing to
Figure 6. Steps of bus-route matching using GPS rounding boxes. (1) A location of a bus b1 closing
atopolyline of aof
a polyline bus route.
a bus (2) The
route. distance
(2) The between
distance the bus
between theb1
busand
b1 the
andpolyline. (3) The
the polyline. (3)representation
The represen-
oftation
the rounding
of the rounding box of b1, which is b*1, on the neighbors of the rounding boxespolyline.
box of b1, which is b*1, on the neighbors of the rounding boxes of the of the pol-
yline.
Sustainability 2023, 15, x FOR PEER REVIEW 13 of 24
In addition, to detect a bus driving on a bus route path, we need to verify that most
of the
InGPS coordinates
addition, of aa bus
to detect bus belong
drivingtoonthe route
a bus path.
route Thewe
path, concept
need of
to trajectory
verify thatroute
most
matching is a key player for finding QoS scores in the next sections.
of the GPS coordinates of a bus belong to the route path. The concept of trajectory route
matching is a key player for finding QoS scores in the next sections.
3.4. Bus Trip Calculating
3.4. Bus Tripthe
When Calculating
rounding boxes of all paths constructed, in the next step, it is to detect bus
tripsWhen
and on-path driving.boxes
the rounding Theseofconcepts
all pathsare described in the following
constructed, next step, subsections.
it is to detect bus
trips and on-path driving. These concepts are described in the following subsections.
3.4.1. Bus Trip Detection
3.4.1. The
Bus concept
Trip Detection
is to detect when an individual bus transits from the begin point to the
end The concept
point. is to
The size of detect when an
the rounding individual
boxes area of bus transits
a point from100
is about the× begin
100 m,point to the
as shown
end point.7(1).
in Figure The The
size begin
of the point
rounding
and boxes areaare
end point of adefined
point isasabout 100 × 100 m, as shown in
follows:
Figure
- 7(1).
The Thepoint
begin beginispoint andwhen
detected end point
a busare defined
starts as follows:
moving out of the rounding boxes area
- The
of thebegin
beginpoint
point,isasdetected
shown inwhenFigure a 7(2).
bus At starts movingt1,out
timestamp of the
a bus rounding
is inside boxes
the round-
area
ing boxes area, while it moves out of the area at the timestamp t2. In this case, t1 the
of the begin point, as shown in Figure 7(2). At timestamp t1, a bus is inside is
rounding
stamped as boxes area,ofwhile
the time a busitatmoves out point
the begin of theR8190.00.B.
area at the timestamp t2. In this case,
- t1
Theis stamped
end pointasisthe time of
detected a busa at
when busthe begin
starts point R8190.00.B.
moving into the rounding boxes area of
- The endpoint,
the end point as
is shown
detected in when
Figurea7(3).
busAt starts movingt9,into
timestamp theisrounding
a bus boxes
entering the area of
rounding
the
boxesendarea,
point,
andas itshown
starts in Figure
inside the7(3).
areaAtat timestamp
timestampt9, a bus
t10. is entering
In this case, t10the rounding
is stamped
boxes
as the area, and
time of it starts
a bus at theinside the area
end point at timestamp t10. In this case, t10 is stamped
R8190.00.E.
as the time of a bus at the end point R8190.00.E.
begin
point
end
point
rounding
boxes
rounding
(1) boxes
t1
t2 t10
begin point t9
end point
R8190.00.B R8190.00.E
(2) (3)
Figure7.
Figure 7. A
A method
method to
to detect
detect aa bus
busat
ataabegin
beginpoint
pointand
andananend
endpoint.
point.(1)(1)
The rounding
The roundingboxes of aof a
boxes
beginning point and an end point of a bus route path. (2) A timestamp t1 when a bus starts
beginning point and an end point of a bus route path. (2) A timestamp t1 when a bus starts moving mov-
ing out of a beginning rounding boxes area, which is represented by two-star symbols (3) A
out of a beginning rounding boxes area, which is represented by two-star symbols (3) A timestamp
timestamp t10 when a bus enters an end rounding boxes area.
t10 when a bus enters an end rounding boxes area.
If the sequence of the begin and end points of a bus, as shown in Figure 8(1), is
[R8190.00.B, R8190.00.E, R8190.00.B, R8190.00.B, R8190.00.E], the trips become [(R8190.00.B,
Sustainability 2023, 15, x FOR PEER REVIEW 14 of 24
If the sequence of the begin and end points of a bus, as shown in Figure 8(1), is
[R8190.00.B, R8190.00.E, R8190.00.B, R8190.00.B, R8190.00.E], the trips become
[(R8190.00.B,
R8190.00.E),R8190.00.E),
(R8190.00.B,(R8190.00.B, ?), (R8190.00.B,
?), (R8190.00.B, R8190.00.E)].
R8190.00.E)]. The first
The first pair and pair
the and
last the
pair
last pair contain
contain the beginthepoint
begin point
and the and
end the end
point ofpoint of path R8190.00,
path R8190.00, so they
so they are are considered
considered full trips.
However,
full the case the
trips. However, (R8190.00.B, ?), which?),does
case (R8190.00.B, notdoes
which havenot
anhave
end point,
an endispoint,
not considered
is not con-a
full trip.
sidered a full trip.
InInaacase
casewhere
whereaaroute
routehashasmain
mainpaths,
paths,split
splitpaths,
paths,andandsubpaths,
subpaths,the themain
mainpathpathisis
consideredthe
considered thehighest
highestpriority,
priority,while
whilethe
thesplit
splitpath
pathandandthethesubpath
subpathare areinindescending
descending
orderof
order ofimportance.
importance. As As shown
shown in in Figure
Figure8(2);
8(2);P.0,
P.0,P.1, and
P.1, andP.2P.2
areare
a main path,
a main a split
path, path,
a split
and a subpath; and the sequence of a bus is [P.0.B, P.2.B, P.2.E, P.0.E, P.2.B,
path, and a subpath; and the sequence of a bus is [P.0.B, P.2.B, P.2.E, P.0.E, P.2.B, P.2.E, P.2.E, P.1.B, P.1.E].
The trip
P.1.B, is considered
P.1.E]. [(P.0.B, (P.2.B,
The trip is considered P.2.E),
[(P.0.B, P.0.E),
(P.2.B, (P.2.B,
P.2.E), P.2.E),
P.0.E), (P.1.B,
(P.2.B, P.1.E)(P.1.B,
P.2.E), ], where the
P.1.E)
first subpath trip (P.2.B, P.2.E) is inside the main path trip, so it is ignored due
], where the first subpath trip (P.2.B, P.2.E) is inside the main path trip, so it is ignored due to the main
topath
the having
main pathhigher priority
having higherthan the subpath.
priority than the Insubpath.
this case,Inthere
this are three
case, trips,
there are (P.0.B, P.0.E),
three trips,
(P.2.B, P.2.E), and (P.1.B, P.1.E).
(P.0.B, P.0.E), (P.2.B, P.2.E), and (P.1.B, P.1.E).
Thetrip
The tripcalculation
calculation results
results are
are given
givenininTable
Table6.6.InInthe
thetable,
table,the columns
the columns areare
as follows:
as fol-
-
lows: index: an index, which is a running number, of each entry.
- - index:
bid: aan
bus identifier.
index, which is a running number, of each entry.
- - path_id: a path identifier.
bid: a bus identifier.
- begin_ts: a begin timestamp when a bus starts moving out from a begin point’s
- path_id: a path identifier.
rounding boxes area.
- begin_ts: a begin timestamp when a bus starts moving out from a begin point’s
- end_ts: an end timestamp when a bus starts moving into an end point’s rounding
rounding boxes area.
boxes area.
- end_ts: an end timestamp when a bus starts moving into an end point’s rounding
- is_full_trip: to check if a trip is a full trip, where 1 is a full trip, otherwise 0.
boxes area.
- on_path: a measurement of a bus driving on a route path. It uses a Jaccard index,
- is_full_trip: to check if a trip is a full trip, where 1 is a full trip, otherwise 0.
which will be described in the next subsection.
- on_path: a measurement of a bus driving on a route path. It uses a Jaccard index,
which will be described in the next subsection.
The first row in the table indicates that the trip was made by bus “4d43e028” on path
The first row in the table indicates that the trip was made by bus “4d43e028” on path
R8190.00, which is the main path of route R8190, between 10:10 and 12:12 on 1 October
R8190.00, which is the main path of route R8190, between 10:10 and 12:12 on 1 October
2022, and was a full trip. In addition, some trips, such as 3, 6, and 11, were considered
2022, and was a full trip. In addition, some trips, such as 3, 6, and 11, were considered
failed trips, because they did not pass through the end points of their paths.
failed trips, because they did not pass through the end points of their paths.
3.4.2.
3.4.2.On-Path
On-PathDriving
DrivingDetection
Detection
When
When a trip is detected,ananon-path
a trip is detected, on-path driving detection
driving is is
detection also calculated.
also TheThe
calculated. calcula-
calcu-
tion needs to follow the GPS data of each trip point by point to check the distance
lation needs to follow the GPS data of each trip point by point to check the distance on on
a
route path and the distance outside of the route path. To do this, a true-positive,
a route path and the distance outside of the route path. To do this, a true-positive, false- false-
positive,
positive,and
andfalse-negative
false-negative are
are verified, as demonstrated
verified, as demonstratedininFigure
Figure9,9,andandthe
theJaccard
Jaccardindex
in-
dex is determined.
is determined.
-- True-positive
True-positive(TP):
(TP):the
thedistance
distanceofofaabus
busdriving
drivingon
onaaroute
routepath.
path.
-- False-positive
False-positive(FP):
(FP):the
thedistance
distanceofofaabus
busdriving
drivingoutside
outsideofofaaroute
routepath.
path.
-- False-negative
False-negative(FN):
(FN):the
thedistance
distanceofofaaroute
routepath
pathwithout
withoutaabusbusdriving
drivingon
onit.it.
8 km
(A) F P= (C)
begin
point (B) (D)
end
TP = 5 km TP = 5
km FN =5 km point
ExampleGPS
Figure9.9.Example
Figure GPStracks
tracksofofaabus
buson
onaabus
busroute
routepath
pathwhere
whereA–D
A–Dare
arepoints
pointsof
ofits
itspolyline.
polyline.
After that, the Jaccard index is calculated as in the following equation. As shown in
After that, the Jaccard index is calculated as in the following equation. As shown in
Figure 9, TP is 10 (from 5 + 5), FP is 8, and FN is 5, so the Jaccard calculated by 10/(10 + 8 + 5)
Figure 9, TP is 10 (from 5 + 5), FP is 8, and FN is 5, so the Jaccard calculated by 10/(10 + 8
is 0.43 or 43%. The maximum is 1 and the minimum is 0. An example result of Jaccard
+ 5) is 0.43 or 43%. The maximum is 1 and the minimum is 0. An example result of Jaccard
calculation is shown in the column on_path of Table 5.
calculation is shown in the column on_path of Table 5.
Jaccard 𝑇𝑃TP
𝐽𝑎𝑐𝑐𝑎𝑟𝑑 = = TP + FP + FN (6)
(6)
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁
Thisstep
This stepisisalso
alsoused
usedtotosupport
supportthe
the data
data validation.
validation.Attributes
Attributeson_path
on_pathandandtravel
travel
time, which
time, which is the difference
difference between end_ts and begin_ts, calculated from Table 6 used
between end_ts and begin_ts, calculated from Table 6 are are
to define
used outliner
to define data.data.
outliner A small valuevalue
A small of theofon_path, such such
the on_path, as a number lowerlower
as a number than 0.3,
thanis
assumed
0.3, that athat
is assumed busatrip
buswas
trip not
wasperforming its normal
not performing duties,
its normal so that
duties, trip trip
so that is eliminated
is elimi-
nated from the evaluation of QoS. In addition, the outliners of the travel time are detected
Sustainability 2023, 15, 5618 16 of 23
from the evaluation of QoS. In addition, the outliners of the travel time are detected using
the interquartile range (IQR) method [21,22]. Thus, any trip having different travel time
than the normal travel time of a given route path is also considered to exclude from the
assessment of QoS.
After all paths are calculated, the QoS-1 scores of each route are the weighted average
of all paths of that route. For example, the QoS-1 of the route R8190 on 1 October 2021 is
shown in Table 7.
Table 7. Example of three QoS scores of the route R8190 on 1 October 2021.
max(num_on_path_trips, all_trips)
QoS − 2 Score = . (8)
all_trips
In this case, the QoS-2 score of R8190.00 from the example data in Tables 4 and 5 is
max(10, 12)/12, or 0.83. This score of a given day is recorded in Table 7.
begin
check
con_type
con_type = “count” con_type = “headway”
N = the number of trips from the condition N = the number of possible trips accroding the
headway from the condtion
is the last no
trip?
yes
score = max(n, N) / N
end
Figure10.
Figure 10. Flowchart
Flowchart for
for calculating
calculating the
theQoS-3
QoS-3score.
score.
In
Inaddition,
case of awhen the condition
condition type beingtype“count,”
is “headway,” a ratio
the a ratio score is max(n,
between calculated
N) the
andsame
N is
as for the previous
calculated, where ncondition.
is the number of full n
However, is the
trips, andnumber
N is the ofnumber
trips satisfying thetrips
of possible headway
satis-
condition. AccordingAccording
fying the condition. to condition C0015 in Table
to condition C00144,inthe headway
Table between
4, five trips 16:00 and
are needed 18:00
between
is 30 min, so the first trip must be at 16:00 and the next trips take 30 min
11:00 and 12:00, so N is 5. To apply this condition, indices 3–6 of Table 6 are selected, and each, until 18:00.
This means that
the number this condition
of trips is 4, so n isrequires
4. Thus, five scoresoofNthe
thetrips, is 5.condition
In this case, a developer
C0014 is 4/5, or can
0.8. add
someIn error such as ± 5 min. Based on the time of this condition,
addition, when the condition type is “headway,” a ratio score is calculated indices 10–13 of Tablethe
6
are selected.
same as for the previous condition. However, n is the number of trips satisfying the head-
-way Atcondition.
index 10, According to condition
the begin_time C0015
is 16:05, whichin Table 4, the
satisfies theheadway
condition between 16:00error
including and
18:00times. Thus,son the
is 30 min, is 1,first
andtrip
ex_begin_time
must be at 16:00is 16:05.
and the next trips take 30 min each, until
-18:00.
AtThis
index 11, the
means thatbegin_time is 16:35.
this condition It differs
requires from so
five trips, theNex_begin_time about
is 5. In this case, 30 min,
a developer
so n some
can add becomeserror2, such
and ex_begin_time
as ±5 min. Based becomes
on the16:35.
time of this condition, indices 10–13 of
-TableAt 6 index 12, the begin_time is 17:20. It differs from the ex_begin_time about 45 min,
are selected.
- so this trip
At index 10, is failed. In this case,
the begin_time n is still
is 16:05, 2, andsatisfies
which ex_begin_time changes
the condition into 17:20.
including error
- At index 13, the begin_time is 17:50, and
times. Thus, n is 1, and ex_begin_time is 16:05. it differs from the previous one about 30 min.
- Thus,
At indexn becomes 3.
11, the begin_time is 16:35. It differs from the ex_begin_time about 30 min,
Since n is 3 and2,Nand
so n becomes is 5,ex_begin_time
the score of this condition
becomes is 3/5 or 0.6. At the end, the average
16:35.
score
- of all conditions, C0014 and C0015, is 0.7. Thus,
At index 12, the begin_time is 17:20. It differs from the the
QoS-3 score of 0.7 is
ex_begin_time as recorded
about 45 min,
in Table 7. trip is failed. In this case, n is still 2, and ex_begin_time changes into 17:20.
so this
- At index 13, the begin_time is 17:50, and it differs from the previous one about 30
4. Results
min. Thus, n becomes 3.
4.1. Result of Bus QoS scores
Since n is 3 and N is 5, the score of this condition is 3/5 or 0.6. At the end, the average
The GPS transaction dataset of buses between 1 October 2021 and 31 December 2021
score of all conditions, C0014 and C0015, is 0.7. Thus, the QoS-3 score of 0.7 is as recorded
was analyzed. There were 709,182,747 transactions in total, including 454 bus routes and
in Table 7.
4418 buses. The route numbers were masked due to privacy constraints—for example,
Sustainability 2023, 15, 5618 18 of 23
R7234, R7731, R8196, R8630, etc. After calculating with our approach from the previous
section, the daily results of QoS-1, QoS-2, and QoS-3 were as given in Table 8. The table
demonstrates examples of 12 entries from the actual 92 entries of route R7234. After that,
the QoS scores of each route were grouped by month and reported in Table 9. In addition,
the report from Table 9 can be visualized into charts as in Figure 11. There are three charts
reporting QoS-1, 2, and 3, and each is grouped by a bus route, where every group displays
a QoS score ordered by month.
Table 8. Daily QoS scores of the route R8155 in the 4th quarter of 2021.
Table 9. Monthly QoS scores of various routes for the 4th quarter of 2021.
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
R7234 R7731 R8196 R8630
R7234 R7731 R8196 R8630
Sustainability 2023,15,
Sustainability2023, 15,5618
x FOR PEER REVIEW 19 of 24 19 of 23
1
0.8
0.6
QoS-2 Score 1 0.4
0.8 0.2
0.6
QoS-1 Score 0
2021-10
2021-12 2021-11
2021-10 2021-12
2021-11 2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
0.4
0.2
0
R7234 R7731 R8196 R8630
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
R7234 R7731 R8196 R8630
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
0.2
0
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
R7234 R7731 R8196 R8630
R7234 R7731 R8196 R8630
R7234 R7731 R8196 R8630
R7234 R7731 R8196 R8630
Figure 11. Charts of monthly QoS scores.
1
0.8
QoS-3InScore
addition,
0.6 histograms have been generated to summary QoS scores in detail, as de-
0.4
picted in Figure 0.2
12. The x axis is QoS scores from 0 to 100, and the y axis is the number of
city bus routes 0having a particular score. As in the figure, most bus routes have scores
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
2021-10
2021-11
2021-12
close to 100, while a small number of routes have lower scores. In order to make the data
more understandable, we graded
R7234
R7234
each route
R7731
R7731
by level: high,
R8196
R8196
R8630 medium, low, and lower, as
R8630
reported in Table 10. The table contains the rating labels, rating range, and number of city
Figure
Figure
bus 11.11.
Charts
routes of monthly
Charts
with of monthly
three QoS scores
QoS scores.
QoS scores.
for each rate.
In addition, histograms have been generated to summary QoS scores in detail, as de-
picted QoS-1 QoS-2
in Figure 12. The x axis is QoS scores from 0 to 100, and the y axisQoS-3
is the number of
city bus routes having a particular score. As in the figure, most bus routes have scores
close to 100, while a small number of routes have lower scores. In order to make the data
more understandable, we graded each route by level: high, medium, low, and lower, as
reported in Table 10. The table contains the rating labels, rating range, and number of city
all scores bus routes with three QoS scores for each rate.
all scores
scores < 80
Figure
Figure 12.
12. Histograms
Histograms of
of QoS
QoS scores.
scores. Each
Each column
column is
is the
the QoS
QoS score;
score; the
the first
first row
row shows
shows histograms
histograms
scores < 80
of all scores, and the second row displays histograms of scores below 80.
of all scores, and the second row displays histograms of scores below 80.
Table 10. Number of city bus routes having each rating level of QoS scores.
4.2. Discussion
The measurement of QoS of public city bus transportation is an early step in the
improvement of smart mobility since it helps one to understand the current situation. There
are many factors involved in the assessment, such as accessibility, availability, comfort,
customer satisfaction, reliability, safety, security, etc. [2–5]. These metrics are generally
evaluated by the user survey method [2–4], because users are the direct service consumers
and this method can reflect user expectations in a straightforward way. As we are in the
era of data utilization, data analytics supports the analysis of certain factors, in addition
to the survey method [6,8]. Some studies have attempted to use GPS data analytics for
transportation, e.g., for assessing the travel time, travel time variability, waiting time, or
transfer time of buses [7,9]. This is advantageous evidence of the use of data for determining
the QoS of transportation, especially bus services. Since several studies have addressed the
transportation-related issues mentioned above, this study is an extension of the analysis
of GPS data to measure the efficiency of bus services in terms of accessibility, availability,
and reliability. Thus, we aimed to measure the QoS of public city bus transportation in
Bangkok by analyzing the GPS data of buses, route data, and schedule conditions. We used
three QoS scoring functions to determine complete trips, on-path driving, and on-schedule
operations, tracking the conditions of each bus route. The results are reported in Section 4.1;
we found that most of the bus routes received high scores. In this discussion, we organize
our contribution into two parts: our approach, and smart city management.
First, the contribution of the proposed approach is to derive the quality of service
of bus transportation by data analytics. As mentioned in the introduction, it would be
convenient if there were data from wireless sensors at each bus stop to detect the bus arrival
time [15,16]. However, without wireless sensor data, it was necessary to use GPS and spatial
data. For the datasets that we have, we found four challenging issues: that were no arrival
data at any bus stops, one bus route had many paths, a bus could choose any path under
the same route, and there was no exact departure time in timetables. Therefore, the GPS
coordinates rounding box was adopted for path matching [17–20]. It rasterizes a vector of a
polyline into a set of grids, which are indices of a path. Although this technique requires
some memory, it involves little computational processing, and is capable of working with a
large amount of data, such as voluminous GPS transaction coordinates. To match a path, it
finds a trip of a bus with a path type and a direction, so we could detect incomplete trips,
as demonstrated in Figures 7 and 8. Another advantage of using rounding boxes is that it
is simple to detect a bus driving along a route, as shown in Figure 9. Moreover, working
with a condition table and the algorithm in Figure 10, we could correct the frequency
and headway of each bus route path. For all of these steps, the rounding box technique
is a key player that preprocesses the raw data into bus trips and serves all QoS scoring
functions. The results of our work demonstrate the use of data analytics to monitor QoS,
in addition to surveys, as other works have demonstrated. There are more criteria that
data analytics can support, such as driving safety, travel time, bus stop proximity, other
mode connections, etc.; however, this requires much more data, such as bus stop locations
and the coordinates of other modes, which are useful for future research. In addition, the
survey method from [2,3,5,6] is still needed because some qualitative results, such as user
satisfaction, on-board safety, appropriate fare, driver’s ability, and ticket availability are
difficult to measure by data analytics.
Second, our contribution to smart city management was to use data to improve the
QoS. Our work focused on public city bus transportation because buses are commonly
used in any city, such as Bangkok, Thailand. Our data analytics contributes to the research
on transport quality in terms of reliability, accessibility, and availability.
Reliability. The reliability is one aspect contributing to user satisfaction [23]. This
factor can refer to an ability to carriage passengers from a starting point to an end point [24].
The reliability assessed in this work is the ability of buses to perform their intended trip
from an origin to a destination along a route path under specified conditions for a given
period without failure. This factor is measured by QoS-1, which is for compete trip tracking.
Sustainability 2023, 15, 5618 21 of 23
This metric will ensure that bus providers provide enough buses to offer the number of
complete trips that they have committed to. A low score means that the bus operator
cannot provide enough buses to complete the agreed number of trips, so the operator must
prepare more vehicles; otherwise, it may negatively affect the use of this bus route in the
future. The results in Table 10 show that more than 300 bus routes achieved a high rating,
while about 130 needed significant improvement.
Accessibility. The term “accessibility” generally refers to the ability to transfer people
from an origin to a destination [25]. This measurement approach is primarily from the
perspective of user demand and can be viewed as the coverage of transportation system
against the needs of people and user satisfaction [26]. The evaluation in a user-centric mode
is possible by the user survey method [2–4], and by data analytics from individual trip data
such as inferring the mobility of people from their bus smart card payment transactions to
evaluate the supply of public bus transport. In our work, there are data from the supply
side only. The information contains the routes that operators take as concessions from
the government authority and conditions for running buses on each route path that the
operators have committed to. In this work, we excluded how the route meets the user
demand; nevertheless, we were able to evaluate how buses drive along the promised route
paths. Since QoS-1 measures complete trips, a bus may go off route to achieve the fastest
trip between a begin point and an end point in order to increase the QoS-1 score. This
results in a bus not stopping at every location on the route, and is considered a violation of
the regulations of the city bus transportation. Thus, QoS-2, for bus on-path driving tracking,
was introduced to confirm that a bus driver follows the whole route path. A high score
means that a trip had less off-route time and covered the whole path. As per our analysis,
there were about 300 bus routes rating highly, whereas for about 100 the operator must
enforce stricter guidelines with the drivers in order to increase the QoS-2.
Availability. The availability of for public transportation refers to the ability to provide
services covering the demands of travels from passengers. It can be viewed that having
a bus service in accordance with the schedule is a part of the term availability [27–29]. In
this case, work interprets the availability in terms of the regularity of bus operation by
QoS-3, which is for bus on-schedule operation tracking. Even if a bus line has completed
the number of trips specified and did not go off route, it cannot be guaranteed that all
buses will operate regularly. According to the frequency and headway of the bus operation
agreed upon by the operator, each bus line must operate as promised. A failed condition
leads to a lower QoS-3 score. A high score allows users the confidence to use the bus
according to their demands. The results in Table 10 indicate that most bus routes were
reliable in terms of on-schedule operation. Compared to the previous QoS scores, not many
bus routes needed improvement in QoS-3. If we take a closer look at the analytical results,
we see that many bus routes operated more trips than promised. This situation is beneficial
for users, and causes a higher QoS-3 score as a by-product. However, this metric can be
enhanced to evaluate the waiting time at each bus stop. In this case, an individual timetable
is required for every bus stop.
Our proposed method for scoring the QoS of bus transportation is evidence in support
of having policies to enhance smart mobility. Policy makers need to consider the data
carefully, because policies that benefit some service consumers may adversely affect other
groups of people [10]. We have primarily presented the analysis of GPS data from the
supply side, without taking demand-side data into consideration. In the future, when there
are data on people’s need for trips in Bangkok, not just acquired through the survey method,
such as transactions from all-in-one smart cards for public transportation [9], location data
from smartphones [25], etc., we may be able to glean more insights from both the demand
side and the supply side to optimize bus route networks [30] and schedules [31]. In this
event, policies about smart card and privacy data must be put into place.
To this end, our work demonstrates the power of having quality GPS data and spatial
data that enable policy makers to bring about positive changes in a city. We can say that our
Sustainability 2023, 15, 5618 22 of 23
contribution encourages the sustainability of public city bus transportation and, as such,
can be a part of better living in the future.
5. Conclusions
This work introduces an approach to the measurement of the quality of service (QoS) of
public city bus transportation in Bangkok in terms of reliability, accessibility, and availability,
using global positioning system (GPS) data analytics. There were three QoS scoring
functions: QoS-1 for complete trip tracking, QoS-2 for bus on-path driving tracking, and
QoS-3 for bus on-schedule operation tracking. The analytical process had four phases:
input, preprocessing, scoring, and output. Input data were GPS transactions of buses from
the last quarter of 2021; route data containing polylines of all route paths of city buses
in Bangkok and its metropolitan area; and schedule conditions of each route path. The
challenges involved in this study were no bus arrival timestamp at each bus stop, one
route having many paths, no fixed path of buses on the same route, and no departure time
being given in the schedule. Thus, we had to detect the trips on each route by analyzing
GPS trajectory data and path polylines. In this case, GPS coordinates rounding became
an important technique of the preprocessing phase. In the next phase, scoring, when trips
and their metadata were detected, the three QoS scoring functions were executed and gave
results as scores in the output phase. The analytical results of all routes showed that most
bus routes have high scores; however, some bus routes need to be improved due to low
scores. Thus, the contribution of our work was to demonstrate the feasibility of using data
analytics to measure the QoS of bus transportation, in addition to using a survey method.
This is one of the tasks that can contribute to the sustainability of smart cities.
Due to this work focusing on the analytics of bus tracking data from the supply side, in
the future, there needs to be more data, such as individual payment transactions for public
transportation and individual journey data from smartphones, to improve QoS methods
against the demand side.
Author Contributions: Conceptualization, R.C., A.S. and T.T.; Methodology, R.C. and T.T.; Formal
analysis, R.C.; Resources, A.S.; Data curation, T.T.; Writing—original draft, R.C.; Writing—review &
editing, T.T.; Visualization, R.C.; Supervision, A.S.; Project administration, A.S. All authors have read
and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Hansson, J.; Pettersson, F.; Svensson, H.; Wretstrand, A. Preferences in regional public transport: A literature review. Eur. Transp.
Res. Rev. 2019, 11, 1–16.
2. Wethyavivorn, P.; Sukwattanakorn, N. Problems and barriers affecting sustainable commuting: Case study of people’s daily
commute to Kasetsart University, Bangkok, Thailand. IOP Conf. Ser. Earth Environ. Sci. 2019, 329, 012011.
3. Ueasangkomsate, P. Service quality of public road passenger transport in Thailand. Kasetsart J. Soc. Sci. 2019, 40, 74–81.
4. Chan, W.; Ibrahim, W.W.; Lo, M.; Suaidi, M.; Ha, S. Sustainability of public transportation: An examination of user behavior to
real-time GPS tracking application. Sustainability 2020, 12, 9541.
5. Page, S.; Yue, G.G. Transportation and tourism: A symbiotic relationship? In The SAGE Handbook of Tourism Studies; Sage
Publications: Thousand Oaks, CA, USA, 2009; pp. 371–395.
6. Goyal, S.; Agarwal, S.; Singh, N.S.S.; Mathur, T.; Mathur, N. Analysis of Hybrid MCDM Methods for the Performance Assessment
and Ranking Public Transport Sector: A Case Study. Sustainability 2022, 14, 15110.
7. Mazloumi, E.; Currie, G.; Rose, G. Using GPS data to gain insight into public transport travel time variability. J. Transp. Eng. 2010,
136, 623–631. [CrossRef]
8. Shen, L.; Stopher, P.R. Review of GPS travel survey and GPS data-processing methods. Transp. Rev. 2014, 34, 316–334.
Sustainability 2023, 15, 5618 23 of 23
9. Gschwender, A.; Munizaga, M.; Simonetti, C. Using smart card and GPS data for policy and planning: The case of Transantiago.
Res. Transp. Econ. 2016, 59, 242–249. [CrossRef]
10. Liu, Q.; Liu, Z.; Kang, T.; Zhu, L.; Zhao, P. Transport inequities through the lens of environmental racism: Rural-urban migrants
under Covid-19. Transp. Policy 2022, 122, 26–38.
11. Chawuthai, R.; Pruekwangkhao, K.; Threepak, T. Spatial-Temporal Traffic Speed Prediction on Thailand Roads. In Proceedings of
the 7th International Conference on Engineering, Applied Sciences and Technology, Pattaya, Thailand, 1–3 April 2021; pp. 58–62.
12. Chawuthai, R.; Chankaew, N.; Threepak, T. A Hybrid Method for Predicting a Potential Next Rest Stop of Commercial Vehicles.
Transp. Res. Procedia 2018, 34, 36–43. [CrossRef]
13. Chawuthai, R.; Ainthong, N.; Intarawart, S.; Boonyanaet, N.; Sumalee, A. Travel Time Prediction on Long-Distance Road
Segments in Thailand. Appl. Sci. 2022, 12, 5681.
14. Chawuthai, R. Monitoring roadway lights and pavement defects for nighttime street safety assessment by sensor data analysis
and visualization. Sens. Mater. 2018, 30, 2267–2279. [CrossRef]
15. SL, A.H.; Samsudeen, S.N. Real time bus tracking and scheduling system using wireless sensor and mobile technology. J. Inf. Syst.
Inf. Technol. 2016, 1, 18–23.
16. Kamble, P.A.; Vatti, R.A. Bus tracking and monitoring using RFID. In Proceedings of the 2017 Fourth International Conference on
Image Information Processing, Shimla, India, 21–23 December 2017; pp. 1–6.
17. Huang, S.-H.; Lin, C.-S. Rapid Route Comparison Based on GPS Coordinates and Bounding Boxes. J. Traffic Logist. Eng. 2019, 7,
5–9. [CrossRef]
18. Elevelt, A.; Bernasco, W.; Lugtig, P.; Ruiter, S.; Toepoel, V.; Ruiter, B.M.S. Where you at? Using GPS locations in an electronic time
use diary study to derive functional locations. Soc. Sci. Comput. Rev. 2021, 39, 509–526. [CrossRef]
19. Ciociola, A.; Cocca, M.; Giordano, D.; Vassio, L.; Mellia, M. E-scooter sharing: Leveraging open data for system design. In
Proceedings of the 2020 IEEE/ACM 24th International Symposium on Distributed Simulation and Real Time Applications
(DS-RT), Prague, Czech Republic, 14–16 September 2020; pp. 1–8.
20. Payyanadan, R.P.; Sanchez, F.A.L.; Lee, J.D. Assessing route choice to mitigate older driver risk. IEEE Trans. Intell. Transp. Syst.
2016, 18, 527–536.
21. Yang, J.; Rahardja, S.; Fränti, P. Outlier detection: How to threshold outlier scores? In Proceedings of the International Conference
on Artificial Intelligence, Information Processing and Cloud Computing, Sanya, China, 19–21 December 2019; pp. 1–6.
22. Rilett, L.R.; Tufuor, E.; Murphy, S. Arterial roadway travel time reliability and the COVID-19 pandemic. J. Transp. Eng. Part A Syst.
2021, 147, 04021034.
23. Soza-Parra, J.; Raveau, S.; Muñoz, J.C.; Cats, O. The underlying effect of public transport reliability on users’ satisfaction. Transp.
Res. Part A Policy Pract. 2019, 126, 83–93.
24. Xiaoliang, Z.; Limin, J. Analysis of Bus Line Operation Reliability Based on Copula Function. Sustainability 2021, 13, 8419.
25. Liu, Q.; An, Z.; Liu, Y.; Ying, W.; Zhao, P. Smartphone-based services, perceived accessibility, and transport inequity during the
COVID-19 pandemic: A cross-lagged panel study. Transp. Res. Part D Transp. Environ. 2021, 97, 102941.
26. Curl, A.; Nelson, J.D.; Anable, J. Does accessibility planning address what matters? A review of current practice and practitioner
perspectives. Res. Transp. Bus. Manag. 2011, 2, 3–11. [CrossRef]
27. Leng, N.; Corman, F. The role of information availability to passengers in public transport disruptions: An agent-based simulation
approach. Transp. Res. Part A Policy Pract. 2020, 133, 214–236.
28. Vdovychenko, V.; Ivanov, I.; Pidlubnyi, S. Assessment of the impact of traffic conditions on the availability of transport services
of the city bus route. Technol. Audit. Prod. Reserves 2022, 3, 45–50.
29. L’upták, V.; Droździel, P.; Stopka, O.; Stopková, M.; Rybicka, I. Approach methodology for comprehensive assessing the public
passenger transport timetable performances at a regional scale. Sustainability 2019, 11, 3532.
30. Zhang, H.; Cui, H.; Shi, B. A data-driven analysis for operational vehicle performance of public transport network. IEEE Access
2019, 7, 96404–96413. [CrossRef]
31. Zhu, H.; Wu, Y.; Wang, Y. Algorithm for Headway of Fixed Route Buses in Bus Stations Based on Bus Big Data. In Proceedings of
the 6th International Conference on Transportation Information and Safety, Wuhan, China, 22–24 October 2021; pp. 28–33.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.