Journal of Transport Geography 86 (2020) 102769

Journal of Transport Geography

How does ridesourcing substitute for public transit? A geospatial perspective

in Chengdu, China

Hui Konga,b, Xiaohu Zhangb, , Jinhua Zhaob
Singapore-MIT Alliance for Research and Technology, 1 Create Way, Singapore
Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA 02139, United States


Keywords: The explosive growth of ridesourcing services has stimulated a debate on whether they represent a net substitute
DiDi for or a complement to public transit. Among the empirical evidence that supports discussion of the net effect at
Ridesourcing the city level, analysis at the disaggregated level from a geospatial perspective is lacking. Besides, it remains
Public transit unexplored the spatiotemporal pattern of ridesourcing's effect on public transit, and the factors that impact the
effect. Using DiDi Chuxing data in Chengdu, China, this paper develops a three-level structure to recognize the
potential substitution or complementary effects of ridesourcing on public transit. Furthermore, this paper in-
vestigates the effects through exploratory spatiotemporal data analysis, and examines the factors influencing the
degree of substitution via linear, spatial autoregressive, and zero-inflated beta regression models. The results
show that 33.1% of DiDi trips have the potential to substitute for public transit. The substitution rate is higher
during the day (8:00–18:00), and the trend follows changes in public transit coverage. The substitution effect is
more exhibited in the city center and the areas covered by the subway, while the complementary effect is more
exhibited in suburban areas as public transit has poor coverage. Further examination of the factors impacting the
relationship indicates that housing price is positively associated with the substitution rate, and distance to the
nearest subway station has a negative association with it, while the effects of most built environment factors
become insignificant in zero-inflated beta regression. Based on these findings, policy implications are drawn
regarding the partnership between transit agencies and ridesourcing companies, the spatial-differentiated po-
licies in the central and suburban areas, and the potential problems in providing ridesourcing service to the
economically disadvantaged population.

1. Introduction by abundant research and media coverage (Anderson, 2014; Bialik

et al., 2015; Glöss et al., 2016; Rayle et al., 2016). Besides competing
The explosive growth of app-based, on-demand mobility service with the taxi industry, ridesourcing also pulls people away from public
providers (e.g., Uber, Lyft, Grab, DiDi Chuxing, etc.), aka transportation transit. Existing research suggests that ridesourcing both substitutes for
network companies (TNCs), has put ridesourcing in the spotlight and complements public transit (Jin et al., 2018), but it is not clear to
(Hughes and MacKenzie, 2016; Meyer and Shaheen, 2017; Yu and Peng, what extent such effects perform, how the effects evolve over time and
2019). Ridesourcing refers to on-demand services that use mobile de- space, and what factors are impacting the effects.
vices and applications to connect drivers—people who drive private This study uses data of DiDi Chuxing in Chengdu, China, to in-
cars instead of commercial vehicles—with passengers. The term “ride- vestigate the substitution and complementary relationship between ri-
sourcing” is commonly used by transportation researchers, while desourcing and public transit. We focus on answering three questions:
practitioners describe themselves as “TNCs” or “mobility service pro- (1) To what extent does DiDi substitute for public transit? (2) How does
viders” (MSPs), and media normally use the term “ride-hailing” or the degree of substitution change over time and space? In other words,
“ride-sharing” (Shaheen et al., 2016). Examples of such services include what are the diurnal and spatial patterns of the relationship? (3) What
UberX, UberXL, Lyft, JustGrab, DiDi Express, DiDi Premier, etc. are the factors that impact the substitution effect of DiDi on transit? The
The substitution effect of ridesourcing for taxis has been confirmed answers to these questions are significant for both transportation

Corresponding author at: Senseable City Lab, Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA 02139, United
E-mail address: (X. Zhang).
Received 4 June 2019; Received in revised form 13 May 2020; Accepted 5 June 2020
Available online 22 June 2020
0966-6923/ © 2020 Elsevier Ltd. All rights reserved.
H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

operation and urban planning. of 1000 riders in Metro Boston showed that 42% of respondents would
The rest of the paper is organized as follows. The next section de- shift to public transit for their trips if ridesourcing was not available.
fines substitution and complementary effects and reviews current re- Clewlow and Mishra (2017) differentiated various types of public
search on ridesourcing and its relationship with public transit. Section 3 transit and discovered that TNCs pull people away from the public bus
explains the data and methodology used in this study, followed by a and light rail but enhance the ridership of heavy rail. The arguments on
section that presents and discusses the empirical results. Lastly, we complementarity come from both surveys and quantitative analysis.
highlight the findings and policy implications and point out the lim- The survey by the American Public Transportation Association
itations of the current study and directions for future research. (Murphy, 2016) concluded that ridesourcing is more likely to substitute
for a trip by private car than for a trip by public transit, and it is more
2. Substitution vs. complementarity: research background popular when public transportation operates less (e.g., 8 PM – 4 AM).
Smith (2016) from the Pew Research Center revealed that public transit
Ridesourcing can be a substitute for public transit as it provides an usage is highly correlated with Uber usage (56% of regular ridesourcing
alternative transport option. When taking ridesourcing, a rider does not users also take public transit regularly) based on a survey of 4787
have to walk to a transit stop and wait for the bus/subway but can American adults.
instead request a service and be transported by a vehicle directly to the Very few quantitative studies examined the relationship between
final destination. Also, the in-vehicle environment is often more com- ridesourcing and public transit by studying the changes in their rider-
fortable than that of public transit. Therefore, although ridesourcing ship when a special case occurs or by conducting simulations of virtual
fares are typically higher, there is the potential for it to substitute for situations. Hoffmann et al. (2016) suggested that ridesourcing usage
public transit. On the other hand, ridesourcing can complement public increases over 30% when there is a subway service disruption. Hall
transit, as it increases the reach and flexibility of public transit's fixed et al. (2017) applied statistical approaches to compare how transit ri-
route, fixed schedule service by providing on-demand services (Hall dership changes in cities after Uber enters the market, and found that
et al., 2017). In microeconomics, substitutes and complements are Uber is complementary to transit agencies, as it increased transit ri-
distinguished by cross elasticity of demand EPA, QB which is formulated dership by 5% within two years of its entry. Jin et al. (2018) compared
as: Uber pick-ups in NYC with public transit stops and found that Uber
substitutes for public transit most of the time, except when the transit
E PA, QB = ∙ coverage is low. Through simulation (Basu et al., 2018) or matching
∂PA QB (1)
algorithms (Stiglic et al., 2018) of virtual systems, studies concluded
where PA is the price of good/service A and QA is the demand quantity that public transit is irreplaceable, and the overall system efficiency
of good/service A. If EPA, QB > 0, meaning a price increase of A leads to a improves if ridesourcing serves as a feeder of the transit system.
demand increase of B, A and B are substitutes; on the other hand, if EPA, Existing studies have revealed both substitution and complementary
QB < 0, they complement each other; and if EPA, QB = 0, A and B are relationships between ridesourcing and public transit. However, we still
independent. lack empirical evidence to quantitatively measure the effects in the
Some research discusses the net substitution or complementary ef- geospatial context. The very few quantitative measurements of the re-
fect of ridesourcing at the city level. However, similar to its variations lationships only examine the overall substitution/complementary ef-
in different cities, the effect also varies spatially within a city area, and fect, without investigating whether each ridesourcing trip has the po-
few studies have explored the disaggregated level. This paper thus tential to substitute for or complement public transit. In addition, the
proposes a methodology for investigating the substitution or com- spatiotemporal variation of such relationships and how factors impact
plementary effect in the geospatial context. For each origin-destination the relationship remain unknown. This paper fills in these gaps by: (1)
(OD) pair, the effect of ridesourcing on public transit is either substitute analyzing how each ridesourcing trip may substitute for/complement
or complementary. We define ridesourcing as a potential substitute for public transit via a three-level structure; (2) revealing how the sub-
public transit if riders choose ridesourcing when public transit is ac- stitution varies over time and space; (3) examining how the substitution
cessible (within a comfortable walking distance to the transit station is impacted by different factors through regression models.
and at comparable travel time cost). On the other hand, ridesourcing
complements public transit if it provides services when public transit is 3. Data and methodology
not accessible (transit stations cannot be reached within an easy
walking distance or the trip duration by public transit is significantly 3.1. Study area
longer). The local effect in an area is determined by aggregating the
effects of all ridesourcing trips. This study examines how DiDi Chuxing substitutes public transit in
The limited literature that touched on the relationship between ri- Chengdu, China. Chengdu is the capital of Sichuan province in south-
desourcing and public transit suggested that ridesourcing both sub- western China. It consists of 11 districts, five county-level cities, and
stitutes for and complements public transit, yet insufficient quantitative four counties. The 11 districts constitute what is referred to as ‘Central
evidence has been provided to substantiate this observation. The ar- Urban Areas’, including five in the city center (Jinjiang, Qingyang,
guments of substitution were established based on surveys asking riders Jinniu, Wuhou, Chenghua) and six in the suburban areas (Pidu, Xindu,
questions such as “If ridesourcing is not available, what other trans- Qingbaijiang, Longquanyi, Shuangliu, Wenzhou). The rest five county-
portation modes would you use?” and previous surveys reached dif- level cities (Pengzhou, Dujiangyan, Chengzhou, Qionglai, Jianyang)
ferent substitution rates of ridesourcing for public transit, owing to and four counties (Jintang, Dayi, Pujiang, Xinjin) constitute ‘Suburban
their different study areas, sampling strategies, and questionnaire de- New Cities’.
sign. Rayle et al. (2016) conducted surveys in San Francisco on 380 Chengdu is one of the most important commercial, cultural, and
riders who had either just completed a ridesourcing trip or who had transportation centers in Western China. By the end of 2010, over 200
used ridesourcing within the last two weeks, concluding that 33% of of Fortune 500 companies had set up branches in Chengdu. In 2016,
ridesourcing trips replace public transit, and users do save about 10 min there were more than 400 bus lines in Chengdu with nearly 12,000
on average by choosing ridesourcing over public transit. Henao (2017) buses in total, 4 subway lines were in operation, and more than 20 lines
conducted a survey of 311 Uber/Lyft passengers in Denver while doing are planned to be implemented.
participant observation as a ridesourcing driver himself and found that DiDi Chuxing is selected as a representative of ridesourcing services.
22.2% of passengers would have used public transit for their trip if It is the largest transportation network company in China, and has
ridesourcing was not an option. Gehrke et al.'s (2018) in-vehicle survey acquired Uber China in 2016. Studies have been focused on its impacts

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

Fig. 1. Chengdu city and the study area.

on taxi industry (Nie, 2017), the impacts of ride-splitting services (Chen according to the public transit schedule to minimize waiting time.
et al., 2018), traffic emissions (Sun et al., 2018; Xue et al., 2018), and This neglect is consistent with the DiDi data, since the DiDi travel
the impacts of restriction policies on demand (Sun and Ding, 2019). time in our data also ignores the waiting time, whereas the relative
Most of the DiDi trips data that we have access to are concentrated in travel time is compared by calculating the transit travel time for all
the 11 districts in the Central Urban Areas, so this study focuses on DiDi trips using Google Map API.
these districts instead of the whole city region (Fig. 1). (4) Housing price. The housing price data in July 2018 of 9266 com-
munities were obtained from Lianjia (, a
3.2. Data major real estate trading platform in many Chinese cities.
(5) Point of Interest (POI). POI data were obtained from the 58-POI
We collected five categories of data from various sources, explained website (
as follows: pois.html). There are, in total, 124,449 POIs in the study area.
(6) GIS layers. GIS layers were extracted from OpenStreetMap, in-
(1) DiDi trip data.1 We used the DiDi trip data of November 1, 2016, cluding administrative boundaries and road networks.
which includes 181,172 DiDi (DiDi Express + DiDi Premier) trips.
The data contain timestamps and coordinates of both the pick-up 3.3. Methods
and drop-off locations. The majority of trips were located in
counties near the central urban areas of Chengdu: City Center, Pidu, 3.3.1. Quantifying the degree of substitution
Xindu, Qingbaijiang, Longquanyi, Shuangliu, and Wenjiang, so we Identifying the substitution/complementary effect of individual
focus on analyzing 181,068 trips that fall inside these counties. trips. Based on our definition in the geospatial context, whether a
(2) Public transit stops and the operational period. Data of public transit ridesourcing trip has the potential to substitute for or complement
stops and the operational hours of each stop were obtained from public transit is examined by how accessible the trip is on public transit:
Baidu Map API, including 20,923 stops (112 subway stations a ridesourcing trip has the potential of substituting for public transit if
+20,811 bus stops). the rider uses ridesourcing when public transit is accessible; otherwise,
(3) Travel time of public transit. The travel time of public transit was it has the potential of complementing public transit, as it provides an
estimated using the Google Distance Matrix API. For each DiDi trip, alternative in areas with poorer public transit services. We consider
we estimate the travel time of the same trip by public transit. three criteria to measure public transit accessibility: transit coverage,
Departure time is defined as the pick-up time of the DiDi trip. The travel time difference, and service quality difference, upon which we
API returns the public transit duration as the sum of in-vehicle time, propose a three-level structure (Fig. 2) to examine the substitution/
walking time to and from the public transit stop, and, if there are complementary effect of ridesourcing on public transit. This section
transfers, the transfer time. The in-vehicle time is given by con- proposes the measurement of all three levels. However, due to the data
sidering both historical traffic conditions and live traffic. The public limitations, this study only conducts analysis to the depth of level 1 and
transit travel time does not include the waiting time for the first trip level 2.
segment, as travelers are assumed to be able to plan their trips
Level 1: transit coverage. At this level, a ridesourcing trip has the
potential of substituting public transit if both pick-up and drop-off
Data source: Didi Chuxing GAIA Initiative (
locations are within the transit coverage. This criterion is

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

difference between public transit and ridesourcing is shorter than a

pre-defined threshold, since it indicates that the rider chooses ri-
desourcing when public transit is not significantly time-consuming,
or, in other words, accessible. Formula (3) depicts this criterion as:

S 2 = {t ∈ S1 | TtPT − TtRS ≤ τT } (3)

where S is the set of all ridesourcing trips identified as the potential
substitution at level 2, TtRSis the travel time of the ridesourcing trip, TtPT
is the travel time of the same trip should public transit be used, and τT is
the threshold of travel time difference.

Level 3: service quality. Level 3 uses service quality to further dif-

ferentiate the ridesourcing service: if ridesourcing greatly improves
service quality, the trip can be considered complementary. Service
quality is determined by many factors, e.g., the crowdedness of the
vehicle, the in-vehicle environment, privacy, safety, fare, etc. For
example, if the bus/subway is too crowded, using ridesourcing
greatly improves the comfort of users, thus the trip can be con-
Fig. 2. Three-level structure to differentiate the substitution/complementary sidered complementary. Fare is another important factor to con-
effect between ridesourcing and public transit. sider. If the fare difference of ridesourcing and public transit is too
large, the ridesourcing trip could be considered as complementary.
formulated as Formula (4) depicts this criterion as:

S1 = {t ∈ S | ot ∈ bn (p)&dt ∈ bn (p) } (2) S 3 = {t ∈ S 2 | QtRS − QtPT ≤ τQ} (4)

where S1 is the set of all ridesourcing trips identified as potential sub- where S3 is the set of all ridesourcing trips identified as the potential
stitution, t is an individual ridesourcing trip, S is the collection of all substitution at level 3, QtRS is the service quality of the ridesourcing
ridesourcing trips, otand dt are the pick-up and drop-off locations, and trip, QtPT is the service quality of the same trip should public transit be
bn(p) is the n-meter buffer of transit stops p. used, and τQ is the threshold of service quality difference. The analytics
Transit coverage can be measured using buffer analysis, and areas of Level 3 is not included in this paper, as the data used in this study
inside the buffer are considered “covered” by the transit system (Hawas does not support the measurement of the service qualities.
et al., 2016). This study measures the transit coverage in terms of both
‘spatial coverage’ (where the transit services are available) and ‘tem- Quantifying the substitution effect at the aggregate level. The
poral coverage’ (when the services are available). degree of substitution at the aggregate level could be calculated by:
For ‘spatial coverage’, a 400 m buffer is used to measure transit
coverage in this paper, as it has been recognized as a comfortable |Si2 |
Ri =
walking distance in previous studies (Demetsky and Bin-Mau Lin, 1982; |Si | (5)
Murray et al., 1998; Wu and Murray, 2005; Hawas et al., 2016). Al- 2
where Ri is the substitution rate in area i, Si is the set of ridesourcing
ternatively, some studies use 400 m for subways and buses and 800 m
trips being identified as substitution in area i, and Si is the complete set
for suburban railways, as people might be willing to walk a longer
of ridesourcing trips in area i.
distance to take the suburban railway (Smith and Taylor, 1994; Jin
et al., 2019). In this paper, as the public transit in Chengdu only con-
tains bus and subway, and all the subway lines are in the central areas, 3.3.2. Modeling impacts on the substitution effect
we use 400 m as a threshold distance for all the transit stations. We divide the study area into regular grid cells to facilitate the
Regarding ‘temporal coverage’, since public transit service changes measurement of factors influencing the substitution and com-
throughout the day (different public transit stops have different oper- plementary effects of ridesourcing and public transit. Since 400 m is
ating hours), we construct the transit coverage for every 10 min, given adopted as a threshold distance in the analysis of transit coverage, we
that public transit riders' typical wait time is 8 to 10 min (Watkins et al., use 400 × 400 m grid cells to be consistent with the previous analysis.
2011). For example, for a DiDi trip that picks up passengers at 9:05 AM Three regression models are used in this paper. The Ordinary Least
and drops off passengers at 9:25 AM, the trip is classified into S1 only if Squares (OLS) model is the base model, the spatial autoregression
its pick-up location is covered by public transit stations operating model is used to accommodate spatial variation, and the zero-or-one
during the 9:00–9:10 AM period, and its drop-off location is covered by inflated beta regression model (ZOIB) is used to examine impacting
public transit stations operating during the 9:20–9:30 AM period. factors, considering the effect of the 0 and 1 values of the dependent
Level 2: travel time difference between public transit and ridesourcing. The OLS model is constructed as follows:
The dichotomy based on transit coverage may overestimate the Ri = ei + bi + δi + ε (6)
substitution: the ridesourcing trips that have either the pick-up or
the drop-off outside the transit coverage are definitively categorized where Ri is the substitution rate of all ridesourcing trips originating in
complementary in level 1; however, the trips within transit coverage cell i; ei are socio-economic factors measured by average housing price
should be deemed complementary to public transit if the travel time at cell i, which is used as a proxy of average wealth level; bi are built
saved by ridesourcing is significant. Therefore, this level of analysis environment factors; δi represents control variables, including the count
includes travel time difference as an additional criterion to identify of ridesourcing trips in cell i, average travel time of all ridesourcing
potential substitution. For ridesourcing trips defined as the potential trips in the cell and count of bus stops in the cell; and ε is the un-
substitution in level 1 (t ∈ S1), we estimate the travel time of the observed error term.
same trip by public transit. The ridesourcing trip is considered a The built environment has been considered to influence travel be-
potential substitution for public transit only if the travel time havior (Cervero and Kockelman, 1997; Ewing and Cervero, 2001,
2010). In this study, the built environment consists of four measures:

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

POI density, land use diversity, road density, and distance to subway DiDi remains active. These observations imply poor public transit ser-
station. POI density is measured as the number of POIs per cell; road vice during late-night and early morning hours.
density is measured by the total length of roads per km2 in cell i; and
distance to subway station is measured by the distance of the cell 4.2. Relationship between DiDi and public transit
centroid to the nearest subway station. The diversity of land use is
measured by the mixture level of different POI categories. We reclassify As specified in Section 3.3, we differentiate the ridesourcing trips
the POIs into eight categories (see Table A1 in Appendix A) and cal- that are potential substitution and complementary for public transit at
culate the Shannon entropy index to represent the level of diversity levels 1 and 2 (based on transit coverage and travel time differences).
(Shannon, 1948): For the analysis at level 1, we identified the trips with pick-up and drop-
n off locations inside the transit coverage area. As discussed in Section
H = − ∑ pj logn pj 3.3.1, 400-m buffer is used to measure the ‘spatial coverage’ as it has
j (7) been recognized as a comfortable walking distance in previous studies.
We also conduct the sensitivity analysis that calculates the percentage
where H is the entropy (ranging from 0 to 1), pj is the percentage of the
of trips being recognized as potential substitution (overall substitution
jth category of POI, and n is the number of categories. An entropy of 1
rate) when different buffer radius are used (Fig. A1). It is obvious that,
indicates extreme diversity of land use, whereas a value of 0 means the
when the threshold is smaller than 400 m, the overall substitution rate
least diversity (only one category of POIs in the cell).
increases as the buffer radius is larger, but the substitution rate remains
We further adopt spatial autoregressive models to accommodate
relatively flat when the threshold is greater than 400 m. For ‘temporal
spatial variation in the relationship being modeled. If there is any
coverage’, transit coverage is created for each 10-min time slot, and the
spatial dependency presented, the ‘uncorrelated error terms’ and ‘in-
DiDi trips that may substitute for or complement public transit are
dependent observations’ assumptions in the OLS model are violated,
classified based on the criteria of whether both pick-up and drop-off
raising the necessity of using either spatial lag or spatial error models. If
locations are within the transit coverage. Fig. 5(a) plots the overall
there are omitted spatially correlated covariates influencing the re-
substitution rate and the public transit coverage of the whole study
lationship over space, a spatial error model should be used to capture
area. Not surprisingly, at midnight and during the early morning hours
the correlation of the error terms across different spatial units:
(12:00 AM to 6:00 AM), most DiDi trips have both pick-up and drop-off
Ri = ei + bi + δi + λWεi + μi (8) locations outside the transit coverage. From 7:00 to 20:00, the sub-
stitution effect of DiDi for public transit is evident: around 90% of trips
where λ represents the spatial error coefficient and μiis the unobserved have both pick-ups and drop-offs inside the transit coverage, while after
error term. W is the spatial weights matrix, defined by Queen contiguity 20:00, the complementary effect increases and becomes dominant
neighborhoods. again (Fig. 5a). Regarding the relationship in different spatial regions,
If the spatial lag exists, suggesting that the dependent variable is the central area exhibits the highest transit coverage ratio (82.13%),
also affected by the neighboring independent variables, a spatial lag and 79.75% trips are identified as potential substitutions; while in the
model should be used to incorporate the additional effect of neigh- Qingbaijiang county, whose transit coverage ratio (27.51%) is the
boring attribute values: lowest among all counties, more than half of the trips are considered
Ri = ei + bi + δi + ρWRi + μi (9) complementary (Fig. 5b).
Further, we move on to level-2 analysis and differentiate potential
where ρ represents the spatial lag coefficient and μi is the unobserved substitution and complementary trips based on the travel time differ-
error term. W is the spatial weights matrix, defined by Queen contiguity ence. For all DiDi trips whose pick-ups and drop-offs are both inside the
neighborhoods. transit coverage, we estimate the travel time if they were taken by
Considering that the dependent variable is measured in percentage public transit (Appendix B). If the travel time by public transit is sig-
and has a high concentration of zeros and ones, it is likely that the nificantly longer, we consider the DiDi trip as complementary instead of
impacts of factors come from the effects of 0 or 1 values. To exclude the substitution.
effects of 0 and 1 values, we also applied the zero-or-one inflated beta Based on transit coverage and travel time difference, we identify the
regression model (ZOIB), to examine the impacting factors without DiDi trips that may substitute for public transit by formula (3). The
considering the extreme samples (Ferrari and Cribari-Neto, 2004; substitution rate (percentage of DiDi trips that has the potential to
Ospina and Ferrari, 2012). substitute for public transit) is calculated using formula (5) contingent
on the threshold (τT) of the travel time difference (Fig. 6). If we set τT as
4. Results and discussion 10 min, which means a DiDi trip is considered as potential substitution
if the travel time difference is less than 10 min, then only around 5% of
4.1. A graphical representation of DiDi and public transit services the total DiDi trips have the potential to substitute for public transit. As
the benchmark of travel time difference increases, the substitution ef-
Fig. 3 shows the distribution of DiDi pick-ups (Fig. 3a) and public fect increases. When τT approximates 43 min, the substitution rate is
transit stops (Fig. 3b). Both modes are highly concentrated in the city greater than 50%, indicating that the substitution effect surpasses the
center. complementary effect. We also conduct the sensitivity analysis con-
To explore the diurnal variation of the two modes, we plot the sidering the change of buffer radius threshold and travel time difference
transit coverage ratio and count of DiDi trips based on their pick-up threshold (Fig. A2). Based on the fact that ridesourcing saves about
time (Fig. 4). Transit coverage is measured by the 400 m buffer of op- 20 min travel time than public transit (Schwieterman, 2019), and that
erational transit stops, and the transit coverage ratio is calculated as the Google Distance Matrix API tends to overestimate the travel time on
sum of transit coverage areas over the total study area. Due to data public transit, we choose τT = 30 min to further examine the spatial
limitations, we didn't eliminate areas where transit will not occur, such and temporal patterns of the relationship. Under this condition, we
as parks, forests, and water bodies. In general, the coverage ratio is identify that 33.1% of DiDi trips have the potential to substitute for
below 50% in Chengdu, and it declines significantly in the evening public transit.
(after 6:00 PM), while the DiDi pick-ups, although showing a downward
trend in the evening, do not decline as dramatically as the transit 4.3. Spatiotemporal variation of the relationship
coverage ratio. Also, most public transit lines don't operate from mid-
night to the early morning hours (around 12:00 AM to 6:00 AM), while To depict the spatiotemporality of the relationship between DiDi

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

Fig. 3. Spatial distribution of (a) DiDi pick-ups; (b) public transit coverage.

and public transit, we plot the substitution rate over time (Fig. 7) and transit stops that end operation from 18:00 to 20:00 are mainly located
over space (Fig. 8) with τT = 30 min. in the periphery, while the stops at the city center continue operating.
Identifying the substitution trips only by transit coverage results in a As most of the DiDi trips occur in the city center, they still highly
very high substitution rate during the day, but the substitution becomes overlap with the transit coverage. After 20:00, the operation of most
less after considering the travel time difference (Fig. 7). Before 6:00, the transit stops in the city center ends; thus, the complementary effect of
complementary effect is dominant, but the substitution effect increases DiDi becomes significant. The lag effect is less significant in the sub-
afterward as most buses and subways start operating. From 8:00 to stitution rate identified from both transit coverage and travel time
18:00, the substitution effect fluctuates around 40% and decreases difference, as the end of operation of transit stops in the periphery still
gradually after 18:00. The percentage of DiDi trips that has the po- affects the overall travel time difference.
tential of substituting for public transit reaches two small peaks from To explore the spatial pattern, we plot the pick-ups of all DiDi trips
8:00 to 9:00 and around 18:00, which is in accordance with the smaller (Fig. 8). Most of the substitution trips aggregate in the city center, while
travel time difference as shown in Fig. B1. the peripheral areas are mainly occupied by complementary trips. This
Compared to the public transit coverage, the overall trend of the high substitution rate in the city center could be attributed to the higher
substitution rate follows changes in transit coverage. However, a tem- public transit coverage and relatively shorter trips, as people are more
poral lag is found from 18:00 to 20:00 in the substitution rate identified likely to use ridesourcing for short-distance rides than for long-distance
from transit coverage, as the public transit coverage drops from around ones. In addition, we discover a higher substitution rate in the north-
45% to 25%, while the substitution rate remains high. This lag effect west-southeast direction and from the city center to the southern per-
could be explained by the spatial distribution of transit stops that end iphery. These are two major development belts of Chengdu with hea-
operation during this time period (see Fig. A3 in Appendix A). The vier traffic and are overlapped with the two oldest subway lines. All the

Fig. 4. Public transit coverage ratio and count of DiDi trips, November 1, 2016.

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

Fig. 5. The relationship identified from transit coverage: (a) the change of overall substation rate in the study area over time; (b) the percentage of potential
substitution and complementary trips over space.

Fig. 6. Substitution rate with the change of travel time difference threshold (τT).

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

Fig. 7. The change of overall substitution rate in the study area over time.

Fig. 8. Spatial distribution of DiDi trips that have the potential to (a) substitute or (b) complement public transit (τT = 30 min).

evidence shows that the relationship between DiDi and public transit is
correlated with both socio-economic factors and the built environment,
and the substitution effect is more significant as it gets closer to the city

4.4. Impacting factors of the relationship

We employ regression analysis to explore the impact of different

factors on the substitution and complementary effects. To be consistent
with the 400 m buffer we used to measure transit coverage, we divide
the study area into 400 m × 400 m grid cells and aggregate the DiDi
trips to cells based on the pick-up locations. Cells that do not have DiDi
trips or are not covered by public transit are removed. The substitution
rate (Fig. 9) and factors are calculated for all the cells, and cells without
housing price data are interpolated using the ordinary Kriging method.
According to Shaheen et al. (2017), there are different potential ap-
plications of ridesourcing different urban built environment. Therefore,
we assume that the variations in percentage of substitution trips per cell
could be explained by socioeconomic factors, the built environment,
and other spatial factors, and based on our data availability, select
several variables for analysis as listed in Table 1. Descriptive statistics
of all variables are represented (Table 1).
The results are shown in Table 2. Firstly, based on the OLS model, Fig. 9. Substitution rate at 400 × 400 m grid cells for regression.
all factors show significant associations with the substitution rate.
However, in the diagnostics for spatial dependence, high values of the
OLS model (R2 = 0.3815). In addition, the spatial pseudo R2 of the
Moran's I index suggest a positive global spatial autocorrelation, in-
spatial lag model is 0.4357, which is a relatively less optimistic as-
dicating that the spatial autoregression model should be applied to take
sessment of the model fit (Anselin, 2014), still provides a slightly better
into account the spatial heterogeneity.
fit than the OLS model (adjusted pseudo R2 = 0.3803).
The (Robust) Lagrange Multiplier (LM) tests (Table 2) show that the
Regarding the results of spatial lag regression, the coefficients of
spatial lag model fits better in our case. The goodness-of-fit (i.e., pseudo
housing price and built environment factors are all statistically
R2) is 0.6780, indicating that the spatial lag model fits better than the

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

Table 1 significantly negative, indicating that the further away from subway
Descriptive statistics. stations, the substitution effect decreases (and the complementary ef-
Mean Std. dev. Minimum Maximum fect increases). This indicates that DiDi services overlap with the public
transit coverage, as DiDi drivers are more likely to choose areas with
Dependent variable higher density (e.g., downtown, around subway stations) in pursuit of
Substitution rate 0.12 0.19 0 1
more potential passengers. The spatial lag coefficient is positive and
Independent variables
Housing price (×10,000) 1.28 0.35 0.55 2.58
significant, indicating the agglomeration effect of the relationship, and
POI density (×100 POIs per km2) 1.45 1.82 0 27.76 is in accordance with our observations in Section 4.3 that substitution
Land-use diversity (Shannon 0.55 0.28 0 0.95 concentrates in the city center. Sensitivity analysis has been conducted
entropy) for the spatial lag modeling for different travel time thresholds (τT =
Road density (in km/km2) 8.85 5.67 0 43.12
20 min, τT = 40 min), and the findings do not have significant changes,
Count of bus stops 3.08 4.21 0 46.00
Distance to nearest subway station 3.80 3.85 0.04 26.40 indicating the robustness of the results (Table A2).
(in km) In the ZOIB model, after mitigating the impacts of the zeros and
Control variables ones, the effects of most built environment measures become insignif-
Count of DiDi trips (×10) 4.34 10.66 0.10 196.9
icant, indicating that the impacts of built environment on the sub-
Average travel time (in hours) 0.57 0.25 0.09 3.43
stitution mainly inflated by the effects of the 0 (perfectly com-
Note: Std. dev. = Standard Deviation. plementarity) and 1 (perfectly substitution) values. However, housing
price, count of bus stops, and distance to the nearest subway station are
Table 2 still significantly correlated with the substitution rate. The magnitude
Modeling results. of the impact of the distance to a subway station becomes greater in the
ZOIB model. This is in accordance with what we found in Fig. 9: there
OLS Spatial lag ZOIB
are more substitutions between DiDi and transit in areas covered by
Constant −0.098 ⁎⁎⁎
−0.059 ⁎⁎⁎
−1.270⁎⁎⁎ subway lines, and this effect becomes more significant after we ignore
SES factors the zeros and ones in the ZOIB model. Again, sensitivity analysis has
Housing price (×10,000) 0.128⁎⁎⁎ 0.045⁎⁎⁎ 4.154E-05⁎⁎⁎ been conducted for the ZOIB modeling for different travel time
Built Environment
thresholds (τT = 20 min, τT = 40 min), and also reflect robustness re-
POI density (×100 POIs per km2) 0.009⁎⁎⁎ 0.003⁎⁎ 1.526E-04
Land use diversity (Shannon entropy) 0.045⁎⁎⁎ 0.015⁎⁎ −0.121 sults (Table A3).
Road density (in km/km2) 0.002⁎⁎⁎ 0.001⁎⁎ 0.003
Count of bus stops 0.006⁎⁎⁎ 0.002⁎⁎⁎ 0.011⁎⁎ 5. Conclusions and future research
Distance to nearest subway station (in −0.006⁎⁎⁎ −0.002⁎⁎⁎ −0.268⁎⁎⁎
Control variables
To examine the substitution and complementary effects of ride-
Trip counts (×10) 0.005⁎⁎⁎ 0.002 0.001⁎⁎⁎ sourcing on public transit at the disaggregated level from a geospatial
Average travel time (in hours) −0.045⁎⁎⁎ 0.021⁎⁎ −0.061 perspective, this paper develops a three-level structure to categorize
Spatial lag coefficient NA 0.696⁎⁎⁎ NA every ridesourcing trip as potential substitution or complementary
Summary of Statistics
based on transit coverage, travel time difference, and service quality.
Number of observations 4123 4123 4123
(pseudo) R2 0.3815 0.6780 0.1028 This methodology is applied to DiDi Chuxing in Chengdu, which pro-
Adjusted (pseudo) R2 0.3803 NA NA vides empirical data to the debate over whether ridesourcing and public
Spatial pseudo R2 NA 0.4357 NA transit are substitutes or complements.
Diagnostics for spatial dependence Our results corroborate that DiDi both substitutes for and comple-
Moran's I 0.4656⁎⁎⁎ NA NA
Lagrange multiplier (lag) 2543.03⁎⁎⁎ NA NA
ments public transit. In total, 33.1% of DiDi trips are identifies as po-
Robust Lagrange multiplier (lag) 524.86⁎⁎⁎ NA NA tential substitutions for public transit when the travel time difference
Lagrange multiplier (error) 2033.81⁎⁎⁎ NA NA threshold is set as 30 min. Based on this result, we gain a good under-
Robust Lagrange multiplier (error) 15.64⁎⁎⁎ NA NA standing of the relationship at fine spatial and temporal scales. During
the day, around 40% of DiDi trips have the potential to substitute for
Note: NA = not applicable.
⁎⁎⁎ public transit, and this substitution rate decreases in the evening when
p-value < .001.
p-value < .01. the supply of transit decreases. In the spatial dimension, the substitu-
tion effect is more significant in the city center and in the more de-
significant. Housing price is positively correlated with the substitution veloped areas covered by subway lines, while the peripheral areas are
rate. An increase of 10,000 Yuan in the housing price leads to about dominated by complementary trips. This indicates that the substitution
4.5% increase in the substitution rate, holding all other variables con- effect of DiDi is highly concentrated and is correlated with socio-eco-
stant. Housing price is an indicator of the average wealth level locally, nomic and built environment factors. To examine how different factors
so our results indicate that wealthier people are more likely to use DiDi. impact the relationship between DiDi and public transit, we further
Considering the built environment, POI density, land use diversity, and apply the spatial lag model and ZOIB model to study how the sub-
road density are positively correlated with the dependent variable and stitution rate is influenced by housing price, built environment, and
are significant at a 99% confidence interval. The coefficient of density is spatial lag factors. The results prove that housing price, distance to the
0.003, indicating an increase of 0.3% in the substitution rate, given that nearest subway station, and spatial lag effects have significant effects
the number of POIs increases by 100. This reflects the instinct of ride- on the relationship, while the influence of the built environment is less
sourcing toward pursuing profit: they tend to provide service and significant.
substitute for public transit in areas with denser activities. The positive Our results provide implications for policymaking. First, transit
coefficient of diversity means that the mix of land use is correlated with agencies are suggested to collaborate with TNCs to improve the first/
the higher substitution rate. Road density has a positive correlation last mile connection. However, ridesourcing not only complements but
with the substitution rate, which could be explained by the higher PT also substitutes for public transit, and public agencies should take
coverage in areas with denser roads. The count of bus stops is positively precautions prior to partnering with TNCs to serve areas with poor
correlated with the substitution rate at a 99.9% confidence interval. transportation connectivity while not causing disruptive changes in
The coefficient of ‘distance to the nearest subway station’ is public transit. It is also essential for public agencies to improve the
integration of infrastructure, information, and fare, with the goal to

H. Kong, et al. Journal of Transport Geography 86 (2020) 102769

encourage public transit and TNCs integration (Shaheen et al., 2016). Henao, 2017), to figure out the actual substitution of ridesourcing on
In addition, both the spatiotemporal analytics of the relationship the other modes.
and the examination of factors indicate spatial heterogeneity of the
substitution effect. Therefore, transportation planners may consider Acknowledgment
adopting different strategies in different areas. In urban peripheries,
where public transit has poor coverage, the complementary effect of The research is supported by the National Research Foundation
DiDi indicates that there are travel demands the current transit system (NRF), the CREATE Programme from the Singapore Prime Minister's
does not satisfy. It should be noted that the complementary effect does Office, and the Singapore-MIT Alliance for Research and Technology
not always suggest planning more transit routes because travel demand (SMART) Centre, Future Urban Mobility (FM) Interdisciplinary
may not be high enough to support transit operation. Instead, lever- Research Group. The DiDi trip data is shared by DiDi Chuxing GAIA
aging ridesourcing services to connect them with the transit system Initiative ( The authors would like to
would improve the overall efficiency of the system. However, re-ex- thank Baichuan Mo for helping with the public transit travel time es-
amining areas with high complementary effects is necessary because timation.
some travel demands may point to potential transit routes. On the other
hand, in the city center, there is high transit coverage, and the sub- Appendix A. Supplementary data
stitution effect of DiDi on transit is significant. In this case, DiDi is
pulling people away from transit-dependence to more car-dependence. Supplementary data to this article can be found online at https://
A challenging question for policymakers, then, is whether we should
rely on TNCs to improve urban mobility or continue to invest in and
