Professional Documents
Culture Documents
Exploring Causal Effects of Neighbourhood Type On Walking Behaviour Using Stratification On The Propensity Score
Exploring Causal Effects of Neighbourhood Type On Walking Behaviour Using Stratification On The Propensity Score
Abstract
The causality issue has become one of the key questions in the debate over the relationship
between the built environment and travel behavior. To ascertain whether changes to the built
environment are a cost-effective way to change travel behavior, it is necessary to determine the
magnitude of the effect. Further, it is important to understand if the observed influence of the
built environment on travel behavior diminishes substantially once we control for self-selection.
Using 1,553 residents living in four traditional and four suburban neighborhoods in Northern
California, this study explores the causal effect of neighborhood type on walking behavior and
the relationship between this effect and the observed influence of neighborhood type on walking
behavior. Specifically, this study applied propensity score stratification, which has been widely
used to reduce selection bias. The results showed that, on average, the causal influences of
neighborhood type are likely to be overstated by 64% for utilitarian walking frequency and 16%
for recreational walking frequency, if residential self-selection is not controlled for. However,
neighborhood type still plays a more important role in affecting walking behavior than self-
selection. This study also offers a basic tutorial for the propensity score stratification approach
and discusses its strengths and weaknesses for applications in the field of land use and travel
behavior.
Key words: causality, land use, smart growth, transportation, treatment effect
1
1. INTRODUCTION
Suburban development has been widely criticized for its contribution to auto dependence and its
consequences: air pollution, global climate change, and oil dependence. Numerous studies have
investigated the relationships between the built environment and travel behavior since the 1990s
(Crane, 2000; Ewing and Cervero, 2001; Frank and Engelke, 2001; Handy, 1996). These studies
found that many attributes of traditional neighborhoods (such as high density, high accessibility,
and mixed land use) have a positive association with walking and/or a negative relationship with
driving. The results point to the movement of using land use and transportation policies to
reduce auto dependence and its negative impacts. Most recently, decision-makers at the state
and local levels have been considering land use policies as a way to reduce vehicle-miles
traveled (VMT) and thus greenhouse gas emissions. The recent report Growing Cooler (Ewing
et al., 2008) concluded that “it is realistic to assume a 30 percent cut in VMT [for people in areas
of] compact development” (p. 9). However, association does not necessarily mean causality. It
is possible that residential self-selection is at work - individuals who prefer walking may
The goal of research regarding self-selection is to establish whether there is a causal relationship
between the built environment and travel behavior, and ultimately to determine the magnitude of
this relationship. Such evidence provides a basis for the adoption of policies that aim to change
travel behavior by changing the built environment. The existence of self-selection doesn't mean
that the built environment is irrelevant. For the sake of increasing active travel, self-selection is
studies contended those neighborhoods are undersupplied, Cao (Cao, 2008) offered a critique of
2
the studies and his empirical results did not support the argument of unmet demand. Further, the
individuals may desire walkability-related attributes (such as stores within walking distance),
they may also prefer contradictory qualities (such as large lots and free parking) (Walker and Li,
2007). Therefore, to the extent self-selection exists but is not controlled for, we are likely to
misestimate the influence of built environment when we use land use policies to try to reduce
travel, fuel consumption, and emissions. For example, if those who have an automobile-oriented
lifestyle end up living in dense and diverse neighborhoods despite their preferences (e.g. because
of undersupply of the neighborhoods), their travel behavior will probably not match that of those
Recent studies have investigated the causal relationships between the built environment and
travel behavior. Among 38 studies reviewed in Cao et al.(Cao et al., 2009), many concluded the
evidence of residential self-selection, and virtually every study found a statistically significant
influence of the built environment on travel behavior, controlling for self-selection (Boarnet et
al., 2005; Boarnet and Sarmiento, 1998; Frank et al., 2007; Khattak and Rodriguez, 2005;
Krizek, 2003; Vance and Hedel, 2007). It is arguable that the magnitude of an effect is at least as
by sample size (Ziliak and McCloskey, 2004). Therefore, to ascertain whether changes to the
built environment are a cost-effective way to change travel behavior, given the opportunity costs
of spending resources another way, it is necessary to determine the magnitude of the effect, not
just whether one occurs or not. Further, it is known that the observed influence of the built
environment on travel behavior (without a correction for self-selection) constitutes the influence
3
of the built environment itself (the causal influence of the built environment) and the influence of
self-selection. This intrigues planners and make them interested in knowing if the observed
However, few studies have shed light on the proportion of the causal influence of the built
environment in the observed influence of the built environment on travel behavior. Using travel
diary data from the Regional Travel – Household Interview Survey, Salon (Salon, 2006)
estimated a three-tiered nested logit model of residential choice, auto ownership, and walking
level. She concluded the effect of the built environment itself accounted for 1/2 to 2/3 of the
effect of a change in population density on walking level in most areas of New York City. Using
the 1998-1999 Austin Travel Survey, Zhou and Kockelman (Zhou and Kockelman, 2008)
employed a sample selection model to investigate the causal influence of residential location on
VMT. They found that the causal effect of the built environment accounted for 58-90% of the
“total” (not observed but derived from models) influence of residential location on VMT,
In addition, a few studies indicated whose effect of the built environment itself and the self-
selection on travel behavior is stronger, but they did not show how much stronger (although the
causal effects may be calculated using parameter estimates). The results of these studies were
mixed, however. For example, using a 2003-2004 survey in the San Diego and San Francisco
metropolitan areas, Chatman (Chatman, 2009) employed a negative binomial model to explore
the impact of the built environment on trip frequencies by different modes, controlling for
preferences for mode choice and socio-demographics. He concluded the built environment
4
impacts travel behavior, and residential self-selection bias is modest. Schwanen and Mokhtarian
(Schwanen and Mokhtarian, 2005) compared the mode choice of consonant residents (those
whose residential choices match their travel/residential preferences) and dissonant residents.
They found that urban-oriented suburbanites (dissonant suburbanites) commuted by car at rates
almost as high as other suburbanites. Therefore, for suburbanites, the built environment has a
relatively stronger influence on mode choice than attitudes. However, in the urban
neighborhood, they found that the built environment has an influence on mode choice similar in
magnitude to travel preferences. Kitamura et al. (Kitamura et al., 1997) evaluated the relative
contributions of built environment variables and attitudes by gradually including different groups
of variables in their model specifications. They found that attitudes explain travel behavior
better than neighborhood characteristics. Using the 1995 National Personal Transportation
Survey, Boer et al. (Boer et al., 2007) found that a number of built environment variables were
significantly associated with the choice of walking or not. However, after propensity score
matching (with demographics being independent variables in the propensity score model), many
previously significant built environment variables became insignificant although few remained
significant. They concluded that self-selection played an important role in walking choice.
Given limited research and mixed results, the extent to which the built environment itself
contributes to the observed influence of the built environment on travel behavior is inconclusive.
Using the 2003 data collected from Northern California, this study applied a propensity score
stratification approach to identify the respective effects of neighborhood type and self-selection
on walking behavior: Handy et al. (Handy et al., 2006) used the same dataset and examined
whether the influence of the built environment on walking behavior is causality or correlation.
5
They adopted negative binomial models and did not quantify the effect of the built environment
and the effect of self-selection on walking behavior. This study moves beyond Handy et al. and
aims to answer the following questions: (1) How large is the causal influence of neighborhood
type on walking behavior? (2) To what extent do neighborhood type itself and self-selection
contribute to the observed influence of neighborhood type on walking behavior? Further, this
paper is one of few applications of the propensity score approach in the field of land use and
transportation. It offers a tutorial for the method and discusses its strengths and weaknesses.
The organization of this paper is as follows. Section 2 reviews the conceptual connection
between residential self-selection and misestimation. Section 3 describes the propensity score
approach. The next section presents the data and variables. Section 5 discusses modeling
results. The last section presents the limitations and summarizes major findings.
is not taken into account, we are likely to overestimate the causal influence of the built
environment (Mokhtarian and Cao, 2008; Pinjari et al., 2007). However, is it possible that we
First assume that the relationship between walking behavior and neighborhood type is
confounded by only walking preference. Let ATE (average treatment effect) denote the causal
influence of neighborhood type on walking behavior, across the entire population. Similar to a
natural experiment, suppose we could (though impossible) randomly assign half of a sample to a
walkable neighborhood and the other half to a non-walkable neighborhood. Then the ATE is the
6
neighborhoods, because the influence of walking preference on walking behavior will be
cancelled out due to random assignment (Figure 1). Now, assume that all people can self-select
the neighborhoods that match their walking preference. For people selectively living in the
the effect of neighborhood type, and vice versa. Therefore, the observed difference in walking
behavior between the two neighborhoods (ATE1) will be larger than the ATE. By contrast, if all
people are mismatched (that is, people who prefer walking live in the non-walkable
neighborhood, and vice versa), the observed difference in walking behavior between the
neighborhoods (ATE2) will be smaller than the ATE. The ATE1 and ATE2 are the upper and
lower bounds of the observed influence of neighborhood type on walking behavior, respectively.
[Figure 1]
In reality, not all people can find the neighborhoods that match their walking preference and not
all people are mismatched. For such a sample, the observed influence of neighborhood type on
walking behavior (ATE3) will lie somewhere between the ATE1 and the ATE2. Therefore,
walking behavior. Some studies found that up to about a quarter of residents were mismatched
(Cao, 2008; Schwanen and Mokhtarian, 2004). This evidence seems to suggest that the ATE3 is
larger than the ATE but smaller than the ATE1. However, the misestimation remains uncertain
because (1) the influence of travel preferences on travel behavior does not appear to be
symmetric for residents living in different types of neighborhoods (Schwanen and Mokhtarian,
7
2003; Schwanen and Mokhtarian, 2005); (2) individuals’ travel behavior is influenced by many
3. METHODOLOGY
Stratification is an effective way to control for selection bias (Rosenbaum and Rubin, 1984). In
observational studies, observations in the treatment group often differ systematically from those
in the control group. In this context, treatment consists of residents living in traditional
neighborhoods and control includes suburbanites. To reduce the bias, we can classify residents
into several strata based on their characteristics, and then compare travel behavior between
residents living in traditional and suburban neighborhoods that were grouped into the same
stratum (Rosenbaum and Rubin, 1984). If the assignment of a treatment is confounded by only a
single variable, treatment and control units in a given stratum tend to carry similar values (within
a prespecified range) of the variable (i.e., to be balanced). Within the stratum, this sub-
treatment group and a control group. That is, stratification roughly resembles a true random
experiment. It was concluded that a sub-classification up to five strata can reduce more than
90% of the bias resulting from one continuous variable (Cochran, 1968). Through two pair-wise
found that five strata (quintiles) reduced selection bias by 73% and 90%, respectively.
scalar function of those variables (Rosenbaum and Rubin, 1984). Self-selection bias can result
from a large number of variables. However, stratification is convenient for only a small number
of variables because the number of strata grows at an exponential rate as the number of variables
8
increases. If, for example, a variable is sub-classified into five strata, a stratification of k
variables will produce 5k groups. Excess stratification may lead to some empty strata or few
observations in some strata, and hence make a direct comparison of outcomes impossible. The
limitation calls for a scalar that carries the information required to balance all variables.
The propensity score is a scalar function that can be used to balance multiple variables. Using
large and small sample theory, Rosenbaum and Rubin (Rosenbaum and Rubin, 1983) have
proved that “adjustment for the scalar propensity score is sufficient to remove bias due to all
observed covariates” (p.41). According to their definition, the propensity score in this context is
treatment) given her observed characteristics. It can be estimated using binary choice models
(Rosenbaum and Rubin, 1983). It is worthy noting that the propensity score model can also be
used to the case with one control and multiple treatments in spite of few applications. Due to
limitations of land use data in this study, we chose a binary classification: one control and one
treatment.
Here, the goal of propensity score stratification is to estimate the causal effect of neighborhood
type on travel behavior. Our interest is the ATE, which, ideally, represents the average increase
traditional one (Mokhtarian and Cao, 2008). Before estimating the ATE, propensity score sub-
classification and balance assessment are in order (as discussed in Section 5). Then, for each
stratum, we calculate the difference in travel behavior between residents in traditional and
suburban neighborhoods. The ATE is a weighted average of the differences of all strata.
9
The propensity score method is desirable for its ability to estimate the causal influence of the
built environment on travel behavior. It is different from other approaches for causal effects.
First, it is distinct from the statistical control method, which explicitly measures attitudes and
incorporates them in the behavior equation (Mokhtarian and Cao, 2008). Conceptually, the
propensity score approach controls for the observed characteristics that affect whether an
imbalance in variables between traditional and suburban neighborhoods. The latter identifies the
determinants of travel behavior through incorporating confounding factors directly into the
behavior equation, so that we can eliminate all differences between traditional and suburban
neighborhoods that affect the behavior. The attention is directed to the behavior outcome (Oakes
and Johnson, 2006; Winship and Morgan, 1999). Empirically, the model used to estimate a
statistical significance of independent variables; interaction and polynomial terms are always
encouraged for propensity score estimation (Oakes and Johnson, 2006). However,
multicolinearity and statistical significance are important for a model aiming to explaining travel
The sample selection model is essentially a generalized propensity score approach, although the
application of the former is earlier than that of the latter (Winship and Morgan, 1999). The
sample selection approach first estimates individuals’ prior selection into different types of
residential locations, and then model travel behavior as conditional on that prior selection
(Mokhtarian and Cao, 2008). The difference between the two approaches is that the sample
10
selection model requires a strong normality assumption and inserts a lambda (selection
correction factor) in the behavior equation whereas the model using propensity score as a
regressor inserts the estimated propensity score (predicted probability) in the behavior equation
on neighborhood type, size of the metropolitan area, and region of the state. Neighborhood type
was differentiated as “traditional” for areas built mostly in the pre-World War II era, and
“suburban” for areas built more recently. This distinction reflects a significant change in design
characteristics for residential neighborhoods as the suburban boom took place following World
War II. Using the US Census, we screened potential neighborhoods to ensure that average
income and other characteristics were near the average for the region. The traditional
(Junior College area), and Modesto (Central). The suburban neighborhoods were Sunnyvale (I-
280 area), Sacramento (Natomas area), Santa Rosa (Rincon Valley area), and Modesto (suburban
area). The four traditional neighborhoods differ in visible ways from the four suburban
neighborhoods – the layout of the street network, the age and style of the houses, and the
location and design of commercial centers (Figure 2). A selection of the objective accessibility
measures reveals distinct differences between traditional and suburban neighborhoods (Table 1).
Residents of traditional neighborhoods on average have two to four times more businesses within
400m and 1600m from home. In addition, the average distance to the nearest establishment of
any type for residents of traditional neighborhoods (247m) is less than half the distance for
11
suburban residents (557m), and residents of traditional neighborhoods are closer to every type of
establishment on average than suburban residents. Further, there are some differences in
accessibility among the four traditional (suburban) neighborhoods. However, in general, the
differences between traditional and suburban neighborhoods are much larger than the differences
The original database consisted of 6,746 valid addresses (out of 8,000 addresses). 1,682 surveys
were returned and the response rate is about 25%. This response rate is considered quite good
for a survey of 14 pages, since the response rate for a survey administered to the general
characteristics to population characteristics, based on the 2000 U.S. Census (Table 2), shows that
survey respondents tend to be older than residents of their neighborhood as a whole, and that the
percent of households with children is lower for the sample for most neighborhoods. In addition,
median household income for survey respondents was higher than the census median for all but
one neighborhood, a typical result for voluntary self-administered surveys. However, since the
focus of our study is on explaining travel behavior as a function of other variables rather than on
describing the simple univariate distribution of the behavior per se, these differences are not
expected to materially affect the results (Babbie, 2007). This study also applies the propensity
[Table 2]
12
The dependent variables are walking to store frequency and strolling frequency. In the survey,
respondents were asked to report the frequency in the last 30 days they walked from their
residence to a local store or shopping area, and the frequency in the last 30 days they took a walk
The independent variables are classified into three groups: residential preferences, travel
attributes regarding their residence and neighborhood when/if they were looking for a new place
factor analysis reduced these items to six factors: accessibility, physical activity options, safety,
socializing, attractiveness, and outdoor spaciousness (Table 3). To measure attitudes regarding
travel, the survey asked respondents whether they agreed or disagreed with a series of 32
analysis reduced these 32 items to six underlying dimensions: pro-bike/walk, pro-transit, pro-
travel, travel minimizing, car dependent, and safety of car (Table 3). Refer to Handy et al.
(Handy et al., 2006; Handy et al., 2004) for detailed discussion on both factor analyses. Finally,
the survey contained a list of socio-demographic variables including gender, age, employment
status, educational background, household income, household size, the number of children in the
[Table 3]
13
5. RESULTS
Binary logistics regression in SPSS 15.0 was used to estimate the propensity score. The
inclusion or exclusion of a variable in the propensity score model is based on its relevance to
residential choice rather than its statistical significance (Rubin and Thomas, 1996). In fact,
scholars strongly oppose using statistical significance as a criterion (Luellen et al., 2005) (p.536).
residential preferences, and travel attitudes. Hence, they are potential independent variables for
the model.
The procedure for developing the propensity score model is as follows. All socio-demographics,
residential preferences, and travel attitudes were allowed to enter the model. Based on quintiles
of the propensity score (as recommended by Rosenbaum and Rubin (Rosenbaum & Rubin,
1984)), respondents were classified into five strata. The next step is to examine if residents in
traditional neighborhoods do not differ (in terms of their characteristics) from suburbanites in the
same stratum. A two-way (2 treatments x 5 strata) ANOVA was adopted (Rosenbaum & Rubin,
1984). When the interaction effect and main effect of the treatment are insignificant at the 0.05
level, the variables are considered to be balanced. Otherwise, we need to adjust the propensity
score using a different model specification. Specifically, the unbalanced variable, its high-order
form (such as polynomial terms), and its interaction with other variables can enter the model
until the balance of all variables is achieved (Oakes & Johnson, 2006; Rosenbaum & Rubin,
1984). Note that some variables in this data contain missing values. Including all independent
variables in the model inevitably reduces effective sample size. So variables that are not
significantly different between traditional and suburban neighborhoods before (and after) the
14
Table 4 presents the final propensity score model. Pseudo R-square of the model is 0.178, which
is considered typical for a model with balanced market shares and using disaggregate data with a
large sample size. Renters, the number of children under 18 years old, and number of adults are
significant in the model, with expected signs. The high-income are more likely to live in
traditional neighborhoods, which is consistent with the observed bivariate relationship shown in
Table 5. Residential preferences for socializing and attractiveness and the pro-bike/walk attitude
are positively associated with the choice of traditional neighborhoods, but those preferring
neighborhood safety and valuing the safety of car tend to live in suburban neighborhoods. When
evaluating whether variables were balanced with the original model, I found that residential
preference for outdoor spaciousness showed systematical differences. Several different model
specifications were tried and the inclusion of the preference for outdoor spaciousness, its
quadratic term, and its interaction with household size in the model balances the variable.
[Tables 4 and 5]
Table 5 compares variables between traditional and suburban neighborhoods before and after
propensity score stratification. Before the adjustment, the majority of socio-demographics differ
significantly, even at the 0.01 and 0.001 levels. Two residential preferences (safety and
spaciousness) and four travel attitudes (pro-bike/walk, pro-transit, safety of car, and car
dependent) are also different. After the adjustment, none of them are significant at the 0.05
level, as indicated by the small F-statistics of the interaction effect and main effect of the
treatment. Therefore, the propensity score stratification successfully balances the quintiles
15
simultaneously on these variables. The quintiles are the final five strata for stratification. Note,
although the strata was classified based on only the propensity score, the influences of socio-
demographics and attitudes have been incorporated into the propensity score (Table 4).
To figure out the ATE with a propensity score adjustment, we first calculate treatment effect for
each of five quintiles and then take a weighted average of these treatment effects (Rosenbaum
and Rubin, 1984). Table 6 presents the ATEs of neighborhood type on walking behavior.
Overall, the results suggest that residents living in traditional neighborhoods tend to walk more
than suburbanites. In particular, the ATE of neighborhood type on walking to store (utilitarian
walking) frequency is 1.86 times per month, which accounts for 61% (=1.86/3.05) of the
observed difference between residents in traditional and suburban neighborhoods. The causal
influence of neighborhood type on strolling (recreational walking) frequency is 2.05 trips per
month, which accounts for 86% (=2.05/2.38) of the observed difference. The difference in the
percentages shows that residential self-selection tends to have a stronger influence on utilitarian
walking than recreational walking. This makes sense since access to destinations is one of more
conducted propensity score matching using a caliper approach and a kernel approach in Limdep
9.0. As shown in the last two columns of Table 6, the differences among different approaches
are less than 10%. Therefore, propensity score stratification is considered to be reliable.
[Table 6]
16
As discussed earlier, there are some differences in accessibility measures among the four
traditional (and suburban) neighborhoods, although not substantial. A sensitivity analysis was
conducted to examine how pooling the four traditional (and suburban) neighborhoods influences
travel behavior outcomes. In particular, I re-ran the models eight times with each time leaving
out one neighborhood. Overall, for walking to store frequency, the results are fairly stable
(Table 7). With all neighborhoods in the model, the ATE accounts for 61% of the observed
influence of neighborhood type on walking frequency; for the remaining eight models, this share
ranges from 55% to 62%. With all neighborhoods in the model, the ATE accounts for 54% of
the mean frequency for the whole sample; for the eight models, this proportion ranges from 47%-
59%. For strolling frequency, the results show similar patterns although a range is somewhat
larger than that for walking frequency. Therefore, the results based on all neighborhoods have
[Table 7]
6. CONCLUSIONS
This study applies propensity score stratification to determine the causal effect of neighborhood
type on walking behavior and its share in the observed influence of neighborhood type.
Although the approach can be used to estimate treatment effects, it is not a panacea for
addressing selection bias. First, although propensity score stratification can reduce 90% of
selection bias, it cannot fully eliminate the influence of residential self-selection on travel
behavior. Second, because the propensity score model assumes that all variables affecting
outcomes and treatment assignments are measured through observed characteristics, hidden bias
can be a potential concern (Rosenbaum and Rubin, 1983). If unmeasured characteristics (for
17
example, attitudes were not measured in most travel diary) are a source of self-selection, this
approach cannot compensate for that. In this study, we have measured attitudinal factors and
presumably hidden bias is not a major problem. Further, it is desirable to have two or more
treatments modeled. The overwhelming majority of propensity score applications involve only a
single treatment and one control. Recently, there are a few applications of multinomial logit
propensity score model and ordered propensity score model (Imai and van Dyk, 2004). In this
study, we chose a binary neighborhood type, which is a coarse measurement of the built
environment. People living in the same type of neighborhoods (and hence presumably received
the same treatment) are often exposed to different levels of treatments. Therefore, it is ideal to
use a composite measure, derived from various dimensions of the built environment, to classify
the environment. The pedestrian environment factor in Portland, OR and the transit
serviceability index in Montgomery County, MD can be potentially good measures although not
widely available.
Nevertheless, this study provides insightful evidence to understand the causal influence of the
built environment on walking behavior. First, the results show that if residential self-selection is
not controlled for, we are likely to overestimate (not to underestimate) the causal influence of the
built environment. In particular, the causal influence of neighborhood type on walking to store
frequency and strolling frequency will be overstated by 64% (=3.05/1.86-1) and 16%,
respectively. These weaken the observed connections between neighborhood type and walking
behavior, especially utilitarian walking. However, although both the built environment and self-
selection influence walking behavior, the former tends to play a more important role. For
walking to store frequency, the ATE of neighborhood type is 1.86 trips per month, which
18
accounts for 54% of the mean frequency for the whole sample. This considerable influence
provides a supportive evidence for the ability of changes in the built environment to stimulate
meaningful changes in walking behavior. However, given our cross-sectional sample, we only
tested a single direction of causality – attitudes influencing behavior – and not the converse.
Accordingly (in concert with other such studies), by not allowing for the possibility that both
travel choices and the chosen built environment are changing attitudes over time, we may be
overestimating the influence of attitudes (self-selection) on the built environment and travel
behavior, and hence underestimating the influence of the built environment on travel behavior.
ACKNOWLEDGEMENTS
The data collection was funded by the UC Davis-Caltrans Air Quality Project, the Robert Wood
Johnson Foundation, and the University of California Transportation Center. The survey was
designed by Susan Handy and Patricia Mokhtarian. Thank Michael Oakes for his help on
technical concepts. Comments from three anonymous referees have greatly improved the paper.
19
TABLE 1. Accessibility of Residents in Traditional vs. Suburban Neighborhoods
Junior College
Rincon Valley
Silicon Valley
Silicon Valley
Sacramento -
Sacramento -
Santa Rosa -
Santa Rosa -
- Sunnyvale
- Mountain
Traditional
Modesto -
Modesto -
nbhd type
Suburban
Suburban
Midtown
Natomas
p-value
Central
View
No. of business types
w/in…
400m 2.6 2.5 2.1 1.2 4.1 0.8 1.1 0.8 0.8 0.6 0.00
1600m 13.0 13.5 13.4 10.4 14.1 9.6 9.1 8.7 10.9 9.4 0.00
Minimum distance in
meters to…
Any business 247 284 235 298 192 557 462 581 502 704 0.00
Institutional 377 417 381 427 305 760 574 727 683 1087 0.00
Maintenance 380 351 408 478 317 819 873 851 663 898 0.00
Eat-out 526 587 438 816 349 789 794 955 696 740 0.00
Leisure 508 547 618 654 293 814 692 932 799 869 0.00
N 882 220 208 183 271 741 209 155 197 180
Note: accessibility were estimated for each respondent, based on distance along the street network from home to a variety of destinations classified
as institutional (bank, church, library, and post office), maintenance (grocery store and pharmacy), eating-out (bakery, pizza, ice cream, fast food,
and take-out), and leisure (health club, bookstore, bar, theater, and video rental). Commercial establishments were identified using on-line yellow
pages, and ArcGIS was used to calculate network distances between addresses for survey respondents and commercial establishments.
20
Table 2. Sample vs. Population Characteristics
Traditional Suburban
SR Junior College
SR Rincon Valley
Mountain View
MD Suburban
SC Midtown
SC Natomas
MD Central
Sunnyvale
Sample Characteristics
Number 228 215 184 271 217 165 220 182
Percent of females 47.3 54.3 56.3 58.2 46.9 50.9 50.9 54.9
Average auto ownership 1.80 1.63 1.59 1.50 1.79 1.66 1.88 1.68
Age 43.3 47.0 51.3 43.4 47.1 54.7 53.2 45.6
Average HH size 2.08 2.03 2.13 1.78 2.58 2.19 2.41 2.35
Percent of HHs w/kids 21.1 18.6 21.7 8.9 42.4 24.8 25.5 31.9
Percent of home owners 51.1 57.8 75.6 47.0 61.1 68.7 81.0 82.4
Median HH income (k$) 98.7 55.5 45.5 64.2 95.0 49.5 55.5 55.3
Population Characteristics
Age 36.1 36.3 36.5 42.7 35.9 38.3 38.1 31.7
Average HH size 2.08 2.21 2.46 1.79 2.66 2.48 2.51 2.57
Percent of HHs w/kids 19.3 20.3 32.9 12.4 35.3 35.4 34.2 41.7
Percent of home owners 34.3 31.2 58.8 34.3 53.2 63.5 61.4 55.2
Median HH income (k$) 74.3 40.2 42.5 43.8 88.4 49.6 40.2 46.2
Notes: SR = Santa Rosa, MD = Modesto, SC = Sacramento, HH = household
21
Table 3. Key Variables Loading on Residential Preference and Travel Attitude Factors
Factor Statement
Residential Preferences
Accessibility Easy access to a regional shopping mall (0.854); easy access to downtown (0.830);
other amenities such as a pool or a community center available nearby (0.667);
shopping areas within walking distance (0.652); easy access to the freeway (0.528);
good public transit service (bus or rail) (0.437)
Physical Good bicycle routes beyond the neighborhood (0.882); sidewalks throughout the
activity options neighborhood (0.707); parks and open spaces nearby (0.637); good public transit
service (bus or rail) (0.353)
Safety Quiet neighborhood (0.780); low crime rate within neighborhood (0.759); low level
of car traffic on neighborhood streets (0.752); safe neighborhood for walking (0.741);
safe neighborhood for kids to play outdoors (0.634); good street lighting (0.751)
Socializing Diverse neighbors in terms of ethnicity, race, and age (0.789); lots of people out and
about within the neighborhood (0.785); lots of interaction among neighbors (0.614);
economic level of neighbors similar to my level (0.476)
Attractiveness Attractive appearance of neighborhood (0.780); high level of upkeep in neighborhood
(0.723); variety in housing styles (0.680); big street trees (0.451)
Outdoor Large back yards (0.876); large front yards (0.858); lots of off-street parking (garages
spaciousness or driveways) (0.562); big street trees (0.404)
Travel Attitudes
Pro-bike/walk I like riding a bike (0.880); I prefer to bike rather than drive whenever possible
(0.865); biking can sometimes be easier for me than driving (0.818); I prefer to walk
rather than drive whenever possible (0.461); I like walking (0.400); walking can
sometimes be easier for me than driving (0.339)
Pro-transit I like taking transit (0.778); I prefer to take transit rather than drive whenever possible
(0.771); public transit can sometimes be easier for me than driving (0.757); I like
walking (0.363); walking can sometimes be easier for me than driving (0.344);
traveling by car is safer overall than riding a bicycle (0.338)
Pro-travel The trip to/from work is a useful transition between home and work (0.683); Travel
time is generally wasted time(-0.681); I use my trip to/from work productively
(0.616); The only good thing about traveling is arriving at your destination (-0.563); I
like driving (0.479)
Travel Fuel efficiency is an important factor for me in choosing a vehicle (0.679); I prefer to
minimizing organize my errands so that I make as few trips as possible (0.671); I often use the
telephone or the Internet to avoid having to travel somewhere (0.514); The price of
gasoline affects the choices I make about my daily travel (0.513); I try to limit my
driving to help improve air quality (0.458); Vehicles should be taxed on the basis of
the amount of pollution they produce (0.426); When I need to buy something, I
usually prefer to get it at the closest store possible (0.332)
Safety of car Traveling by car is safer overall than riding a bicycle (0.489); traveling by car is safer
overall than walking (0.753); traveling by car is safer overall than taking transit
(0.633); the region needs to build more highways to reduce traffic congestion (0.444);
the price of gasoline affects the choices I make about my daily travel (0.357)
Car dependent I need a car to do many of the things I like to do (0.612); getting to work without a
car is a hassle (0.524); we could manage pretty well with one fewer car than we have
(or with no car) (-0.418); traveling by car is safer overall than riding a bicycle
(0.402); I like driving (0.356)
Note: The numbers in parentheses are the pattern matrix loadings for the obliquely rotated factors.
22
TABLE 4. Binary Logit Model for Propensity Score
Coefficients p-value
Constant 0.726 0.036
Social-demographics
Renter 0.822 0.000
Income (k$) 0.006 0.002
Age -0.003 0.555
# adults in the household -0.412 0.000
# children (<18) in the household -0.404 0.000
Female 0.209 0.095
Neighborhood Preferences
Spaciousness -0.107 0.440
Spaciousness-square -0.136 0.010
Spaciousness x household size 0.071 0.200
Accessibility 0.061 0.455
Physical activity options -0.088 0.288
Safety -0.527 0.000
Socializing 0.239 0.001
Attractiveness 0.338 0.000
Travel Attitudes
Pro-bike/walk 0.314 0.000
Pro-travel -0.072 0.232
Travel minimizing -0.043 0.481
Pro-transit 0.091 0.175
Safety of car -0.456 0.000
Car dependent -0.090 0.153
N 1553
Log-likelihood at zero -1076.46
Log-likelihood at constant -1070.93
Log-likelihood at convergence -884.93
McFadden R-square 0.178
Suburban neighborhood is the reference category.
23
TABLE 5. Comparison of Covariates between Traditional and Suburban Neighborhoods before and after Stratification
Variables Traditional Suburban Treatment t- Treatment F- Interaction F-
neighborhood neighborhood statistics before statistics after statistics after
stratification a stratification b stratification c
Socio-demographics
Education background 4.280 (0.0450, 841)d 4.080 (0.0500, 709) 2.91** 0.02 0.06
Household income ($) 71508 (1259, 842) 68176 (1327, 711) 1.82 0.06 1.51
# cars 1.660 (0.0280, 842) 1.820 (0.0320, 711) -3.93*** 0.15 0.15
Age 45.0 (0.519, 842) 49.1 (0.558, 711) -5.36*** 0.03 0.95
Household size 2.020 (0.0360, 842) 2.470 (0.0500, 711) -7.37*** 1.88 2.35
# adult 1.725 (0.0233, 842) 1.900 (0.0297, 711) -7.84*** 1.36 1.07
# children (≤5) 0.120 (0.0140, 842) 0.180 (0.0190, 711) -2.87** 0.33 1.88
# children (≤12) 0.200 (0.0190, 842) 0.350 (0.0280, 711) -4.37*** 0.01 1.95
# children (<18) 0.290 (0.0240, 842) 0.570 (0.0350, 711) -6.49*** 0.74 1.76
Renter (dummy) 0.440 (0.0170, 842) 0.260 (0.0170, 711) 7.21*** 0.08 1.38
Female (dummy) 0.530 (0.0170, 842) 0.500 (0.0190, 711) 1.49 0.10 0.37
Worker (dummy) 0.840 (0.0130, 837) 0.790 (0.0150, 705) 2.45 * 0.35 0.83
Neighborhood Preferences
Accessibility -0.357 (0.0304, 842) -0.409 (0.0372, 711) 1.08 0.34 0.42
Physical activity options -0.306 (0.357, 842) -0.329 (0.0382, 711) 0.45 0.05 0.95
Safety 0.215 (0.0297, 842) 0.609 (0.0270, 711) -9.84*** 0.57 0.41
Socializing -0.199 (0.0368, 842) -0.294 (0.0412, 711) 1.71 0.40 0.43
Spaciousness -0.121 (0.0333, 842) 0.006 (0.0376, 711) -2.53* 0.20 2.14
Attractiveness 0.080 (0.0318, 842) 0.013 (0.0311, 711) 1.51 1.03 0.71
Travel Attitudes
Pro-bike/walk 0.216 (0.0359, 842) -0.219 (0.0334, 711) 8.86*** 0.08 0.47
Pro-travel -0.027 (0.0353, 842) 0.021 (0.0366, 711) -0.95 0.27 0.49
Travel minimizing 0.0166 (0.0343, 842) -0.018 (0.377, 711) 0.68 0.00 1.24
Pro-transit 0.146 (0.0353, 842) -0.171 (0.0355, 711) 6.33*** 0.01 0.15
Safety of car -0.276 (0.0346, 842) 0.288 (0.0332, 711) -11.75*** 2.52 2.04
Car dependent -0.059 (0.0358, 842) 0.076 (0.0351, 711) -2.70** 0.08 1.13
* 0.01<p<0.05 ** 0.001<p<0.01 *** p<0.001
a. t-statistic = independent sample t-statistic with Levene’s test for equality of variances.
b. F-statistic for main effect of neighborhood type after adjusting propensity score quintiles.
c. F-statistic for interaction effect between neighborhood type and propensity score quintile.
d. mean (standard error of mean, number of observations).
24
Table 6. Average Treatment Effects on Travel Behavior
Q1 Q2 Q3 Q4 Q5 ATE Mean a Caliper Kernel
Walk Suburban 1.25 1.71 2 2.36 3.79 1.81
N 219 213 155 87 34 708
Traditional 1.48 3.71 3.12 4.39 7.68 4.86
N 88 96 154 222 275 835
ATE 0.23 2 1.12 2.03 3.89 1.86 3.05 1.98 1.95
Stroll Suburban 7.27 7.24 8.37 7.7 6.88 7.53
N 218 214 155 87 34 708
Traditional 8.35 9.65 8.83 9.2 11.7 9.91
N 88 95 155 222 274 834
ATE 1.08 2.41 0.46 1.5 4.82 2.05 2.38 2.20 2.23
a. Mean outcomes without a propensity score adjustment
Note: Q1 is the stratum of observations whose propensity scores to live in traditional neighborhood do not exceed 20 percentile of the propensity
scores (people with the lowest propensity to live in traditional neighborhood); Q2 is the stratum of observations whose propensity scores are larger
than 20 percentile but do not exceed 40 percentile of the propensity scores; similarly for Q3, Q4, and Q5. Q5 includes people with the highest
propensity to live in traditional neighborhood.
25
Figure 1. The Relationship between Self-Selection and Misestimation
Non-Walkable Walkable
Random ATE = μ2 – μ1
μ1 μ2
μ1, μ1’, and μ1” are observed mean walking behavior of people living in the non-walkable neighborhood;
μ2, μ2’, and μ2” are observed mean walking behavior of people living in the walkable neighborhood.
26
Figure 2. Comparison of Traditional and Suburban Neighborhoods (Sacramento)
Sacramento – Traditional Sacramento - Suburban
Street network
Houses
Commercial centers
27
REFERENCES
Babbie E R, 2007 The practice of social research (Thomson Wadsworth, Belmont, CA)
Boarnet M G, Day K, Anderson C, McMillan T, Alfonzo M, 2005, "California's safe routes to school
program - Impacts on walking, bicycling, and pedestrian safety" Journal of the American Planning
Association 71 301-317
Boarnet M G, Sarmiento S, 1998, "Can land-use policy really affect travel behaviour? A study of the link
between non-work travel and land-use characteristics" Urban Studies 35 1155-1169
Boer R, Zheng Y, Overton A, Ridgeway G K, Cohen D A, 2007, "Neighborhood Design and Walking
Trips in Ten U.S. Metropolitan Areas" American Journal of Preventive Medicine 32 298-304
Cao X, 2008, "Is Alternative Development Undersupplied? Examination of Residential Preferences and
Choices of Northern California Movers" Transportation Research Record: Journal of the Transportation
Research Board 2077 97-105
Cao X, Mokhtarian P L, Handy S L, 2009, "Examining the Impacts of Residential Self-Selection on
Travel Behaviour: A Focus on Empirical Findings" Transport Reviews 29 359-395
Chatman D G, 2009, "Residential choice, the built environment, and nonwork travel: evidence using new
data and methods" Environment and Planning A 41 1072-1089
Cochran W G, 1968, "The Effectiveness of Adjustment by Subclassification in Removing Bias in
Observational Studies" Biometrics 24 295-313
Crane R, 2000, "The Influence of Urban Form on Travel: An Interpretive Review" Journal of Planning
Literature 15 3-23
Ewing R, Bartholomew K, Winkelman S, Walters J, Chen D, 2008, "Growing Cooler: The Evidence on
Urban Development and Climate Change", (Urban Land Institute, Washington, DC)
Ewing R, Cervero R, 2001, "Travel and the Built Environment: A Synthesis" Transportation Research
Record: Journal of the Transportation Research Board 1780 87-114
Frank L D, Engelke P O, 2001, "The Built Environment and Human Activity Patterns: Exploring the
Impacts of Urban Form on Public Health" Journal of Planning Literature 16 202-218
Frank L D, Saelens B E, Powell K E, Chapman J E, 2007, "Stepping towards causation: Do built
environments or neighborhood and travel preferences explain physical activity, driving, and obesity?"
Social Science & Medicine 65 1898-1914
Handy S, 1996, "Methodologies for exploring the link between urban form and travel behavior"
Transportation Research Part D: Transport and Environment 1 151-165
Handy S, Cao X, Mokhtarian P L, 2006, "Self-Selection in the Relationship between the Built
Environment and Walking: Empirical Evidence from Northern California" Journal of the American
Planning Association 72 55 - 74
Handy S, Mokhtarian P, Buehler T J, Cao X, 2004, "Residential Location Choice and Travel Behavior:
Implications for Air Quality", (University of California, Davis)
Imai K, van Dyk D A, 2004, "Causal inference with general treatment regimes: Generalizing the
propensity score" Journal of the American Statistical Association 99 854-866
Khattak A J, Rodriguez D, 2005, "Travel behavior in neo-traditional neighborhood developments: A case
study in USA" Transportation Research Part a-Policy and Practice 39 481-500
Kitamura R, Mokhtarian P L, Daidet L, 1997, "A micro-analysis of land use and travel in five
neighborhoods in the San Francisco Bay Area" Transportation 24 125-158
Krizek K J, 2003, "Residential Relocation and Changes in Urban Travel: <i>Does Neighborhood-Scale
Urban Form Matter?</i>" Journal of the American Planning Association 69 265-281
Luellen J K, Shadish W R, Clark M H, 2005, "Propensity Scores: An Introduction and Experimental Test"
Evaluation Review 29 530-558
Mokhtarian P L, Cao X, 2008, "Examining the impacts of residential self-selection on travel behavior: A
focus on methodologies" Transportation Research Part B: Methodological 42 204-228
28
Oakes M J, Johnson P J, 2006, "Propensity score matching for social epidemiology", in Methods in
epidemiology Eds M J Oakes, J S Kaufman (John Wiley & Sons, Inc., New York)
Pinjari A, Pendyala R, Bhat C, Waddell P, 2007, "Modeling residential sorting effects to understand the
impact of the built environment on commute mode choice" Transportation 34 557-573
Rosenbaum P R, Rubin D B, 1983, "The Central Role of the Propensity Score in Observational Studies
for Causal Effects" Biometrika 70 41-55
Rosenbaum P R, Rubin D B, 1984, "Reducing Bias in Observational Studies Using Subclassification on
the Propensity Score" Journal of the American Statistical Association 79 516-524
Rubin D B, Thomas N, 1996, "Matching Using Estimated Propensity Scores: Relating Theory to
Practice" Biometrics 52 249-264
Salon D, 2006 Cars and the city: An investigation of transportation and residential location choices in
New York city, Agricultural and Resource Economics, University of California, Davis
Schwanen T, Mokhtarian P L, 2003, "Does dissonance between desired and current neighborhood type
affect individual travel behaviour? An empirical assessment from the San Francisco Bay Area", in
Proceedings of the European Transport Conference (ETC), Strasbourg, France
Schwanen T, Mokhtarian P L, 2004, "The extent and determinants of dissonance between actual and
preferred residential neighborhood type" Environment and Planning B-Planning & Design 31 759-784
Schwanen T, Mokhtarian P L, 2005, "What affects commute mode choice: neighborhood physical
structure or preferences toward neighborhoods?" Journal of Transport Geography 13 83-99
Sommer B B, Sommer R, 1997 A practical guide to behavioral research: tools and techniques (Oxford
University Press, New York)
Vance C, Hedel R, 2007, "The impact of urban form on automobile travel: disentangling causation from
correlation" Transportation 34 575-588
Walker J, Li J, 2007, "Latent lifestyle preferences and household location decisions" Journal of
Geographical Systems 9 77-101
Winship C, Morgan S L, 1999, "The estimation of causal effects from observational data" Annual Review
of Sociology 25 659-706
Zhou B, Kockelman K, 2008, "Self-Selection in Home Choice: Use of Treatment Effects in Evaluating
Relationship Between Built Environment and Travel Behavior" Transportation Research Record: Journal
of the Transportation Research Board 2077 54-61
Ziliak S T, McCloskey D N, 2004, "Size matters: the standard error of regressions in the American
Economic Review" Journal of Socio-Economics 33 527-546
29