Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

sustainability

Article
A Machine Learning and Computer Vision Study of the
Environmental Characteristics of Streetscapes That Affect
Pedestrian Satisfaction
Jiyun Lee 1 , Donghyun Kim 1 and Jina Park 2, *

1 Department of Urban Planning, Hanyang University, Seoul 04763, Korea; 2asyun@gmail.com (J.L.);
hyunurban@hanyang.ac.kr (D.K.)
2 Department of Urban Planning and Engineering, Hanyang University, Seoul 04763, Korea
* Correspondence: paran42@hanyang.ac.kr; Tel.: +82-2-2220-0332

Abstract: Pedestrian-friendly cities are a recent global trend due to the various urbanization problems.
Since humans are greatly influenced by sight while walking, this study identified the physical and
visual characteristics of the street environment that affect pedestrian satisfaction. In this study, vast
amounts of visual data were collected and analyzed using computer vision techniques. Furthermore,
these data were analyzed through a machine learning prediction model and SHAP algorithm. As
a result, every visual feature of the streetscape, for example, the visible area and urban design
quality, had a greater effect on pedestrian satisfaction than any physical features. Therefore, to
build a street with high pedestrian satisfaction, the perspective of pedestrians must be considered,
and wide sidewalks, fewer lanes, and the proper arrangement of street furniture are required. In
conclusion, visually, low enclosure, adequate complexity, and large green areas combine to create
a highly satisfying pedestrian walkway. Through this study, we could suggest an approach from
Citation: Lee, J.; Kim, D.; Park, J. A
Machine Learning and Computer
a visual perspective for the pedestrian environment of the street and see the possibility of using
Vision Study of the Environmental computer vision techniques.
Characteristics of Streetscapes That
Affect Pedestrian Satisfaction. Keywords: street environment; streetscapes; walking satisfaction; computer vision; machine learning;
Sustainability 2022, 14, 5730. explainable AI; SHAP
https://doi.org/10.3390/su14095730

Academic Editors: Elisa Conticelli,


Paulo Ribeiro, George
N. Papageorgiou and
1. Introduction
Fernando Fonseca South Korea has experienced rapid economic growth and achieved rapid industrial-
ization and urbanization. In that process, to quickly secure the competitiveness of the city,
Received: 24 March 2022
a car-oriented transportation system was established. However, the automobile-centered
Accepted: 6 May 2022
lifestyle has created many problems, including traffic congestion and environmental pol-
Published: 9 May 2022
lution. Walking is the most basic means of transport for humans; it is eco-friendly and
Publisher’s Note: MDPI stays neutral promotes good health without producing any pollutants. Therefore, walking is one way
with regard to jurisdictional claims in to solve the urban problems caused by automobiles, and it has become the basic back-
published maps and institutional affil- ground for Compact City and Transit-Oriented Development. In fact, the transition from
iations. automobile-centric cities to pedestrian-friendly cities is gaining attention worldwide.
Walking mainly takes place on the road, and people perceive the streetscape mainly
through sight [1]. Visual cognition, along with various other factors such as the physical
environment of the street, produces a response to the environment in which that cognition
Copyright: © 2022 by the authors.
occurs that can eventually be experienced as satisfaction [2]. Therefore, to allow residents
Licensee MDPI, Basel, Switzerland.
to feel satisfied with walking, the landscape plan should consider what pedestrians see.
This article is an open access article
distributed under the terms and
Prior studies examined the relationship between pedestrians and the walking envi-
conditions of the Creative Commons
ronment. However, few studies have looked at the relationship between walking as an
Attribution (CC BY) license (https:// activity and the walking environment because that project is labor-intensive, and the scope
creativecommons.org/licenses/by/ of analysis has been limited to field investigations or surveys. Therefore, to do the research
4.0/). needed to create a truly pedestrian-friendly city, researchers need to exceed the limitations

Sustainability 2022, 14, 5730. https://doi.org/10.3390/su14095730 https://www.mdpi.com/journal/sustainability


Sustainability 2022, 14, 5730 2 of 21

of the prior studies. For example, the pedestrian environment will need to be addressed in
a broad way to make it possible to draw generalizable conclusions.
In recent years, newly developed technology has expanded the methodological possi-
bilities for this kind of research [3]. For example, Google and other platforms now provide
street view services that allow anyone to freely navigate vast amounts of street data through
online map services. Accordingly, methods for batch processing of visual image data using
computer vision techniques are emerging [4–7]. Therefore, studies beyond the limits of
previous methods can now be undertaken using advanced technologies.

2. Literature Review
2.1. Streetscapes and Walkability
Streets are the most essential and representative element of a city, forming the city’s
structure and characteristics. The visually perceived streetscape represents the landscape
image of a city or region [8].
Streetscapes are not created by the individual landscape elements that make up
the street; instead, they are created by the mutual connections of multiple elements [9].
Nonetheless, the physical environmental elements that make up the street play an impor-
tant role in establishing a street’s identity. They greatly affect human visual perception, the
physical environment of the street, and human visual perception plays an important role in
determining the image of a place [10].
Walking is the most basic human means of transportation. Humans experience the
urban environment comprehensively by walking [11], so pedestrian activity makes a great
contribution to the formation of local communities, as well as vitalizing the urban economy
by creating natural contacts among urban residents. Therefore, walking is an essential
factor in revitalizing local and urban economies. A great street environment improves
the frequency of outdoor activities and affects the behavior of street users, including
pedestrians. For example, trees and green spaces, low building layouts, aesthetic facades
of streets and buildings, and open spaces between them amplify the attractiveness of the
walking experience and attract pedestrians [12]. Because pedestrians continue to walk
voluntarily in an environment that continuously offers walking motivation, the meaning
and attributes of the walking environment must be considered in combination to create a
good walking experience. Therefore, walking satisfaction has a high correlation with the
perception of the walking environment [13].
Various authors argued that people’s perceptions could be objectively evaluated by
dealing with the walking environment of a street from a visual point of view. They defined
five factors: imageability, enclosure, human scale, openness, and complexity [6,14,15]. They
then argued that those factors influence people’s perceptions of the pedestrian environment
in a measurable way, enabling researchers to concretely express the relationship between
the physical characteristics of a street environment and pedestrians. Ernawati et al. [16]
classified pedestrian environmental elements into spatial and visual factors. They con-
sidered urban design characteristics, which are the perceptual characteristics of the street
landscape, to be important for promoting pedestrian activities.

2.2. Machine-Learning Model and Explainable AI


Machine learning is a field of artificial intelligence (AI) in which an algorithm predicts
results by applying a mathematical technique that learns patterns based on data to enhance
statistical reliability and minimize prediction errors [17]. In other words, machine learning
predicts the output from new input data by learning from old data and automatically
sensing data trends [18].
Machine learning can be flexibly applied to various situations because it focuses
on understanding the major relationships between input variables and evaluating and
predicting patterns in the input data through data learning [19]. Machine learning can
analyze data that are difficult to analyze with traditional statistics because it can collect
data without a specific intention or data that have various forms [20], and it can analyze
Sustainability 2022, 14, 5730 3 of 21

complex interactions between nonlinear variables in the models [21]. Therefore, in today’s
era of big data, especially unstructured data, many scholars are using predictive models
based on machine learning algorithms to interpret data and make accurate predictions.
Machine-learning models are called black-box models because the internal algorithms
used are complex, making it difficult to explain the reasoning behind the resulting predic-
tions. However, there are thousands or millions of hyperparameters inside the model that
are irrelevant to the input value but affect the predictive power [22]. So many studies have
been conducted to interpret these prediction processes and validate the reliability of the
results. Accordingly, these days, machine learning is no longer called a black-box model,
and there are many different ways to interpret it.
Explainable artificial intelligence (XAI) is a technique for interpreting a machine-
learning model [7]. Unlike other forms of AI, which cannot provide evidence, reliability
statistics, or error correction methods for the derived results, XAI can explain the process,
reasoning, and cause of errors throughout a machine-learning model [23]. XAI can identify
the reasons behind the results derived by AI and explain the causes of any errors. We
tried to improve upon existing research by providing an explanation interface that humans
can understand and interpret immediately, rather than simply showing the calculated
result values.
When a machine-learning model is analyzed through XAI, the following effects can be
obtained [24]. First, analyzing the characteristics of the input variables makes it possible
to identify factors that degrade the performance of a machine-learning model and derive
the optimal hyperparameters for improving the model performance. Second, XAI can
offer stability to the problem-solving process. In machine-learning models, it is important
to design a fluid model that can operate under various conditions. Third, analyzing the
structure of the model makes it possible to identify and cope with the causes of any errors
that occur in the model and makes it possible to secure the reliability of the model by
explaining the result-derivation process. Some examples of XAI techniques are Explain
Like I’m 5 (ELI5), Skater, Local Interpretable Model-Agnostic Explanation (LIME), and
Shapley Additive Explanations (SHAP).
Among them, the SHAP algorithm is one of the XAI techniques to interpret machine
learning by evaluating the feature importance of each variable in the prediction aspect of
machine learning and follows a game-theoretic approach to interpret the output of the
machine learning model [25–27]. Therefore, SHAP is widely used in terms of interpreting
machine learning models reliably.
Ding et al. [28] studied the importance of the characteristics of the construction en-
vironment for driving distance using the gradient boosting model, which is one of the
machine learning models. Moreover, researchers have used SHAP to validate the predictive
performance of machine learning–derived land-use change predictions [29], home price
prediction through a street view analysis [30], and the prediction of traffic accidents [31].
Thus, research is expanding not only by using machine learning to analyze data but also by
using XAI techniques to interpret the results from machine-learning models.

2.3. Limitation of Previous Studies and Differences in This Study


Many studies have analyzed the effects of street design factors on pedestrian satisfac-
tion. However, most preceding studies have used limited research targets due to limitations
in data collection or measurement methods, and they viewed the street environment fac-
tors fragmentarily, such as judging the physical elements of a street environment only by
their existence.
Some studies about pedestrian environments have focused on only one or a few streets
and lacked consideration for visual cognitive characteristics [9,32,33]. Likewise, research
that classifies walking space using attributes carries the inherent limitation of difficulty in
generalizing the results because the criteria for selecting emotional elements or dividing
the street space are different for each researcher [32,34].
Sustainability 2022, 14, x FOR PEER REVIEW 4 of 22

search that classifies walking space using attributes carries the inherent limitation of dif-
Sustainability 2022, 14, 5730 ficulty in generalizing the results because the criteria for selecting emotional elements 4orof 21
dividing the street space are different for each researcher [32,34].
In this study, we synthetically considered the physical environmental factors in a
streetscape that affect pedestrians’ walking satisfaction. First, we measured the compo-
In this study, we synthetically considered the physical environmental factors in a
nents of the streetscape from the visual perspective to determine which characteristics
streetscape that affect pedestrians’ walking satisfaction. First, we measured the components
influence pedestrian satisfaction. And then, we used computer vision techniques to quan-
of the streetscape from the visual perspective to determine which characteristics influence
tify and measure the visual perceptions of a streetscape. Using these data, we built ma-
pedestrian satisfaction. And then, we used computer vision techniques to quantify and
chine learning models, compared the performances, chose the best model, and interpreted
measure the visual perceptions of a streetscape. Using these data, we built machine
it using SHAP, which is one of the XAI techniques. By using computer vision and machine
learning models, compared the performances, chose the best model, and interpreted it
learning
using techniques,
SHAP, which weiscould
one ofanalyze
the XAIvisual data to By
techniques. present
usingan urban design
computer visionperspective
and machine
forlearning
improving the pedestrian
techniques, environment.
we could Additionally,
analyze visual we discussed
data to present an urbanthe possibility
design of
perspective
using computer vision as basic data to improve the quality of streetscapes in the future.
for improving the pedestrian environment. Additionally, we discussed the possibility of
using computer vision as basic data to improve the quality of streetscapes in the future.
3. Materials and Methods
3.1.3.Materials
Materials and Methods
3.1. Materials
Seoul, the capital and largest city of South Korea, is already carrying out projects to
create aSeoul,
pedestrian-friendly
the capital andenvironment, and
largest city of studies
South areisbeing
Korea, conducted
already carryingtoout
determine
projects to
thecreate
best way to create a highly satisfactory
a pedestrian-friendly environment,pedestrian environment
and studies are beingthat encourages
conducted urban
to determine
residents
the besttowaywalk.to create a highly satisfactory pedestrian environment that encourages urban
This study
residents is based on data from the Seoul Floating Population Report (Table 1),
to walk.
which is a large-scale
This study survey
is based conducted
on data from theinSeoul
Seoul. The Seoul
Floating Metropolitan
Population Report Government
(Table 1), which
surveyed 20,000 pedestrians at 1000 points per year from 2009 to 2015. Seoul’s floating
is a large-scale survey conducted in Seoul. The Seoul Metropolitan Government surveyed
population survey data
20,000 pedestrians at are
1000divided
points into observation
per year from 2009 data, attribute
to 2015. data,floating
Seoul’s and point data.
population
Among
survey them,
datawe areinvestigated
divided into the attributedata,
observation data attribute
by dividing the
data, overall
and pointsatisfaction
data. Among with
them,
the walking environment into a 5-point Likert scale from ‘very unsatisfied’ to ‘very satis-
we investigated the attribute data by dividing the overall satisfaction with the walking
environment
fied.’ These datainto werea collected
5-point Likert
from scale from ‘very
individual unsatisfied’
surveys to ‘very
of pedestrians satisfied.’
passing These
through
data were collected from individual surveys of pedestrians passing through
representative points, and they contain the visitor’s gender, age, occupation, residence, representative
points,frequency
purpose, and they contain theoverall
of visits, visitor’s gender, age,
satisfaction andoccupation, residence,
dissatisfaction scorespurpose, frequency
for the walking
of visits, overall satisfaction and dissatisfaction scores for the walking
environment, and the reasons for those scores. Additionally, the report provides infor- environment, and the
reasons
mation aboutforthe
those scores.environment
physical Additionally,ofthe report
each provides
street, including information
the locationabout thesurvey
of the physical
environment of each street, including the location of the survey point,
point, the sidewalk status at each point, and so on. Therefore, because the Seoul Floating the sidewalk status
at each point, and so on. Therefore, because the Seoul Floating Population
Population Report contains information about both the physical environment at each sur- Report contains
information about both the physical environment at each survey point and pedestrian
vey point and pedestrian reports of walking satisfaction, we used it as the basic data for
reports of walking satisfaction, we used it as the basic data for this study.
this study.
Table 1. Contents of the Seoul Floating Population Report.
Table 1. Contents of the Seoul Floating Population Report.
Survey Point 1000 Representative Places in Seoul Survey Locations in Seoul
Survey Point 1000 representative places in Seoul Survey Locations in Seoul
Survey Target Pedestrians passing through the representative points
Survey Target Pedestrians passing through the representative points
Survey Period Every Friday and Saturday during October 2015
Survey Range
Survey Period
10 people per point per day
Every Friday and Saturday during October 2015
Survey Method Individual interviews using questionnaires
Survey Range 10 people per point per day
Survey Method Individual interviews using questionnaires
Gender, age, occupation, residence, purpose, frequency of
Survey Contents
visits, overall
Gender, satisfactionresidence,
age, occupation, with thepurpose,
walkingfrequency
environment
of
Survey Contents
visits, overall satisfaction with the walking environment

3.1.1. Study Area


The spatial background of this study is Seoul, which is the capital and largest city
in South Korea. First, the content of this study is limited to the visual perceptions of
pedestrians, defined as the physical environmental elements that a pedestrian can visually
perceive while walking on the street. The temporal range is 2015, which is the time of the
most recent Seoul Floating Population Report. The spatial range is the entire area of Seoul,
specifically the survey points of the Floating Population Report. However, because the
walking satisfaction survey was not conducted at all the survey sites, this study targeted
The spatial
3.1.2. Streetscapes Images background
and theofNaver
this study is Seoul,
Street ViewwhichAPI
is the capital and largest c
South Korea. First, the content of this study is limited to the visual perceptions of p
The streetscape
trians, definedimages wereenvironmental
as the physical obtained using elementsthe that Naver Street
a pedestrian View
can visually
ceive while walking on the street. The temporal
Seoul Floating Population Report’s point data. Naver is the most popula range is 2015, which is the time o
Sustainability 2022, 14, 5730 most recent Seoul Floating Population Report. The spatial range is the entire 5 of 21 area of S
Korea, andspecifically
it supports the map
surveyservices and
points of the offersPopulation
Floating street view services
Report. However, similar
becaus
quiring thewalkingstreetscape images
satisfaction surveyusing
was notanconducted
internetatmap all theservice allowed
survey sites, us to
this study targ
images
only only
quickly
the points the points
andwalking
for which for which walking
easilysatisfaction
and enabled satisfaction
scores us scores
existtoingather exist in
images
the attribute the attribute data.
from the year
data. Therefore, There
978 from
sample points fromsample
the 1000points
total sample points are used, excluding the points
addition, using
978 sample points
missing the images extracted through the street view service
the
data.
1000 total are used, excluding the points with make
missing data.
measure subjective perceptions of the built environment contained in the i
In this3.1.2.
3.1.2. Streetscapesstudy,Streetscapes
Images we theImages
andused 360°
Naver and theView
Naver
panoramic
Street APIStreet View API
images from the Naver Street
The streetscape imagesusing
weretheobtained using View
the Naver Street
the View API wit
ure 1) because pedestrians perceive the overall image API
The streetscape images were obtained Naver Street of the street,
with Seoulnot a sp
Floating Population Report’s point data. Naver is the most popular portal site in Korea, portal s
Seoul Floating Population Report’s point data. Naver is the most popular
used
and Python
it supports mapto services
Korea, extract theoffers
and it supports
and landscape
map services
street view images
and offers
services from
streetthe
similar view Naver
services
to Google. Street
similar to
Acquiring View
GoogleA
quiring using
the streetscape images the streetscape
an internetimages using anallowed
map service internetus map service allowed
to acquire us to acquire se
several images
images
quickly and easily and quickly
enabledandus toeasily
gatherand enabled
images us the
from to gather
year weimages
wanted.fromInthe year we wante
addition,
using the imagesaddition,
extractedusing the images
through extracted
the street view through the street
service makes view service
it possible makes it possib
to measure
measureofsubjective
subjective perceptions perceptions ofcontained
the built environment the built environment contained in the image [35–
in the image [35–38].
In this
In this study, we study,
used 360◦we used 360° images
panoramic panoramicfromimages from the
the Naver Naver
Street Street
View View API
API
ure 1) because pedestrians perceive the overall image of the street,
(Figure 1) because pedestrians perceive the overall image of the street, not a specific view.not a specific view
used Python to extract the landscape images from the
We used Python to extract the landscape images from the Naver Street View API. Naver Street View API.

Figure 1. Example extracted Naver Street View image.

3.2. Methods
We measured streetscape data from NSV images in terms of physica
Figure 1. Example extracted Naver Street View image.
Figure 1. Example extracted Naver Street View image.
urban design quality. We used physical features from the Seoul Floating P
3.2. Methods 3.2. Methods
port, which contains the width of the sidewalk, the number of lanes, road
ence or absenceWe ofmeasured streetscape
We measured streetscape data from NSVdata frominNSV
images termsimages in terms
of physical of physical
features and features
a centerline, street furniture, slope, braille block, fence,
urban design quality. We used physical features from the Seoul Floating Populatio
urban design quality. We used physical features from the Seoul Floating Population Report,
way contains
which station, theand
port,widthcrosswalk.
which contains
of theAdditionally,
the sidewalk, width
theofnumber ofwe
the sidewalk, measured
theroad
lanes, number
type,ofurban
lanes,
the design
road
presence qu
type, the
mantic
or absencesegmentation
of aence or absence
centerline, and
street aedge
of furniture, detection,
centerline, street
slope, which
furniture,
braille block, are
slope, computer
braille
fence, vision
block,subway
bus stop, fence, bustechn
stop,
way station, and crosswalk. Additionally,
urbanwe measured urban
usingdesign quality usin
semantic segmentation,
station, and crosswalk.
we measured
Additionally, we measured
enclosure,design
openness,
quality
greenery,
semantic
and t
segmentation and mantic
edgesegmentation
detection, which andare
edge detection,
computer which
vision are computer
technologies. vision
Using technologies. U
semantic
ratio, and using
segmentation, semantic
we edge
measured detection,
segmentation,
enclosure, we we measured
measured
openness, enclosure,
greenery, the
and thecomplexity.
openness,
featuregreenery, Then,
and
area ratio, andthewe an
feature
terpreted them
using edge ratio, with
detection, we theedge
and using Machine
measured the learning
detection, we measured
complexity. model
Then, we and SHAP
the complexity.
analyzed and algorithm.
Then, we analyzed
interpreted Allan
terpretedlearning
them with the Machine learning model Allandprocedures
SHAP algorithm. All procedure
shown in Figure 2.
them with the Machine model and SHAP algorithm. are shown
in Figure 2. shown in Figure 2.

Figure 2. Research procedure.

Figure
Figure 2. Research
2. Research procedure.
procedure.

3.2.1. Semantic Segmentation


Deep learning, a machine-learning technique, is rapidly developing in areas such
as data mining, image recognition, speech recognition, and natural language processing.
The semantic segmentation technique (Figure 3) is a kind of computer vision that divides
the objects that appear in images into meaningful unit classes through deep learning. It
3.2.1. Semantic Segmentation
Deep learning, a machine-learning technique, is rapidly developing in areas su
Sustainability 2022, 14, 5730 data mining, image recognition, speech recognition, and natural language processing
6 of 21
semantic segmentation technique (Figure 3) is a kind of computer vision that divide
Figure 3. Area extracted by semantic segmentation from the image in Figure 1.
objects that appear in images into meaningful unit classes through deep learning. It
ics the
mimics the process
3.2.2. ofprocess
Edge of recognizing
recognizing
Detection images by images
performingby performing several several steps simultaneously, ju
steps simultaneously,
a person visually recognizes an image. That
just as a person visually recognizes an image. That means that it segments the imagemeans that it segments theinto
image into p
Edge detection (Figure 4) derives edges by detecting the contours of an image a
pixels and then and thenthe
predicts predicts thewhich
class to class to
eachwhich
pixeleach belongs. pixel Semantic
belongs. Semantic segmentation segmentation
thus thus
used for object extraction and identification [8]. The detected edges reflect the objects
bles thethat
enables the elements elements that the
constitute constitute
landscapethe landscape
to be extracted to be extracted
at the pixel at the pixel
level. Welevel. We
stituting the image. Sobel Edge and Canny Edge are representative edge detection m
the High-Resolution
used the High-Resolution NetworksNetworks model
model jointly jointly developed
developed by the University by the University
of Illinoisof Illinoi
ods. Canny Edge is provided by the OpenCV function, so it is easy to use and ha
Google
and Google in 2019 [39].in 2019 [39].
advantage of measuring each edge exactly once. Canny Edge detection detects edg
white lines on a black background.
Applying edge detection to our streetscape images allowed us to represent the n
ber of street elements. Previous studies used edge detection to quantify the visual
plexity of the landscape and concluded that the measured complexity coincided wit
complexity visually perceived by humans [40–43]. In that way, the perceived compl
can be expressed numerically [44].
The image complexity measurement taken by edge detection follows the theo
Figure 3.byArea extracted by semantic segmentation from the image in Figure 1.
entropy,
Figure 3. Area extracted asemantic
physicalsegmentation
quantity thatfrommeasures
the image the degree
in Figure of1.disorder or degree of irregul
The number of edges extracted from the entire image was used to calculate the ent
3.2.2. Edge Detection
3.2.2. Edge Detection
value using the formula given in Equation (1):
Edge detectionEdge(Figure detection (Figure
4) derives 4) derives
edges by detecting edges the by detecting
contours the contours
of an image of andan image a
𝑛𝑛
used
is used for object for objectand
extraction extraction and identification
identification [8]. The detected [8]. Theedges detected edges
reflect thereflect
objects the objects
constituting thestituting
image. the Sobel Edge and Canny 𝐻𝐻
Edge =
are − � 𝑝𝑝𝑖𝑖
representative log
image. Sobel Edge and Canny Edge are representative edge detection m
𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 2 𝑝𝑝𝑖𝑖 edge detection
𝑖𝑖=1
methods. Cannyods. EdgeCanny Edge isby
is provided provided
the OpenCVby the OpenCV
function, so function,
it is easy to so use
it isand
easyhastothe
use and ha
where
advantage of measuring H
advantageeach is the entropy
of measuring
edge exactlyvalue
eachonce.and
edgeCanny 𝑝𝑝
exactly
𝑖𝑖 is the
once.
Edge probability
Canny Edge
detection of random
detection
detects edges object
as occurr
detects edg
white lines on aamong
white all objects.
blacklines on a black background.
background.
Applying edge detection to our streetscape images allowed us to represent the
ber of street elements. Previous studies used edge detection to quantify the visual
plexity of the landscape and concluded that the measured complexity coincided wit
complexity visually perceived by humans [40–43]. In that way, the perceived compl
can be expressed numerically [44].
The image complexity measurement taken by edge detection follows the theo
entropy, a physical quantity that measures the degree of disorder or degree of irregul
Thefrom
Figure 4. The image number
Figure 4. Figure of1 edges
The image from
with extracted
Figure
each from
1 with
object’s edgeseachthe entire edges
object’s
extracted image
by edge was
extracted used
detection. to calculate
by edge detection.the en
value using the formula given in Equation (1):
Applying edge detection to our streetscape images allowed𝑛𝑛 us to represent the number
of street elements. Previous studies used edge detection 𝐻𝐻𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓to=quantify
− � 𝑝𝑝𝑖𝑖 the log 2visual
𝑝𝑝𝑖𝑖 complexity of
the landscape and concluded that the measured complexity coincided 𝑖𝑖=1
with the complexity
visually perceived by humans [40–43]. In that way, the perceived complexity can be
where H is the entropy value and 𝑝𝑝𝑖𝑖 is the probability of random object occur
expressed numerically [44].
among all objects.
The image complexity measurement taken by edge detection follows the theory of
entropy, a physical quantity that measures the degree of disorder or degree of irregularity.
The number of edges extracted from the entire image was used to calculate the entropy
value using the formula given in Equation (1):
n
H f actor = − ∑ pi log2 pi (1)
i =1

Figure 4.value
where H is the entropy The image
and from Figure
pi is the 1 with each
probability ofobject’s
random edges extracted
object by edge
occurrence detection.
among
all objects.

3.2.3. Analytic Frame


This study uses objective indicators to represent the degree of influence that the
elements of the street have on walking satisfaction. To that end, we built a model that
defined the evaluation factors in the physical environment of pedestrians using people’s
perceptions (Figure 5).
Complexity describes the visual abundance at a particular point as the number
forms perceived by humans [46]. In terms of the complexity perceived visuall
[47], pedestrians need a significant amount of environmental complexity to fe
Sustainability 2022, 14, 5730 ing is attractive [16]. The area ratios of buildings, roads, sidewalks, and str
7 of 21
measure the proportions of each streetscape image that each element occupi

Figure 5.The
Figure 5. The framework
framework of thisof this study
study.

We used the information provided by the floating population research as the physical
4.features
Variables
of the streetscape. For the visual features, we used the measured area ratio
together with enclosure, openness, and complexity, which are urban design qualities
This study is based on data from the floating population research cond
used in previous studies [6,14,15]. We used a semantic segmentation technique to extract
Seoul
visual city government (2015). The purpose of this study is to identify the ch
features.
of a satisfactory
Enclosure walking
describes environment.
the degree of surrounded Therefore,
space in termsweofconstructed
the D:H ratio machine
[45]. l
In other words, enclosure describes the degree to which streetscape elements visually
sification models to predict walking satisfaction in streetscapes. For this, we
define the three elements of a ceiling, wall, and floor through buildings, walls, trees, and
the walking
other satisfaction
vertical elements [14,15].scores of space,
In urban normal, satisfied,
the enclosure or very
is mainly satisfied
formed into a du
by rows
ble of ‘satisfied
of buildings with
or trees. the walking
Openness environment.’
describes the feeling of visualSimilarly, scores ofthe
visibility, determining unsatisf
amount of perceived lightness, which has an impact on visual perception and pleasant-
unsatisfied were converted into ‘unsatisfied with the pedestrian environmen
ness [15]. Complexity describes the visual abundance at a particular point as the number of
Furthermore,
repetitive we by
forms perceived divided
humans personal characteristics
[46]. In terms of the complexityinto gender,
perceived age grou
visually
of passage, the purpose of visit, and job. Physical features were divided
by humans [47], pedestrians need a significant amount of environmental complexity to into
feel that walking is attractive [16]. The area ratios of buildings, roads, sidewalks, and street
width of the sidewalk, the number of lanes, road type, the presence or abse
furniture measure the proportions of each streetscape image that each element occupies.
terline, street furniture, slope, braille block, fence, bus stop, subway station
walk. All of the personal characteristics and the presence or absence of the stre
4. Variables

features were converted into 0 and 1. Table 2 shows the variables in this stud
This study is based on data from the floating population research conducted by the
Seoul city government (2015). The purpose of this study is to identify the characteristics
of a satisfactory walking environment. Therefore, we constructed machine learning classi-
fication models to predict walking satisfaction in streetscapes. For this, we transformed
the walking satisfaction scores of normal, satisfied, or very satisfied into a dummy vari-
able of ‘satisfied with the walking environment.’ Similarly, scores of unsatisfied and very
unsatisfied were converted into ‘unsatisfied with the pedestrian environment.’
Furthermore, we divided personal characteristics into gender, age group, frequency
of passage, the purpose of visit, and job. Physical features were divided into land use,
the width of the sidewalk, the number of lanes, road type, the presence or absence of
a centerline, street furniture, slope, braille block, fence, bus stop, subway station, and
crosswalk. All of the personal characteristics and the presence or absence of the street’s
physical features were converted into 0 and 1. Table 2 shows the variables in this study.
Sustainability 2022, 14, 5730 8 of 21

Table 2. Variable definitions and basic statistics (n = 17,960).

Factor Element Item Mean/Proportion S.D. Min. Max.

1. Very dissatisfied
2. Slightly dissatisfied
3. Normal 2.73 0.966 1 5
Dependent 4. Slightly satisfied
Pedestrian satisfaction
variable 5. Very satisfied

Not satisfied (ref.) 35% - - -


Satisfied 65% - - -
Men (ref.) 46% - - -
Gender Women 54% - - -
15–19 5% - - -
20–29 19% - - -
30–39 18% - - -
Age
40–49 17% - - -
50–59 23% - - -
60+ 18% - - -
First time 4% - - -
Every day 30% - - -
Frequency of passage 1–2 times a week 19% - - -
3–5 times a week 33% - - -
Personal Less than twice a month 14% - - -
Characteristics
Commute 22% - - -
Personal 37% - - -
Purpose of visit Work or study 19% - - -
Leisure 14% - - -
Passage 8% - - -
Student 14% - - -
Housewife 22% - - -
White-collar 27% - - -
Job Blue-collar 7% - - -
Sales and service 11% - - -
Self-employment 11% - - -
Other 8% - - -
Width of the sidewalk Width of the sidewalk (m) 4.12 2.312 1 24
The number of lanes The number of lanes 3.93 2.637 1 18
No (ref.) 30% - - -
Bus road Bus road
Yes 70% - - -
No (ref.) 29% - - -
Centerline Centerline
Yes 71% - - -

Street No (ref.) 8% - - -
Street furniture
furniture Yes 92% - - -
Car only 7% - - -
Road type Pedestrian only 70% - - -
Physical
features Mixed-use 23% - - -
No (ref.) 64% - - -
Braille block Braille block
Yes 36% - - -
No (ref.) 77% - - -
Slope Slope
Yes 23% - - -
No (ref.) 79% - - -
Fence Pedestrian safety fence
Yes 21% - - -
No (ref.) 64% - - -
Bus stop Bus stop
Yes 36% - - -
No (ref.) 77% - -
Slope Slope
Yes 23% - -
Pedestrian safety No (ref.) 79% - -
Fence
fence Yes 21% - -
Sustainability 2022, 14, 5730 9 of 21
No (ref.) 64% - -
Bus stop Bus stop
Yes 36% - -
Subway No (ref.) 65% - -
Table 2. Cont. Subway station
station Yes 35% - -
Factor Element Item No (ref.)Mean/Proportion
39% S.D. -Min. Max.-
Crosswalk Crosswalk
No (ref.) Yes 65% 61% - -- - -
Subway Subway station
station Commercial area
Yes 35% 18% - -- - -
Green area No (ref.) 39% 3% - -- - -
Crosswalk Semi-residential area
61% 4% -- - -
Crosswalk
Yes -
Land use Semi-industrial area
Commercial area 18%
5% -
-- -
-
Class Ⅰ
Green area
residential area 3%
6% -
-- -
-
Class Ⅱ residential area 30% - -
Semi-residential area 4% - - -
Class Ⅲ residential area 33% - -
Land use Semi-industrial area 5% - - -
Sum of the
Class I residential area 6% - - -
area
Class II residential area 30% - - -
Enclosure ratios of 0.549 0.153 0.145 0.9
Class III residential area 33% - - -
buildings and
Sum of street
the areatrees
ratios of
Enclosure The and
buildings area ratio 0.549 0.153 0.145 0.912
Urban design Openness street trees 0.208 0.096 0 0.5
of the sky
qualities
Openness Sum
The area of plant- 0.208
ratio
0.096 0 0.539
of the sky
Visual
Urban design qualities Greenery ing 0.208 0.096 0 0.5
features area
Sum of
ratios
Visual Greenery planting 0.208 0.096 0 0.539
features Amount of
area ratios
Complexity Amountvisual of 1.283 0.238 0.516 2.2
Complexity information 1.283
visual 0.238 0.516 2.249
information
The proportion of buildings 0.394 0.212 0.002 0.8
The proportion of buildings 0.394 0.212 0.002 0.895
The proportion of the road 0.144 0.060 0 0.2
The proportion of the road 0.144 0.060 0 0.273
Area ratio Area ratio The proportion of the sidewalk 0.038 0.031 0 0.2
The proportion of the sidewalk
The proportion of the street furni- 0.038 0.031 0 0.287

ture
The proportion of the street furniture 0.013 0.013 0.018 0.018
0 0.1600 0.1

4.1. Enclosure 4.1. Enclosure


Enclosure
Enclosure (Figure (Figure
6) describes the6)streetscape
describes the streetscape
frame formed frame
by theformed
verticalby the vertical elem
elements
that make up thethat make up the
landscape. landscape.the
It represents It represents the degree
degree to which to whichare
streetscapes streetscapes
defined byare define
buildings,
buildings, walls, walls,
street trees, andstreet
othertrees, andelements
vertical other vertical
[14]. elements
The proper[14]. Theofproper
ratio ratio of ve
vertical
and horizontal and horizontal
factors factors
on a street can on
givea street can give
pedestrians pedestrians feeling
a comfortable a comfortable
[48,49].feeling [48,49].

Figure
Figure 6. An example 6. Anof
image example image of enclosure.
enclosure.

4.2. Openness
Openness (Figure 7) describes the ratio of the visible sky in the street environment.
Openness determines the amount of light a person perceives and affects both visual per-
ception and comfort [15].
Sustainability 2022, 14, x FOR PEER REVIEW 10

4.2. Openness
4.2. Openness 4.2. Openness
Openness
Openness (Figure (Figure
7) describes
Openness (Figure 7)ratio
the7) describes the
of thethe
describes ratiosky
visible
ratio of the
of the visible
in the sky
street
visible sky in the
the street
street environm
environment.
in environm
Sustainability 2022, 14, 5730 10 of 21
Openness
Openness determines
Openness determines
thedetermines the amount
amount of the
lightamount
a personof perceives
of light aa person
light person perceives
and affects both
perceives and affects
visual
and both visual
per-
affects both visua
ception
ception and comfort
ception and comfort
[15].
and comfort [15].
[15].

Figure7.7.An
Figure Figure
Anexample
example
Figure 7. An
Anof
image
7.
image example
of image of
openness.
example image
openness. of openness.
openness.

4.3.Greenery
4.3. 4.3. Greenery
Greenery 4.3. Greenery
The amount
The amount of The
ofThe amount
greenery
amount
greenery of greenery
of
(Figuregreenery
(Figure 8) is
8) is an (Figure
an(Figure
essential
essential 8) is
8) is an essential
factor
an
factor essential
in walking
in walking factor in walking
walking
satisfaction.
factor in
satisfaction. In satisfactio
Insatisfactio
this study, we this
this study, we this study,
calculate
calculate all we
study,allwe
thecalculate
thecalculate
green inall
green inall thethe
thethe green
images, in the
green including
images, images,
in the images,
including including
bothincluding
both both
trees and both
trees and trees
othertrees
other and
plants. other pl
and other
plants. p
Greenery is the Greenery
Greenery is theGreenery
most is
most effectivethe
effective most
is the factor effective
in improving
most effective
factor in improvingfactor
factorthein improving
quality of athe
in improving
the quality of athe quality
street
street of a
environment
quality street environment
of a street[50],
environment environment
[50],
withthe
with with the
theattractiveness
with theand
attractiveness attractiveness
and
attractiveness
aesthetic and aesthetic
aesthetic value
value
and aesthetic
of
of aa street
streetvalue of aa street
increasing
value of
increasing street
along
along increasing
with
withthe
increasing along with the
proportion
thealong with
proportion the propo
propo
ofgreenery.
of greenery. of of greenery.
greenery.

Figure8.8.An
Figure Figure
Anexample
example
Figure 8. An
Anof
image
8.
image example
of image of
greenery.
example image
greenery. of greenery.
greenery.

4.4.Complexity
4.4. 4.4. Complexity
Complexity4.4. Complexity
Complexity (Figure
Complexity Complexity
(Figure
Complexity
9) (Figure
9) describes
describes
(Figure 9)visual
9)
the describes
describes the visual
abundance
the
abundance visual
of aabundance
streetscape.
of abundance of People
of
streetscape. aaPeople
streetscape.
are
streetscape. Peopl
are Peopl
morecurious
more curiousand more
and
more curious
satisfied
curious
satisfied and
with
and
with satisfied
complex
satisfied
complex with complex
scenery
with
scenery complex
than
than withscenery
with than with
monotonous
scenery than
monotonous with monotonous
scenery.
monotonous
scenery. scenery. H
How-scenery.
However, H
ifever,
the amount ever,
if the amount
ofever,
visualif
of the amount
visual
information of
informationvisual
that must information
that must
be be
processed that must
processed
at one atbe processed
one
time istime is at one time
excessive,
if the amount of visual information that must be processed at one time is exces
excessive, people is exce
people
feel cognitive peopleconfusion
feel cognitive
people
confusion feel
feel
andcognitive
and
cognitive
soon confusion
soon
confusion
lose interest.and
and soonTherefore,
lose interest.
soon
Therefore, lose
lose interest.
interest.
the highest Therefore,
the Therefore,
highest
satisfaction thescores
the highest satisfa
satisfaction
highest satisfa
scores are found
are found when scores
when
scores are
are
complexity found
complexitywhen
found when
has the complexity
hascomplexity
the medianhas
median value has the
value
[46]. median
the [46]. value
median value [46]. [46].

Figureimage
Figure 9. An example
Figure 9. An
9. Anof
example image of
complexity.
example image of complexity.
complexity.
Figure 9. An example image of complexity.
4.5.Area
AreaRatio 4.5. Area
Ratio 4.5. Area Ratio
Ratio
4.5.
The area
area ratio The area
area ratio
describes the describes the
the proportion offeature.
each physical
physical feature. It includes the
the
The ratio The theproportion
describesratio describes
proportion of of
each physical
proportion
each of each
physical It includes
feature. the It
Itfeature.
includes pro-
includes
the
Sustainability 2022, portions of
14, x FOR PEER portions
buildings
portions
REVIEW of buildings
of buildings
(Figure 10), (Figure
roads 10),11),
(Figure
(Figure 10), roads (Figure 11),
sidewalks
roads (Figure 11), sidewalks
sidewalks
(Figure 12), and (Figure
street
(Figure 12), and
fur-
12), and stree
stree
11
proportions of buildings (Figure 10), roads (Figure 11), sidewalks (Figure 12), and street
niture (Figure niture (Figure
13).
niture (Figure 13).
13).
furniture (Figure 13).

Figure
Figure 10. Example 10. showing
image Example the
image showingofthe
proportion proportion of buildings.
buildings.

Figure 11. Example image showing the proportion of the road.


Sustainability 2022, 14, 5730 11 of 21

Figure 10. Example image showing the proportion of buildings.


Figure 10. Example image showing the proportion of buildings.
Figure 10. Example image showing the proportion of buildings.

Figure 11.showing
Example image showingofthe
theproportion of the road.
Figure
Figure 11. Example 11. Examplethe
image image showing the
proportion proportion
road. of the road.
Figure 11. Example image showing the proportion of the road.

Figure 12. Example image showing the proportion of the sidewalk.


Figure 12. Examplethe
image showing the proportion of the sidewalk.
Figure
Figure 12. Example 12. showing
image Example image showingofthe
proportion proportion
the sidewalk.of the sidewalk.

Figure 13. Example image showing the proportion of street furniture.


Figure 13. Example image showing the proportion of street furniture.
Figure 13. showing
Figure 13. Example image Example the
image showingofthe
proportion proportion
street of street furniture.
furniture.
5. Analysis
5. Analysis
5. Analysis 5. Analysis
5.1. Machine-Learning Analysis Methods
5.1. Machine-Learning Analysis Methods
5.1. Machine-Learning
5.1. Machine-Learning
In this study, weAnalysis
Analysis Methods Methods
constructed a walking satisfaction prediction model using lo
In this study, we constructed a walking satisfaction prediction model using lo
In this study, weIn this random
regression, study, we
constructed a constructed
walking
forest, a walking
andsatisfaction
XGBoost satisfaction
prediction
algorithm modelprediction model
using logistic
machine-learning using log
classification m
regression, random forest, and XGBoost algorithm machine-learning classification m
regression,
regression, random forest,we
els. Then random
andused forest,
XGBoost
the and XGBoost
algorithm
best model algorithm
tomachine-learning
examine machine-learning
classification
the factors in the street classification
environmentm
models.
els.best
Thenmodelwe used the best model to examine the factors in the that
streetaffect
environment
Then we used the els. Then
affect we used
walking the best the
to examine
satisfaction. model to examine
factors the factors
in the street in the street
environment environment
affect walking satisfaction.
affect walking satisfaction.
walking satisfaction.
5.1.1. Logistic Regression Classification
5.1.1. Logistic
5.1.1. Logistic Regression Regression Classification
Classification
5.1.1.Logistic
Logistic RegressionisClassification
regression a machine-learning algorithm that applies a near regressi
Logistic
Logistic regression regression is a machine-learning
is a regression
machine-learning algorithm
algorithm thatalgorithm that
applies a that applies
nearapplies
regressiona near regressi
to regressi
Logistic is
classification. Logistic regression a machine-learning
predicts the probability of a near
binary classification usin
classification.
classification. Logistic Logistic
regression regression
predicts the predicts the of
probability probability
binary of binary classification
classification using usin
classification.
ear regression.Logistic regression
It is often used aspredicts the probability
a basic model of binaryofclassification
binary classification
because usin
it is
earItregression.
linear regression. is often usedIt is
as often
a used
basic as aof
model basic model
binary of binary because
classification classification
it is because it is
light
ear
andregression.
fast and has It excellent
is often used as a basic
predictive model of [17].
performance binary classification because it is
and fast and has and fast and has excellent
excellent predictive performance [17].
and fast andpredictive
has excellent performance
predictive[17].
performance [17].
5.1.2. Random Forest Classifier
5.1.2. Random
5.1.2. Random Forest Classifier Forest Classifier
5.1.2.The
Random
random Forest Classifier
The random forest a set of forest
Theisrandom forest
randomlyis a set
is a set
of randomly
of randomly
generated generated
trees. It decision
generated
decision decision
works trees.
andIt
trees.
well works wel
Ithas
works wel
has The random
excellent forest is awithout
performance set of randomly
requiring generated
parameter decision
tuning trees.
or data Itscaling
worksbecau
wel
has excellent
excellent performance without performance without requiring
requiring parameter tuning orparameter tuning
data scaling or datait scaling
because can becau
has
can excellent
compensate performance
for data without requiring
shortcomings while parameter
taking tuningof
advantage orthe
data scaling becau
single-tree mod
compensate forcan datacompensate
shortcomings for data
whileshortcomings while taking
taking advantage of the advantage
single-treeof the single-tree
model. It is mod
can compensate
is currently the for data
most shortcomings
popular while taking
machine-learning advantage
algorithm of theof
because single-tree mod
its fast speed
is currently the most popular machine-learning algorithm because of its fast speed
currently the most popular machine-learning algorithm because of its fast speed and high
is currently
high predictivethe most popular[17]. machine-learning algorithm because of its fast speed
predictive performance [17]. performance
high predictive performance [17].
high predictive performance [17].
5.1.3. XGBoost Classifier
XGBoost is a gradient boosting model (GBM) that sequentially trains and predicts
simple models (such as shallow decision trees) and improves errors. X in XGBoost means
the eXtreme Gradient Boosting, a specific framework from the GBM. Unlike the random
forest, XGBoost has no randomness. The trees are sequentially created to compensate for
errors in a previous tree, so it generally shows better classification performance than other
machine-learning models [17,51].
XGBoost solves the problems of slow speed and overfitting, which are the disadvan-
tages of GBM, and can independently derive the importance of characteristics. In addition,
Sustainability 2022, 14, 5730 12 of 21

it provides a variety of custom optimization options due to its high flexibility, and it links
well with other algorithms [52].

5.2. Evaluating the Machine-Learning Techniques


The predictive performance indicators for machine-learning models are accuracy, pre-
cision, recall, F1 score, and AUC score. In binary classification, other evaluation indicators
are often more important than model accuracy [17].
Accuracy is the most basic performance evaluation index, and it is the ratio of correctly
classified data to all data. Precision is the proportion of pedestrians who actually answered
that they were satisfied among the pedestrians predicted by the classification model to
be satisfied. Recall is the proportion that the model correctly predicted for pedestrians
who answered they were actually satisfied. Precision and recall are thus complementary
to each other, and it is important to achieve high numbers for both. The F1 score is a
combination of precision and recall, and it has a high value when precision and recall are
not biased to either side. The ROC curve indicates how recall changes when the model
misclassifies unsatisfied pedestrians as satisfied. It thus evaluates the overall performance
of the classification model. The area under the ROC curve is the AUC score.

5.3. Interpretable Machine-Learning Techniques


We used the SHAP algorithm to interpret the results of the machine-learning models.
The SHAP algorithm is an XAI technique for interpreting machine learning by evaluating
the importance of each variable’s features in the machine-learning predictions [25–27,53].
The Shapley values of feature j are calculated as follows [25,54,55]:

| S |!( p − | S | − 1)!
∅j = ∑ p!
(val (S ∪ x j ) − val (S)) (2)
S⊆ x1 ,...,x p /x j

∅ j : Feature contribution
S: Subset of features used in the model
χ: Vector of the feature values of the observations
p: Number of features
SHAP reliably interprets machine-learning models by calculating Shapley values for
each input variable. SHAP then uses those Shapley values to graphically display the
importance of each feature in determining the predictive performance, allowing users to
visually check the Shapley values of any specific input variable [54,55].

6. Results
6.1. Machine-Learning Models
We used 80% of the total data for the training and 20% for the test data. The results
for each machine-learning model are shown in Table 3. It was conducted using the test
data, and as a result of comparing the performance evaluation indicators for each model
(logistic regression analysis, random forest, XGBoost), the XGBoost model was selected for
this study because of its excellent performance for all the indicators.
Sustainability 2022, 14, 5730 13 of 21

Sustainability
Sustainability
Sustainability 2022,
2022,
2022, 14,
14,
14, xFOR
FOR
x xFOR PEER
PEER
PEER REVIEW
REVIEW
REVIEW 13
1313ofofof 22
2222

Table 3. Analysis results for each machine-learning model.


Table
Table
Table3.3.3. Analysis
Analysis
Analysis results
results for
results for
for each
each
each machine-learning
machine-learning
machine-learning model.
model.
model.
Model Logistic Regression Random Forest XGBoost
Model
Model
Model Logistic
Logistic
Logistic Regression
Regression
Regression Random
Random
Random Forest
Forest
Forest XGBoost
XGBoost
XGBoost
Accuracy Accuracy
Accuracy
Accuracy 0.650.65
0.65
0.65 0.72
0.72
0.72
0.72 0.82
0.82
0.82
0.82
Precision Precision
Precision
Precision 0.600.60
0.60
0.60 0.76
0.76
0.76
0.76 0.83
0.83
0.83
0.83
Recall Recall
Recall
Recall 0.510.51
0.51
0.51 0.81
0.81
0.81
0.81 0.92
0.92
0.92
0.92
F1 score F1F1
F1 score
score
score 0.530.53
0.53
0.53 0.79
0.79
0.79
0.79 0.87
0.87
0.87
0.87
AUC score AUC
AUC
AUC score
score
score 0.560.56
0.56
0.56 0.76
0.76
0.76
0.76 0.90
0.90
0.90
0.90

ROC
ROC
ROC
ROC curve
curve
curve
curve

6.2.
6.2.
6.2. Interpretation
Interpretation
Interpretationofofof
thethe
the Machine-Learning
Machine-Learning
Machine-Learning Model
Model
Model
6.2. Interpretation6.2.1.
of6.2.1.
the
6.2.1.
Machine-Learning
Analysis
Analysis
Analysis of
ofof the
the
the Street
Street
Street
Model Characteristics
Environment
Environment
Environment Characteristics
Characteristics That
That
That Affect
Affect
Affect Pedestrian
Pedestrian
Pedestrian Satisfac-
Satisfac-
Satisfac-
tion
6.2.1. Analysis oftion
tion
the Street Environment Characteristics That Affect Pedestrian Satisfaction
The results of the TheThe
The results
results
results
SHAP ofofthe
of theSHAP
the
analysis SHAP
SHAP are analysis
analysis
analysis
shown areare
are in shown
shown
shown
Figure ininin Figure
Figure
Figure
14 and141414andandTable
and
Table Table 4.4.The
4. 4.The
Table The
The x-axis
x-axis
x-axis
x-axis of
ofof
of
thethe SHAP
SHAP value
value plot
plot indicates
indicates thethe importance
importance of each variable as calculated by its Shapley
the SHAP value plot the SHAP
indicates value plot
the indicates
importance the
of each of
importance of each
each
variable variable
variable asas
as calculated calculated
calculated
byby by its
its its Shapley
Shapley
Shapley
value,such
value,
value, such
such thatthe
that
that thegreater
the greaterthe
greater theabsolute
the absolutevalue,
absolute value,the
value, thegreater
the greaterthe
greater thedegree
the degreeofof
degree ofcontribution
contribution
contribution to
toto
value, such that the
thethe
the greater
prediction
prediction
prediction the
[55].absolute
[55].
[55]. InInIn Figure
Figure
Figure value,
13,13,
13, ared
aared the
red bar
bar greater
bar indicates
indicates
indicates the degree
apositive
positive
aapositive of contribution
effect
effect
effect (+)(+)
(+) on on
on walking
walking
walking to the
satis-
satis-
satis-
faction,
prediction [55]. Infaction,
Figure
faction, andand
and13, aablue
aablueblue
redbar bar
bar bar indicates
indicates
indicates
indicates anegative
negative
aanegative a positive effect
effect
effect (−)(−)
(−) on
effect
onon walking
walking satisfaction.
(+) onsatisfaction.
walking walking
satisfaction. satisfaction,
and a blue bar indicates TheThe
The SHAP
SHAP
aSHAP
negative value
value
value plot
plot
plot
effect ranks
ranks (−the
ranks the
)the
on importance
importance
walkingofof
importance ofthethe
the entire
entire
entire
satisfaction. datadata
data setset
settototo
thethe
the performance
performance
performance
of of
ofthethe
the machine-learning
machine-learning
machine-learning model.
model.
model. However,
However,
However,
The SHAP value plot ranks the importance of the entire data set to the performance ititit
doesdoes
does not not
not show
show
show howhow
how each
each
each variable
variable
variable affects
affects
affects the
the
the
modelperformance,
model
model performance,so
performance, sosowe
weweuse usethe
use theSHAP
the SHAPsummary
SHAP summaryplot
summary plottoto
plot toexpress
expressthat
express thatinformation
that informationinin
information in
of the machine-learning
terms
terms
terms of
ofof
model.
direction
direction
direction and
However,
and
and size.
size.
size. TheThe
The
itSHAP
SHAP
does
SHAP
not show
summary
summary
summary plotplot
plot
how
isisis
eachinvariable
arranged
arranged
arranged in
in descending
descending
descending
affects order
order
order
theof
ofof
model performance, Shapley
Shapley
Shapley sovalues,
we
values,
values, use thetoSHAP
similar
similar
similar toto
thethe
the SHAP
SHAP
SHAP summary value
value
value plot.
plot. plot
plot. BasedBased
Based toon express
onon thethe
the 0.0 0.0
0.0 that
in
ininthethe
the information
plot,
plot,
plot, the the
the right
right
right side in
side
side
terms of direction contributes
contributes
contributes
and size. positively,
positively,
positively,
The SHAP andand
and thethe
the leftleft
left
summary side
side
side contributes
contributes
contributes
plot is arranged negatively
negatively
negatively in to
toto thethe
the model
model
model
descending performance.
performance.
performance.
order of
In In
In addition,
addition,
addition, redred
red indicates
indicates
indicates high
high
high feature
feature
feature
Shapley values, similar to the SHAP value plot. Based on the 0.0 in the plot, the right value,
value,
value, and and
and blue
blue
blue indicates
indicates
indicates lowlow
low feature
feature
feature value.
value.
value. ForFor
For the
the
the
side
dummyvariables,
dummy
dummy variables,111(yes)
variables, (yes)isisisrepresented
(yes) representedinin
represented inred,red,and
red, and000(no)
and (no)isisisrepresented
(no) representedinin
represented inblue.
blue.Each
blue. Each
Each
contributes positively, and the left side contributes negatively to the model performance.
point
point
point constituting
constituting
constituting the the
the plot
plot
plot represents
represents
represents thethe
the response
response
response ofofof one
one
one person.
person.
person. When
When
When several
several
several points
points
points are
are
are
In addition, red indicates
placed
placed
placed in
ininthehigh
the
the same
same
same feature
placeplace
place ononvalue,
onthethe
the x-axis, andthey
x-axis,
x-axis, blue
they
they appearindicates
appear
appear densely
densely
densely low
stacked.feature
stacked.
stacked. value.
Longtails
Longtails
Longtails of
ofof For
individ- the
individ-
individ-
dummy variables, ualual
ual dots
1dots
dots
(yes) gathered
gathered
gathered to
toto the
is represented the
the right
right
right or
ororleft
in left
left indicate
indicate
indicate
red, and thatthat
that extreme
0 extreme
extreme
(no) measurements
is measurements
measurements
represented cancan
can
in bebebe important
important
blue. important
Each
point constituting to
toto
the aparticular
particular
aaparticular individual
individual
individual
plot represents [54,55].
[54,55].
[54,55].
the response of one person. When several points are
Insummary,
InIn summary,the
summary, theSHAP
the SHAPvalue
SHAP valueplot plotranksranksthe theimpact
impactthat thateach eachvariable
variablehas hason onthe
the
placed in the same place on the x-axis, theyvalue appear plot ranks
densely the impact
stacked. that each
Longtails variable has
of individual on the
performance
performance
performance of
ofofthethe
the model
model
model andand
and whether
whether
whether its its
its effects
effects
effects areare
are approximately
approximately
approximately positive
positive
positive ororor negative.
negative.
negative.
dots gathered to The
theThe
The
right
SHAP
SHAP
SHAP
or left indicate
summary
summary
summary plotplot
plot shows
shows
that
shows how
how
extreme
how and
and
and which
which
measurements
which values
values
values specifically
specifically
specifically
canaffect
be important
affect
affect eacheach
each variable
variable
variable
to a
in
inin
particular individual terms
terms
terms [54,55].
of
ofof direction
direction
direction and and
and size.
size.
size. Because
Because
Because ourour
our focus
focus
focus isisis to
toto determine
determine
determine the the
the physical
physical
physical characteristics
characteristics
characteristics
In summary, of
ofofaaasatisfying
the satisfying
SHAPwalking
satisfying walkingenvironment,
walking
value environment,
ranks we
environment,
plot weanalyzed
we
the analyzedthe
analyzed
impact theSHAP
the
that SHAP
SHAPeach summary
summary
summary
variable plotplot
plot toto
has tofind
findthe
find
on the
the
the
positive
positive
positive Shapley
Shapley
Shapley values,
values,
values, which
which
which are are
are onon
on thethe
the
performance of the model and whether its effects are approximately positive or negative. right
right
right sideside
side of of
ofthe the
the plot.
plot.
plot.
The SHAP summary plot shows how and which values specifically affect each variable in
terms of direction and size. Because our focus is to determine the physical characteristics
of a satisfying walking environment, we analyzed the SHAP summary plot to find the
positive Shapley values, which are on the right side of the plot.
Our analysis showed that all the visual features (the urban design qualities and the area
ratios) significantly affected the model’s predictive performance more than any physical
features. Among the visual features, the proportions of the road and street furniture,
enclosure, complexity, and openness had a negative effect on walking satisfaction, whereas
the proportions of the sidewalk, buildings, and greenery had a positive effect. For a better
understanding of the results, we attached the example image of the street with the high or
low pedestrian satisfaction score in Figure 15.
Sustainability 2022, 14, 5730 14 of 21
Sustainability 2022, 14, x FOR PEER REVIEW 14 of 22

(a)

(b)
Figure
Figure 14. Result of 14.
SHAPResult of SHAP
analysis: (a) analysis:
SHAP value(a) SHAP value
plot; (b) plot;summary
SHAP (b) SHAPplot.
summary plot.
Sustainability 2022, 14, 5730 15 of 21

Table 4. Shapley values for each of the top 20 values.

SHAP SHAP
Variable Variable
Value Value
The proportion of the road 0.1551 Semi industrial area 0.0493
The proportion of the sidewalk 0.1475 Class II residential area 0.0409
The proportion of the street furniture 0.1412 Purpose of passage 0.0356
Enclosure 0.1399 Purpose of commute 0.0316
Complexity 0.1311 Visit every day 0.0315
Greenery 0.1252 Pedestrian-only road 0.0276
Proportion of buildings 0.1157 Slope 0.0266
Openness 0.0949 Crosswalk 0.0247
Sustainability 2022, 14, x FOR PEER
Width ofREVIEW
the sidewalk 0.0792 Student 0.0236 16 o
The number of lanes 0.0636 Visit 3–5 times a week 0.0255

(a)

(b)

Figure 15. Results Figure 15.analysis:


of visual Results of(a)
visual analysis:
Example image(a) Example image
with a high with asatisfaction
walking high walking satisfaction
score (4.05); score (4
(b) Example image with a poor walking satisfaction score (1.35).
(b) Example image with a poor walking satisfaction score (1.35).

6.2.2. Analysis
In the proportion of thethe
of the road, Relationship between Walking
red dots dominantly appearSatisfaction
on the leftand
sideVisual
of theFeatures
Becausethat
SHAP value plot, indicating the aprevious
wide roadanalysis couldaffects
negatively not clearly reveal
walking the detailed
satisfaction. Onandthenonlinear
contrary, for the lationships
proportion between streetthe
of sidewalks, environment characteristics
red dots dominantly and walking
appear on the rightsatisfaction,
side we u
a SHAP
of the plot and create dependence
a long plot (Table
tail, indicating that5) to investigate
a large sidewalkthe effects
area has of visual
a very factors on walk
positive
satisfaction.
effect on pedestrian The For
satisfaction. SHAP dependence
complexity andplot shows the
openness, theinteraction
high and low effect of each
values arevariable.
x-axis represents the numerical value
mixed, but they seem to have a negative effect overall. of the variable, and the y-axis represents the Shap
value. In addition, because negative numbers indicate dissatisfaction
For feature importance, the physical elements immediately following the visual el- with the walk
environment, and positive numbers indicate satisfaction with
ements are the width of the sidewalk and the number of lanes, with a high width of the the walking environm
the value of the street environment factor for a satisfying walking environment can
calculated. The vertically wide distribution of Shapley values for each value in the p
indicates that walking satisfaction varies from person to person [55].
In this analysis, most of the visual features showed a nonlinear relationship w
walking satisfaction. In the proportion of the road, the Shapley values mostly have p
Sustainability 2022, 14, 5730 16 of 21

sidewalk and a small number of lanes positively affecting walking satisfaction. In the
land-use variables, the rankings of semi-industrial areas and class II residential areas were
relatively high and negatively affected walking satisfaction. People particularly disliked
walking in semi-industrial areas. In addition, it can be seen a satisfying walking environ-
ment should have crosswalks and no slope. Among the road types, pedestrian-only lanes
appear to positively affect walking satisfaction, and mixed-use lanes and car-only lanes do
not seem to have much influence on the predictive power of the model.
The most important individual characteristic variables were, in order, the purpose
of the passage, the purpose of commute, daily visitor, and 3–5 times a week visitor. The
purpose of passage positively affected walking satisfaction, whereas the purpose of com-
mute, daily visitor, and 3–5 times a week visitor negatively affected walking satisfaction.
Excluding the two types of visitors, individual characteristics seemed to have little effect on
the model’s performance. Among the pedestrian occupations, only the student was among
the top 20 factors that influence walking satisfaction.

6.2.2. Analysis of the Relationship between Walking Satisfaction and Visual Features
Because the previous analysis could not clearly reveal the detailed and nonlinear
relationships between street environment characteristics and walking satisfaction, we used
a SHAP dependence plot (Table 5) to investigate the effects of visual factors on walking
satisfaction. The SHAP dependence plot shows the interaction effect of each variable. The
x-axis represents the numerical value of the variable, and the y-axis represents the Shapley
value. In addition, because negative numbers indicate dissatisfaction with the walking
environment, and positive numbers indicate satisfaction with the walking environment,
the value of the street environment factor for a satisfying walking environment can be
calculated. The vertically wide distribution of Shapley values for each value in the plot
Sustainability 2022, 14,2022,
Sustainability
Sustainability
Sustainability 2022, x14,
FOR
2022,
x14,
FORx14,
PEER
FORx REVIEW
PEERFOR PEER
PEER REVIEW
REVIEW
REVIEW 17 of
17 22 17 22
17 22
of of of 22
Sustainability
Sustainability
Sustainability 14,2022,
Sustainability
2022,
2022, x14,
FORx14,
2022, x14,
PEER
FOR FOR indicates
PEER
xREVIEW
PEERFOR REVIEW
PEER
REVIEW that walking satisfaction varies from person to person [55].
REVIEW 17 of
17 221722
of of
1722
of 22

Table 5. SHAP dependence plot.


Table 5. SHAP
Table
Table 5. 5. dependence
Table
SHAP 5. dependence
SHAPSHAP plot.
dependence
dependence
plot.plot.plot.
Table 5.Table
Table SHAP 5. dependence
5.Table
SHAPSHAP dependence
plot.plot.plot.plot.
5.dependence
SHAP dependence
The Proportion of TheThe
Proportion
The The of of of of
Proportion
Proportion
The Proportion
Proportion
TheThe
Proportion
TheThe of the
Proportion the
Proportion
Proportion Road
Road
of
of the of
the the
Road Road
RoadThe The Proportion
Proportion
The
of theThe of
Proportion
Sidewalk of of of
Proportion
Complexity
Complexity
Complexity
Complexity
Complexity Greenery Greenery
Greenery
Greenery
Greenery
TheTheThe Proportion
Proportion
The of the
Proportion
Proportion of the
Road
of the Road
of the
Road the Sidewalk
Road thethe the Sidewalk
Sidewalk
Sidewalk Complexity
Complexity
Complexity
Complexity Greenery
Greenery Greenery
Greenery
thethe the
SidewalkSidewalk
the Sidewalk
Sidewalk

TheThe
Proportion
TheThe of of of of
Proportion
Proportion
Proportion TheThe
Proportion
TheThe of of of of
Proportion
Proportion
Proportion
The
The Proportion
The
Proportion
The The of
Proportion
of
Proportion of of of
Proportion Enclosure
Enclosure
Enclosure
Enclosure TheTheThe
The Proportion
Proportion
Proportion
The of of
of of of
Proportion
Proportion Openness
Openness
Openness
Openness
Street Furniture
Street
Street
Street
Street Street
Furniture Furniture
Furniture
Furniture Enclosure
Enclosure
EnclosureEnclosure
Enclosure BuildingsBuildings
Buildings
Buildings
Buildings
Openness
Openness
Openness
Openness
Openness
Street
Street Furniture
Furniture
Street Furniture
Furniture Buildings
BuildingsBuildings
Buildings

TheThe Shapley
The The values
Shapley
Shapley
Shapley values forvalues
values thefor
for proportion
the for
the the of street
proportion
proportion
proportion of furniture
of furniture
street
street
of street areare
furniture mostly
furniture are positive,
are
mostly
mostly mostly and
positive,
positive, they
positive,
and and and
they they they
have The
have The
ahave
positive
a The
Shapley
havea Shapley
The
Shapley
a values
effectShapley
positive
positive
positive values
effecton values
for
walking
effecteffect
on for
on
walkingfor
thewalking
values the
on the
proportion
for proportion
the
proportion
satisfaction.
walking of street
proportion offurniture
of street
However,
satisfaction.
satisfaction.
satisfaction. street if
However,
However, furniture
offurniture
street
the
However,if are
area
theif if
the are
mostly
furniture
are ratio
area the
area mostly
mostly positive,
are
is
area
ratio mostly
more
ratio
is positive,
positive,
ratio
is and
than
more is
more about
more
than and
they
positive,
and
than than
about they
and
they
about they
about
In this analysis, most of the visual features showed a nonlinear relationship with
have a
have
0.020.02
andhave
positive
a
0.02
and have
less
0.02a
and positive
positive effect
a
than
and
less positive
less
thaneffect
0.06,
less effect
on
than walking
on
the
than
0.06,0.06, on
effect
walking
Shapley
0.06,
the walking
on
the the valuessatisfaction.
satisfaction.
walking
satisfaction.
Shapley
Shapley
Shapley However,
satisfaction.
arevalues
values
values However,
However,
negative,
are are are if the
However,
if
indicating
negative,
negative,
negative, if
area
the the
area
indicating
indicating area
ratio
if the is
ratio
thatthat
indicating ratio
areamore
is
that is
ratio
more
pedestrians
that more
than
isthan than
about
more
do do
pedestrians
pedestrians about
notnot
pedestrians about
than
do do about
notnot
walking
0.02 and0.02less satisfaction.
and
0.02 less
than
and than
0.06,
less the
than In
0.06, the
the
Shapley
0.06, proportion
Shapley
the values
Shapley values
are of the
are
negative,
values road,
negative,
are the Shapley
indicating values
that mostly
pedestrians dohave
not pos-
0.02
prefer and
that
prefer
prefer thatless
that than
proportion.
prefer that 0.06,
proportion.
proportion. When the
proportion.When Shapley
theWhen
When the the values
proportion
the are
of
proportion
proportion negative,
street
proportion
of of ofindicating
negative,
furniture
street
street indicating
street was that
furniture
furniture
furniture was10% pedestrians
indicating
that
was
10%was that
orpedestrians
more
10%
or10%
moreanddo
or
more not
orpedestrians
do
was
more
and not
and and
was dowas
was not
itive
prefer values
prefer
that below
that 16%,
proportion.
proportion.
prefer but
When when the those
proportion values are
ofsatisfaction
street exceeded,
furniture they
was 10% haveorand negative
more and values,
waswas
prefer
close to
close the
close
tothatto
the tothat
proportion.
maximum
close the the
maximum
maximum When
proportion.
in
maximum When
the
in the
input
in
the in
theproportion
When
the
input data,
the
input the
proportion
data,the
inputdata, of the
street
proportion
of the
highest
data,
the furniture
streetof street
furniture
highest
highest
highest was
was 10%
furniture
was
satisfaction
satisfaction
satisfaction 10%
shown.
was wasor
was
was
shown.more
or10%
more
shown.
shown. orandwas
more wasand
toclose
indicating
close
close the
Enclosure tothat
the the
maximum
toclose to maximum
a wide
the
maximum
showed
Enclosure
Enclosure
Enclosure showedshowed
road
inashowed
maximumthe
in ina the
input
a the
negative input input
negatively
data,
ina the
negative
negative data, data,
the
input
correlation
negative the the
affects
highest
data, highest
the
highest
until
correlation
correlation
correlation
walking
satisfaction
highest
about
until satisfaction
satisfaction
untiluntil
about 0.4
about
satisfaction.
and
about
0.4 was
satisfaction
0.4
and was
then
0.4
and was
shown.
thenshown.
showed
and
then shown.
was
then shown.
ashowed
showed
showed positive a positive
a positive
a positive
Enclosure
correlation Enclosure
Enclosure showed
Enclosure
thereafter.
correlation
correlation
correlation showedshowed
Ina showed
thereafter.
thereafter. Ina addition,
negative
addition,
thereafter.
In negative
a addition,
negative
In all correlation
correlation
aaddition,
negative
correlation
Shapley
all all alluntil
correlation
values
Shapley
Shapley
Shapley until
about
until about
with about
0.4
until
values
values
values withmore
withand
about
0.4
more 0.4
andthen
than
withmore and
0.4
then
than70%
more then
showed
and
than
70% showed
then
showed
enclosure
than
70% 70% a were
aenclosure
positive
showed positive
aenclosure
enclosure a were
positive
were positive
were
positive,correlation
correlation
correlation thereafter.
correlation
indicating
positive,
positive,
positive, thereafter.
thereafter. In
thatthat
indicating
indicating
indicating In
theythat In
addition,
thereafter. they addition,
addition,In
positively
that
they all
they all all
Shapley
addition, Shapley
all
Shapley
affected
positively
positively
positively values
Shapley
walkingvalues
valueswith
affected
affected
affected with
more
values
with more
satisfaction.
walking
walking
walking more
than
with than than
70%
more
The70%
satisfaction.
satisfaction.
satisfaction. The 70%
enclosure
than enclosure
70%
enclosure
enclosure
The The were
enclosure
enclosure
enclosure were
is the
enclosure
is were
is
the iswere
the the
sum sumofpositive,
positive,
positive,
vertical
sum sumof ofindicating
ofindicating
positive,
indicating
elementsthat
indicating
vertical
vertical
vertical they
that that
that they
elements
elements
elements they
positively
that
make
that that they
make positively
positively
up
that
make affected
positively
the
make
up affected
affected walking
affected
streetscape,
up
the up
the the walking
streetscape,
streetscape, people
streetscape,
so so satisfaction.
sosatisfaction.
walking walking
satisfaction.
people prefer
so
people The
people streets
prefer
prefer The
enclosure
satisfaction.
The prefer enclosure
The
enclosure
with
streets
streets iswith
the
enclosure
many
streets
with many is the
iswith
the
many is the
many
sum ofsum
sum
things sumofsee.
vertical
ofthings
to
things
things vertical
of
vertical
see.
to to to
see. see.elements
elements
vertical
elements that
elements make
that that
make make
up
that the
make
up up
the the streetscape,
streetscape,
up the
streetscape, so people
streetscape, so people
so people prefer
so people prefer
streets
prefer streets
preferwith
streets with
many
streets
with many
with
many many
things things
thingsto
ForFor see.
things
to Forto
see.
complexity,
Forsee.
to see.
complexity,
complexity, values
complexity, values below
valuesvalues
below 1.4below
below had
1.4 1.4
hadmostly
1.4
had had
mostly positive
mostly
mostly Shapley
positive
positive
positive values,
Shapley
Shapley
Shapley and
values,
values, values
values,
and and and values
values
values
above For
above For
1.4
above For
complexity,
show
above
1.4 1.4
showcomplexity,
For
complexity,
ashow
1.4mixturevalues
complexity,
ashow values
aof
a mixture
mixture values
below
ofvalues
below
positive
mixture below
1.4
of and
of positive
positive had
below
1.4and1.4
had had
mostly
1.4
negative
positive and and mostly
had
mostly positive
mostly
Shapley
negative
negative
negative positive
positive Shapley
positive
values.
Shapley
Shapley
Shapley values. Shapley
Shapley values,
Shapley
Therefore,
values.
values. values,
values,and and
people
Therefore,
Therefore,
Therefore, and
values
values,
peoplefeel values
and
values
people
peoplefeel values
feel feel
Sustainability 2022, 14, 5730 17 of 21

The Shapley values for the proportion of the sidewalk showed a positive correla-
tion with walking satisfaction overall. In addition, on streets where the proportion of
the sidewalk was 7% or more, almost all respondents were satisfied with the walking
environment. This indicates that a wide sidewalk is necessary to create a highly satisfying
walking environment.
The Shapley values for the proportion of street furniture are mostly positive, and they
have a positive effect on walking satisfaction. However, if the area ratio is more than about
0.02 and less than 0.06, the Shapley values are negative, indicating that pedestrians do not
prefer that proportion. When the proportion of street furniture was 10% or more and was
close to the maximum in the input data, the highest satisfaction was shown.
Enclosure showed a negative correlation until about 0.4 and then showed a positive
correlation thereafter. In addition, all Shapley values with more than 70% enclosure were
positive, indicating that they positively affected walking satisfaction. The enclosure is the
sum of vertical elements that make up the streetscape, so people prefer streets with many
things to see.
For complexity, values below 1.4 had mostly positive Shapley values, and values
above 1.4 show a mixture of positive and negative Shapley values. Therefore, people feel
unsatisfied on streets where the complexity is too high. However, with complexity values
above 1.5, the points on the plot appear vertically with both positive and negative numbers.
Thus, even in streets with the same complexity, satisfaction and dissatisfaction with the
walking environment vary from person to person.
Greenery showed a positive correlation overall, indicating that walking satisfaction
increases as more greenery is visible. Streets without any greenery had a very negative
effect on walking satisfaction. Particularly because most of the negative Shapley values are
from streets with less than 1% greenery, it is clear that green space must be present for a
street to satisfy pedestrians.
The proportion of buildings had the lowest Shapley values when no buildings were
visible in the street image. Furthermore, most Shapley values were positive, so it had a
positive effect on walking satisfaction. However, at about 90%, which is the maximum in
the input data, the Shapley values fell from −0.3 to −0.5. Thus, pedestrians do not prefer
streets with no buildings or with buildings that occupy about 90% of the visual landscape.
Similar to complexity, openness showed a pattern in which vertically long Shapley
values appear at the same values, indicating that the satisfaction felt at the same openness
varied from person to person. In particular, when the area of the sky was less than about
0.15, walking satisfaction varied widely by person. With about 0.35 or more openness, all
Shapley values became negative. Compared with other values, openness had very low
Shapley values, so when pedestrians saw too much sky, walking satisfaction was quite low.
Due to the extremely low Shapley values in this section, it can be presumed that in the
previous SHAP value plot, openness was expressed as a blue bar that negatively affects
walking satisfaction. However, at values between 0.15 and 0.33, most Shapley values were
positive, indicating that people do not prefer too much or too little sky; they prefer streets
with a moderate amount of sky. In other words, as previously stated, people prefer streets
with suitable sights.

7. Discussion
In this study, we used a machine-learning model and an XAI technique to investigate
which physical characteristics of the streetscape influence walking satisfaction. We found
that the importance of all the visual features (the urban design quality factors and the area
ratios of the street environment) were more important than any of the physical features.
Therefore, pedestrians are greatly influenced by visual input when they feel satisfaction
while walking on the street.
In particular, the proportions of the road, sidewalk, and street furniture had the
greatest influence on model performance, followed by enclosure, complexity, greenery,
the proportion of buildings, and openness. Among the physical features, the width of the
Sustainability 2022, 14, 5730 18 of 21

sidewalk had the largest positive effect, followed by the number of lanes, semi-industrial
area, and Class II residential area, which all negatively affected the model performance.
After that, the personal characteristics of the purpose of the passage, commute, and daily
visitor followed in importance. Among them, the purpose of commute and daily visitors
negatively affected walking satisfaction, and the purpose of passage had a positive effect.
In the SHAP rankings for their degree of impact on the performance of the machine-
learning model, the proportions of road, sidewalk, and street furniture were the most
important. In other words, the biggest influence on the satisfaction of pedestrians is the
floor surface of the streetscape, which is directly related to walking [34].
The streetscape is defined and formed by vertical elements that obstruct people’s
gaze [14], and a proper ratio of vertical and horizontal elements provides a comfortable
feeling for pedestrians [49]. The fact that enclosure and complexity appeared immediately
after the floor surface indicates that the vertical elements and visual diversity of the
landscape are important for walking satisfaction.
Most of the visual elements in the streetscape had a nonlinear relationship with
walking satisfaction. People generally preferred streets with a low proportion of road,
and the walking satisfaction score rose with the proportion of sidewalks. In other words,
to create a satisfying walking environment, wider sidewalks are better. The fact that
a low road area and high sidewalk area both positively affected walking satisfaction
indicates that a large number of vehicles on the road increases the difficulty of creating
a comfortable walking environment [56]. In addition, people preferred pedestrian-only
streets, so road-oriented urban planning should be avoided, and safe and comfortable
walking environments should be created instead.
For the proportion of the street furniture, once it exists, most of the Shapley values
were positive, indicating that the presence of street facilities positively affects pedestrian
satisfaction, but streets with a street furniture area ratio of 3 to 5% were not preferred.
For the enclosure, it had a negative effect on walking satisfaction, but the streets with
an enclosure of 0.7 or higher were found to be satisfactory for walking.
In terms of complexity and openness, satisfaction and dissatisfaction tended to vary
widely from person to person at the same street points. In general, people felt dissatisfied
in streets with a complexity of 1.5 or higher, and overall, low complexity had a positive
effect on walking satisfaction. In addition, people do not prefer streets where they can see
too much or too little sky, with most pedestrians reporting satisfaction on streets with a
middle value of openness. Therefore, people prefer streets with a moderate number of
things to see.
The percentage of green space was a major factor in improving satisfaction with the
pedestrian environment. Greenery is the most effective factor in improving the quality of the
street environment [50], with a larger area of green space correlating with better aesthetics
in the landscape. The green area ratio showed a positive correlation with satisfaction as a
whole, but negative Shapley values were found when the green area ratio was less than
about 1%. Thus, any amount of green space positively affects walking satisfaction, with
walking satisfaction increasing along with the percentage of green space.
For the proportion of buildings, the most negative effect was seen when buildings were
completely missing from the view. In general, a higher area ratio of buildings correlated
with a positive influence on walking satisfaction, but the maximum area ratio in the input
data, 90%, had a negative effect. The fact that most high building area ratios positively
affected pedestrian satisfaction is consistent with the results of a previous study showing
that the amount of walking increased as the number of buildings and the building coverage
ratio increased [33].
Looking at the results of our study, it is clear that when dealing with the streetscape and
pedestrian environments, not only the physical features that constitute the streetscape but
also the visual features should be considered. To create a satisfying walking environment, it
is necessary to have a low ratio of road, enclosure, and lanes. Moreover, a large amount of
greenery, proportion of buildings, and width of the sidewalk are needed. For the proportion
Sustainability 2022, 14, 5730 19 of 21

of street furniture, complexity, and openness, it is necessary to estimate the value that shows
optimal walking satisfaction. Pedestrians did not prefer walking in semi-industrial areas or
class II residential areas. Among the types of road, pedestrians preferred pedestrian-only
lanes and streets with no slope and a crosswalk. Among the personal characteristics, the
walking satisfaction of pedestrians who visited each day as part of their commute was
negative, whereas the purpose of passage affected walking satisfaction positively.

8. Conclusions
This study measured the constituent elements of the streetscape from the viewpoint of
visual cognitive characteristics and examined their relationships with walking satisfaction.
In the process, visual data were constructed using computer vision techniques. Then, we
used machine-learning prediction models and the SHAP algorithm, an XAI technique, to
analyze the complex interactions among the variables with nonlinear relationships.
We found that all the visual features in the streetscape (the urban design quality factors
and area ratios of the street environmental factors) had greater effects on the machine-
learning model performance than any of the physical features. Thus, for pedestrians to feel
satisfied with their walking environment, the composition of the visible elements is more
important than the physical elements on the street. To build a street for optimal pedestrian
satisfaction, it is necessary to view the street environment as an overall image and manage
the composition ratio of each element.
This study has shown the potential for using computer vision techniques in urban
design. In particular, the results of this study suggest that a satisfying pedestrian environ-
ment can be created by designing the streetscape. Therefore, the urban design could be
used to improve pedestrian environments, and our results can be used as basic data to
improve the quality of street landscapes.
However, this study has several limitations. First, because the street view image is a
360◦ panoramic image taken by the vehicle in the middle of the lane, so the viewpoints of
the pedestrian and the street view image do not match. Second, due to the characteristics
of the 360◦ panoramic image, distortion occurs, and the area could be measured lower than
the actual ratio. Nevertheless, the fact that the proportion of the sidewalk has the second
greatest influence on pedestrian satisfaction means that the sidewalk is just as important.
Third, we borrowed some of the models from Ewing and Handy [14] but failed to use
all of the urban design quality elements they claimed. Therefore, as further research, it
is necessary to examine which factors of the streetscapes affect pedestrian satisfaction by
individual characteristics and to find the various ways to create a pedestrian environment
with high satisfaction.

Author Contributions: Conceptualization, J.L. and J.P.; methodology, J.L. and D.K.; software, J.L.;
validation, J.P.; formal analysis, J.L. and J.P.; investigation, J.L. and J.P.; resources, D.K. and J.P.; data
curation, J.L.; writing—original draft preparation, J.L.; writing—review and editing, D.K. and J.P.;
visualization, J.L. and D.K.; supervision, J.P.; project administration, J.P.; funding acquisition, J.P. All
authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by the National Research Foundation of Korea (NRF) grant
funded by the Korea government (MSIT) (No. NRF-2019R1A2C1088467).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available on request from the
corresponding author.
Conflicts of Interest: The authors declare no conflict of interest.
Sustainability 2022, 14, 5730 20 of 21

References
1. Suh, J.; Choi, Y. Research on the Visual Cognitivity of Urban Plaza—Focused on preference and complexity. Korea Soc. Des. Trend
2012, 34, 197–206.
2. Lim, S. Environmental Psychology and Human Behavior: Human-Friendly Environment Design; Bomoondang: Seoul, Korea, 2007.
3. Kandt, J.; Batty, M. Smart cities, big data and urban policy: Towards urban analytics for the long run. Cities 2021, 109, 102992.
[CrossRef]
4. Gong, Z.; Ma, Q.; Kan, C.; Qi, Q. Classifying Street Spaces with Street View Images for a Spatial Indicator of Urban Functions.
Sustainability 2019, 11, 6424. [CrossRef]
5. Ibrahim, M.R.; Haworth, J.; Cheng, T. Understanding cities with machine eyes: A review of deep computer vision in urban
analytics. Cities 2020, 96, 102481. [CrossRef]
6. Ma, X.; Ma, C.; Wu, C.; Xi, Y.; Yang, R.; Peng, N.; Zhang, C.; Ren, F. Measuring human perceptions of streetscapes to better inform
urban renewal: A perspective of scene semantic parsing. Cities 2021, 110, 103086. [CrossRef]
7. Meng, L.; Wen, K.-H.; Zeng, Z.; Brewin, R.; Fan, X.; Wu, Q. The Impact of Street Space Perception Factors on Elderly Health
in High-Density Cities in Macau—Analysis Based on Street View Images and Deep Learning Technology. Sustainability 2020,
12, 1799. [CrossRef]
8. Choi, I.J.; Jo, H.D. A Study on the Form-Element of Buildings Affecting in Street Spaces. J. Korean Assoc. Geogr. Inf. Stud. 2010, 13,
16–27.
9. Kim, J.G. A Study on Analysis of Recognition and Preference in Urban Landscape—A Quantatative Experimental Analysis for
Subjected Streetscapes. J. Korean Soc. Civ. Eng. 2005, D25, 305–309.
10. Proshansky, H.M. The city and self-identity. Environ. Behav. 1978, 10, 147–169. [CrossRef]
11. Lee, I. Development of Pedestrian Path-Choice Model in Urban Residential Area—Comparison of Importance, Satisfaction, and
Environmental Tradeoff Models. J. Urban Des. Inst. Korea Urban Des. 2000, 1, 63–78.
12. Hahm, Y.; Yoon, H.; Choi, Y. The effect of built environments on the walking and shopping behaviors of pedestrians; a study with
GPS experiment in Sinchon retail district in Seoul, South Korea. Cities 2019, 89, 1–13. [CrossRef]
13. Lee, S.; Han, M.; Rhee, K.; Bae, B. Identification of Factors Affecting Pedestrian Satisfaction toward Land Use and Street Type.
Sustainability 2021, 13, 10725. [CrossRef]
14. Ewing, R.; Handy, S. Measuring the Unmeasurable: Urban Design Qualities Related to Walkability. J. Urban Des. 2009, 14, 65–84.
[CrossRef]
15. Tang, J.; Long, Y. Measuring visual quality of street space and its temporal variation: Methodology and its application in the
Hutong area in Beijing. Landsc. Urban Plan. 2019, 191, 103436. [CrossRef]
16. Ernawati, J.; Adhitama, M.S.; Sudarmo, B.S. Urban Design Qualities Related Walkability in a Commercial Neighbourhood.
Environ. Behav. Proc. J. 2016, 1, 242–250. [CrossRef]
17. Müller, A.C.; Guido, S. Introduction to Machine Learning with Python: A Guide for Data Scientists, 1st ed.; O’Reilly Media, Inc.:
Sebastopol, CA, USA, 2016.
18. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Science &
Business Media: New York, NY, USA, 2009.
19. Bzdok, D.; Altman, N.; Krzywinski, M. Statistics versus machine learning. Nat. Methods 2018, 15, 233–234. [CrossRef]
20. Yoo, J.-E. Predictor exploration via group lasso: Focusing on middle school students’ life satisfaction. Stud. Korean Youth 2017, 28,
127–149.
21. Yoo, J.-E.; Rho, M. Predictive Modeling of Students Creativity via Elastic Net. SNU J. Educ. Res. 2018, 27, 185–205.
22. Suman, R.R.; Mall, R.; Sukumaran, S.; Satpathy, M. Extracting State Models for Black-Box Software Components. J. Object Technol.
2010, 9, 79–103. [CrossRef]
23. Gunning, D.; Stefik, M.; Choi, J.; Miller, T.; Stumpf, S.; Yang, G.Z. XAI—Explainable artificial intelligence. Sci. Robot. 2019,
4, eaay7120. [CrossRef]
24. Lee, Y.-G.; Oh, J.-Y.; Kim, G. Interpretation of Load Forecasting Using Explainable Artificial Intelligence Techniques. Trans. Korean
Inst. Electr. Eng. 2020, 69, 480–485. [CrossRef]
25. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. arXiv 2017, arXiv:1705.07874.
26. Ribeiro, M.T.; Singh, S.; Guestrin, C. Why Should I Trust You? Explaining the predictions of any classifier. In Proceedings of the
22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August
2016; pp. 1135–1144.
27. Shapley, L.S. A Value for n-Person Games. In Contributions to the Theory of Games (AM-28); Kuhn, H.W., Tucker, A.W., Eds.;
Princeton University Press: Princeton, NJ, USA, 1953; Volume II, pp. 307–318.
28. Ding, C.; Cao, X.; Næss, P. Applying gradient boosting decision trees to examine non-linear effects of the built environment on
driving distance in Oslo. Transp. Res. Part A Policy Pract. 2018, 110, 107–117. [CrossRef]
29. Wieland, R.; Lakes, T.; Nendel, C. Using SHAP to interpret XGBoost predictions of grassland degradation in Xilingol, China.
Geosci. Model Dev. 2020, 1–28. [CrossRef]
30. Chen, L.; Yao, X.; Liu, Y.; Zhu, Y.; Chen, W.; Zhao, X.; Chi, T. Measuring Impacts of Urban Environmental Elements on Housing
Prices Based on Multisource Data—A Case Study of Shanghai, China. ISPRS Int. J. Geo-Inf. 2020, 9, 106. [CrossRef]
Sustainability 2022, 14, 5730 21 of 21

31. Parsa, A.B.; Movahedi, A.; Taghipour, H.; Derrible, S.; Mohammadian, A. Toward safer highways, application of XGBoost and
SHAP for real-time accident detection and feature analysis. Accid. Anal. Prev. 2020, 136, 105405. [CrossRef]
32. Kum, K.-J.; Jang, H.-Y.; Son, S.-N.; Hwang, Y.-J. Analysis of Pedestrian-Streetscape Image in Commercial District Using Structural
Equation Model. J. Korea Plan. Assoc. 2010, 45, 97–109.
33. Yun, N.Y.; Choi, C.G. Relationship between Pedestrian Volume and Pedestrian Environmental Factors on the Commercial Streets
in Seoul. J. Korea Plan. Assoc. 2013, 48, 135–150.
34. Kim, K.-R.; Lee, J.-S. Pedestrian Cognition and Satisfaction on the Physical Elements in Pedestrian Space. J. Urban Des. Inst. Korea
Urban Des. 2016, 17, 89–103.
35. Hu, L.; He, S.; Han, Z.; Xiao, H.; Su, S.; Weng, M.; Cai, Z. Monitoring housing rental prices based on social media: An integrated
approach of machine-learning algorithms and hedonic modeling to inform equitable housing policies. Land Use Policy 2019, 82,
657–673. [CrossRef]
36. Lu, Y.; Sarkar, C.; Xiao, Y. The effect of street-level greenery on walking behavior: Evidence from Hong Kong. Soc. Sci. Med. 2018,
208, 41–49. [CrossRef] [PubMed]
37. Yin, L.; Wang, Z. Measuring visual enclosure for street walkability: Using machine learning algorithms and Google Street View
imagery. Appl. Geogr. 2016, 76, 147–153. [CrossRef]
38. Zeng, L.; Lu, J.; Li, W.; Li, Y. A fast approach for large-scale Sky View Factor estimation using street view images. Build. Environ.
2018, 135, 74–84. [CrossRef]
39. Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep high-resolution
representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3349–3364. [CrossRef]
40. Gunawardena, G.M.W.L. Evaluation of Streetscape Complexity Created by Streestscape Signage Using Different Objective
Analysis Techniques. In Proceedings of the 5th International Conference on Arts and Humanities, Copenhagen, Denmark, 24–27
June 2019; Volume 5, pp. 50–61.
41. Machado, P.; Romero, J.; Nadal, M.; Santos, A.; Correia, J.; Carballal, A. Computerized measures of visual complexity. Acta
Psychol. 2015, 160, 43–57. [CrossRef]
42. Tucker, C.; Ostwald, M.J.; Chalup, S.K. A method for the visual analysis of streetscape character using digital image processing.
In Proceedings of the 38th Annual Conference of the Architectural Science Association ANZAScA and the International Building
Performance Simulation Association, Contexts of Architecture, Launceston, Australia, 10–12 November 2004; Australia and New
Zealand Architectural Science Association: Sydney, Australia; Auckland, New Zealand, 2004; pp. 134–140.
43. Wang, H.; Duan, J.; Han, X.; Xiao, B. Research on image complexity evaluation method based on color information. In LIDAR
Imaging Detection and Target Recognition; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; p. 106051Q.
44. Lee, J.-Y.; Park, J. The Analysis of the Effect of Visual Information Volume on the Preference of Commercial Streetscape—Using
the Computer Vision Techniques. J. Urban Des. Inst. Korea Urban Des. 2020, 21, 75–86. [CrossRef]
45. Jacobs, J. The Death and Life of Great American Cities; Random House: New York, NY, USA, 1961.
46. Berlyne, D.E. Studies in the New Experimental Aesthetics: Steps toward an Objective Psychology of Aesthetic Appreciation; Hemisphere:
New York, NY, USA, 1974.
47. Rapoport, A. History and Precedent in Environmental Design; Kluwer Academic Publishers/Plenum Press: New York, NY,
USA, 1990.
48. Ewing, R.; Clemente, O. Measuring Urban Design: Metrics for Livable Places; Island Press: Washington, DC, USA, 2013.
49. Jacobs, A.; Appleyard, D. Toward an Urban Design Manifesto. J. Am. Plan. Assoc. 1987, 53, 112–120. [CrossRef]
50. Wolf, K.L. Business district streetscapes, trees, and consumer response. J. For. 2005, 103, 396–400.
51. Cao, K.; Guo, H.; Zhang, Y. Comparison of Approaches for Urban Functional Zones Classification Based on Multi-Source
Geospatial Data: A Case Study in Yuzhong District, Chongqing, China. Sustainability 2019, 11, 660. [CrossRef]
52. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794.
53. Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst.
2014, 41, 647–665. [CrossRef]
54. Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S. From local
explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 56–67. [CrossRef] [PubMed]
55. Lundberg, S.M.; Erion, G.; Lee, S. Consistent individualized feature attribution for tree ensembles. arXiv 2018, arXiv:1802.03888.
56. Ji, W.S.; Koo, Y.S.; Jwa, S.H. A Study on Satisfaction for Pedestrian Environment; Gyeonggi Research Institute: Suwon, Korea, 2008;
pp. 3–4.

You might also like