Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Atmospheric Environment 43 (2009) 5075–5084

Contents lists available at ScienceDirect

Atmospheric Environment
journal homepage: www.elsevier.com/locate/atmosenv

A prediction-based approach to modelling temporal and spatial variability


of traffic-related air pollution in Montreal, Canada
Dan L. Crouse a, *, Mark S. Goldberg b, Nancy A. Ross a
a
Department of Geography, McGill University, 805 Sherbrooke St. West, Burnside Hall, Room 705, Montreal, Quebec H3A 2K6, Canada
b
Department of Medicine, McGill University, Canada

a r t i c l e i n f o a b s t r a c t

Article history: Concentrations of traffic-related air pollution can be highly variable at the local scale and can have substantial
Received 7 August 2008 seasonal variability. This study was designed to provide estimates of intra-urban concentrations of ambient
Received in revised form nitrogen dioxide (NO2) in Montreal, Canada, that would be used subsequently in health studies of chronic
15 June 2009
diseases and long-term exposures to traffic-related air pollution. We measured concentrations of NO2 at 133
Accepted 24 June 2009
locations in Montreal with passive diffusion samplers in three seasons during 2005 and 2006. We then used
land use regression, a proven statistical prediction method for describing spatial patterns of air pollution, to
Keywords:
develop separate estimates of spatial variability across the city by regressing NO2 against available land-use
Nitrogen dioxide
Land use regression variables in each of these three periods. We also developed a ‘‘pooled’’ model across these sampling periods
Geographic information systems to provide an estimate of an annual average. Our modelling strategy was to develop a predictive model that
maximized the model R2. This strategy is different from other strategies whose goal is to identify causal
relationships between predictors and concentrations of NO2.
Observed concentrations of NO2 ranged from 2.6 ppb to 31.5 ppb, with mean values of 12.6 ppb in
December 2005, 14.0 ppb in May 2006, and 8.9 ppb in August 2006. The greatest variability was observed
during May. Concentrations of NO2 were highest downtown and near major highways, and they were
lowest in the western part of the city. Our pooled model explained approximately 80% of the variability
in concentrations of NO2. Although there were differences in concentrations of NO2 between the three
sampling periods, we found that the spatial variability did not vary significantly across the three
sampling periods and that the pooled model was representative of mean annual spatial patterns.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction differences in average annual concentrations of ambient nitrogen


dioxide (NO2) of more than 50 mg/m3 between sampling locations
One of the most pressing problems in the investigation of the less than 50 m apart from each other. Indeed, studies have shown
effects on health of ambient air pollution is the lack of high quality that spatial variability of ambient air pollution can be greater within
data on personal exposures (Briggs, 2007). In earlier studies, inves- cities than between cities (Briggs, 2000; Zhu et al., 2002; Jerrett et al.,
tigators estimated variability in exposure to ambient air pollution 2005a). Intra-urban variability in ambient air pollution may be
between cities with data collected by government agencies (Dockery caused by a variety of factors, including the mixing of pollutants, local
et al., 1993; Pope et al., 2002). More recently there has been growing wind patterns (Seaman, 2000), patterns of traffic, and land use.
interest in assessing exposure at the finer scale of neighbourhoods In addition to high spatial variability, there is usually substantial
within the same metropolitan area (Jerrett et al., 2005a; Marshall intra-urban seasonal variability in concentrations of air pollution
et al., 2008). (Ackerman and Knox, 2003). Seasonal variability may occur in cities
that experience seasonal differences in patterns of urban heating
1.1. Spatial and seasonal variability of ambient air pollution and volume of traffic that may also be related to changing weather
conditions (Andreescu and Frost, 1998; Environment Canada,
Concentrations of ambient air pollution can be highly variable at 2004); hours of sunlight, temperature, wind speed and direction,
the local scale. For example, Hewitt (1991) observed, in Lancaster, UK, and amount and type of precipitation all influence the diffusion and
dispersion of ambient pollutants (McGregor, 1999). The presence of
heavy clouds, for example, reduces the amount of incoming ultra-
* Corresponding author. Tel.: þ1 514 398 1592; fax: þ1 514 398 7437. violet radiation, thus limiting photochemical reactions that
E-mail address: dan.crouse@mail.mcgill.ca (D.L. Crouse). produce secondary pollutants, such as ozone (Jacobson, 2002).

1352-2310/$ – see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.atmosenv.2009.06.040
5076 D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084

In Montreal, Quebec, over the last five years, data from fixed-site disease, which require annual estimates of NO2, as opposed to
air pollution monitoring stations located across the city showed assessments of short-term exposures and acute health effects.
that mean concentrations of NO2 varied between 12 ppb in summer Important local variations in ambient pollution were found in all
months to over 23 ppb in winter months (Environment Canada, of the previous studies that used land use regression. With the
National Air Pollution Surveillance (NAPS) data, available: www. exception of a study by Wheeler et al. (2008), however, all previous
etc-cte.ec.gc.ca/napsstations/Default.aspx). Wheeler et al. (2008) models were based on measurements of pollution at one point in
and Jerrett et al. (2009) also observed nearly two-fold differences time. As we mentioned, concentrations of ambient NO2 in urban
in concentrations of NO2 between seasons in Windsor and Toronto, areas usually vary across seasons (Ackerman and Knox, 2003). What
Ontario. Thus, it seems clear that in the context of estimating remains unclear, however, is to what extent the spatial patterns of
ambient air pollution, especially for the purposes of estimating ambient NO2 remain consistent across seasons. As we conducted
chronic health effects, that both spatial and seasonal variability measurements in different seasons, we were able to explore these
must be considered. temporal patterns and identify whether one model created using an
annual average can describe adequately patterns of NO2 in all parts
1.2. Techniques for modelling intra-urban air pollution patterns of the city and at all times of the year. Our modelling strategy was
predictive rather than explanatory, in that our primary goal was to
A number of techniques have been developed in the last decade create a model that could reliably predict concentrations of NO2,
to assess intra-urban exposure to air pollution, including, among rather than to identify specific causal relationships between
others, dispersion models (Bellander et al., 2001), proximity-based individual predictor variables and concentrations of ambient NO2.
assessments (Venn et al., 2000), geostatistical interpolation, such
as kriging and inverse distance weighting (Jerrett et al., 2001; 2. Materials and methods
Marshall et al., 2008), and land use regression (Briggs et al., 2000)
(see Jerrett et al., 2005b for a review of the techniques). Land use Montreal is the second largest metropolitan area in Canada
regression is a statistical prediction method that estimates in two- (population 3.6 million people (Statistics Canada, 2006)) and
dimensional space, concentrations of pollution from measurements ambient air pollution has been shown to vary spatially (Gilbert
taken at specific locations within an urban area. It has proved to be et al., 2005). Montreal has generally lower concentrations of air
more effective for describing spatial variability than dispersion pollution than Canada’s largest city, Toronto, and other large cities
models and methods of interpolation (Briggs et al., 1997, 2000; in the United States, including Chicago, New York, and Phila-
Lebret et al., 2000; Hoek et al., 2001). delphia (Ontario Ministry of the Environment, 2006). The city has
In land use regression, a spatially dense measurement campaign a temperate climate, with mean daily temperatures during 1971–
of concentrations of pollutants is conducted within a well-defined 2000 in January and July of approximately 10  C (minimum of
geographic area. The prediction model then incorporates land use, 35  C) and 22  C (maximum of 35  C), respectively (Environ-
road and population densities, and characteristics of vehicular ment Canada, Online Climate data: www.climate.weatheroffice.ec.
traffic. Land use regression models have been shown to explain gc.ca/climate_normals/index_e.html).
between 50% and 80% of the spatial variability in fine particulate
matter (particles with aerodynamic diameters under 2.5 m; PM2.5) 2.1. Monitoring of ambient air pollutants
and NO2 in several European cities (Briggs et al., 1997, 2000; Brauer
et al., 2003; Rosenlund et al., 2008), in American cities (Ross et al., We developed the land use regression model by first conducting
2006, 2007; Moore et al., 2007), and in Canadian cities (Gilbert et al., a series of sampling campaigns throughout the study area to estimate
2005; Sahsuvaroglu et al., 2006; Henderson et al., 2007; Jerrett et al., integrated two-week concentrations of NO2 at individual points in
2007; Wheeler et al., 2008). Despite the development of land use space. We measured concentrations of NO2 using two-sided Ogawa
regression models for a number of different cities using very similar passive samplers (Ogawa and Co., USA). We deployed the samplers in
data sources and methods, it has been argued that city-specific three periods: November/December 2005 (to capture concentrations
models are not readily transferable to other cities, given the inherent in cold weather), April/May 2006 (‘‘temperate’’ weather), and August
differences in meteorology, local topography, land use, and patterns 2006 (‘‘hot’’ weather). The Ogawa samplers have not been tested in
of traffic between places (Briggs, 2007; Poplawski et al., 2008). conditions below 10  C and there are no published studies of their
Data related to traffic patterns and street networks are key functionality in colder temperatures. We thus deployed the samplers
components of land use regression given that automobiles and in December, rather than in January or February, before Montreal’s
trucks are major contributors to air pollution through direct emis- temperatures became too cold for normal operation of the devices.
sions of nitrogen oxides (NOx), carbon monoxide, carbon dioxide, The samplers were installed at a height of 2.5 m above ground
sulphur dioxide, volatile organic compounds (VOCs), polycyclic and were attached to street light poles, hydro-electric poles, or
aromatic hydrocarbons, and particulates. In metropolitan Montreal, parking signs, usually near the sidewalk of the closest road. The
for example, 85% of NOx emissions and 43% of VOCs have been geographic coordinates of each sampling location were recorded
attributed to transportation (King et al., 2005). Despite this broad with a Garmin eTrex Legend Cx global positioning system (accurate
mixture of pollutants originating from road traffic, NO2 is recognized to between 5 m and 15 m).
as a good indicator of traffic-related pollution due to its demon-
strated co-locational association with other pollutants (Nieu- 2.1.1. Location and frequency of sampling
wenhuijsen, 2000; Brunekreef and Holgate, 2002; Beckerman et al., NO2 has high spatial variability and so a relatively dense sampling
2008; Wheeler et al., 2008). network was used. The locations of the samplers were selected
with a population-weighted location-allocation model that placed
1.3. Objectives samplers in areas likely to have high spatial variability in traffic-
related pollution, and with high population densities (Kanaroglou
The objective of this study was to develop a land use regression et al., 2005). Approximately 20 samplers were added to capture
model in Montreal for describing intra-urban spatial patterns of concentrations in residential areas that appeared to be under-repre-
NO2 across seasons. The model that we developed is intended to be sented by the initial allocation of sampling sites. Results from previous
used for long-term exposure assessment in studies of chronic studies suggested that the precision of land use regression models
D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084 5077

depends more on the variability of the land use characteristics that is 2.2.1. Step 1: creation of spatial variables
captured by the sampling network rather than the total number of The spatial variables were created in ArcGIS 9.2 (Redlands, CA)
sampling sites (Ryan and LeMasters, 2007). Thus, we located a total of with the land use and traffic data in the 2006 DMTI CanMap
133 samplers across the island of Montreal, including in residential Streetfiles (Markham, ON). This dataset contains a street centreline
areas, industrial areas, parks, near and away from the shoreline, next road network with six road classifications and topographic infor-
to major roads and highways, and in areas with relatively high and low mation, including land-use classifications and building footprints.
population densities (Fig. 1). The minimum distance between any two The accuracy of these data ranges from 10 m to less than one metre.
neighbouring samplers was approximately 100 m and the maximum There are two kinds of GIS data, namely vector, which consist of
distance was just over three kilometres. The number and density of points, lines, polygons, and their associated attributes and, raster,
samplers used here is slightly larger than in other studies by Lebret which consist of grid-like, continuous surfaces in which individual
et al. (2000) in Amsterdam, Sahsuvaroglu et al. (2006) in Hamilton, pixels each represent a single attribute. The first step in this
and Jerrett et al. (2007) in Toronto. analysis was to convert the source vector-format datadwhich
describe land uses and roads as polygons and lines respectively – to
2.1.2. Analysis of samplers binary, raster surfaces at a resolution of 5 m  5 m cells. For
The Ogawa passive samplers use triethanolamine-impregnated example, one data layer would include cells assigned as ‘‘industrial’’
filters as an absorbent and diffusion draws air into the sampler or ‘‘not industrial’’, while another data layer would include cells
where a reagent absorbs the NO2. Sampling begins when the assigned either as ‘‘highway’’ or ‘‘not highway’’. This was repeated
sampler is exposed to air and ends when the sampler is placed in for two road categories and six land use categories. We created
a closed, impermeable container. The samplers were analyzed at an similar surfaces in which each cell was assigned either the traffic
Environment Canada laboratory using ion chromatography (Gilbert count of the underlying segment of road or a value of ‘‘non-road’’.
et al., 2003). Next, we summed the total area (i.e., count of 5 m  5 m cells)
within buffers of multiple radii around every other cell in the entire
2.2. Land-use regression models and spatial modelling study area. In this way, we were able to select any 5 m  5 m loca-
tion and immediately identify the total area of each land use and
We developed the land use regression model in three steps. (and total length of major roads or total traffic count on all major
First, we developed a set of spatial variables that described char- roads) within various buffer distances of that cell. This process was
acteristics of land use and road densities within buffers of various repeated with buffers of 100 m, 300 m, 500 m, and 750 m.
radii surrounding each sampler location. Second, we developed We also used data from the 2001 Canadian census to describe
regression models to determine the associations between these patterns of population density. We converted census tract bound-
variables and the observed concentrations of NO2, and subse- aries to population-weighted centroids and then used kernel density
quently to create a prediction equation to estimate concentrations estimation (Baily and Gatrell, 1995) to create a density surface.
of NO2 at locations in which measurements were not made. Third, Additionally, we used data from the 2005 National Pollutant Release
we computed a spatial map to show visually, and numerically, the Inventory (NPRI), which is Canada’s legislated inventory of pollut-
predicted concentrations of NO2. ants released and disposed of by industrial, institutional, and

Fig. 1. Locations of Ogawa passive samplers in Montreal, Quebec, 2005–2006.


5078 D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084

commercial facilities. This database includes emissions and address 2.3. Comparisons of the models across the sampling periods
information from all facilities that manufacture or process any of the
NPRI-listed substances (which includes NO2) (see www.ec.gc.ca/pdb/ To assess whether the mean of NO2 across our three sampling
npri/npri_home_e.cfm for more specific information about reporting sessions could serve as an adequate proxy for a mean annual
requirements). concentration, we estimated Pearson correlation coefficients of the
We used traffic count data generated from a region-wide observed concentrations at the 129 sampler locations between the
transportation model developed in TransCAD, which is a compre- three seasons. We also intersected 5000 randomly generated points
hensive, GIS-based, urban transportation modelling software. The with each of the four predicted surfaces and estimated the Pearson
travel demand data were based on the 1998 origin destination correlation coefficients between these. This step enabled us to
survey undertaken by the Agence Métropolitaine de Transport de assess whether locations that had relatively high (or low) concen-
Montréal. The street network data with link travel times were trations of NO2 in one season had similarly ranked concentrations
acquired through DMTI Spatial Inc. (Markham, ON) and contained in other seasons.
approximately 135,000 bi-directional links. These data present
a detailed representation of peak morning automobile flow (6 am 3. Results
to 9 am) on a link-by-link basis for both primary provincial high-
ways and major urban streets. 3.1. Environmental sampling
Lastly, we calculated also for each cell the straight-line distance
to the shoreline, to the nearest highway, and to the known point We obtained valid observations from all three sampling periods at
sources of NO2, namely the 33 facilities in the NPRI. Overall, we 130 locations. Samplers were stolen or damaged on at least one
produced 47 different variables. occasion at three locations and so these locations were excluded from
the analysis. Additionally, data from one sampler were discarded due
2.2.2. Step 2: statistical modelling to atypical circumstances. This sampler had been placed near an
We created four separate models using as the dependent intersection of two single-lane streets in a residential block. In May
variables the three sets of two-week integrated concentrations of 2006, construction activity led to the redirection of traffic from a major
NO2 from the three sampling periods (i.e., December, May, August) artery onto this side-street, which increased traffic and idling by
as well as the mean of these three periods. This mean value vehicles that normally would not have used that street. That sampler
was meant to represent an approximate annual estimate at each measured unusually high concentrations of NO2, including the highest
location. We developed our regression models using the natural observed value (36 ppb) of any location over all three sampling
logarithm of NO2 because the data were distributed lognormally. sessions. Observed concentrations of NO2 from the remaining 129
Our approach to modelling was to identify the model that locations ranged from approximately 2.6 ppb to 31.5 ppb, with mean
explained the most variability (as estimated from the uncorrected values of 12.6 ppb in the December 2005 sampling session, 14.0 ppb
R2) of the natural logarithm of NO2. To meet this objective, we in May 2006, and 8.9 ppb in August 2006 (Table 1). The greatest
included all variables measured except those that were perfectly variability was observed during the May 2006 sampling period.
collinear. This modelling strategy is different from that of other
investigators (Henderson et al., 2007; Jerrett et al., 2007; Ross et al., 3.2. Model selection and mapping
2007) who generally attempted to develop parsimonious models by
using various model selection procedures (e.g., forward stepwise). The magnitude and direction (þ/) of the coefficients in each
Our strategy is justified by the objective of the study, namely to multivariable model vary slightly by season (Table 2). This is not
develop a model that maximized prediction. Moreover, we were unexpected, because weather affects relationships between
not overly concerned with the interpretation of the individual different land uses and topographic features and local concentra-
regression coefficients, only the resulting linear predictor. We thus tions of pollution. For example, differences in concentrations of
included variables measuring the same construct but with different residential heating between winter and summer alters the size of
buffer sizes. In addition, statistical significance and expected sign the regression coefficient of the residential land use variables
(þ or ) of individual coefficients were not criteria for removal without altering the overall predicted spatial patterns of NO2.
from the model, as excluding non-significant variables only Of the four multivariable models, the pooled model created with
reduced the predictive power of the model. Another advantage of the mean of the three observation periods had the highest R2 (0.80)
our procedure is that it is transparent and reproducible and used whereas the model for May had the lowest (0.72). The R2 for August
the same set of variables across the three seasons so that compar- and December were 0.72 and 0.77, respectively. These results
ison with the average, or ‘‘pooled’’ model, is simplified. We applied suggest that the collection of land use and traffic predictors
standard regression diagnostics to identify possible outliers and to included in our models collectively explain approximately 70–80%
ensure that the models conformed with the assumptions of linear of the variability of NO2 in Montreal.
regression. Additionally, we conducted sensitivity analyses to
assess how the models performed after removing randomly 15% of 3.2.1. Diagnostics and model validation
the observations. We examined several different diagnostic statistics and graphical
plots to ensure validity of each of the four models. Specifically, we
2.2.3. Step 3: spatial modelling
The four multivariable models were processed into individual Table 1
spatial surfaces by calculating the linear predictor of NO2 as Results of the two-week sampling of NO2 across three sampling periods at 129 sites
exp(b0 þ b1x1 þ . þ bixi) , where bi are the estimated regression in Montreal, 2005–2006.
coefficients for the independent predicting variables, xi. These December May August Average of the three seasons
predictions were computed in GIS using the 47 spatial data layers Mean 12.6 14.0 8.9 11.9
and each predicted pollution map was computed at a 5 m resolution. Median 12.7 13.8 8.8 11.8
Functionally, this translated into calculating the predicted concen- Std. Deviation 2.6 4.3 3.1 3.0
tration of NO2 at every 5 m  5 m location in our study area based on Minimum 6.7 6.1 2.6 5.4
Maximum 20.1 31.5 16.9 19.0
the measured physical characteristics of its surrounding areas.
D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084 5079

Table 2
Results of multivariable regression models, Montreal, 2005–2006 (*NPRI ¼ National Pollutant Release Inventory Facility; N/A ¼ variable excluded due to extreme
multicollinearity).

Variable Units Buffer December 2005 May 2006 August 2006 Average of the three
radius (m) periods

Regression Standard Regression Standard Regression Standard Regression Standard


coefficient error coefficient error coefficient error coefficient error
Commeral land use ha 100 0.061 0.09 0.148 0.14 0.106 0.18 0.017 0.10
300 0.015 0.02 0.033 0.04 0.012 0.04 0.010 0.03
500 0.002 0.01 0.023 0.02 0.028 0.03 0.016 0.02
750 0.004 0.01 0.007 0.01 0.013 0.01 0.008 0.01

Industrial land use ha 100 0.041 0.05 0.239 0.09 0.159 0.11 0.156 0.06
300 0.001 0.01 0.025 0.02 0.023 0.03 0.015 0.02
500 0.010 0.01 0.005 0.01 0.007 0.02 0.005 0.01
750 0.002 0.00 0.003 0.00 0.005 0.01 0.001 0.00

Parks ha 100 0.059 0.07 0.112 0.11 0.069 0.13 0.011 0.08
300 0.017 0.02 0.045 0.03 0.036 0.04 0.002 0.02
500 0.003 0.01 0.025 0.02 0.009 0.02 0.007 0.01
750 0.007 0.00 0.006 0.01 0.005 0.01 0.006 0.00

Residential land use ha 100 0.010 0.04 0.037 0.07 0.063 0.08 0.002 0.05
300 0.012 0.01 0.019 0.02 0.028 0.02 0.002 0.01
500 0.006 0.01 0.008 0.01 0.004 0.01 0.000 0.01
750 0.001 0.00 0.002 0.00 0.004 0.00 0.002 0.00

Density of buildings Buildings/ha 100 0.017 0.09 0.070 0.14 0.125 0.17 0.007 0.10
300 0.011 0.02 0.055 0.03 0.036 0.04 0.016 0.02
500 0.006 0.01 0.006 0.01 0.024 0.02 0.006 0.01

Population densisty People 500 0.000 0.00 0.000 0.00 0.003 0.00 0.001 0.00
(000s)/km2 1500 0.000 0.02 0.000 0.00 0.026 0.00 0.002 0.00
2500 0.000 0.02 0.000 0.00 0.004 0.00 0.014 0.00

Length of primary hwy m 100 0.147 0.26 0.005 0.43 0.136 0.53 0.075 0.31
300 0.021 0.09 0.287 0.14 0.075 0.17 0.087 0.10
500 N/A N/A N/A N/A
750 0.016 0.03 0.042 0.05 0.051 0.06 0.035 0.04
1000 0.009 0.01 0.038 0.02 0.041 0.03 0.028 0.02
50-200 0.096 0.10 0.214 0.16 0.000 0.20 0.118 0.12
(annulus)
50-500 0.032 0.05 0.043 0.08 0.019 0.09 0.000 0.06
(annulus)

Length of major road m 100 0.047 0.04 0.092 0.07 0.012 0.09 0.053 0.05
300 0.011 0.01 0.032 0.02 0.017 0.03 0.004 0.02
500 0.000 0.01 0.020 0.01 0.002 0.02 0.007 0.01
750 0.006 0.01 0.008 0.01 0.008 0.01 0.001 0.01
1000 0.001 0.00 0.005 0.00 0.008 0.01 0.000 0.00

Open water ha 100 0.083 0.19 0.040 0.31 0.134 0.38 0.023 0.23
300 0.065 0.03 0.001 0.05 0.045 0.06 0.039 0.03
500 0.028 0.01 0.007 0.02 0.001 0.02 0.010 0.01
750 0.003 0.00 0.003 0.01 0.008 0.01 0.001 0.00

Traffic on primary hwy Count 100 0.037 0.00 0.011 0.00 0.039 0.00 0.027 0.00
(00,000s) 300 0.001 0.00 0.014 0.00 0.000 0.00 0.006 0.00
500 0.000 0.00 0.006 0.00 0.001 0.00 0.003 0.00

Traffic on major road 100 0.048 0.00 0.058 0.00 0.058 0.00 0.029 0.00
300 0.009 0.00 0.012 0.00 0.017 0.00 0.003 0.00
500 0.004 0.00 0.012 0.00 0.003 0.00 0.003 0.00

Distance to shoreline m (00s) N/A 0.003 0.00 0.005 0.00 0.008 0.00 0.005 0.00
Distance to NPRI* N/A 0.001 0.00 0.002 0.00 0.004 0.00 0.002 0.00
Distance to primary hwy N/A 0.004 0.00 0.005 0.00 0.007 0.00 0.005 0.00

tested for homogeneity of variance in the residuals, normality of the daily values during our total sampling period) at the nine fixed-site
residuals, and autocorrelation in the residuals. We found no heter- NAPS stations for which data were available (Fig. 3). This plot too,
oscedasticity or autocorrelation and we also found that the residuals showed good agreement between our predictions and the obser-
had a reasonably normal distribution, all of which demonstrate vations at the fixed sites across Montreal.
that the models did not violate any of the assumptions of multiple Multicollinearity is a problem that limits the ability to separate
regression. We also sought to identify influential or outlier cases by and assess the partial effects of correlated independent variables, but
examining Cook’s D and by inspecting plots of the observed versus it does not hinder the ability to assess their joint effects. As such, we
the predicted values. Each model produced consistent predictions did not test for multicollinearity in our process of model evaluation
with almost no important outliers (Fig. 2). We produced also a plot of (other than removing variables identified as being perfectly collinear,
the predicted values versus observed mean annual values (mean as explained above). Although high correlation between two or more
5080 D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084

a December 2005 b May 2006

30.0 30.0

25.0 25.0

Observed NO2 (ppb)


Observed NO2 (ppb)

20.0 20.0

15.0 15.0

10.0 10.0

R Sq Linear = 0.768 R Sq Linear = 0.709


5.0 5.0

5.0 10.0 15.0 20.0 25.0 30.0 5.0 10.0 15.0 20.0 25.0 30.0
Predicted NO2 (ppb) Predicted NO2 (ppb)

c August 2006 d Average of the three sampling periods

30.0 30.0

25.0 25.0
Observed NO2 (ppb)

Observed NO2 (ppb)

20.0 20.0

15.0 15.0

10.0 10.0

R Sq Linear = 0.707 R Sq Linear = 0.782


5.0 5.0

5.0 10.0 15.0 20.0 25.0 30.0 5.0 10.0 15.0 20.0 25.0 30.0
Predicted NO2 (ppb) Predicted NO2 (ppb)

Fig. 2. Comparison of observed (Ogawa) and predicted concentrations of NO2 (n ¼ 129) across three sampling periods, Montreal, 2005–2006. a) December 2005, b) May 2006,
c) August 2006, d) Average of the three sampling periods.

predictor variables may affect the sign and magnitude of the indi- surface maps have strong face-validity, with the highest predicted
vidual regression coefficients and their standard errors, it does not concentrations of NO2 appearing along highway corridors and in the
violate any of the assumptions of multiple regression and it does not downtown of the city (observed more easily in Fig. 5). As well,
affect overall prediction (Mason and Perreault, 1991; Allison, 1999). the areas with the highest predicted concentrations appear to be the
As a further test of validity, we recomputed each model with among most densely-populated areas (see Fig. 1) compared to
a random selection of 85% of the observations and were able to the less densely-populated east and west ends of Montreal.
produce comparable results to those achieved from the full dataset.
Additionally, we created a surface map for the model of NO2 based 3.3. Comparison of seasonal and spatial variability
on the average from the three sampling periods and compared the
predicted estimates at the 15% of locations that had been excluded We explored the variability of spatial patterns of both observed
from the model with our observed concentrations at those loca- and predicted concentrations of NO2 across seasons. In both cases,
tions. Here we found Pearson correlation coefficients of w0.9 we found strong positive Pearson correlation coefficients (i.e., 0.73–
between the observed and predicted values. 0.81) between the values in each season (Table 3). These correla-
Surface maps of predicted concentrations of NO2 from each tions demonstrated that the locations characterised by relatively
model show very similar spatial patterns when categorized into high and low concentrations of NO2 remained consistent across
groups of relatively high and low values, despite differences in mean the sampling periods, so that the spatial variability did not vary
concentrations between the sampling periods (Fig. 4). Moreover, the appreciably according to sampling period.
D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084 5081

weather as compared to ‘‘cold’’ and ‘‘hot’’ weather. We observed


also that the spatial variability in ambient NO2 did not vary by
sampling period, suggesting that one land use regression model
based on the average of the three sampling periods may be used as
a reasonable proxy for an annual estimate.
The spatial variability of concentrations of NO2 was related directly
to characteristics of local geography, namely population density and
patterns of land use, vegetation, open space, and roads, and traffic. In
this context, some of the coefficients in our models are different
(in size and direction) from those reported by researchers working
in other cities. Many other coefficients, however, (e.g., population
density, industrial land use within 750 m, distance to shoreline, traffic
density within w500 m) are comparable to those obtained in models
by other authors (Gilbert et al., 2005; Ross et al., 2006; Sahsuvaroglu
et al., 2006; Henderson et al., 2007; Jerrett et al., 2007). The magnitude
of coefficients from land use regression models is related to the set of
variables included in the models, as well as to the absence of those
variables that have been excluded from the models. Some modelling
strategies, such as forward selection, will remove variables that are
Fig. 3. Comparison of observed (NAPS) and predicted concentrations of NO2 (n ¼ 9)
competing for the same variability through p-value selection criteria,
across three sampling periods, Montreal, 2005–2006. whereas our procedure was based on retaining all variables that
explained variability. Our models produced some of the highest
4. Discussion and conclusions predictions in the literature and use of forward variable selection
procedures with our data yielded concentrations of prediction similar
4.1. Principal findings to those found in the literature (data not shown).

We showed that there were considerable differences in mean, 4.2. Strengths and weaknesses
integrated two-week concentrations of ambient NO2 between the
three sampling periods. Additionally, we found that there was Our model of estimated mean annual concentrations of NO2
significantly more variability in concentrations of NO2 in temperate across Montreal reflects progress over the model described by Gilbert

Fig. 4. Surface maps of predicted concentrations of NO2 across sampling periods and their average, Montreal, 2005–2006.
5082 D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084

Fig. 5. Surface map of predicted concentrations of NO2 based on the average of the three sampling periods, Montreal, 2005–2006.

et al. (2005). In that study, fewer factors for predicting concentrations Gilbert et al. (2005) recommended incorporating data on
of NO2 were used and the model was based on samples of NO2 from industrial point sources to improve model performance, which was
approximately half as many locations (67) as were used here (129). achieved here with the NPRI data. It should be acknowledged,
Five of the seven variables included in that earlier study were related however, that the NPRI dataset that we used to identify point
to roads and traffic, whereas the present model is based on numerous sources of pollution is likely to be incomplete. Although it is a legal
characteristics of the landscape, including area of commercial requirement and mandatory under Canadian Law for industrial
and industrial space, building density, distance to the shoreline, and facilities to report to the NPRI, coverage may not be complete
proximity of known point sources of NO2, in addition to density of because smaller industrial facilities may not be aware or able to
roads and traffic. It could be argued, then, that the present model may provide these data. In addition, the data provided to NPRI are not
perform more reliably than the earlier model in areas of the city based usually on actual measurements but rather are engineering
characterised by more varied land uses and at increased distances calculations. We also did not consider wind direction or speed, both
from major highways and expressways. of which may influence the extent and direction of dispersion of
We believe that our variable selection procedure is optimal for pollution from these facilities. Nevertheless, the associations
the purpose of producing the most highly-predictive model, is between NPRI point sources and NO2 suggest that it is important to
reproducible, and is well-suited for the task of assigning estimates consider the specific industrial landscape and presence of point
of ambient exposures at the intra-urban scale. If, however, the source emitters when modelling ambient pollution.
goal of our study had been to inform policy for mitigating air The use of estimates of traffic counts is a strength of this study.
pollution, or for identifying specific patterns and densities of land We found that the predictive power of our models (as measured by
use that tend to be associated with higher concentrations of the R2) improved by approximately 2–4% after adding the traffic
pollution, or for determining independent predictors of air quality, data. Although these traffic data might be expected to necessarily
then another model selection procedure would have been used. improve model performance, in practice, their value-added to
prediction in land use regression has been mixed. For example, Ross
Table 3 et al. (2006) and Sahsuvaroglu et al. (2006) both found that traffic
Pearson correlation coefficients of observed and predicted concentrations of NO2 estimates in San Diego County, CA, and Hamilton, ON, respectively,
between sampling periods in Montreal, 2005–2006. contributed significantly to the improvement of the models. On the
December May August other hand, Henderson et al. (2007) found that variables describing
Observed concentrations at 129 sites
the length of roads proved to be as effective as variables describing
December 1.00 vehicle density for predicting pollutant concentrations in Vancou-
May 0.73 1.00 ver. Rosenlund et al. (2008) also found that variables based on
August 0.76 0.73 1.00 estimates of traffic counts in Rome did not improve their model
Predicted concentration at 5000 random points significantly. It is possible that the inconsistent findings related to
December 1.00 the importance of traffic counts in previous land use regression
May 0.75 1.00 models may have been due to competition of correlated variables
August 0.81 0.80 1.00
for explaining variance. Estimates of traffic counts can be difficult or
D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084 5083

expensive to collect, and may be unavailable in some cities, but our and testing in four contrasting urban environments. Sci. Total Environ. 253,
151–167.
findings suggest that these data will improve the predictive power
Briggs, D., Collins, S., Elliott, P., Fischer, P., Kingham, S., Lebret, E., et al., 1997.
of the model. Mapping urban pollution using GIS: a regression-based approach. Int. J. Geogr.
Inf. Sci. 11, 699–718.
Brunekreef, B., Holgate, S.T., 2002. Air pollution and health. Lancet 360, 1233–1242.
4.3. Implications Dockery, D.W., Pope, C.A., Xu, X., Spengler, J.D., Ware, J., Fay, M.E., et al., 1993. An
association between air pollution and mortality in six U.S. Cities. N. Engl. J. Med.
The results of this study showed that although concentrations of 329, 1753–1759.
Environment Canada, 2004. Residential Wood Heating: Summary results from 1999
ambient NO2 in Montreal vary throughout the year (due, perhaps, to 2002. Environment Canada, Ministère de l’Environnement du Québec, Ville
to seasonal changes in traffic volume, urban heating, and weather), de Montréal. Cat. No: EN154-27/2004-1E, ISBN: 0-662-38374-5.
the spatial patterns of these concentrations do not. It has already Gilbert, N.L., Goldberg, M.S., Beckerman, B., Brook, J.R., Jerrett, M., 2005. Assessing
spatial variability of ambient nitrogen dioxide in Montreal, Canada, with a land-
been shown that the principal source of ambient NO2 in Montreal is use regression model. J. Air Waste Manag. Assoc. 55, 1059–1063.
exhaust from vehicular traffic (King et al., 2005). The high corre- Gilbert, N.L., Woodhouse, S., Stieb, D.M., Brook, J.R., 2003. Ambient nitrogen dioxide
lations between both the seasonal observations and the seasonal and distance from a major highway. Sci. Total Environ. 312, 43–46.
Henderson, S.B., Beckerman, B., Jerrett, M., Brauer, M., 2007. Application of land use
estimates of NO2, as shown in Table 3, suggest that patterns of
regression to estimate long-term levels of traffic-related nitrogen oxides and
vehicular traffic in Montreal likely remain relatively consistent fine particulate matter. Environ. Sci. Technol. 41 (7), 2422–2428.
across seasons. The key implication of this finding is that we can Hewitt, C.N., 1991. Spatial variations in nitrogen dioxide levels in an urban area.
conclude that use of the mean of the observations from the three Atmos. Environ. 25B (3), 429–434.
Hoek, G., Fischer, P., Van Den Brandt, P., Goldbohm, S., Brunekreef, B., 2001. Esti-
sampling periods does not mask underlying spatial patterns. This mation of long term average exposure to outdoor air pollution for a cohort
lack of spatial variation between seasons suggests also that the study on mortality. J. Expos. Anal. Environ. Epidemiol. 11 (6), 459–469.
spatial patterns visible in the final land use regression surface map Jacobson, M.Z., 2002. Atmospheric Pollution: History, Science, and Regulation.
Cambridge University Press, Cambridge.
are generally representative of the spatial patterns throughout the Jerrett, M., Burnett, R.T., Kanaroglou, P., Eyles, J., Finkelstein, N., Giovis, C., et al.,
year, despite variations in actual concentrations across seasons. 2001. A GIS – environmental justice analysis of particulate air pollution in
Hamilton, Canada. Environ. Plan. A 33 (6), 955–973.
Jerrett, M., Arain, A., Kanaroglou, P., Beckerman, B., Crouse, D.L., Gilbert, N.L., et al.,
5. Source of financial support 2007. Modelling the intra-urban variability of ambient traffic pollution in
Toronto, Canada. J. Toxicol. Environ. Health A 70, 200–212.
Jerrett, M., Burnett, R.T., Ma, R., Pope III, C.A., Krewski, D., Newbold, K.B., et al.,
This study was supported financially through the Canadian
2005a. Spatial analysis of air pollution and mortality in Los Angeles. Epidemi-
Institutes for Health Research (CIHR). Dr. Goldberg gratefully ology 16 (6), 727–736.
acknowledges receipt of an Investigator Award from the CIHR, Jerret, M., Arain, A., Kanaroglou, P., Beckerman, B., Potoglou, D., Sahsuvaroglu, T., et
Nancy Ross gratefully acknowledges receipt of a New Investigator al., 2005b. A review and evaluation of intraurban air pollution exposure models.
J. Expos. Anal. Environ. Epidemiol. 15, 185–204.
Award from the CIHR (2003–2008), and Dan Crouse gratefully Jerrett, M., Finkelstein, M.M., Brook, J.R., Arain, M.A., Kanaroglou, P., Stieb, D.M., et
acknowledges receipt of a Canada Graduate Scholarship from the al., 2009. A cohort study of traffic-related air pollution and mortality in Toronto,
CIHR. Ontario, Canada. Environ. Health Perspect. 117, 772–777.
Kanaroglou, P.S., Jerrett, M., Morrison, J., Beckerman, B., Arain, M.A., Gilbert, N.L., et
al., 2005. Establishing an air pollution monitoring network for intra-urban
Acknowledgements population exposure assessments: a location-allocation approach. Atmos.
Environ. 39, 2399–2409.
King, N., Morency, P., Lapierre, L., 2005. Direction de santé publique de Montréal.
We thank Michael Jerrett (University of California, Berkeley) for Synth. Rep. Ser. 8, 3.
advice and suggestions throughout the development of this paper. Lebret, E., Briggs, D., Van Reeuwijk, H., Fischer, P., Smallbone, K., Harssema, H., 2000.
Small area variations in ambient NO2 levels in four European areas. Atmos.
We also thank Ian Haase for help with data collection, and Murtaza
Environ. 34, 177–185.
Haider and Timothy Spurr (Department of Civil Engineering and Mason, C.H., Perreault Jr., W.D., 1991. Collinearity, power, and interpretation of
Applied Mechanics, McGill University) for providing data from their multiple regression analysis. J. Market. Res. 28 (3), 268–280.
traffic estimation model. We are grateful to Dr. Jeffrey Brook and Marshall, J.D., Nethery, E., Brauer, M., 2008. Within-urban variability in ambient
air pollution: comparison of estimation methods. Atmos. Environ. 42 (6),1359–1369.
Sandy Benetti of Environment Canada for conducting the analysis McGregor, G.R., 1999. Basic meteorology. In: Holgate, S.T., Samet, J.M., Koren, H.S.,
of the Ogawa samplers for concentrations of NO2. Maynard, R. (Eds.), Air Pollution and Health. Academic Press, London.
Moore, D.K., Jerrett, M., Mack, W.J., Kunzli, N., 2007. A land use regression model for
predicting ambient fine particulate matter across Los Angeles, CA. J. Environ.
References Monit. 9, 246–252.
Nieuwenhuijsen, M.J., 2000. Personal exposure monitoring in environmental
Ackerman, S.T., Knox, J.A., 2003. Meteorology: Understanding the Atmosphere. epidemiology. In: Elliott, P., Wakefield, J., Best, N., Briggs, D. (Eds.), Spatial
Thompson Learning, Toronto. Epidemiology: Methods and Applications. Oxford University Press, Oxford.
Allison, D., 1999. Multiple Regression: A Primer. Pine Forge Press, Thousand Oaks. Ontario Ministry of the Environment, Environmental Monitoring and Reporting
Andreescu, M.-P., Frost, D.B., 1998. Weather and traffic accidents in Montreal, Branch. Air Quality in Ontario, 2005 Report. Queen’s Printer for Ontario, 2006.
Canada. Clim. Res. 9, 225–230. Pope III, C.A., Burnett, R.T., Thun, M.J., Calle, E.E., Krewski, D., Ito, K., et al., 2002. Lung
Baily, T.C., Gatrell, A.C., 1995. Interactive Spatial Data and Analysis. Pearson cancer, cardiopulmonary mortality, and long-term exposure to fine particulate
Education Ltd, Essex. air pollution. JAMA 287, 1132–1141.
Beckerman, B., Jerrett, J., Brook, J.R., Verma, D.K., Arain, M.A., Finkelstein, M., 2008. Poplawski, K., Gould, T., Setton, E., Allen, R., Su, J., Larson, T., et al., 2008. Intercity
Correlation of nitrogen dioxide with other traffic pollutants near a major transferability of land use regression models for estimating ambient concen-
expressway. Atmos. Environ. 42, 275–290. trations of nitrogen dioxide. J. Exposure Sci. Environ. Epidemiol. (advance
Bellander, T., Berglind, N., Gustavsson, P., Jonson, T., Nyberg, F., Pershagen, G., et al., online publication).
2001. Using geographic information systems to assess individual historical Rosenlund, M., Forastiere, F., Stafoggia, M., Porta, D., Perucci, M., Ranzi, A., et al.,
exposure to air pollution from traffic and house heating in Stockholm. Environ. 2008. Comparison of regression models with land-use and emissions data to
Health Perspect. 109 (6), 633–639. predict the spatial distribution of traffic-related air pollution in Rome.
Brauer, M., Hoek, G., van Vliet, P., Meliefste, K., Fischer, P., Gehring, U., et al., 2003. J. Exposure Sci. Environ. Epidemiol. 18 (2), 192–199.
Estimating long-term average particulate air pollution levels: application of traffic Ross, Z., English, P.B., Scalf, R., Gunier, R., Smorodinsky, S., Wall, S., et al., 2006.
indicators and geographic information systems. Epidemiology 14, 228–239. Nitrogen dioxide prediction in Southern California using land use regression
Briggs, D., 2007. The use of GIS to evaluate traffic-related pollution. Occup. Environ. modelling: potential for environmental health analyses. J. Exposure Sci.
Med. 64, 1–2. Environ. Epidemiol. 16, 106–114.
Briggs, D., 2000. Exposure assessment. In: Elliott, P., Wakefield, J., Best, N., Briggs, D. Ross, Z., Jerrett, M., Ito, K., Tempalski, B., Thurston, G., 2007. A land use regression
(Eds.), Spatial Epidemiology: Methods and Applications. Oxford University model for predicting fine particulate matter levels in the New York City region.
Press, Oxford. Atmos. Environ. 41, 2255–2269.
Briggs, D., de Hoogh, C., Gulliver, J., Wills, J., Elliott, P., Kingham, S., et al., 2000. A Ryan, P.H., LeMasters, G.K., 2007. A review of land use regression models for character-
regression-based method for mapping traffic-related air pollution: application izing intraurban air pollution exposure. Inhalation Toxicol. 19 (Suppl. 1), 127–133.
5084 D.L. Crouse et al. / Atmospheric Environment 43 (2009) 5075–5084

Sahsuvaroglu, T., Arain, A., Kanaroglou, P., Finkelstein, N., Newbold, B., Jerrett, M., et children: combined cross sectional and longitudinal study. Occup. Environ.
al., 2006. A land use regression model for prediction ambient levels of nitrogen Med. 57, 152–158.
dioxide in Hamilton, Ontario, Canada. J. Air Waste Manag. Assoc. 56, 1059–1069. Wheeler, A.J., Smith-Doiron, M., Xu, X., Gilbert, N.L., Brook, J.R., 2008. Intra-urban
Seaman, N.L., 2000. Meteorological modelling for air quality assessments. Atmos. variability of air pollution in Windsor, Ontariodmeasurement and modelling
Environ. 34, 2231–2259. for human exposure assessment. Environ. Res. 106, 7–16.
Statistics Canada, 2006. Census Data. Zhu, Y., Hinds, W., Kim, S., Shen, S., Sioutas, C., 2002. Study of ultrafine particles
Venn, A., Lewis, S., Cooper, M., Hubbard, R., Hill, I., Boddy, R., et al., 2000. Local road near a major highway with heavy-duty diesel traffic. Atmos. Environ. 36, 4323–
traffic activity and the prevalence, severity, and persistence of wheeze in school 4335.

You might also like