Professional Documents
Culture Documents
Dwes 3 107 2010
Dwes 3 107 2010
Dwes 3 107 2010
Earth System
Open Access
Monitoring water distribution systems: Science
understanding and managing sensor networks
Data
D. D. Ediriweera and I. W. Marshall
Lancaster Environment Centre, Lancaster University, Lancaster, LA1 4YQ, UK
Received: 5 February 2010 – Published in Drink. Water Eng. Sci. Discuss.: 12 April 2010
Revised: 23 August 2010 – Accepted: 2 September 2010 – Published: 27 September 2010
Abstract. Sensor networks are currently being trialed by the water distribution industry for monitoring com-
plex distribution infrastructure. The paper presents an investigation in to the architecture and performance of
a sensor system deployed for monitoring such a distribution network. The study reveals lapses in systems
design and management, resulting in a fifth of the data being either missing or erroneous. Findings identify the
importance of undertaking in-depth consideration of all aspects of a large sensor system with access to either
expertise on every detail, or to reference manuals capable of transferring the knowledge to non-specialists.
First steps towards defining a set of such guidelines are presented here, with supporting evidence.
into several non-obvious design pitfalls observed within the Based on the literature it becomes evident that sensing
investigation are outlined to help future designers avoid the within water distribution systems is useful for many applica-
same. tions ranging from burst and leakage detection, infrastructure
security, water quality monitoring etc. However at the same
time only few guidelines appear to be available on “how”
2 Related work such sensing should be achieved. Available models often ap-
pear to address design time issues and rarely discuss illusive
Work published by (Stoianov et al., 2007) provides a useful pitfalls which often become apparent at deployment and op-
insight into the emerging role of sensing within water dis- eration time. Guidelines presented here are therefore aimed
tributions systems. Work presented discusses a solution for at assisting designers avoid issues which are illusive and sub-
monitoring large diameter bulk-water transmission pipelines tle at design time, but impactful during live operation. These
through wireless sensor networks for burst and leakage de- guidelines are derived by analyzing the performance of a live
tection. Increased special and temporal resolution of hy- sensor network in retrospect, and hence should effectively
draulic and acoustic/vibration data is claimed to provide sig- capture runtime limitations and their relationship to design
nificant proactive and reactive maintenance capability allow- time choices.
ing for potentially large financial savings. Field results indi-
cate key observations including the necessity for robust sens-
ing hardware, decoupling of data collection from communi- 3 Survey details
cation, and improved time synchronisation.
3.1 Information used
A different application of sensing with water distribution
infrastructure is presented by (Qian et al., 2007). The paper A multi-faceted analysis has been undertaken where a vari-
highlights the importance of security in urban water distri- ety of variables and potential contributing factors have been
bution infrastructure, and discusses a sensor solution for im- examined to understand “how” some may have affected sys-
proving its safety and protection via an optimisation between tem performance. The main analysis has been performed
off-line and online monitoring. Other work such as environ- based on the original dataset collected by the sensor network
mental sensing for monitoring drinking water quality (Ail- over the studied period. Following additional information
amaki et al., 2003), and project SmartCoast for monitoring and knowledge has also been made us of in the analysis.
water quality in fresh water catchments (B O’Flynn et al.,
2007) do also exists which provide further evidence of the – Discussions with the sensor network operator, and anal-
growing application of sensor networks within the drinking ysis of o the water distribution system.
water industry.
– Analysis of the sensor locations and their distance to
A specialized communications model for sensor based
communication towers.
monitoring of water distribution networks is presented by
(Lin et al., 2008). The work specifically addressed chal- – Analysis of local weather conditions.
lenges encountered in underground to aboveground radio
propagation when sensors are placed within concrete and – Experience from other sensor networks including mon-
cast iron chambers; a typical placement of sensor nodes com- itoring systems in telecommunications.
mon within many water distribution networks. The work dis-
cusses different antenna placements, antenna sizes, frequen-
3.2 System configuration
cies and strategies for establishing the necessary uplinks, and
provides useful insights in to practical communication chal- The examined system is a wide area sensor network deployed
lenges which need consideration in real world sensor deploy- for monitoring of a water distribution and supply system.
ments. The sensor net consists of 520 plus sensor nodes monitor-
Common sensor network architecture for equipment fail- ing flow and pressure of a pipe system. These nodes are dis-
ure prediction through vibration monitoring is discussed by tributed within roughly 50 km2 . The majority of the sensor
(Krishnamurthy et al., 2005). The architecture is trialed nodes are deployed underground within manholes located on
within two industrial surroundings: a semiconductor man- public roads. All nodes including their sensors, radios and
ufacturing plant; and a North Sea oil tanker. While the work loggers are commercially manufactured and are designed to
may not directly relate to water distribution, findings are rele- be water proof. Electronics within these were custom made
vant as they are presented in terms of general design insights. with the possibility for minor imperfections. Each sensor
Important findings include relationships observed between node is equipped with a GSM modem capable of GPRS data
different sensor hardware configurations and their power effi- connectivity. Data collected by individual sensor nodes are
ciency. It is claimed that hardware with grater RAM and I/O relayed via a public GSM/GPRS network. Sensor nodes are
bandwidth were more power efficient as they required less powered using a battery pack with an estimated lifetime of
software intelligence for resource management. approximately twenty four months.
Figure
Figure 1. The 1. The outlines
diagram diagram outlines the configuration
the configuration of theofsensor
the sensor network
network being
being investigated.An
investigated. Anabstracted
abstracted view
viewofofthe
theprocess
processis cap-
is captured to
tured to maintain clarity. Circled numbers indicate possible failure points within the infrastructure which are relevant to the analysis.
maintain clarity. Circled numbers indicate possible failure points within the infrastructure which are relevant to the analysis.
Each net
Table 1. Sensor sensor node is designed
performance to support
summarised by year. up to a max- LFP identifies the percentage of loggers which at
imum of 4 data channels, with each channel support- least failed once during a given year. Over the stu-
ing a single sensor. 68% of the nodes deployed were died period LFP appears to be fairly stable indicating
Year a single
using Total Days Total
connected channel Loggers Loggers
while the remain- Logger
that Missing such
overall conditions Total Records Lost
as manufacturing per
quality
ing 32%Operational Failings
were using both. SensorOperational
nodes with twoFailed
ac- Failure
of % Dataand
the hardware, % long Records Logger per Day
term environmental condi-
tive channels were monitoring both flow and pres- (LFP)
tions (MDP) stationary
were generally Lost during the (RL/L/D)
considered
sure, while nodes with a single active channel were period. MDP identifies the volume of data missing
2006 364
measuring either flow or822
pressure. 479 321
in 0.670
a given year 8.03
as a 1percentage
196 956
of the6.86 total data
2007 364 1242 462 401 0.867 7.45 1 389
which should have been recorded during 133 8.26 the same
2008 path configuration
Data 365 2527
of the system 430is highlighted
349 0.811 MDP 32.8
period. is therefore 4 465an913important 28.45
indicator of
2009 71 137 387
in figure 1. Sensor nodes are configured to record 289 0.746 system0.32
overall performance. 21 987 MDP for year 0.80 2006 and
readings every 15 minutes. Pressure readings are 2007 are stable at around 6%-8%, but dramatically
recorded as snapshots while flow readings are aver- increases to 32% in 2008. This indicates a much
aged node
Each sensor over is
several measurements
designed to support within
up to a amaximum
15 minute higher level
defined of failures
as a gap of greater during
than 15 thismin
year. MDP two
between for consec-
period. The loggers report back to the central data 2009 indicates a big improvement at 0.8%, but may
of 4 data channels, with each channel supporting a single utive measurements recorded for a specific channel. This is a
collector every 30 minutes. The dataset used for the not be indicative as the figures are only for a short
sensor. 68% of the includes
analysis nodes deployed
in excess wereof using a singlerecords
76 million con- conservative
period of themeasurement
year. RL/L/D as it does not
identifies on take
averagein tothe
considera-
nected channel while
over the the remaining
3+ year period from 32%2006were using49
to 2009. both.
mil- tion
numberany ofrecords
records which continued
missing per loggerto bepermissing
operational at the end of
Sensor nodes
lionwith two active
of these readingschannels
were were monitoring
of pressure, whileboth27 the
day.studied
RL/L/Dperiod.
appearsKey indicators
to be consistentto be
withnoted
MDP. from It Table 1
million were
flow and pressure, whileflow.
nodesThewithdataahas beenactive
single recorded from
channel peaks
are: in 2008
Logger and then
Failure falls off (LFP);
Percentage in 2009. In summary
Missing Data Percent-
529 distinct sensor nodes
were measuring either flow or pressure. with 697 active channels. these(MDP);
age findings
andindicate
RecordssystemLost perperformance
Logger per Day to be (RL/L/D).
Data path configuration of the system is highlighted in generally poor incurring significant loss of data dur-
ingLFPeachidentifies the percentage
year excluding 2009 (were ofonly
loggers
a shortwhichpe- at least
Fig. 1. Sensor
4 THEnodes are configured to record readings every
ANALYSIS failed
riod ofonce during
the year is aanalysed).
given year. 2008Overrecordsthe anstudied
un- period
15 min. Pressure readings are recorded as snapshots while LFP
usuallyappears to be
high loss ratefairly
at 32%.stable indicating that overall con-
4.1 are
flow readings Missing Dataover
averaged Overview
several measurements within ditions such as manufacturing quality of the hardware, and
Table 1The
a 15 min period. provides
loggers a summary
report back of toyearly performance
the central data long term environmental
4.2 Missing Data by Dateconditions were generally station-
of the30
collector every system. All failures
min. The dataset in thefor
used table
therefer to events
analysis in- ary during the considered period. MDP identifies the volume
of missing
cludes in excess of 76data within
million the dataset.
records over theA3+missing event
year period Figure 2 illustrates the distribution of the short term
of data
(less thanmissing
6 hours) in amissing
given yeardata as a percentage
events over the 3+ of the total
from 2006intothis instance is defined as a gap of greater than 15
2009. 49 million of these readings were of pres- data
minutes between two consecutive measurements years analysed. The plot indicates short-term failuressame pe-
which should have been recorded during the
sure, whilerecorded
27 million
forwere flow. The
a specific data has
channel. Thisbeen
is a recorded
conserva- riod.
to be MDP
fairlyisclustered
thereforewith an important
some periods indicator of overall sys-
illustrating
from 529 distinct sensor nodes
tive measurement as itwith
does697
notactive
take inchannels.
to considera- higher
tem numbers ofMDP
performance. failureforthan
yearothers.
2006 andPeak2007 periods
are stable at
tion any records which continued to be missing at can be 6%–8%,
around identifiedbut as dramatically
Jan-07 to May-07,increases Mar-08
to 32% to in 2008.
the end of the studied period. Key indicators to be Aug-08,
This and Oct-08
indicates a much to Dec-08.
higher levelThe calmer
of failuresperiods
during this
4 The analysis
noted from Table 1 are: Logger Failure Percentage with fewer failures occur between Jan-06 to Nov-06,
year. MDP for 2009 indicates a big improvement at 0.8%,
(LFP); Missing Data Percentage (MDP); and and May-07 to Jan-08. No lull periods appears to
4.1 Missing dataLost
Records overview
per Logger per Day (RL/L/D). but may not beJan-08
have occurred indicative as the figures are only for a short
onwards.
period of the year. RL/L/D identifies on average the number
Table 1 provides a summary of yearly performance of the of records missing per logger per operational day. RL/L/D
system. All failures in the table refer to events of missing appears to be consistent with MDP. It peaks in 2008 and then
data within the dataset. A missing event in this instance is falls off in 2009. In summary these findings indicate system
Figure 2. Short-term (<6 h) failures vs. date. Figure 4. Long-term (>6 h) failures vs. date.
Figure 3. Rainfall vs. date. Figure 5. Sensor network maintenance vs. date.
performance to be generally poor incurring significant loss ures either. Based on this evidence it is therefore appropriate
of data during each year excluding 2009 (were only a short to exclude rainfall as a major contributor of sensor failures.
period of the year is analysed). 2008 records an unusually Figure 5 plots maintenance work carried out on the sen-
high loss rate at 32%. sor network between Jan-06 to Jan-09. This is of interest
as maintenance may cause system disruptions. It is in fact
found that short term failures during Jul-08 and in Sep-08
4.2 Missing data by date closely correlate with such maintenance. This strongly backs
the possibility of these failures being caused as the result of
Figure 2 illustrates the distribution of the short term (less engineers performing maintenance on sensor devices.
than 6 h) missing data events over the 3+ years analysed. Figure 4 plots long-term errors lasting for greater than 6 h.
The plot indicates short-term failures to be fairly clustered Two clusters of long-term errors appear from Jan-07 to May-
with some periods illustrating higher numbers of failure than 07 and from Mar-08 to Jul-08. Based on operator feedback
others. Peak periods can be identified as Jan-07 to May-07, these periods were found to correlate with battery replace-
Mar-08 to Aug-08, and Oct-08 to Dec-08. The calmer peri- ment schedule and could therefore indicate devices switch-
ods with fewer failures occur between Jan-06 to Nov-06, and ing off due to battery depletion. Specific battery replacement
May-07 to Jan-08. No lull periods appears to have occurred records were not available for verifying this. Long-term er-
Jan-08 onwards. rors which do not fall within these periods were potentially
Figure 3 plots rainfall recorded during the period from Jan- caused due to out-of-synch battery depletion or sensor hard-
06 to Jan-09. This has been calculated as the average from ware failures.
6 rain gauges in the general area. Rainfall is of interest due
to the effects which it has on radio wave propagation and
4.3 Missing data by duration and event size
the possibility of it causing flooding of underground sensor
nodes. Upon comparing Figs. 2 and 3, none of the peak rain- Figure 6 plots failure events grouped by event duration and
fall events appear to significantly correlate with failure events event size. Two distinct distributions are visible: short-term
over the 3+ year period. A large rainfall event in Jul-07 errors “<24 h”, and long-term errors “>24 h”. Long-term er-
which is recorded to have caused widespread flooding in the rors peak at “>30 days” and tail off at “2 to 3 days”. It is
area does not appear to have trigged significant sensor fail- interesting to note that 99.5% of these long-term errors affect
fundamental management information such as the error logs. the hour. Using such non-standard times increases the prob-
Resulting from this lack of information the operators of the ability of avoiding traffic being generated by other systems.
sensor network were unaware of the scale of the performance
lapses within the system. To overcome such limitations it is 6 Conclusions
vital that system designers envisage how a system can be
managed when it is deployed, and use this insight to in- The study has revealed up to a fifth of the data collected by
tegrated necessary tools to support such management. At the evaluated sensor system to be either missing or erroneous.
a minimum such support should include the possibility of Contributing factors included lack of expertise in system de-
monitoring sensor nodes continuously and red-flagging pos- sign leading to poor design choices, lack of management
sible issues as they occur. Error logs should be available for support built-in to the system, weaknesses in the specifica-
further investigation of any issues. Facilities should also be tion given to the equipment vendor by the system owners,
introduced within sensor management systems for automated back end IT system failures, and failing transducer hardware.
monitoring of battery performance and scheduling of battery Findings in retrospect reveal important aspects which require
replacements. This would help minimise downtime of sensor considerable attention when deploying large sensor systems.
nodes due to battery failure. It was surprising to discover that These include the recruitment of necessary expertise, design
batteries within the current system were hardwired to sensor of integrated management tools, careful evaluation and se-
nodes, making their replacement a complex process. Issues lection of communication mediums, identification of suitable
such as hardwired batteries and unavailability of sensor error communication strategies, and undertaking of management
logs suggest significant lapses in requirement specifications as a planned activity. Future work on developing these guide-
by sensor network operators to the equipment vendors. lines to formulate a more comprehensive rule set is currently
The selection of a suitable communication medium and planned.
appropriate communication protocols are other aspect of sen-
sor network design which requires significant consideration. Acknowledgements. This work is/has been funded by project
Current system operators have opted for GSM/GPRS solu- Neptune (EPSRC Reference: EP/E003192/1).
tion while the industry norm has largely remained focused
towards GSM/SMS. Although such forward thinking is valu- Edited by: J. Vreeburg
able, extreme caution is necessary to fully understand their
implications. As an example, the effects of peak GSM/GPRS References
network traffic discussed earlier are potentially a hidden phe-
nomenon with considerable adverse side-effects. The exis- Ailamaki, A., Faloutos, C., Fischbeck, P., Small, M., and VanBrie-
tence of such an elusive problem suggests the importance of sen, J.: An environmental sensor network to deter-mine drink-
ing water quality and security, SPECIAL ISSUE: On sensor net-
undertaking performance evaluations such traffic profiling,
work technology and sensor data management, ACM SIGMOD
and quality of service checks on any third party infrastruc-
Record, 32 , 4, 47–52, 2003.
ture which is planned to be used. Krishnamurthy, L., Adler, R., Buonadonna, P., Chhabra, J., Flani-
The selection of a proper communication strategy can it- gan, M., Kushalnagar, N., Nachman, L., and Yarvis, M.: De-
self be as vital as selecting the communication medium itself. sign and deployment of industrial sensor networks: experiences
Within the current system all sensor nodes are polled every from a semiconductor plant and the north sea, Proceedings of
30 min starting from the top of the hour. This approach of si- the 3rd international conference on Embedded networked sensor
multaneously polling all sensor nodes may itself be sufficient systems, 64–75, 2005.
to cause low-capacity GSM/GPRS cells to be congested, cre- Lin, M., Wu, Y., and Wassell, I.: Wireless sensor network: Water
ating traffic errors, and connection issues. More importantly, distribution monitoring system, Radio and Wireless Symposium,
this approach is most likely to create bottle necks at the data 2008 IEEE, 775–778, 2008.
O’Flyrm, B., Martinez, R., Cleary, J., Slater, C., Regan, F., Dia-
collector which would need to handle incoming connection
mond, D., and Murphy, H.: 32nd IEEE Conference on Local
requests from all sensor nodes simultaneously, leading to
Computer Networks 2007, 815–816, 2007.
potential data mishandling and other communication errors Qian, Z., Gao, J., Wu, W., Hou, X.Q., Wu, Y., Tao, L., and Ran, R.:
within the backend system. The simplest approach to over- Harbin Water Supply Network Monitoring and Evaluation, IEEE
come this would be to stagger the polling times such that International Conference on Networking, Sensing and Control
sensor nodes report back at different individual times during 2007, 345–349, 2007.
each 30 min block. To compliment such an approach it is Stoianov, I., Nachman, L., Madden, S., and Tokmouline, T.:
also useful to avoid the use of standard times for reporting PIPENET a wireless sensor network for pipeline monitoring,
back data, especially when third party infrastructure is used. Proceedings of the 6th international conference on Information
The rationale being, the probability of standard times such as processing in sensor networks, 264–273, 2007.
the top of the hour, fifteen past, or midnight being used by Tierney, B., Crowley, B., Gunter, D., Lee, J., and Thompson, M.: A
Monitoring Sensor Management System for Grid Environments,
other systems on same network is much higher compared to
Cluster Comput., 4, 1, 19–28, 2001.
non-standard times such as for example three minutes past