Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Design and Implementation of a Real Time Video

Surveillance System with Wireless Sensor Networks


Wen-Tsuen Chen§, Po-Yu Chen†, Wei-Shun Lee§, and Chi-Fu Huang
§
Department of Computer Science
Email: wtchen@cs.nthu.edu.tw, 934384@mnet.cs.nthu.edu.tw

Institute of Communication Engineering
National Tsing Hua University Hsin-Chu, Taiwan
Email: jaa@com.nthu.edu.tw

Department of Computer Science
National Chiao Tung University Hsin-Chu, Taiwan
Email: cfhuang@csie.nctu.edu.tw

Abstract—One important goal of surveillance systems is to collect [5-9]. Traditional video surveillance systems allocate a large
information about the behavior and position of interested targets number of video cameras to monitor a whole environment and
in the sensing environment. These systems can be applied to collect a large volume of audio/video information requiring
many applications, such as fire emergency, surveillance system, great computation and manpower to analyze it. Generally,
and smart home. Recently, surveillance systems combining
someone must always be watching all the monitor screens to
wireless sensor networks with video cameras have become more
and more popular. In traditional video surveillance systems, the check whether events occurred or not in the monitored
system performance and cost is proportional to the number of environment. There are always some hints from these
deployed video camera. In this paper, we propose a real time numerous records, but the main disadvantage is the person's
video surveillance system consisting of many low cost sensors and capability to detect the motion of objects. For other possible
a few wireless video cameras. The system allows a group of disasters, like leak of gas or heat from fire accident, one
cooperating sensor devices to detect and track mobile objects and isolated system is not enough to notify them, instead, we need
to report their positions to the sink node in the wireless sensor to install several systems separately to detect and avoid them.
network. Then, the sink node uses the IP cameras deployed in the A sensor can sense many parameters in the environment, such
sensing area to record these events and display the present
as temperature, humidity and object motion detection. These
situations. We also propose a camera control scheme to initialize
the coverage distribution of cameras and support the inter-task sensed data can be quickly reported to a process center being
handoff operations between cameras. We have implemented the able to make a decision. However, WSN can just sense that
proposed system with 16 sensor nodes and two IP cameras, and something is irregular and reports some numerical information
evaluated the system performance. The result shows that our but it can not understand what happened. The characteristics
surveillance system is adaptable to variant environments and of WSN and video surveillance systems are so complementary
provides real time information of the monitored environment. that intelligent WSNs provide an opportunity to significantly
improve the quality and robustness of surveillance systems,
Keywords- real time, surveillance, sensor networks, video making them more powerful and providing more information
and services about environmental observation.
I. INTRODUCTION In this paper, we propose a video surveillance system
Recent advances in micro-electromechanical systems, integrating with WSNs. It can act as traditional video
embedded computing, and low power radio communication surveillance systems by using many low cost sensors and a
technology have sparked the advent of massively distributed few wireless video cameras. WSN is deployed to sense and
wireless sensor networks (WSNs). The WSNs consist of large report the information of events back to the sink, and then the
number of small, low cost, and low power sensor nodes, sink will control cameras to monitor and send video streams
which collect and disseminate environmental data. WSNs can back for data analysis. Since video streams contain event
be applied to many applications, including surveillance information, we save a large amount of manpower for data
systems, guiding systems, biological detection, habitat, analysis and achieve high monitoring quality. Our system is
agriculture, and health monitoring [2, 3]. The key advantage suitable to many surveillance applications, such as home
of WSNs is the ability to bridge the gap between physical and security and disaster avoidance in buildings. For simplicity
logical world by collecting and sending useful information to and convenience of system design and protocol description,
powerful devices that can process it. WSNs appropriately we make some assumptions. (1) The sensors allocated in the
applied to dangerous tasks can greatly decrease their risk, or monitor region are uniformly distributed. (2) Each sensor is
even avoid the need of manpower for certain tasks [4]. aware of its position. (3) The detected objects can emit some
Integrating the context-aware ability of WSN into surveillance signals, such as sound or light, that can be detected by sensors
system is an attractive direction which led to lots of researches or the objects themselves are phenomenal. Since a few

978-1-4244-1645-5/08/$25.00 ©2008 IEEE 218


cameras can not cover the whole sensing region, many which has two fields: Sender_ID and Hop-count, where the
interested events may happen simultaneously. We propose a hop initially sets to zero. On sensor Si receiving an initial <Sj,
heuristic method to solve the problems of monitoring regions hopj> packet from its neighbor Sj, Si adds a new entry <Sj, hopj
allocation and inter-task handoff. Our system will monitor the > into its neighbor table and rebroadcasts the initial <Si, hopj
maximum number of events in the region. We have +1> packet. If multiple initial packets with the same hop value
implemented a prototype system and evaluated the are received, the packet with the strongest signal strength is
performance of the proposed system. Our design integrates recorded. After building the routing tree in the WSN, each
WSN with video cameras to extend the use of surveillance sensor has its own shortest path to the sink and neighboring
information in its neighboring table.
system and decrease the cost on deploying a large number of
cameras.
The organization of this paper is as follows. Section II
reviews some related works. Section III introduces the
proposed system. Section IV presents the experimental results.
Section V shows the implementation of the system. Finally,
section VI presents the conclusions.

II. RELATED WORKS


In this section, we review some surveillance systems.
Several works use mobile robots with WSNs to discover
unknown or dangerous areas [8], [9]. The system consists of a
WSN and a mobile robot equipped with an IP camera and a
guiding sensor. The WSN can guide the robot through its
equipped sensor to a location and take images back to a sink.
These surveillance systems [5-8] rely on the cooperation of
WSNs and cameras, because of the large amount of Figure 1. System Overview.
information and the high-level mediums provided by cameras.
The advantages of covering large surveillance area, robustness, B. Event Position and Report
and relatively low cost make them very suitable for
applications in visual security for many environments, such as When events are detected by sensors, they start to collect
banks, shopping malls, subway stations and parking lots. The data for a specific period. A simple method to track events is
WSN is also used to detect and localize the position of to allow all sensors to detect events and report event
interested objects. However, these works often focus on the information and position back to the sink. Then, the sink will
detection of human behavior and limit the function of WSN. process all the information to localize where events occurred.
The object tracking issue in WSN has also been intensively The advantage of this method is that the complicated data is
studied. Most works assume that the intrusion objects can emit processed by powerful device and the power-limited sensors
some signals, such as noise, light, or the objects themselves are could consume less energy to extend the network lifetime.
phenomenal, such as diffused gas or chemical liquid [10]. In However, the main disadvantage is that multiple sensors
[11], a sound source localizer and a motion detector system are would detect the same event since sensors are equally
implemented on a robot called ISAC. Because of its omni- distributed, and a large number of reported packets bring high
directional and low computation effort compared to treating traffic load and high collision possibility. As a result, the
image, the system uses audition instead of vision as localizer. event position task is performed on sensors. On detecting an
event, sensor Si broadcasts an event-position <Si, Ei, Ampi>
III. THE PROPOSED SYSTEM packet, which has three fields: Sender_ID, Event_ID and
Fig. 1 is an overview of our surveillance system. Our sensed data, to inform its neighbors that it has detected an
design can be separated into two parts. One part is on the event. The sensed data Ampi means the amplitude of vibration
sensor side and the other part is processed on the sink side. In of voice emitted by the handhold device. A higher value
our system, the sensors are used to track and report locations means that the sensor’s position is closer to the event. On Si
of objects to a sink through a routing tree. The sink side will receiving event-position <Sj, Ej Ampj> packet from its
process the reported data and then trigger cameras to send neighbor Sj, it checks if its neighbors also detects the same
video streams back to the graphic user interface (GUI) on the event Ej and then Si compares the received packets from its
sink through IEEE 802.11 radio interface. In the following, we neighbors to determine which sensor is the closest to the event.
will discuss our design in detail. The sensor with the highest data value around the event will
report its ID to the sink and then the sink uses this node ID as
A. Network Initialization the event position. Thus, no matter how the event is static or
Due to the fixed location of sink, our system adopts the mobile, we can track the event hop by hop without any loss.
spanning tree as the routing tree for information transmission Since our system uses sound as a trigger event, we assume
from sensors to sink. The root of the routing tree is at the sink. that events are triggered at different times or in proper
Firstly, the sink Ssink broadcasts an initial <Ssink, hop> packet, distance to avoid affecting the data collection in sensors.

219
These parameters can be adjusted based on the application- Then the sink deletes all events in the assigned set from other
specification. event sets belonging to other cameras. Then sink repeats the
process again until all cameras been assigned a monitoring
C. Monitoring Region Allocation region. The heuristic method results in the best mapping of
Our goal is to use fewer cameras to achieve the same cameras and event sets so that the system can always monitor
surveillance quality as traditional surveillance systems. The most events as possible. In traditional surveillance systems, if
challenge is how to assign the monitoring regions to cameras. we want to provide a full coverage without any loss to a large
Every camera can monitor a proper sub-region in its area so square area, at least four cameras are needed. Although events
that the total number of monitored events is at its maximum. may lose in our system, it is using fewer cameras. It is the trade
We propose a heuristic method to decide the best mapping off between coverage and cost.
between cameras and monitor regions. As in Fig. 2(a), the D. Inter-task Handoff Operation
area covered by the circular regions represents the area that a
The key point of handoff operation is inter-task exchange
camera can monitor. A circular region is formed by one sensor
between cameras while a mobile object is moving around. The
and its one hop neighbors. A red node represents an event.
mobile object may be moving from a covered area of one
The number of circular regions of a camera is the same as the
camera to the covered regions of several cameras. The handoff
number of sensor nodes in its covered area. When an event
operation transfers the monitoring task of the mobile object
occurred in one circular region, the sink will create an event
from the current camera to a predicted camera while it enters
set belonging to that circular region. A camera will have a
the other coverage area of several cameras. The handoff
succession of event sets. The sink will collect event
operation is described as following.
information and classify these events with their positions
Due to the event report, the position and route that a mobile
periodically, and the cameras monitor and calculate the
object passed through will be recorded in the sink. Then, a
number of events in each set.
prediction of the next direction and position will be made
based on the past recorded data and the current position of the
mobile object. On an event M is now moving in the region
belonging to camera X:
i. If the next predicted position is still in the region of the
current monitoring camera X, nothing will happen and
the system follows the original monitoring region
allocation scheme to decide the next operations of
cameras.
ii. If the object enters the predicted covered region of other
cameras Y and Z, the system adds a virtual event into Y’s
and Z’s event lists according to the new position. A
(a) virtual event is regarded as a real event and taken into
consideration in the region allocation process.
iii. If the prediction failed, the system still can catch up this
moving object by original detecting method soon and the
sink will delete these incorrect virtual events and re-
assign the right camera to monitor it.
IV. PERFORMANCE E VALUATION
The experiments evaluate the performance of the proposed
system in real world. To study the performance of our system,
we conduct the following experiments on three different
scenarios as shown in Fig. 3. We obtained the experimental
results through an actual deployment of MICAz motes [1] on
the operational field, using the design method described in
Section III. In this evaluation section, we demonstrate the
analysis of system delay, Quality of Surveillance (QoS) and
inter-task handoff operation.
(b)
Figure 2. (a) Circular Monitor Region. (b) Camera and Event Sets Mapping
A. Evaluation of System Delay
For example, in Fig. 2, camera A has 10 event sets and the
corresponding number of events in each set is {2, 2, 3, 1, 3, 4, The system delay is a critical factor for the system analysis.
3, 2, 3, 1}. The other camera B also has its own event set, as In our system, the system delay consists of several portions,
shown in Fig. 2(b). At first, the sink will assign the event set including data collection of sensors, event position,
having maximum number of event to camera A as its information report, process of monitor region allocation and
monitoring region, where the event set is set 6 and has 4 events. rotation time of cameras. Although the absolute values may

220
(a) (b) (c)
Figure 3. Three test scenarios (a) 4 x 4 Grid (b) L shape (c) U shape.

vary in different environments, we can still draw some general


C. Evaluation of Handoff Operations
observations from the MICAz-based platform. We measure the
system delay, starting with an event occurrence and ending In this experiment, we evaluate the successful ratio of the
with the completion of cameras’ actions: inter-task handoff in Fig. 3(a). There are N events with one
mobile object moving around and N-1 static events in the test
i. When events occurred, the sensors start to collect event field. If the WSN can sense those static events and the
information in a fixed span. We setup the span as one movement of the mobile object simultaneously, we call the
second. handoff operation succeeded. We test two scenarios, one with
ii. At every fixed span time, the sensor has to identify handoff prediction and the other is without prediction. As the
whether it is the nearest sensor node to the event. events are triggered by 4 kHz sound, it also tests the stability
Through our tests, at least 0.3 second is needed for a and correctness of event positioning in our system. Fig. 5
sensor to wait for other event-position packets to confirm shows the average successful ratio relative to a number of
the correctness of an event position. events. We can observe that while events increase with time,
iii. The information reporting and sink processing only take the successful ratio is getting lower but still keeps in high ratio.
several milliseconds and can be ignored. With the system predicting the future destination and
iv. The last part is the time spent on the camera rotation. A informing the other cameras to prepare themselves in advance,
this scenario has higher success ratio than the one without
single command makes our camera moving in a fixed arc
handoff prediction.
and each movement takes 0.11sec/rotation. In our
system, the average number of rotation of a camera is
3.3135 times. As a result, it takes 0.36 second in average
to finish a camera operation.
A complete operation, starting with an event occurrence
and ending with the event image displayed on GUI, runs 1.67
second in total. The system can track events with a relative
speed of 2.2 km/hr and the extreme speed is about 4.0 km/hr.
B. Quality of Surveillance (QoS)
For the second experiment, we test the quality of the
surveillance system with random occurred events. We have
three scenarios as shown in Fig. 3. The red point in each
scenario is the position of sink. In each scenario, the position Figure 4. Average monitored event ratio vs. the number of events.
of cameras are considered to provide full coverage of the
observed field and some important regions, such as corner or ˄˅˃
front gate, can be monitored by more than one cameras to
˦̈˶˶˸̆̆ʳ˥˴̇˼̂́ʳʻʸʼ

˄˃˃
improve the monitoring quality. ˋ˃
To demonstrate the effectiveness of the monitoring region
ˉ˃
allocation scheme, we gradually increase the amount of
random occurred events to the system in the testing process. ˇ˃
Fig. 4 shows that our system keeps high ratio in each scenario ˅˃
during the whole testing time. As we increase the number of ˃
events, the system performance will noticeably rise and ˄ ˅ ˆ ˇ ˈ ˉ
achieve consistence. It also proves that our system has better
performance than the traditional system and is independent of ˡ̈̀˵˸̅ʳ̂˹ʳ˘̉˸́̇̆ʳʻˡʼ
environment.
˪˼̇˻ˀˣ̅˸˷˼˶̇˼̂́ ˪˼̇˻̂̈̇ˀˣ̅˸˷˼˶̇˼̂́

Figure 5. Inter-task Handoff Success Ratio.

221
V. IMPLEMENTATION and manpower need are greatly decreased. The performance
The architecture described in section III is implemented on evaluation shows that the proposed system can be applied to
a TinyOS platform. TinyOS is an event-driven system which is different environments. Being able to achieve the same
written by NesC language. NesC language is specially used on function as traditional surveillance system with few cameras
the MICA mote platform. MICAz mote is an IEEE 802.15.4- and to perform the handoff operations smoothly among
compliant module, which supports 250kbps data rate and has cameras, we could make the system very useful in many
128kB programming memory and 4kB data memory. The surveillance applications.
network camera DCS-5300G, by D-Link Inc., is a powerful
surveillance device that can connect to the network through
802.11b/g interface. The DCS-5300G has a pan and tilt
function that can expand the viewing area to cover a wide 270
degrees angle side-to-side and a 90 degrees angle up and
down.
The hand-held device which can transmit 4 kHz sound is a
multi-function device [12], as shown in Fig. 7. The use of
audition as the localizer has some advantages. First, a
microphone can receive the sound emitted from all directions.
Second, the sound can be considered as one dimensional Figure 7. The Hand-held Device.
signal processing which requires significantly fewer ACKNOWLEDGMENT
computational resources compared to image-based systems.
So we chose 4 kHz sound to trigger an event in our system. This work is co-sponsored by Taiwan MoE ATU Program, by NSC
grants 94-2752-E-007-003-PA, 95-2219-E-009-007, and 94-2219-E-
The sensor board of each mote has a microphone and a tone
007-009, by MOEA under grant number 94-EC-17-A-04-S1-044.
detector that can filter out the 4 kHz sound and generate 1-bit
digital output. We use this output and the sound strength to REFERENCES
position the event in the range of one meter. [1] Crossbow. MOTE-KIT2400, http://www.xbow.com.
An experimental prototype is built on a 4 x 4 grid sensing [2] Murat Demirbas, Ken Yian Chow, Chieh Shyan Wan, “Internet-Sensor
field and GUI on the sink is shown as Fig. 6. There are two Integration for Habitat Monitoring,” Proceedings of the International
cameras locating near node No. 4 and No. 13. And the red Symposium on World of Wireless, Mobile and Multimedia Networks
WOWMOM ’06.
node represents an event are detected. The prototype has been [3] J. Burrell, T. Brooke, and R. Beckwith. Vineyard, “Computing: Sensor
tested in the National Tsing Hua University. Networks in Agricultural Production,” IEEE Pervasive Computing,
2004.
[4] Xun Wang, Wenjun Gu, S. Chellappan, K. Schosek, Dong Xuan,
“Lifetime optimization of sensor networks under physical attacks,”
IEEE International Communications Conference ICC, May 2005.
[5] V.A. Petrushin, Gang Wei, O. Shakil, D. Roqueiro, and V. Gershman,
“Multiple-Sensor Indoor Surveillance System Computer and Robot
Vision,” the 3rd Canadian Conference, June 2006.
[6] L. Snidaro and G.L. Foresti, “A Multi-camera Approach to Sensor
Evaluation in Video Surveillance,” Image Processing, IEEE
International Conference ICIP 2005, Volume 1, Sep. 2005.
[7] C. Micheloni, E. Salvador, F. Bigaran, and G.L. Foresti, “An Integrated
Surveillance System for Outdoor Security,” Proceedings. IEEE
Conference on Advanced Video and Signal Based Surveillance, Sep.
2005.
[8] Vijay Kumar, Rus, D., and Sanjiv Singh, “Robot and sensor networks
for first responders,” Pervasive Computing, IEEE Volume 3, Issue 4,
Page(s):24 – 33, Oct-Dec 2004.
Figure 6. GUI and Real Environment. [9] Yu-Chee Tseng, You-Chiun Wang;, and Kai-Yang Cheng, “An
Integrated Mobile Surveillance and Wireless Sensor (iMouse) System
VI. CONCLUSIONS and its Detection Delay Analysis,” Proceedings of the 8th ACM
international symposium on Modeling, analysis and simulation of
The main contribution of the paper is the idea of integrating wireless and mobile systems MSWiM, 2005.
WSN with video camera technology to provide better [10] X. Ji, H. Zha, J. J. Metzner, and G. Kesidis, “Dynamic Cluster Structure
surveillance service and the implementation of this system. for Object Detection and Tracking in Wireless Ad-hoc Sensor
And we also propose a method for monitoring region Networks” IEEE International Conference on Communications ICC,
2004.
assignment. With inter-task handoff operation, our system can [11] A.S. Sekmen, M. Wilkes, and K. Kawamura, “An Application of
seamless monitor the mobile object. WSN can sense many Passive Human-robot Interaction: Human-tracking based on Attention
interested environmental events that can enlarge the Distraction,” IEEE Transaction on Systems, Man, and Cybernetics,
monitoring range of traditional surveillance systems. By using March 2002.
[12] P.-Y. Chen, W. T. Chen, C.-H. Wu, Y.-C. Tseng, and C.-F. Huang, “A
video cameras, information returned by WSN would not be Group Tour Guide System by RFIDs and Wireless Sensor Networks”,
only vague environmental information, but they can be Information Processing in Sensor Networks (IPSN), Boston, USA, April
analyzed with real time images so that the computation effort 2007.

222

You might also like