Road Hazard Detection and Sharing With Multimodal Sensor Analysis On Smartphones

2013 Seventh International Conference on Next Generation Mobile Apps, Services and Technologies
Road Hazard Detection and Sharing with Multimodal Sensor Analysis on

Smartphones
Fatih Orhan, P. Erhan Eren

METU Informatics Institute, Information Systems Dept.,
METU Campus, Ankara, Turkey
e112903@metu.edu.tr, ereren@metu.edu.tr
Abstract— The sensing, computing and communicating sequence of filters are applied including a speed filter
capabilities of smart phones bring new possibilities for using the location and speed values from a GPS sensor,
creating smart applications, including in-car mobile
and specifically a “Z-Peak” algorithm on accelerometer
applications for smart cities. However, due to the dynamic
nature of vehicles, many requirements such as sensor
values. Accordingly, the potholes are detected as a result
management, signal and image processing or information of these filters. Mohan et al. use sensors such as
sharing needs exist when developing a smart sensor-based accelerometer, GPS, GSM and microphone in their
in-car mobile application. On the other hand, most in-car Nericell [2] study. They focus on the detection of pothole,
applications generally employ single-modal sensor analysis, speed bump, abrupt deceleration, and the sound of a horn.
which also yields limited results. Using the advanced For the pothole and speed bump detection, they apply the
capabilities of smart phones, this study proposes a “z-sus” algorithm and obtain successful results. A similar
framework with built-in multimodal sensor analysis study is conducted by Mednis [3] et al., where a smart
capability, and enables easy and rapid development of signal phone application is developed for detecting potholes and
and image processing-based smart mobile applications. extracting the profile of the road in real time. The
Within this framework, an abstraction for fast access to accelerometer values collected from the mobile device
synchronized sensor readings, a plugin based multimodal sensor are used within a time window in different methods
analysis interface for signal and image processing such as comparing with a threshold value (Z-thresh),
applications, and a toolset to connect to other users or taking the difference (Z-Diff) and taking the standard
servers for sharing the results are provided built-in. As part deviation (STDev(Z)). The results show that the Z-Diff
of this study, a sample mobile application is also developed algorithm yields the maximum performance. Astarita et al.
to demonstrate the applicability of the framework. This develop a system to detect potholes and speed bumps on
application is used for detecting defects on the road, such as
the road in their UNIquALroad [4] study. Their aim is to
potholes and speed bumps, and it automatically extracts the
assess the quality of the road, and they utilize a threshold
video section and the image of the corresponding road
segment containing the defect. Upon such critical hazard
based filter on the acceleration values, obtaining %90
detection, the application instantly informs nearby users successful bump detection and %65 successful pothole
about the incident. A good detection rate of speed bumps is detection rates.
obtained in the performed tests, while the advantage of Besides the accelerometer sensor, cameras are also
automatic image extraction based on the multimodal widely utilized by in-car mobile applications for the
approach is also demonstrated. detection of obstacles on the road. KyuToku [5] et al.
develop a system to detect obstacles (such as pedestrian,
Keywords-multimodal sensor analysis; mobile sensor; vehicle, pylon, box or ball) on the road by comparing the
mobile GIS applications; signal processing; pervasive frames captured by the camera against frames from past
computing recordings where there are no obstacles. First, the current
frame of the camera is mapped to one of the past frames
I. INTRODUCTION that were captured at the nearest position, then the road is
registered in both frames and the differences of the two
The convergence in mobile technologies enabled smart frames are extracted in order to detect the obstacles. The
phones to provide various uses due to their broad and results show that the obstacles are successfully detected at
advanced sensing, computing and communication a distance of less than 31 meters. Similarly Naito [6] et al.
capabilities. There are usually more than a dozen sensors also try to detect obstacles on the road by using the optical
on a regular smartphone, in which location, motion and flow method. Koch [7] et al. develop a system which
camera sensors (as well as other sensors) are commonly detects cracks and potholes on the road by using computer
utilized to develop functional pervasive mobile vision techniques. The system operates in offline mode on
applications. One of the emerging areas in smart previously recorded videos, and performs the detection in
applications is in-car mobile applications for smart cities, two steps. First, a set of predefined shapes are searched in
where systems assist drivers in order to increase their video frames and the system decides if there is an
safety, comfort, economy, while also providing new irregularity or not. Then in the second step, the irregularity
entertainment options. type, the location, and the shape are extracted on the
Regarding the studies on road hazard detection, the image.
“Pothole Patrol” [1] study by Eriksson et al. which aims to In brief, in-car mobile sensor applications generally
detect potholes on roads is considered as the first complete make use of accelerometer, magnetometer, GPS, and/or
and valid study for sensor usage in mobile in-car camera sensors. However, most studies rely on a single
applications. As part of a specially designed hardware, a sensor; hence only single modal analysis is performed. The
978-0-7695-5090-9/13 $26.00 © 2013 IEEE 56

DOI 10.1109/NGMAST.2013.19
Authorized licensed use limited to: Universiti Teknikal Malaysia Melaka-UTEM. Downloaded on September 27,2023 at 09:34:48 UTC from IEEE Xplore. Restrictions apply.
main reason to use a single sensor lies with the challenges Once a hazard is detected, the system has the capability to
related to the development of multimodal sensor analysis inform nearby drivers (as geographic position) in order to
applications. These challenges may be listed as the prevent accidents or undesirable events, by providing the
selection of hardware platform and its sensor drivers, hazardous event detail as well as a screenshot image.
proper collection and synchronization of different sensors’ Experiments are conducted within the study to measure the
values, the integration of diverse libraries for additional performance of the developed application, and the results
processing and also tools that provide connectivity for are exhibited.
sharing the results with other infrastructures and social Hence, the contributions of this study may be listed as:
networks. 1) a multimodal sensor framework 2) a sample application
Our study focuses on these challenges and aims to demonstrating the benefits of multimodality and
construct a framework which enables easy, fast and serviceability of information sharing within a connected
flexible implementations of functional smart in-car world.
applications with high level complex analysis, and Section II includes the developed framework. In
provides various outputs including visual outputs. The goal Sections III and IV, the sample implementation utilizing
is to provide capabilities for real-time sensing, signal and the framework and the experiment results are presented,
multimedia processing, and sharing of information with respectively. Finally, the conclusion is given in Section V.
other users and server applications. Hence, the proposed
framework provides an abstraction for easy access to II. PROPOSED FRAMEWORK
synchronized sensor readings, a plugin based analysis The goal of this study is to develop a mobile
interface, integration of signal and image processing multimodal sensor analysis framework which provides real
libraries, and a sharing toolset for information sharing. time sensing, multimedia processing and communication
With this framework, it is possible to implement capabilities. Although various smart mobile applications
complex algorithms and develop analysis methods on the may be developed with this framework, the core features
collected and synchronized sensor values. On the other to be provided are selected as real-time synchronized
hand, many problems which cannot be easily solved by the sensing, plugin based analysis interfaces, signal/image
single-modal approach may also be addressed in processing capabilities and information sharing. The
multimodal analysis within this framework. In this sense, framework is developed on the Android1 platform due to
the framework supports the output of one sensor analysis its flexible and open architecture. The framework contains
to be used as an input to another sensor analysis. three main components: “Sensing Component”, “Analysis
A sample mobile application is also developed in this Component” and “Sharing Component” (“Fig. 2”).
study, in order to demonstrate the capabilities and the
feasibility of the developed framework. This application is
a multimodal sensor analyzer which detects obstacles on
the road, or critical road surface anomalies (i.e. pothole,
speed bump), extracts the video segment or photo of the
road section containing the hazard (“Fig. 1”) and sends the
hazard information to a central repository which distributes
the event to nearby users, in real time.
Figure 2. Multimodal sensor analysis framework

architecture
A. Sensing Component
The sensing component provides an abstraction on
sensor devices, and thus facilitates access to raw sensor
values. Using this component, the developed application is
freed from dealing with the details of sensor device
registrations, malfunctions of device, or erroneous data.
Most importantly, this component enriches the sensor
values by synchronizing them and also provides the
developer the raw, aggregated and/or sampled values of
Figure 1. Sample application use-case the sensors via simple interfaces, hence leading to easy but
accurate access to different sensor values when required as
The system operates in two steps: first, the hazardous part of the multimodal analysis. Another feature of the
road segment (pothole, speed bump) is detected by sensing component is to reorient accelerometer values
analyzing the accelerometer and the vehicle’s velocity prior to dispatching the sensor values (if developer selects
(from GPS sensor) values. In the second step, the output of it), in order to get sensor readings already oriented to the
the first step is utilized, and the visual imagery of the scene world coordinate system when needed.
is extracted as a video segment as well as an image, and To accomplish all of these, each sensor reading is
prepared for further image processing analysis (the image performed in a separate process, and all the sensors are
processing is not performed in the scope of this study). 1
http://developer.android.com/index.html
57
firstly globally time stamped as they are read from the applications. The framework provides this mechanism as a
underlying operating system. Thus, it is possible to extract reusable component. Using this mechanism, the server
the exact values of different sensors at any specific time may efficiently communicate with the mobile devices,
when needed. Within the separate threads, the aggregation, using minimum resources.
sampling and reorientation operations are also performed.
Moreover, a logger interface is also provided within the III. SAMPLE APPLICATION OF THE FRAMEWORK
framework in order to easily log sensor values to a file. The main goal of the sample implementation is to
Thus, the sensing component includes an interface defined develop a multimodal sensor analysis mobile application in
as “ISensorListener” to accomplish all of these tasks, and order to demonstrate the applicability of the framework
as the developer implements this interface, the sensor and its advantages. The first sensor analysis is conducted
values are dispatched to the implementation. on the accelerometer and GPS data, and the second
For the camera sensor, the framework provides the analysis is conducted based on the result of the first
interface “ICameraPreviewListener”. With this interface, it analysis, on the video file captured by the camera. This
is possible to start/stop video recording, get the video file way, instead of performing image processing on each
after recording the camera sensor and split the file into frame, the application uses the result of the first analysis in
segments when start and end times are specified. order to decide which part of the sensor values to be
Alternative to providing the full video, the framework is utilized in the second analysis.
also able to provide the camera input frame by frame, thus The mobile application is designed as an in-car smart
enabling processing of images in real time. Frames may be application, to be used while on the road by attaching the
provided as byte array or bitmap images. mobile device to the front windshield. The main goal is to
B. Analysis Component automatically detect hazardous events on the road while
driving, by analyzing the variations on accelerometer
This component mainly provides interfaces that are values and capturing the imagery of the scene upon the
compatible with the output of the Sensing Component. detection of a hazard. The application would also get
Thus, the values collected from the sensors are utilized warnings from the system (via other drivers) and also
using these interfaces, and the analysis modules are easily inform nearby drivers about the events that are newly
developed. captured (“Fig. 3”). This information includes the exact
The signal processing in the analysis component location, the severity of the hazard, and the detected
provides an abstract interface “ASensorAnalyzer” which obstacle type (where possible), and image or video
implements “ISensorListener” interface and provides easy recording of the incident encountered.
and effective analysis of collected sensor values. The
interfaces are flexible and easy to implement. Moreover,
the analysis component incorporates the use of image
processing libraries such as OpenCV2 .
The sensing component provides the frames captured
from the camera, and the developer may directly use the
OpenCV methods to apply image processing techniques on
these captured frames within the implementation of the
abstract class. Moreover, the infrastructure is integrated
with the FFmpeg 3 library, a software library for
manipulating multimedia resources, and the developer may
also utilize its functionality directly, without any further
integration.
C. Sharing Component
The applications developed with the framework may Figure 3. Mobile Application Usage Scenario
easily be connected to a central application or
communicate directly with social networks. Thus, it is
possible to save all the records in a central repository for A. Mobile Application
further processing, and at the same time inform other users Mounting the mobile device correctly to the vehicle is
about the outcomes of the sensor analysis, as well as get a very important step in order to obtain valid sensor values.
information from other users. This component is based on The device should be tightly fixed to the vehicle in order to
the client-server architecture and provides the following accurately record the motion of the vehicle. For this
features: reason, it is a good idea to use a docking station or a
• A web service interface from server application mobile device holder (as in “Fig. 4(a)”) attached to the
• An advanced upload service, front window of the vehicle, and the mobile device be
• A push-based messaging service fastened tightly to the holder. This operation is necessary,
• Social networks integration otherwise rattling of the mobile device’s accelerometer
Due to the mobile and dynamic environment, the use of sensor produces incoherent values generated by the motion
push mechanism is strongly recommended for mobile of the mobile device itself, and this leads to erroneous
results.
However, fixing the mobile device is not adequate to
2
OpenCV Homepage, http://opencv.willowgarage.com directly process sensor values since the mobile device’s
3 coordinate system will not be usually aligned with the
FFMPEG Homepage, http://ffmpeg.org/
58
vehicle’s coordinate system. For this reason, it is necessary calculation of the minimum speed of 10 km/h of a vehicle
to reorient the mobile device coordinate system. The passing an average of 2m long speed bump with
reorientation operation is performed by the framework accelerometer sensor collection frequency at 40 Hz).
itself, by applying Euler’s angle. The XYZ rotation
sequence is applied to accelerometer values, as explained (1)
in [4].
The time window usage is necessary because the same
hazard produces different number of samples for different
crossing speeds. Thus, for correct detection, only the
samples belonging to the hazard should be analyzed.
The accelerometer values are collected within a time
window (W) which is defined according to the instant
velocity (v) of the vehicle and the largest difference (z-
diff) along with Z axis values is calculated within that time
window. For pothole detection, if z-diff value exceeds
(a) (b) some predefined threshold (Tb), the detection is considered
Figure 4. (a) Mobile application usage in car (b) Re- as successful (equation (2)).
oriented accelerometer axes
This way, the Z axis always shows the motion relative (2)
to the gravity direction (Fig. 4 (b)), the Y axis the front and
backward motion, and the X axis side motions, according
to the world coordinate system. For all incident detection
algorithms, reoriented accelerometer values are utilized. For the speed bump detection, the sensor value changes
in “Fig. 5 (c)” are analyzed. According to this pattern, the
B. Road Obstacle Detection sensor values exceed several thresholds sequentially and
Manual analysis of the accelerometer values shows that with a minimum interval between sequences. As depicted
each type of event has its own unique characteristic in in Fig. 6, first, exceeding a maximum threshold (z-max1),
terms of accelerometer values change over time, as shown then the minimum threshold (z-min1), then a second
in “Fig. 5". This study focuses on two types of analysis: maximum threshold (z-max2), and lastly another minimum
pothole and bump detection. Different sensor analyses are threshold (z-min2) indicates a crossing of the vehicle over
implemented on accelerometer values for each event type. a speed bump.
Figure 5. Accelerometer values for car’s different Figure 6. Minimum duration calculation in speed bump
motions: (a) smooth road (b) pothole (c) speed bump (d) detection
abrupt deceleration The thresholds z-max and z-min are calculated
The pothole and speed bump detection is performed according to equations (3), (4), and (5). First, z-avg is
within a time window of sensors value. First, the velocity calculated as the average of z values within the time
is acquired from the GPS sensor and is filtered out for the window. Then a deviation value is computed according to
values between 10 km/h (vmin) and 60 km/h (vmax). Then the velocity (v) of the vehicle multiplied by a coefficient,
the time window is calculated (as inversely correlated) defined according to the vehicle properties. Then z-max
according to velocity, in equation (1). C is a coefficient and z-min are calculated according to equations (4) and (5)
that is defined based on the vehicle characteristics and respectively.
Wmax is the maximum upper bound of the window. Wmax is
a pre-calculated value found as “36” (as a result of the
59
(3) distance is found in equation (7). After finding the number
of sensor values to be read backwards, the total time
(4) passed (∆Td) in reading n sensor values is calculated in
equation (8).
(5)
(7)
The interval (∆t) between z-max1 and z-max2 is very

important for the detection of a speed bump, since a small
interval value is an indication of other kinds of (8)
irregularities (such as pothole, or any other sudden change
of road surface), and a too large interval value is a
probable indication of two consecutive potholes. Thus, the
time interval (∆t) is compared against the threshold value (9)
T, and the bump is detected according to the equation (6).
Since the sensor values and the video are synchronized,
the correct location of tshot for the image to be extracted
from the video is calculated according to equation (9).
(6)
An important point in bump detection is the fact that an

increase in vehicle’s velocity generates a large fluctuation
in the values of the accelerometer sensor. For this reason,
the threshold values Tmax and Tmin used in the detection of
hazards is adjusted according to the velocity of the vehicle.
Thus the sequence of operations performed in speed bump
detection is as follows: 1) discard sensor values of too low
or too high velocities, 2) determine the thresholds based on
time and velocity of the vehicle, 3) detect consecutively
twice the exceeding of thresholds as Tmax and Tmin and 4)
ensure that a period of time passes between z-max1 and z-
max2.
C. Image Extraction
A distinctive feature of this study is the implementation
Figure 7. Image of nine different detected speed bumps,
of the multimodal analysis approach. The study enables the
analysis of sensor values in one modality based on the extracted automatically
results of sensor values of a different modality. With this D. Sharing with Other Drivers
approach, the video, which is recorded simultaneously Using the Sharing Component of the framework, the
while detecting the hazards, is analyzed and the application is able to warn the nearby users about a hazard
“important” parts of the video, which contain the hazard on the road. Hence, upon a detection of a road hazard, the
scenes, are extracted by using the detection results of the application immediately creates an event which consists of
analysis on the accelerometer sensor values. the time, location, hazard type and hazard severity data.
Since image processing is a time and resource This event is sent to a central server (if the device has
consuming operation, instead of analyzing the whole video connection). The server then runs a quick geo-location
from start till the end, only the video parts which have to search to locate the active drivers nearby the event and
be focused and which are likely to contain a hazard, are sends a notification about the event to these drivers, using
extracted by the multimodal analysis. The video parts and push mechanism.
frames are extracted with this method and prepared for Once the event is received by the vehicle, the
further processing (image processing on extracted frames application checks whether that vehicle motion is headed
is not conducted as a part of this study). towards the event or not. There, we simply check whether
For the video and frame extraction, the following the direction between the event location and the vehicle
method is applied: for a detected road hazard , the video motion direction. If the direction is towards the event, the
that has been recorded is analyzed, and a video section driver is warned about it, when the distance is less than a
starting from 10 meters ahead of the hazard until the threshold (again calculated according to the velocity of the
hazard, and an image from 5 meters ahead of the hazard is vehicle). The warning may be presented by visual or audial
extracted automatically (“Fig. 7”). In order to achieve this, alerts, based on the user preferences.
the exact video location which matches the required
distances before the hazardous road segment is identified. IV. EXPERIMENT RESULTS
For image extraction, we have selected the best
distance for a good image before the bump speed as 5 In order to evaluate the performance of the mobile
meters. Assuming X0 = 5m, the number of sensor values application, a route in our university campus is selected
(n) that should be considered backwards to span that and several driving tests are conducted on this route. In
60
order to build the ground truth for potholes and speed all. This shows that the algorithms are too sensitive, and
bumps, the correct locations are marked manually before needs to be adapted according to different conditions and
testing, by travelling the entire route. The results are different shapes of obstacles.
collected from the test runs and compared with the ground
truth values, and they are evaluated in terms of recall and V. CONCLUSION
precision values. In this study, a multimodal sensor analysis framework
is developed for in-car mobile applications for performing
TABLE I. EVALUATION RESULTS analysis on real-time synchronized sensor values. The
Speed Bump Counts and Recall and Precision Results mobile application developed within this framework
Test No Ground True False False Recall Precision detects the potholes and speed bumps on the road, and
Truth Positive Positive Negative (%) (%) successfully extracts the image and video section
1 11 8 3 3 73 73 corresponding to this road segment automatically. This
2 12 10 3 2 83 77 extracted section is prepared for further image processing.
3 10 9 1 1 90 90
4 11 7 0 4 64 100
The detection is shared over a central application with
5 11 9 2 2 82 82 other drivers, which provides a quick awareness about
6 11 8 0 3 73 100 traffic events.
7 9 7 1 2 78 88 With the methods employed in the study, the
8 4 3 0 1 75 100 advantages of multimodal analysis over single-modal
9 11 11 4 0 100 73 analysis are exhibited. As future work, it is planned to
10 11 10 2 1 91 83 increase the robustness of the algorithms, to focus on more
11 10 8 1 2 80 89 complex analysis using different sensors (such as
12 11 9 0 2 82 100
13 11 9 0 2 82 100
microphone, magnetometer, or proximity), and to explore
14 11 10 0 1 91 100 social aspects of utilizing the outcomes of the sensor
15 9 7 0 2 78 100 analysis for smart city applications.
16 10 9 0 1 90 100
17 12 10 0 2 83 100 REFERENCES
18 11 10 0 1 91 100 [1] Eriksson, J., Girod, L., Hull, B., Newton, R., Madden, S.,
19 11 9 0 2 82 100 Balakrishnan, H.: The Pothole Patrol: Using a Mobile Sensor
20 13 11 0 2 85 100 Network for Road Surface Monitoring. In: MobiSys'08. pp. 29-
21 16 13 2 3 81 87 39, 2008
22 10 7 0 3 70 100 [2] Mohan P., Padmanabhan V. N., and Ramjee R., “Nericell: using
23 12 9 0 3 75 100 mobile smartphones for rich monitoring of road and traffic
24 12 11 1 1 92 92 conditions,” in Proceedings of the 6th ACM conference on
25 11 9 0 2 82 100 Embedded network sensor systems, ser. SenSys ’08. New York,
AVG 10.84 8.92 0.80 1.92 82 93 NY, USA: ACM, pp. 357–358, 2008
[3] Mednis A., Strazdins G., Zviedis R., Kanonirs G., Selavo L.,
Real time pothole detection using android smartphones with
Test drive durations are 20 minutes on average and a
accelerometers. Distributed Computing in Sensor Systems and
total of 25 drives are conducted within the test. The results Workshops, International Conference on 0, 2011
show a recall value of 82% and a precision of 93% (Table [4] Astarita V, Caruso M.V., Danieli G., Festa D.C., Giofrè V. P.,
I) for successful bump detection. A detailed manual Iuele T., Vaiana R., A mobile application for road surface quality
analysis on the results is also conducted, and two speed control: UNIquALroad, Procedia - Social and Behavioral
bumps are identified for which the algorithms mostly Sciences 54, 2012
failed to detect, thus resulting in false negative. With the [5] Kyutoku, H., Deguchi, D., Takahashi, T., Mekada, Y., Ide, I.,
manual inspections on these two speed bumps, it has been Murase, H.: On-road obstacle detection by comparing present
and past in-vehicle camera images. In: Proc. 12th IAPR Conf. on
observed that they are physically different in shape Machine Vision Applications, pp. 357-360 (2011).
compared to others, such that one side of the speed bump [6] T. Naito, "The Obstacle Detection Method using Optical Flow
slope was lower than a regular one, thus they produce Estimation at the Edge Image," IEEE conference on Intelligent
lesser variations in accelerometer values than expected. Vehicles Symposium, pp. 817-822, 2007.
For this reason, the algorithms fail to detect the first [7] Koch, C. and Brilakis, I., ‘Pothole detection in asphalt pavement
threshold exceeding, hence fail to detect the speed bump at images’, Advanced Eng. Informatics 25(3), 507–515, 2011
61

Road Hazard Detection and Sharing With Multimodal Sensor Analysis On Smartphones

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Road Hazard Detection and Sharing With Multimodal Sensor Analysis On Smartphones

Uploaded by

Copyright:

Available Formats

2013 Seventh International Conference on Next Generation Mobile Apps, Services and Technologies

Road Hazard Detection and Sharing with Multimodal Sensor Analysis on

Fatih Orhan, P. Erhan Eren

978-0-7695-5090-9/13 $26.00 © 2013 IEEE 56

Figure 2. Multimodal sensor analysis framework

The interval (∆t) between z-max1 and z-max2 is very

An important point in bump detection is the fact that an

You might also like