Professional Documents
Culture Documents
Intelligent Pothole Detection and Road Condition A
Intelligent Pothole Detection and Road Condition A
Intelligent Pothole Detection and Road Condition A
net/publication/320296987
CITATIONS READS
8 2,209
4 authors, including:
Umang Bhatt
Carnegie Mellon University
12 PUBLICATIONS 10 CITATIONS
SEE PROFILE
All content following this page was uploaded by Umang Bhatt on 28 November 2017.
J. Zico Kolter
Carnegie Mellon University
Pittsburgh, PA
arXiv:1710.02595v2 [cs.CY] 10 Oct 2017
zkolter@cs.cmu.edu
2. METHODOLOGY
Before classifying potholes and road conditions, we had to
collect a considerable amount of training data. We built a
system for collecting and labeling this data via two separate
iPhone apps. Then, we applied various transformations on
the raw sensor data to get a better signal for classification.
2.1 Specifications
All of the data was collected on a 2007 Toyota Prius with
approximately 100,000 miles. Both smartphones used for
data collection were iPhone 6Ss. One iPhone was used for
collecting sensor data while the other for recording potholes.
An iPhone suction-cup mount was used to place the iPhone
collecting sensor data on the center of the windshield.
For road condition classification, we decided to use an inter- 4. RESULTS AND ANALYSIS
val of 25 data points (5 seconds). We believed that 5 seconds After combining data from all of our data collection trips
was ample time to assess a small stretch of a road and clas- and generating intervals and aggregate metrics, we trained
sify it as good or bad. After creating the 5-second intervals several classifiers for both of the classification tasks.
and aggregate metrics, we attached the corresponding labels
of good road (0) and bad road(1). 4.1 Road Condition Classification
The road condition dataset was used to train and evaluate
For pothole classification, we used an interval of 10 data several classifiers, including support vector machine (SVM),
points (2 seconds). Since potholes are sudden events, we logistic regression, random forest, and gradient boosting.
hypothesized that a shorter interval would be able to cap- The best results of each classifier can be found in the Ap-
ture them more accurately. For each interval, we attached pendix: Table 1. We tuned some of the parameters and
Figure 4: Accelerometer readings (centered) for Figure 5: Standard Deviation of accelerometer read-
good vs. bad road ings for good vs. bad roads
hyperparameters for each classifier to get the best test set Like in road conditions classification, the best performing
accuracy. classifiers were SVM and gradient boosting, with accuracies
of 92.9% and 92.02% respectively. These accuracies were
Overall, an SVM with a radial basis function (RBF) kernel somewhat better than the base rate of 89.8% from the base-
and gradient boosting both achieved the highest test accu- line classifier, so at least the classifiers were useful.
racy of 93.4%. A baseline model that predicted ”good road”
for all instances would have achieved 82% accuracy. SVM The SVM model performed the best in terms of accuracy,
and gradient boosting beat the base rate in this problem but we wanted to improve its precision-recall tradeoff. The
precision-recall curve in Figure 9 illustrates all the combi-
Figure 8 illustrates the selection of the regularization pa- nations of precision and recall values for different thresholds
rameter C for the SVM classifier. Note that the training on the SVM decision function. The red point in the figure
and test error are fairly close to each other at the chosen represents the threshold we chose which gives us a precision
parameter value C=250, indicating that the model is per- of 0.78 and a recall of 0.42.
forming well. However, more data and useful features could
be helpful in further lowering this gap between the training This choice is a good tradeoff between correctly flagging pot-
and test error. holes (high precision) and detecting all true potholes (high
recall). In this context, a precision of 0.78 means that when
our model classifies an interval as having potholes, 78% of
those intervals actually have potholes. A recall of 0.42 means
4.2 Pothole Classification that our model correctly classifies 42% of the true pothole
Since potholes are rare events, there was a large class imbal- intervals. Notably, the accuracy of the SVM model stayed at
ance in our dataset. Even a naive model that always pre- 92.9% even though we changed the classification threshold
dicted ”non-pothole” for a new interval would achieve 89.8% to improve the precision-recall tradeoff.
accuracy on the classification task. So, in this problem, it
was more important to optimize the precision-recall tradeoff
than to focus on accuracy alone. 5. DISCUSSION
Figure 6: PCA of features colored by road condi- Figure 9: Precision-recall curve for the SVM clas-
tions sifier. The red point is the precision-recall tradeoff
we chose.
6. FUTURE WORK
There are many extension points from this initial data explo-
ration and classification project. Below are a few examples
of possible future work.
Figure 11: Road condition map of Pittsburgh, PA • Expanding the route to collect training data on addi-
showing classification results from the app tional roads could only help by decreasing the variance
of the models.
5.4 Failures
While performing this data exploration and analysis, we ran • Building a device to solely capture accelerometer and
into many bumps (no pun intended) and had to pivot our gyroscopic data (though this does violate the afore-
approach and methodologies. Below is a collection of the mentioned invasiveness) would allow for less variance
failures we had to overcome in order to produce safe and due to confounding variables.
sound results for this project.
• Working to control other confounding variables, like
sudden changes in acceleration or mere braking, would
• From the inception of this project, we intended on make the features of the classifiers more robust.
• Calculating road condition scores (from 1-10, per se)
would help extend this project beyond binary classifi-
cation. These scores can then be mapped onto a given
city/route to denote the conditions of roads compara-
tive to other roads featured on the given map.
7. ACKNOWLEDGMENTS
In addition to continual guidance and encouragement from
Professor Zico Kolter of Carnegie Mellon University’s School
of Computer Science, we are also especially grateful to Pro-
fessor Christoph Mertz for his insight to use gyroscope and
accelerometer data to classify potholes. We would also like
to thank Professor Roy Maxion, Professor Max G’sell, and
Professor David O’Hallaron for their patient and thoughtful
advice.
8. REFERENCES
[1] R. Z. G. K. A. Mednis, G. Strazdins and L. Selavo.
Real time pothole detection using android smartphones
with accelerometers. International Conference on
Distributed Computing in Sensor Systems and
Workshops (DCOSS), pages 1–6, 2011.
[2] Economist. The hole story. Economist, 2016.
[3] Eriksson and Jakob. The pothole patrol: using a mobile
sensor network for road surface monitoring.
nternational conference on Mobile systems,
applications, and services, 6, 2008.
[4] A. Halsey. Bad highway design, conditions contribute
to half of fatal auto crashes in u.s. The Washington
Post, 2009.
[5] V. N. P. Mohan, Prashanth and R. Ramjee. Nericell:
rich monitoring of road and traffic conditions using
mobile smartphones. ACM conference on embedded
network sensor systems, 6, 2008.