Professional Documents
Culture Documents
Machine Learning For Crack Detection Review An
Machine Learning For Crack Detection Review An
Machine Learning For Crack Detection Review An
Abstract: With the advancement of machine learning (ML) and deep learning (DL), there is a great opportunity to enhance the development
of automatic crack detection algorithms. In this paper, the authors organize and provide up-to-date information on on ML-based crack de-
tection algorithms for researchers to more efficiently seek potential focus and direction. The authors first reviewed 68 ML-based crack
detection methods to identify the current trend of development, pixel-level crack segmentation. The authors then conducted a performance
evaluation on 8 ML-based crack segmentation models using consistent evaluation metrics and three-dimensional (3D) pavement images with
diverse conditions to identify remaining challenges and potential directions for future development. Based on the comparison results, deeper
backbone networks in FCN models and skip connections in U-Net both improved the performance. Within different categories of pavement
images, except for the Other Distress category, FCN and U-Net scored over 90 on the enhanced Hausdorff distance metric. Results showed
that solving the false-positive problem is an important step in further improving ML-based crack detection models. DOI: 10.1061/(ASCE)
CP.1943-5487.0000918. © 2020 American Society of Civil Engineers.
Classification
Traditional ML Methods
The goal of classification is to determine whether an image/image
Traditional ML methods use techniques other than deep learning. patch contains cracks and, if so, to classify them. DL is in an early
To deal with large quantities of nonstructured image data, a prede- stage of use in crack classification, having been developed in 2016
fined feature extraction stage is required before training the models. (Zhang et al. 2016; Schmugge et al. 2016), but it became the most
Beginning almost 30 years ago (Kaseko and Ritchie 1993), re- popular DL method in 2017. CNN models with fully connected
Downloaded from ascelibrary.org by Auckland University Of Technology on 07/14/20. Copyright ASCE. For personal use only; all rights reserved.
search in this category has been continuously conducted. (FC) layers are used, and lightweight models with less than 10
The two most popular techniques in traditional ML-based meth- layers are usually used because the task is considered simple binary
ods are support vector machine (Li et al. 2009; Gavilán et al. 2011; classification.
Moussa and Hussain 2011; O’Byrne et al. 2013; Daniel and Preeja A few researchers have performed image-level classification
2014; Fujita et al. 2017; Wang et al. 2017b; Chen et al. 2017) and (Ma et al. 2017; Gopalakrishnan et al. 2017, 2018; Xu et al. 2019)
artificial neural network (Kaseko and Ritchie 1993; Chou et al. in which the whole image from the data set is classified. On the
1994; Cheng et al. 2001; Liu et al. 2002; Lee and Lee 2004; other hand, patch-level classification (Schmugge et al. 2016;
Moon and Kim 2011; Hoang 2018; Wang et al. 2018). Other tradi- Zhang et al. 2016; Cha et al. 2017; Chen and Jahanshahi 2017;
tional ML techniques, such as random forest (Shi et al. 2016), are Eisenbach et al. 2017; Feng et al. 2017; Pauly et al. 2017; Wang
also used. These techniques are mainly involved in two tasks; crack et al. 2017a; Wang and Hu 2017; Yokoyama and Matsumoto 2017;
detection and crack type classification, and ing of optimal param- Yusof et al. 2018; Dorafshan et al. 2018; Kim and Cho 2018; Li
eters for non-ML crack detection [e.g., threshold values (Cheng et al. 2020; Nguyen et al. 2018; Zhang et al. 2018b; Chen et al.
et al. 2001)]. Image-processing steps are required to perform pre- 2019; Li and Zhao 2019; Park et al. 2019), which separates each
defined feature extraction, and various features have been used in image into small image patches, is usually performed because it has
different research efforts, such as statistical values of images, ver- two advantages. First, more data can be generated by dividing im-
tical and horizontal projections of feature maps, and properties of ages into small patches. Second, by classifying each small patch of
defined crack objects. the original image, localization information of cracks in the original
The main problem with traditional ML methods is that they image can be obtained [Fig. 1(b)]. Thus, the results of patch-level
contain only shallow learning techniques. Without learning higher- classification can be further used in crack type classification.
level features, those techniques are not able to deal with the For research and engineering purposes, important features about
complex information contained in the images. For example, the cracks, such as length, width, and branches, are needed. Although
background of pavement images is largely affected by illumination patch-level classification can generate localization information, the
and the environment. results are coarse and blocky and so cannot be used to estimate
crack features. Therefore, more recent studies have focused on
DL Methods pixel-wise prediction.
DL methods use multilayer neural networks, which can automati- Object Detection
cally learn features from the data. In computer vision tasks, con- Object detection tasks (Cha et al. 2017; Carr et al. 2018; Cheng and
volutional neural networks (CNNs), a specific DL model with Wang 2018; Maeda et al. 2018; Mandal et al. 2018; Nie and Wang
convolution layers, have been widely used. DL methods appeared 2018; Xue and Li 2018) generate bounding boxes around areas that
Fig. 1. (a) Input image of crack detection models; (b) classification: image patches classified as crack or noncrack; (c) object detection: bounding
boxes generated around areas that contain cracks; and (d) segmentation: pixels classified as crack or noncrack.
Segmentation generates a pixel-wise prediction of cracks in the im- sarial networks (GANs) have been used to perform self-supervised
ages; in other words, each pixel is classified as crack or noncrack structure learning to achieve crack segmentation (Zhang et al. 2020).
[Fig. 1(d)]. The precise crack location and structure generated by
crack segmentation can be used both to classify crack type and to Overall Trend in ML Crack Detection
obtain important crack features. Also, because of advances in sens-
ing technologies, 2D and 3D data with higher resolution can be Fig. 2 shows the distribution of published papers each year. As
obtained. For these reasons, crack segmentation has become the shown in Fig. 2(a), traditional ML methods date back to 1993. Re-
current trend in DL crack detection. search on traditional ML methods has been continuously conducted,
DL crack segmentation can be performed via five different ap- but publications have increased slightly in the past decade. DL-based
proaches. The most popular is to use and modify encoder-decoder– methods appeared in 2016, and the number of publications on them
based models (Schmugge et al. 2017; Huang et al. 2018; Jenkins has grown rapidly. This shows that DL-based methods have proven
et al. 2018; Ji et al. 2018; Yang et al. 2018; Zou et al. 2018; Bang effective in crack detection.
et al. 2019; Dung et al. 2019; Liu et al. 2019; Tabernik et al. 2020; Fig. 2(b) is a closer look at the number of papers in each
Yang et al. 2019), such as FCN (Long et al. 2015) and SegNet DL-based category. Two main categories can be seen in the figure:
(Badrinarayanan et al. 2017). For general semantic segmentation patch-level classification and segmentation. With the ability to pro-
vide crack localization in the images using only lightweight CNN
models, patch-level classification received most of the attention in
Machine Learning Based Crack Detection 2016 and 2017. Then segmentation quickly replaced classification
because it provides higher-resolution data and more precise crack
localization and structure prediction. The amount of research on
crack segmentation is still increasing, which shows that the pixel-
wise prediction of cracks, although still in the development stage, is
the current trend inML crack detection.
Methodology
Liu et al. (2019) 537 RGB images with cracks in multiple scales and scenes Segmentation
Kolektor (Tabernik et al. 2020) 399 images of microscopic fractions or cracks on the surface of the plastic embedding Segmentation
in electrical commutators
CrackIT (Oliveira and Correia 2014) 84 grayscale pavement surface images, acquired by an optical device Segmentation
CrackTree (Zou et al. 2012) 206 pavement images with various types of cracks Segmentation
CFD (Shi et al. 2016) 118 road crack images acquired from a smartphone Segmentation
Yang et al. (2018) Around 800 images of pavement cracks and cracks on concrete walls Segmentation
(Tsai et al. 2013), and concrete joint faulting (Tsai et al. 2011); for The main reason is that manually annotating pixel-level segmenta-
automated raveling detection and classification (Tsai and Wang tion ground truth is very time-consuming and labor-intensive. As
2015); for automated pothole detection (Tsai and Chatterjee 2018), shown in Table 1, the available crack segmentation data sets were
and for a new area-based faulting measurement with enhanced ac- smaller than the data sets for crack classification and object detec-
curacy (Geary et al. 2018). Three-dimensional images have also tion, which made the ground truths much easier to annotate. To ad-
been used in research on ML crack detection (Wang et al. 2017a; dress this problem, the authors proposed a less time-consuming and
Zhang et al. 2017; Li et al. 2020; Zhang et al. 2018a; Fei et al. 2019; labor-intensive semiautomatic method for generating crack segmen-
Zhang et al. 2019). A survey reported in 2017 shows that 18 US tation ground truth.
states use 3D automated data collection and 17 have a plan to use it The semiautomatic method consists of three parts. First, a
within two years (Zimmerman 2017). Therefore, 3D pavement im- crack-labeling tool, CrackDigitizer, is used for manually labeling
ages were used in this study. cracks in range images with an annotation of one pixel wide
The Georgia Tech Sensing Vehicle (GTSV) was used to collect called “crack curve segmentation.” CrackDigitizer outputs XML
3D pavement surface images. Two 3D line laser sensors were files containing coordinates of the annotated crack curve data
mounted at the rear of the GTSV to collect 4-m full-lane-width points [Fig. 3(b)]. To make the labeling process simpler, a minimal
3D pavement images with a maximum speed 60 mi=h. For each path algorithm (Chatterjee and Tsai 2018; Jiang 2015) is used.
scanner, one frame of 3D pavement point cloud data is composed By manually labeling two points of the crack, the intermediate
of 1,000 (longitudinal) by 2,080 (transverse) points, with a 5-mm points are automatically generated using the algorithm. In the sec-
interval in the longitudinal direction and a 1-mm interval in the ond part, the coordinates of the crack in each XML file are read to
transverse direction. Points in the longitudinal direction are inter- generate a crack curve binary image in which the crack coordinates
polated to also have a 1-mm resolution by commercial software. are appear white on a black background [Fig. 3(c)]. Finally, both
Height value, which is the distance between the pavement surface the range images and the crack curve binary images are used to
and the laser scanner, is stored in each cloud point with a resolution generate the crack area binary images that contain pixel-wise label-
of 0.5 mm. Raw point clouds are transformed into 3D pavement ing of cracks called “crack area segmentation” [Fig. 3(d)]. This
images, or range images, through compression and rectification process is based on image processing that performs thresholding
by the software. Compression rescales the height value of the point and connected components analysis on range images, followed by
cloud to between 0 and 255 to generate a grayscale image. The data points matching on crack curves and a final refinement stage.
pixel values are then inverted so the deeper surface appears darker The crack area binary images are used as the ground truth.
on the range image. Rectification applies a Gaussian high-pass fil-
ter to remove the gradual change in range values caused by rutting
Testing Data Categorization
and cross-slope on the pavement surface. In this study, the 3D pave-
ment images were down-sampled by a factor of 4 to reduce the size To gain insight into how the implemented crack segmentation mod-
of the data set for training DL algorithms, which resulted in a size els would perform, the authors categorized the testing images based
of 520 × 1,250 pixels. In total, 1,152 images collected from US on pavement crack type. Fig. 4 shows the six categories of the test-
Route 80 (US 80) in the state of Georgia were used as the training ing images; the distribution of each category is provided in Table 2.
data. To avoid properties similar to those of the training data, 200 The Other Distress category indicates other pavement distresses,
images from Georgia State Route 275 (SR 275) northbound were such as scratches, in the image.
used as the validation data and 210 images from SR 275 south-
bound were used as the testing data. To achieve a consistent com-
parison, all models in this study were trained and evaluated using Evaluation Metrics
the same data set. Given a predicted and a ground-truth crack segmentation image, the
Design and performance of DL models rely heavily on the data following components can be obtained by looking at each pixel at
set used. However, there is currently no large-scale public data the same location:
sets with crack segmentation ground truth to serve as a benchmark. • True positive (TP): An actual crack pixel is predicted correctly.
Fig. 3. (a) Range image with pavement crack; (b) manually labeled points of crack overlaid on the range image; (c) binary image of crack curves
segmentation; and (d) binary image of crack area segmentation. Notice the difference in crack width between the crack curve and crack area binary
images.
the shortest distances from each pixel to the other set of pixels when mation on objects that are captured in the initial layers and required
calculating the penalty. Also, an upper limit u is used to ensure that for constructing image segmentation in the decoder. The U-Net
no further penalty is added after it is determined that a predicted model, developed by Jenkins et al. (2018), was implemented in this
crack is not an actual crack. The penalty for the buffered Hausdorff study.
distance metric is calculated by
DeepCrack
1 X DeepCrack (Liu et al. 2019) is similar to FCN, which applies con-
hBHD ðA; BÞ ¼ sat ðminb∈B ka − bkÞ ð7Þ
jAj a∈A u volutional layers to output spatial maps. It is different in that it gen-
erates multiscale predictions from different layers of the model.
To consider the subjectivity of ground-truth labeling, Tsai and These predictions are then fused to generate the final segmentation
Chatterjee (2017) incorporated a penalty-free buffer around the result. In this study, DeepCrack with the VGG16 backbone was
ground truth and predicted crack when calculating the penalty, implemented.
1 X CrackNetII
hEHD ðA; BÞ ¼ fðka − bkÞ;
jAj a∈A CrackNet (Zhang et al. 2017) generates a more precise segmenta-
8 tion of cracks. By removing the pooling layers and preserving the
> u − l for u < x
< spatial resolution of the input images throughout the hidden layers,
where fðxÞ ¼ x − l for l ≤ x ≤ u ð8Þ it avoids the loss of localization information caused by down-
>
:
0 for x ≤ l sampling, which appears in most segmentation models. Based on
CrackNet, CrackNetII (Zhang et al. 2018a) has several improve-
where l = lower limit of the Euclidean distance after which a pen- ments, although the main idea of preserving the spatial resolution
alty is applied. The lower limit provides a penalty-free region throughout the model is preserved.
around the crack to address the subjectivity issue of ground-truth
labeling. On the other hand, u is the upper limit of the Euclidean Image Translation Model—Pix2Pix
distance after which the penalty does not increase. In this study, the Image-to-image translation changes a given image to another image
lower limit was set to 15 mm; the upper limit, to 30 mm. Thus, l in a controlled way. It can be used to translate the range image to
was set to 4 pixels and u to 8. The set of data points A represents the the corresponding crack segmentation. Pix2Pix (Isola et al. 2017) is
predicted crack pixels, and the set of data points B represents the a conditional generative adversarial network (Mirza and Osindero
ground-truth crack pixels. With Eq. (8), the EHD score is given by 2014) designed for image-to-image translation. It consists of a gen-
erator and a discriminator. The generator, which generates a crack
maxðhEHD ðA; BÞ; hEHD ðB; AÞÞ segmentation from a range image, is trained to fool the discriminator,
scoreðA; BÞ ¼ 100 − × 100 ð9Þ
u−l which is trained to classify images as real (ground truth) or false
(generated) segmentation conditioned on the range image. After
Also, the FP and FN penalties are given by training, only the generator is used when evaluating the model on
testing images. Pix2Pix models with U-Net- and ResNet-based gen-
hEHD ðA; BÞ
FP penalty ¼ ð10Þ erators were implemented in this study.
u−l
By comparing FCN models with different backbone networks, the thin cracks more correctly, which leads to lower FN penalty and
authors observed that a deeper backbone leads to better perfor- higher recall.
mance. This indicates that even with 3D pavement data, which re-
moves much noise in 2D data, pavement crack segmentation still CrackNetII—The Need for Spatial Information between
requires deeper networks to extract more complex features. With a Pixels
deeper backbone, models can not only segment cracks more accu-
rately, especially for hard cases such as thin cracks, but also better CrackNetII did not perform well. Fig. 7 shows that it segmented
distinguish cracks and similar patterns such as pavement boundaries cracks correctly with precise crack width for wide pavement cracks.
(Fig. 5). In the images of the crack segmentation results, both FP and However, for thinner cracks it generated disjointed crack pixels,
FN pixels are also shown. Also, the penalty-free buffer was used leading to many FN pixels. Also, disjointed FP pixels were gener-
when showing the result images so the TP pixels would be wider ated when positions in the images were darker. The disjointed pix-
than the ground truth. els may have been caused by the multiple convolutional layers with
Table 3 indicates that DeepCrack with the VGG16 backbone the 1 × 1 kernel used in CrackNetII, which led to less information
performed better than FCN with the VGG16 and VGG19 back- on the spatial relationship between pixels being obtained when
bones. This shows the usefulness of multiscale predictions and fu- extracting the feature maps. For models that do not apply the
sion in DeepCrack for crack segmentation. However, FCN with a encoder-decoder structure, the spatial relationship of cracks is an
ResNet backbone still performed better, which again shows the important factor to be considered when interpolation and skip ar-
benefit of a deeper backbone network. chitectures are removed.
FN
FP
Fig. 5. Crack segmentation results of FCN models with different backbone networks.
Fig. 6. Crack segmentation results of U-Net and FCN with ResNet. The circles in the result of U-Net indicate FN cracks not detected by U-Net. The
circles in the result of FCN indicate FP pixels caused by the FCN model’s inability to segment cracks with precise crack width.
Image Translation—Toward More Precise Crack Width Affect of Pavement Crack Types and Pavement
Boundary
Encoder-decoder models, such as FCN and U-Net, usually generate
coarse segmentation when the object is too small or thin because of Table 4 shows the performance of the two models with the high-
up-sampling. CrackNetII tries to solve this problem by removing est EHD scores, U-Net and FCN with ResNet, on different pave-
the pooling operation and thus the up-sampling. However, as men- ment crack types. The models performed best in the noncrack
tioned earlier, CrackNetII has FP and FN problems. To obtain crack category, as only slight FP occurred. Both models got very low
segmentation with a more precise crack width, the Pix2Pix model EHD scores in the Other Distress category, because they pre-
was explored. As shown in Fig. 8, It generated crack segmentation dicted other distresses as cracks, especially when there were
with a more precise width than U-Net. However, it experienced scratches on the pavement. Therefore, a high FP penalty can be
more FN problems for thin cracks that were nearly 1-pixel wide, seen in the table for this category. For the, alligator, block, lon-
especially when U-Net was used as the generator. Also, more FP gitudinal, and transverse crack categories, both models had EHD
and FN pixels arose because the crack pixels were not accurately scores over 90. This indicates that both models can correctly seg-
located. These problems can be seen in Table 3. With more precise ment cracks in normal cases, and that cracks with more complex
crack widths, Pix2Pix had the highest precision values, but the FP patterns, such as alligators, do not necessarily negatively affect
and FN problems led to low recall values. It also had the highest F1 model performance.
scores, but because the EHD score has a buffer region with no score As indicated in Table 3, most models had a higher FP penalty
penalty, U-Net and FCN with ResNet still managed to score higher than FN penalty. Most FPs were caused by patterns that were sim-
on coarse crack segmentation. Therefore, if crack width needs to be ilar to cracks in the images. Other pavement distresses could have
considered when measuring model performance, precision and re- been a source of patterns that confused the models, or the patterns
call and not just EHD score are important metrics that should be may have been pavement boundaries because of their nearly iden-
used together. tical structure. The models often generated FP pixels at the location
Fig. 8. Crack segmentation results of U-Net and Pix2Pix models. The circles in the result of Pix2Pix indicate Pix2Pix’s localization problem. With
predicted crack pixels not located accurately, both FP and FN pixels were generated.
Table 4. Evaluation result of the models for different pavement crack types
Model Metric Alligator Longitudinal Transverse Block Noncrack Other distress
U-Net EHD score 95.4811 92.4068 91.6501 91.8085 99.814 74.8066
FP penalty 0.0416 0.0438 0.0541 0.0448 0.0019 0.2397
FN penalty 0.0373 0.0612 0.0506 0.0706 0 0.0895
FCN-ResNet EHD score 90.3525 90.7444 90.2685 91.1525 99.9499 78.3495
FP penalty 0.0965 0.0803 0.0798 0.0804 0.0005 0.2165
FN penalty 0.0431 0.0548 0.0565 0.0442 0 0.0859
Fig. 9. Crack segmentation results of U-Net and FCN with ResNet on images with pavement boundary and other pavement distress.
of the pavement boundary. As shown in Fig. 9, ResNet backbone Conclusions and Recommendations
and U-Net solved the problem to some extent but were still largely
confused by the patterns of other distresses. Experiments using ML and DL have become mainstream technologies for developing
training data containing such confusing patterns can be used to bet- enhanced pavement crack detection algorithms. In this paper, the
ter train models to distinguish them from cracks. authors organized and provided up-to-date information on research
ment cracks with a more precise crack width. This indicates that Structural Monitoring Systems (EESMS), 1–5. New York: IEEE.
localization information is important in improving crack seg- Cha, Y.-J., W. Choi, and O. Büyüköztürk. 2017. “Deep learning-based
mentation models. crack damage detection using convolutional neural networks.” Com-
put.-Aided Civ. Infrastruct. Eng. 32 (5): 361–378. https://doi.org/10
• Image translation models largely improve the precision of seg-
.1111/mice.12263.
mented crack width. However, more FP and FN problems are
Chatterjee, A., and Y.-C. Tsai. 2018. “A fast and accurate automated pave-
introduced because the predicted pixels are not accurately lo- ment crack detection algorithm.” In Proc., 26th European Signal
cated. Postprocessing steps that use information from the input Processing Conf. (EUSIPCO), 2140–2144. New York: IEEE.
range images can be explored to address this problem. Chen, F.-C., and M. R. Jahanshahi. 2017. “NB-CNN: Deep learning-based
• Pavement boundaries and other pavement distresses that have crack detection using convolutional neural network and naive Bayes
patterns similar to crack patterns are the major causes of FP pre- data fusion.” IEEE Trans. Ind. Electron. 65 (5): 4392–4400. https://doi
dictions. A possible solution is to collect and add more data con- .org/10.1109/TIE.2017.2764844.
taining these confusing patterns while training the models. Chen, F.-C., M. R. Jahanshahi, R.-T. Wu, and C. Joffe. 2017. “A texture-
• Table 4 shows that state-of-the-art models can achieve EHD based video processing methodology using Bayesian data fusion for
scores over 90 for all pavement crack categories except Other autonomous crack detection on metallic surfaces.” Comput.-Aided Civ.
Distress. This shows the effectiveness of DL models in crack seg- Infrastruct. Eng. 32 (4): 271–287. https://doi.org/10.1111/mice.12256.
Chen, K., A. Yadav, A. Khan, Y. Meng, and K. Zhu. 2019. “Improved crack
mentation. Solving the FP problem is an important step in further
detection and recognition based on convolutional neural network.” In
improvement. Modelling and simulation in engineering. New York: Hindawi.
Based on this study, the following are recommendations for Cheng, H., J. Wang, Y. Hu, C. Glazier, X. Shi, and X. Chen. 2001.
future research: “Novel approach to pavement cracking detection based on neural
• A performance evaluation system should be established to quali- network.” Transp. Res. Rec. 1764 (1): 119–127. https://doi.org/10
tatively and objectively evaluate crack detection algorithms. .3141/1764-13.
Key components should include a consistent performance met- Cheng, J. C., and M. Wang. 2018. “Automated detection of sewer pipe de-
ric and consistent pavement data sets with diverse conditions. fects in closed-circuit television images using deep learning tech-
• Computation speed, as well as accuracy, is an important aspect niques.” Autom. Constr. 95 (Nov): 155–171. https://doi.org/10.1016/j
to be considered. .autcon.2018.08.006.
• Data quantity and quality are important factors in ML model Chou, J., W. A. O’Neill, and H. Cheng. 1994. “Pavement distress classi-
improvement. Fast, robust, and accurate data acquisition and fication using neural networks.” In Vol. 1 of Proc., IEEE Int. Conf. on
Systems, Man and Cybernetics, 397–401. New York: IEEE.
ground-truth labeling should be further developed.
Daniel, A., and V. Preeja. 2014. “Automatic road distress detection and
analysis.” Int. J. Comput. Appl. 101 (10): 18–23. https://doi.org/10
.5120/17723-8018.
Data Availability Statement Dorafshan, S., R. J. Thomas, C. Coopmans, and M. Maguire. 2018. “Deep
learning neural networks for SUAS-assisted structural inspections: Fea-
Some or all data, models, or code that support the findings of this sibility and application.” In Proc., Int. Conf. on Unmanned Aircraft Sys-
study are available from the corresponding author upon reasonable tems (ICUAS), 874–882. New York: IEEE.
request. (testing data, codes for implemented models, and evalu- Dung, C. V. 2019. “Autonomous concrete crack detection using deep fully
ation metrics). convolutional neural network.” Autom. Constr. 99 (Mar): 52–58. https://
doi.org/10.1016/j.autcon.2018.11.028.
Eisenbach, M., R. Stricker, D. Seichter, K. Amende, K. Debes, M.
Acknowledgments
Sesselmann, D. Ebersbach, U. Stoeckert, and H.-M. Gross. 2017.
“How to get pavement distress detection ready for deep learning? A
The authors would like to thank the support provided by the Geor-
systematic approach.” In Proc., Int. Joint Conf. on Neural Networks
gia Department of Transportation and US Department of Transpor- (IJCNN), 2039–2047. New York: IEEE.
tation. In addition, the authors would like to thank the Georgia Tech Fan, R., M. J. Bocus, Y. Zhu, J. Jiao, L. Wang, F. Ma, S. Cheng, and M. Liu.
research team, including Geoffrey Price and Zhongyu Yang for 2019. “Road crack detection using deep convolutional neural network
collecting the 3D pavement surface data, and Dr. Chenglong Jiang, and adaptive thresholding.” Preprint, submitted April 18, 2019. http://
Dr. Anirban Chatterjee, and Arindam Duttagupta for making crack arxiv.org/abs/1904.08582.
digitization and segmentation tools available, and Shuho Chou for Fan, Z., Y. Wu, J. Lu, and W. Li. 2018. “Automatic pavement crack detec-
initiating the FCN for crack detection. tion based on structured prediction with the convolutional neural net-
work.” Preprint, submitted February 1, 2018. http://arxiv.org/abs/1802
.02208.
References Fei, Y., K. C. Wang, A. Zhang, C. Chen, J. Q. Li, Y. Liu, G. Yang, and B. Li.
2019. “Pixel-level cracking detection on 3D asphalt pavement images
Amhaz, R., S. Chambon, J. Idier, and V. Baltazart. 2016. “Automatic crack through deep-learning-based CrackNet-V.” IEEE Trans. Intell. Transp.
detection on two-dimensional pavement images: An algorithm based on Syst. 21( 1): 273–284.
/10298436.2018.1485917.
ment method using three-dimensional pavement data.” Transp. Res. Li, N., X. Hou, X. Yang, and Y. Dong. 2009. “Automation recognition of
Rec. 2672 (40): 41–49. https://doi.org/10.1177/0361198118759951. pavement surface distress based on support vector machine.” In Proc.,
Gopalakrishnan, K., H. Gholami, A. Vidyadharan, A. Choudhary, and A. 2nd Int. Conf. on Intelligent Networks and Intelligent Systems, 346–
Agrawal. 2018. “Crack damage detection in unmanned aerial vehicle 349. New York: IEEE.
images of civil infrastructure using pre-trained deep learning model.” Li, S., and X. Zhao. 2019. “Image-based concrete crack detection using
Int. J. Traffic Transp. Eng. 8 (1): 1–14. https://doi.org/10.7708/ijtte convolutional neural network and exhaustive search technique.” Adv.
.2018.8(1).01. Civ. Eng. 2019: 19. https://doi.org/10.1155/2019/6520620.
Gopalakrishnan, K., S. K. Khaitan, A. Choudhary, and A. Agrawal. 2017. Li, Y., H. Li, and H. Wang. 2018. “Pixel-wise crack detection using deep
“Deep convolutional neural networks with transfer learning for computer local pattern predictor for robot application.” Sensors 18 (9): 3042.
vision-based data-driven pavement distress detection.” Constr. Build. https://doi.org/10.3390/s18093042.
Mater. 157 (Dec): 322–330. https://doi.org/10.1016/j.conbuildmat.2017 Liu, S.-W., J. H. Huang, J.-C. Sung, and C. Lee. 2002. “Detection of cracks
.09.110. using neural networks and computational mechanics.” Comput. Meth-
He, K., X. Zhang, S. Ren, and J. Sun. 2015. “Delving deep into rectifiers: ods Appl. Mech. Eng. 191 (25–26): 2831–2845. https://doi.org/10.1016
Surpassing human-level performance on ImageNet classification.” In /S0045-7825(02)00221-9.
Proc., IEEE Int. Conf. on Computer Vision, 1026–1034. New York: Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C.
IEEE. Berg. 2016. “SSD: Single shot multibox detector.” In Proc., European
He, K., X. Zhang, S. Ren, and J. Sun. 2016. “Deep residual learning for Conf. on Computer Vision, 21–37. New York: Springer.
image recognition.” In Proc., IEEE Conf. on Computer Vision and Liu, Y., J. Yao, X. Lu, R. Xie, and L. Li. 2019. “DeepCrack: A deep hier-
Pattern Recognition, 770–778. New York: IEEE. archical feature learning architecture for crack segmentation.” Neuro-
Hoang, N.-D. 2018. “An artificial intelligence method for asphalt pavement computing 338 (Apr): 139–153. https://doi.org/10.1016/j.neucom
pothole detection using least squares support vector machine and neural .2019.01.036.
network with steerable filter-based feature extraction.” Adv. Civ. Eng. Long, J., E. Shelhamer, and T. Darrell. 2015. “Fully convolutional networks
2018: 12. https://doi.org/10.1155/2018/7419058. for semantic segmentation.” In Proc., IEEE Conf. on Computer Vision
Huang, H.-W., Q.-T. Li, and D.-M. Zhang. 2018. “Deep learning based and Pattern Recognition, 3431–3440. New York: IEEE.
image recognition for crack and leakage defects of metro shield tunnel.” Ma, K., M. Hoai, and D. Samaras. 2017. “Large-scale continual road in-
Tunnelling Underground Space Technol. 77 (Jul): 166–176. https://doi spection: Visual infrastructure assessment in the wild.” In Proc., British
.org/10.1016/j.tust.2018.04.002. Machine Vision Conf. (BMVC). Durham, UK: British Machine Vision
Islam, M., and J.-M. Kim. 2019. “Vision-based autonomous crack detection Association Press.
of concrete structures using a fully convolutional encoder-decoder net- Maeda, H., Y. Sekimoto, T. Seto, T. Kashiyama, and H. Omata. 2018.
work.” Sensors 19 (19): 4251. https://doi.org/10.3390/s19194251. “Road damage detection and classification using deep neural networks
Isola, P., J.-Y. Zhu, T. Zhou, and A. A. Efros. 2017. “Image-to-image trans- with smartphone images.” Comput.-Aided Civ. Infrastruct. Eng.
lation with conditional adversarial networks.” In Proc., IEEE Conf. on 33 (12): 1127–1141. https://doi.org/10.1111/mice.12387.
Computer Vision and Pattern Recognition, 1125–1134. New York: Maguire, M., S. Dorafshan, and R. J. Thomas. 2018. SDNET2018: A con-
IEEE. crete crack image dataset for machine learning applications. Logan,
Jenkins, M. D., T. A. Carr, M. I. Iglesias, T. Buggy, and G. Morison. 2018. UT: Utah State Univ.
“A deep convolutional neural network for semantic pixel-wise segmen- Mandal, V., L. Uong, and Y. Adu-Gyamfi. 2018. “Automated road crack
tation of road and pavement surface cracks.” In Proc., 26th European detection using deep convolutional neural networks.” In Proc., IEEE
Signal Processing Conf. (EUSIPCO), 2120–2124. New York: IEEE. Int. Conf. on Big Data (Big Data), 5212–5215. New York: IEEE.
Ji, J., L. Wu, Z. Chen, J. Yu, P. Lin, and S. Cheng. 2018. “Automated pixel- Mirza, M., and S. Osindero. 2014. “Conditional generative adversarial
level surface crack detection using U-Net.” In Proc., Int. Conf. on Multi- nets.” Preprint, submitted November 6, 2014. http://arxiv.org/abs
disciplinary Trends in Artificial Intelligence, 69–78. New York: /1411.1784.
Springer. Moon, H. G., and J. H. Kim. 2011. “Intelligent crack detecting algorithm on
Jiang, C. 2015. “A crack detection and diagnosis methodology for auto- the concrete crack image using neural network.” In Proc., 28th ISARC,
mated pavement condition evaluation.” Ph.D. thesis, Dept. of Civil 1461–1467. London: International Association for Automation and Ro-
and Environmental Engineering, Georgia Institute of Technology. botics in Construction.
Jiang, C., Y. Tsai, and Z. Wang. 2016. “Use of three-dimensional pavement Moussa, G., and K. Hussain. 2011. “A new technique for automatic detec-
surface data to analyze crack deterioration: Pilot study on Georgia State tion and parameters estimation of pavement crack.” In Proc., 4th Int.
Route 26.” Transp. Res. Rec. 2589 (1): 154–161. https://doi.org/10 Multi-Conf. on Engineering Technology Innovation, IMETI. Orlando,
.3141/2589-17. FL: Multilingual Europe Technology Alliance.
Jiang, C., and Y. J. Tsai. 2016. “Enhanced crack segmentation algorithm Nguyen, N. T. H., T. H. Le, S. Perry, and T. T. Nguyen. 2018. “Pavement
using 3D pavement data.” J. Comput. Civ. Eng. 30 (3): 04015050. crack detection using convolutional neural network.” In Proc., 9th Int.
https://doi.org/10.1061/(ASCE)CP.1943-5487.0000526. Symp. on Information and Communication Technology, 251–256.
Kaseko, M. S., and S. G. Ritchie. 1993. “A neural network-based method- New York: ACM.
ology for pavement crack detection and classification.” Transp. Res. Nie, M., and K. Wang. 2018. “Pavement distress detection based on transfer
Part C: Emerging Technol. 1 (4): 275–291. https://doi.org/10.1016 learning.” In Proc., 5th Int. Conf. on Systems and Informatics (ICSAI),
/0968-090X(93)90002-W. 435–439. New York: IEEE.
Civ. Eng. 33 (3): 04019017. https://doi.org/10.1061/(ASCE)CP.1943 Xu, H., X. Su, Y. Wang, H. Cai, K. Cui, and X. Chen. 2019. “Automatic
-5487.0000831. bridge crack detection using a convolutional neural network.” Appl. Sci.
Pauly, L., D. Hogg, R. Fuentes, and H. Peel. 2017. “Deeper networks for 9 (14): 2867. https://doi.org/10.3390/app9142867.
pavement crack detection.” In Proc., 34th ISARC, 479–485. Berlin: Xue, Y., and Y. Li. 2018. “A fast detection method via region-based fully
IAARC. convolutional neural networks for shield tunnel lining defects.” Com-
Ren, S., K. He, R. Girshick, and J. Sun. 2015. “Faster r-cnn: Towards real- put.-Aided Civ. Infrastruct. Eng. 33 (8): 638–654. https://doi.org/10
time object detection with region proposal networks.” In Advances in neu- .1111/mice.12367.
ral information processing systems, 91–99. Cambridge, MA: MIT Press. Yang, F., L. Zhang, S. Yu, D. Prokhorov, X. Mei, and H. Ling. 2019.
Ronneberger, O., P. Fischer, and T. Brox. 2015. “U-Net: Convolutional net- “Feature pyramid and hierarchical boosting network for pavement crack
works for biomedical image segmentation.” In Proc., Int. Conf. on detection.” IEEE Trans. Intell. Transp. Syst. 21 (4): 1525–1535.
Medical Image Computing and Computer-Assisted Intervention, Yang, X., H. Li, Y. Yu, X. Luo, T. Huang, and X. Yang. 2018. “Automatic
234–241. New York: Springer. pixel-level crack detection and measurement using fully convolutional
Schmugge, S. J., L. Rice, J. Lindberg, R. Grizziy, C. Joffey, and M. C. Shin. network.” Comput.-Aided Civ. Infrastruct. Eng. 33 (12): 1090–1109.
2017. “Crack segmentation by leveraging multiple frames of varying https://doi.org/10.1111/mice.12412.
illumination.” In Proc., IEEE Winter Conf. on Applications of Com- Yokoyama, S., and T. Matsumoto. 2017. “Development of an automatic
puter Vision (WACV), 1045–1053. New York: IEEE. detector of cracks in concrete using machine learning.” Procedia
Schmugge, S. J., L. Rice, N. R. Nguyen, J. Lindberg, R. Grizzi, C. Joffe, and Eng. 171: 1250–1255. https://doi.org/10.1016/j.proeng.2017.01.418.
M. C. Shin. 2016. “Detection of cracks in nuclear power plant using Yusof, N., M. Osman, M. Noor, A. Ibrahim, N. Tahir, and N. Yusof. 2018.
spatial-temporal grouping of local patches.” In Proc., IEEE Winter Conf. “Crack detection and classification in asphalt pavement images using
on Applications of Computer Vision (WACV), 1–7. New York: IEEE. deep convolution neural network.” In Proc., 8th IEEE Int. Conf. on
Shi, Y., L. Cui, Z. Qi, F. Meng, and Z. Chen. 2016. “Automatic road crack Control System, Computing and Engineering (ICCSCE), 227–232.
detection using random structured forests.” IEEE Trans. Intell. Transp. New York: IEEE.
Syst. 17 (12): 3434–3445. https://doi.org/10.1109/TITS.2016.2552248. Zhang, A., K. C. Wang, Y. Fei, Y. Liu, C. Chen, G. Yang, J. Q. Li, E. Yang,
Simonyan, K., and A. Zisserman. 2014. “Very deep convolutional networks and S. Qiu. 2019. “Automated pixel-level pavement crack detection on
for large-scale image recognition.” Preprint, submitted September 4, 3D asphalt surfaces with a recurrent neural network.” Comput.-Aided
2014. http://arxiv.org/abs/1409.1556. Civ. Infrastruct. Eng. 34 (3): 213–229. https://doi.org/10.1111/mice
Stricker, R., M. Eisenbach, M. Sesselmann, K. Debes, and H.-M. Gross. .12409.
2019. “Improving visual road condition assessment by extensive experi- Zhang, A., K. C. Wang, Y. Fei, Y. Liu, S. Tao, C. Chen, J. Q. Li, and B. Li.
ments on the extended gaps dataset.” In Proc., Int. Joint Conf. on Neu- 2018a. “Deep learning–based fully automated pavement crack detection
ral Networks (IJCNN), 1–8. New York: IEEE. on 3D asphalt surfaces with an improved CrackNet.” J. Comput. Civ.
Tabernik, D., S. Šela, J. Skvarč, and D. Skočaj. 2020. “Segmentation-based Eng. 32 (5): 04018041. https://doi.org/10.1061/(ASCE)CP.1943-5487
deep-learning approach for surface-defect detection.” J. Intell. Manuf. .0000775.
31 (3): 759–776. https://doi.org/10.1007/s10845-019-01476-x. Zhang, A., K. C. Wang, B. Li, E. Yang, X. Dai, Y. Peng, Y. Fei, Y. Liu, J. Q.
Tsai, Y.-C., and A. Chatterjee. 2017. “Comprehensive, quantitative crack Li, and C. Chen. 2017. “Automated pixel-level pavement crack detec-
detection algorithm performance evaluation system.” J. Comput. Civ. tion on 3D asphalt surfaces using a deep-learning network.” Comput.-
Eng. 31 (5): 04017047. https://doi.org/10.1061/(ASCE)CP.1943-5487 Aided Civ. Infrastruct. Eng. 32 (10): 805–819. https://doi.org/10.1111
.0000696. /mice.12297.
Tsai, Y.-C., and A. Chatterjee. 2018. “Pothole detection and classification us- Zhang, K., H. Cheng, and B. Zhang. 2018b. “Unified approach to pavement
ing 3D technology and watershed method.” J. Comput. Civ. Eng. 32 (2): crack and sealed crack detection using preclassification based on trans-
04017078. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000726. fer learning.” J. Comput. Civ. Eng. 32 (2): 04018001. https://doi.org/10
Tsai, Y.-C. J., and F. Li. 2012. “Critical assessment of detecting asphalt .1061/(ASCE)CP.1943-5487.0000736.
pavement cracks under different lighting and low intensity contrast con- Zhang, K., Y. Zhang, and H. Cheng. 2020. “Self-supervised structure learn-
ditions using emerging 3D laser technology.” J. Transp. Eng. 138 (5): ing for crack detection based on cycle-consistent generative adversarial
649–656. https://doi.org/10.1061/(ASCE)TE.1943-5436.0000353. networks.” J. Comput. Civ. Eng. 34 (3): 04020004. https://doi.org/10
Tsai, Y. J., F. Li, and Y. Wu. 2013. “A new rutting measurement method .1061/(ASCE)CP.1943-5487.0000883.
using emerging 3D line-laser-imaging system.” Int. J. Pavement Res. Zhang, L., F. Yang, Y. D. Zhang, and Y. J. Zhu. 2016. “Road crack detec-
Technol. 6 (5): 667–672. tion using deep convolutional neural network.” In Proc., IEEE Int.
Tsai, Y. J., and Z. Wang. 2015. Development of an asphalt pavement rav- Conf. on Image Processing (ICIP), 3708–3712. New York: IEEE.
eling detection algorithm using emerging 3D laser technology and Zimmerman, K. A. 2017. Pavement management systems: Putting data to
macrotexture analysis. Final Rep. No. NCHRP IDEA Project 163. work. Washington, DC: Transportation Research Board.
Washington, DC: Transportation Research Board. Zou, Q., Y. Cao, Q. Li, Q. Mao, and S. Wang. 2012. “CrackTree: Automatic
Tsai, Y. J., Y. Wu, and C. Ai. 2011. “Feasibility study of measuring con- crack detection from pavement images.” Pattern Recognit. Lett. 33 (3):
crete joint faulting using 3D continuous pavement profile data 2.” 227–238. https://doi.org/10.1016/j.patrec.2011.11.004.
In Proc., 90th Annual Meeting on Transportation Research Board, Zou, Q., Z. Zhang, Q. Li, X. Qi, Q. Wang, and S. Wang. 2018. “DeepCrack:
23–27. Washington, DC: Transportation Research Board. Learning hierarchical convolutional features for crack detection.” IEEE
Wang, K., A. Zhang, J. Q. Li, Y. Fei, C. Chen, and B. Li. 2017a. “Deep Trans. Image Process. 28 (3): 1498–1512. https://doi.org/10.1109/TIP
learning for asphalt pavement cracking recognition using convolutional .2018.2878966.