Machine Learning For Crack Detection Review An

Machine Learning for Crack Detection:
Review and Model Performance Comparison

Yung-An Hsieh 1 and Yichang James Tsai, M.ASCE 2
Downloaded from ascelibrary.org by Auckland University Of Technology on 07/14/20. Copyright ASCE. For personal use only; all rights reserved.
Abstract: With the advancement of machine learning (ML) and deep learning (DL), there is a great opportunity to enhance the development
of automatic crack detection algorithms. In this paper, the authors organize and provide up-to-date information on on ML-based crack de-
tection algorithms for researchers to more efficiently seek potential focus and direction. The authors first reviewed 68 ML-based crack
detection methods to identify the current trend of development, pixel-level crack segmentation. The authors then conducted a performance
evaluation on 8 ML-based crack segmentation models using consistent evaluation metrics and three-dimensional (3D) pavement images with
diverse conditions to identify remaining challenges and potential directions for future development. Based on the comparison results, deeper
backbone networks in FCN models and skip connections in U-Net both improved the performance. Within different categories of pavement
images, except for the Other Distress category, FCN and U-Net scored over 90 on the enhanced Hausdorff distance metric. Results showed
that solving the false-positive problem is an important step in further improving ML-based crack detection models. DOI: 10.1061/(ASCE)
CP.1943-5487.0000918. © 2020 American Society of Civil Engineers.
Introduction technique in almost every field. By providing a sufficient amount

of data, ML algorithms can automatically digest intrinsic knowl-
Cracking appears in different kinds of structures, such as pave- edge of the data, such as hidden structures or relationships. Tradi-
ments, buildings, and bridges. Since cracking can accelerate the tional ML techniques require a predefined feature extraction stage
deterioration process, occurrence and severity of cracking serve to reduce the complexity of the data and make patterns more visible
as important indicators that maintenance is required. Therefore, to learning algorithms. However, this also limits the performance of
crack evaluation is a critical task to ensure public safety. the models, even if more data are provided. In recent years, deep
Traditionally, crack evaluation is conducted manually through learning (DL), a subset of ML techniques that use multilayer neural
human field surveys. However, these manual survey methods have networks, is rapidly developing. Compared with traditional ML,
poor repeatability and reproducibility, need excessive time, con- DL techniques are more intelligent as the features of the data
sume great amounts of labor, and put surveyors in hazardous sit- are automatically learned through the training process. That is,
uations. Also, the data collected may vary with different raters due DL does not require a predefined feature extraction stage, and a
to subjectivity. To overcome the shortcomings of manual surveys,
more general and robust model can be trained by providing more
there have been substantial research efforts to develop automated
data.
crack survey methods. Automatic crack evaluation consists of three
With the success of the ML techniques, a large number of re-
components: data acquisition, crack detection, and crack diagnosis.
search efforts on ML-based crack detection have been conducted.
Data acquisition provides two- or three-dimensional (2D or 3D)
ML, especially DL, has become a mainstream technology for
input data for the crack detection algorithm. Crack detection,
developing enhanced crack detection algorithms. To further accel-
which is the focus of this paper, is the detection of cracks in the
erate development, the objective of this paper is to organize and
2D and/or 3D data provided by data acquisition through comput-
erized means, minimizing human involvement. Crack detection provide up-to-date information on research into ML-based crack
provides essential inputs necessary for crack diagnosis, including detection algorithms to provide researchers with new focus and
classification information such as type, severity, and extent, that direction. The authors attained the objective in two steps. First,
are crucial for maintenance decision making. Crack diagnosis a literature review of 68 papers was conducted to identify the de-
can be used to determine the optimal timing and level of treatment velopment trend in ML-based crack detection algorithms. With the
through the management process. identified current trend, the authors then conducted a consistent
Having the ability to perform various tasks with outstand- performance evaluation on 8 state-of-the-art models to provide de-
ing performance, machine learning (ML) has become a popular tails on the remaining challenges and potential improvements for
future development.
1
Ph.D. Candidate, Dept. of Computer and Electrical Engineering, Geor- This paper is organized as follows. The section “Introduction”
gia Institute of Technology, Atlanta, GA 30332 (corresponding author). briefly introduces the background, research needs, and objectives
ORCID: https://orcid.org/0000-0001-8964-2912. Email: yhsieh37@gatech for this study. The section “Literature Review” describes the
.edu current development direction of ML-based crack detection re-
2
Professor, Dept. of Civil and Environmental Engineering, Georgia search. The section “Methodology” discusses how a consistent
Institute of Technology, Atlanta, GA 30332. ORCID: https://orcid.org comparison of DL crack detection models, which includes the
/0000-0002-6650-2279. Email: james.tsai@ce.gatech.edu
data, metrics, and models used, was carried out. The section
Note. This manuscript was submitted on January 23, 2020; approved on
April 30, 2020; published online on July 13, 2020. Discussion period open “Results and Discussion” evaluates the results of the models
until December 13, 2020; separate discussions must be submitted for indi- and discusses outcomes. The section “Conclusions and Recom-
vidual papers. This paper is part of the Journal of Computing in Civil En- mendations” presents the study conclusions and recommenda-
gineering, © ASCE, ISSN 0887-3801. tions for future research.
© ASCE 04020038-1 J. Comput. Civ. Eng.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

Literature Review in 2016, and already 50 papers on them have been published. This
shows that the ability of DL to perform various tasks with outstand-
In this section, 68 papers are reviewed and divided into traditional ing performance has caught the attention of researchers working on
ML and DL methods. After an introduction and summary of the re- image-based crack detection. DL crack detection can be divided
search in each category, the overall trend of the research is discussed. into three task categories (Fig. 1).
Classification
Traditional ML Methods
The goal of classification is to determine whether an image/image
Traditional ML methods use techniques other than deep learning. patch contains cracks and, if so, to classify them. DL is in an early
To deal with large quantities of nonstructured image data, a prede- stage of use in crack classification, having been developed in 2016
fined feature extraction stage is required before training the models. (Zhang et al. 2016; Schmugge et al. 2016), but it became the most
Beginning almost 30 years ago (Kaseko and Ritchie 1993), re- popular DL method in 2017. CNN models with fully connected
search in this category has been continuously conducted. (FC) layers are used, and lightweight models with less than 10
The two most popular techniques in traditional ML-based meth- layers are usually used because the task is considered simple binary
ods are support vector machine (Li et al. 2009; Gavilán et al. 2011; classification.
Moussa and Hussain 2011; O’Byrne et al. 2013; Daniel and Preeja A few researchers have performed image-level classification
2014; Fujita et al. 2017; Wang et al. 2017b; Chen et al. 2017) and (Ma et al. 2017; Gopalakrishnan et al. 2017, 2018; Xu et al. 2019)
artificial neural network (Kaseko and Ritchie 1993; Chou et al. in which the whole image from the data set is classified. On the
1994; Cheng et al. 2001; Liu et al. 2002; Lee and Lee 2004; other hand, patch-level classification (Schmugge et al. 2016;
Moon and Kim 2011; Hoang 2018; Wang et al. 2018). Other tradi- Zhang et al. 2016; Cha et al. 2017; Chen and Jahanshahi 2017;
tional ML techniques, such as random forest (Shi et al. 2016), are Eisenbach et al. 2017; Feng et al. 2017; Pauly et al. 2017; Wang
also used. These techniques are mainly involved in two tasks; crack et al. 2017a; Wang and Hu 2017; Yokoyama and Matsumoto 2017;
detection and crack type classification, and ing of optimal param- Yusof et al. 2018; Dorafshan et al. 2018; Kim and Cho 2018; Li
eters for non-ML crack detection [e.g., threshold values (Cheng et al. 2020; Nguyen et al. 2018; Zhang et al. 2018b; Chen et al.
et al. 2001)]. Image-processing steps are required to perform pre- 2019; Li and Zhao 2019; Park et al. 2019), which separates each
defined feature extraction, and various features have been used in image into small image patches, is usually performed because it has
different research efforts, such as statistical values of images, ver- two advantages. First, more data can be generated by dividing im-
tical and horizontal projections of feature maps, and properties of ages into small patches. Second, by classifying each small patch of
defined crack objects. the original image, localization information of cracks in the original
The main problem with traditional ML methods is that they image can be obtained [Fig. 1(b)]. Thus, the results of patch-level
contain only shallow learning techniques. Without learning higher- classification can be further used in crack type classification.
level features, those techniques are not able to deal with the For research and engineering purposes, important features about
complex information contained in the images. For example, the cracks, such as length, width, and branches, are needed. Although
background of pavement images is largely affected by illumination patch-level classification can generate localization information, the
and the environment. results are coarse and blocky and so cannot be used to estimate
crack features. Therefore, more recent studies have focused on
DL Methods pixel-wise prediction.
DL methods use multilayer neural networks, which can automati- Object Detection
cally learn features from the data. In computer vision tasks, con- Object detection tasks (Cha et al. 2017; Carr et al. 2018; Cheng and
volutional neural networks (CNNs), a specific DL model with Wang 2018; Maeda et al. 2018; Mandal et al. 2018; Nie and Wang
convolution layers, have been widely used. DL methods appeared 2018; Xue and Li 2018) generate bounding boxes around areas that
Input Classification Object Detection Segmentation
(a) (b) (c) (d)
Fig. 1. (a) Input image of crack detection models; (b) classification: image patches classified as crack or noncrack; (c) object detection: bounding
boxes generated around areas that contain cracks; and (d) segmentation: pixels classified as crack or noncrack.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

contain cracks [Fig. 1(c)]. Usually, crack type classification is also tasks, researchers have are developing models that maintain spatial
performed for each generated bounding box. Existing object detec- resolution throughout the hidden layers to avoid the loss of spatial
tion models, such as Faster-RCNN (Ren et al. 2015) and SSD (Liu information caused by the down-sampling process. Following this
et al. 2016), are being used in this category. Experiments on differ- direction, the second approach to crack segmentation uses models
ent backbone networks of the models, such as VGG (Simonyan and without pooling layers to preserve spatial resolution (Zhang et al.
Zisserman 2014) and ResNet (He et al. 2016), and different training 2017, 2018a; Fei et al. 2019; Islam and Kim 2019). The third ap-
strategies have been conducted. proach uses CNN models with FC layers to perform classification
Similar to patch-level classification, object detection can gener- tasks and combines them with other proposed methods, such as adap-
ate crack localization information. However, important features of tive thresholding (Fan et al. 2019), to perform crack segmentation
cracks cannot be estimated from the generated bounding boxes. (Fan et al. 2018; Li et al. 2018; Fan et al. 2019). Another type of
DL network, recurrent neural networks (RNNs), have also been used
Segmentation in crack segmentation (Zhang et al. 2019). Finally, generative adver-
Segmentation generates a pixel-wise prediction of cracks in the im- sarial networks (GANs) have been used to perform self-supervised
ages; in other words, each pixel is classified as crack or noncrack structure learning to achieve crack segmentation (Zhang et al. 2020).
[Fig. 1(d)]. The precise crack location and structure generated by
crack segmentation can be used both to classify crack type and to Overall Trend in ML Crack Detection
obtain important crack features. Also, because of advances in sens-
ing technologies, 2D and 3D data with higher resolution can be Fig. 2 shows the distribution of published papers each year. As
obtained. For these reasons, crack segmentation has become the shown in Fig. 2(a), traditional ML methods date back to 1993. Re-
current trend in DL crack detection. search on traditional ML methods has been continuously conducted,
DL crack segmentation can be performed via five different ap- but publications have increased slightly in the past decade. DL-based
proaches. The most popular is to use and modify encoder-decoder– methods appeared in 2016, and the number of publications on them
based models (Schmugge et al. 2017; Huang et al. 2018; Jenkins has grown rapidly. This shows that DL-based methods have proven
et al. 2018; Ji et al. 2018; Yang et al. 2018; Zou et al. 2018; Bang effective in crack detection.
et al. 2019; Dung et al. 2019; Liu et al. 2019; Tabernik et al. 2020; Fig. 2(b) is a closer look at the number of papers in each
Yang et al. 2019), such as FCN (Long et al. 2015) and SegNet DL-based category. Two main categories can be seen in the figure:
(Badrinarayanan et al. 2017). For general semantic segmentation patch-level classification and segmentation. With the ability to pro-
vide crack localization in the images using only lightweight CNN
models, patch-level classification received most of the attention in
Machine Learning Based Crack Detection 2016 and 2017. Then segmentation quickly replaced classification
because it provides higher-resolution data and more precise crack
localization and structure prediction. The amount of research on
crack segmentation is still increasing, which shows that the pixel-
wise prediction of cracks, although still in the development stage, is
the current trend inML crack detection.
Methodology
To gain insight into current ML crack detection, the authors per-

formed a comparison of eight DL segmentation modelsusing consis-
tent real-world pavement image data, evaluation metrics, software,
and hardware. This section introduces the preparation and imple-
Traditional ML DL-Classification DL-Object Detection DL-Segmentation mentation of the comparison.
(a)
Data Collection and Ground-Truth Generation
Deep Learning Based Crack Detection
The reviewed literature on DL-based crack segmentation lacks
comparisons between different proposed models because different
data sets with varying complexity have been used to evaluate the
performance of algorithms. This makes it difficult to achieve con-
sistent performance evaluation. Therefore, the authors established a
data set consisting of 1,562 3D pavement images with diverse con-
ditions to obtain a performance measure that could be applied to all
algorithms.
With the advancement of sensor technologies, 3D laser technol-
ogy has become the mainstream method to acquire high-resolution,
full-coverage 3D pavement surface data for pavement condition as-
sessment. Compared with 2D intensity-based imaging, the 3D laser
Classification-Image Classification-Patch Object Detection Segmentation is not sensitive to lighting effects when measuring range (i.e., eleva-
tion). Therefore, noise such as oil stains and poor intensity contrast
(b) can be removed, making cracks more distinguishable (Tsai and Li
2012). Three-dimensional pavement surface data have been used
Fig. 2. Distribution of ML-based crack detection research. Larger dots
for detecting and measuring cracking (Tsai and Li 2012; Jiang
represent a higher number of papers published that year.
and Tsai 2016) and its deterioration (Jiang et al. 2016), rutting
J. Comput. Civ. Eng., 2020, 34(5): 04020038

Table 1. Publicly available data sets for crack detection
Data set Description Task
Özgenel and Sorguç (2018); 40,000 RGB images with concrete cracks Classification
Zhang et al. (2016)
SDNET2018 56,000 images of cracked and noncracked concrete bridge decks, walls, and Classification
(Maguire et al. 2018) pavements acquired from a camera
Xu et al. (2019) 6,069 bridge crack images acquired from a camera Classification
Maeda et al. (2018) 9,053 pavement images acquired from a smartphone, with 8 damage types annotated Object detection
GAPs (Eisenbach et al. 2017; 2,468 grayscale pavement surface images with 6 damage types annotated Object detection
Stricker et al. 2019)
Amhaz et al. (2016) 269 pavement intensity images acquired from 5 different systems: AigleRN, ESAR, Segmentation
LCMS, LRIS, and TEMPEST2, with a total of 68 annotated images
Liu et al. (2019) 537 RGB images with cracks in multiple scales and scenes Segmentation
Kolektor (Tabernik et al. 2020) 399 images of microscopic fractions or cracks on the surface of the plastic embedding Segmentation
in electrical commutators
CrackIT (Oliveira and Correia 2014) 84 grayscale pavement surface images, acquired by an optical device Segmentation
CrackTree (Zou et al. 2012) 206 pavement images with various types of cracks Segmentation
CFD (Shi et al. 2016) 118 road crack images acquired from a smartphone Segmentation
Yang et al. (2018) Around 800 images of pavement cracks and cracks on concrete walls Segmentation
(Tsai et al. 2013), and concrete joint faulting (Tsai et al. 2011); for The main reason is that manually annotating pixel-level segmenta-
automated raveling detection and classification (Tsai and Wang tion ground truth is very time-consuming and labor-intensive. As
2015); for automated pothole detection (Tsai and Chatterjee 2018), shown in Table 1, the available crack segmentation data sets were
and for a new area-based faulting measurement with enhanced ac- smaller than the data sets for crack classification and object detec-
curacy (Geary et al. 2018). Three-dimensional images have also tion, which made the ground truths much easier to annotate. To ad-
been used in research on ML crack detection (Wang et al. 2017a; dress this problem, the authors proposed a less time-consuming and
Zhang et al. 2017; Li et al. 2020; Zhang et al. 2018a; Fei et al. 2019; labor-intensive semiautomatic method for generating crack segmen-
Zhang et al. 2019). A survey reported in 2017 shows that 18 US tation ground truth.
states use 3D automated data collection and 17 have a plan to use it The semiautomatic method consists of three parts. First, a
within two years (Zimmerman 2017). Therefore, 3D pavement im- crack-labeling tool, CrackDigitizer, is used for manually labeling
ages were used in this study. cracks in range images with an annotation of one pixel wide
The Georgia Tech Sensing Vehicle (GTSV) was used to collect called “crack curve segmentation.” CrackDigitizer outputs XML
3D pavement surface images. Two 3D line laser sensors were files containing coordinates of the annotated crack curve data
mounted at the rear of the GTSV to collect 4-m full-lane-width points [Fig. 3(b)]. To make the labeling process simpler, a minimal
3D pavement images with a maximum speed 60 mi=h. For each path algorithm (Chatterjee and Tsai 2018; Jiang 2015) is used.
scanner, one frame of 3D pavement point cloud data is composed By manually labeling two points of the crack, the intermediate
of 1,000 (longitudinal) by 2,080 (transverse) points, with a 5-mm points are automatically generated using the algorithm. In the sec-
interval in the longitudinal direction and a 1-mm interval in the ond part, the coordinates of the crack in each XML file are read to
transverse direction. Points in the longitudinal direction are inter- generate a crack curve binary image in which the crack coordinates
polated to also have a 1-mm resolution by commercial software. are appear white on a black background [Fig. 3(c)]. Finally, both
Height value, which is the distance between the pavement surface the range images and the crack curve binary images are used to
and the laser scanner, is stored in each cloud point with a resolution generate the crack area binary images that contain pixel-wise label-
of 0.5 mm. Raw point clouds are transformed into 3D pavement ing of cracks called “crack area segmentation” [Fig. 3(d)]. This
images, or range images, through compression and rectification process is based on image processing that performs thresholding
by the software. Compression rescales the height value of the point and connected components analysis on range images, followed by
cloud to between 0 and 255 to generate a grayscale image. The data points matching on crack curves and a final refinement stage.
pixel values are then inverted so the deeper surface appears darker The crack area binary images are used as the ground truth.
on the range image. Rectification applies a Gaussian high-pass fil-
ter to remove the gradual change in range values caused by rutting
Testing Data Categorization
and cross-slope on the pavement surface. In this study, the 3D pave-
ment images were down-sampled by a factor of 4 to reduce the size To gain insight into how the implemented crack segmentation mod-
of the data set for training DL algorithms, which resulted in a size els would perform, the authors categorized the testing images based
of 520 × 1,250 pixels. In total, 1,152 images collected from US on pavement crack type. Fig. 4 shows the six categories of the test-
Route 80 (US 80) in the state of Georgia were used as the training ing images; the distribution of each category is provided in Table 2.
data. To avoid properties similar to those of the training data, 200 The Other Distress category indicates other pavement distresses,
images from Georgia State Route 275 (SR 275) northbound were such as scratches, in the image.
used as the validation data and 210 images from SR 275 south-
bound were used as the testing data. To achieve a consistent com-
parison, all models in this study were trained and evaluated using Evaluation Metrics
the same data set. Given a predicted and a ground-truth crack segmentation image, the
Design and performance of DL models rely heavily on the data following components can be obtained by looking at each pixel at
set used. However, there is currently no large-scale public data the same location:
sets with crack segmentation ground truth to serve as a benchmark. • True positive (TP): An actual crack pixel is predicted correctly.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

Range Image Labeled Crack Data Crack Curve Image Crack Area Image
(a) (b) (c) (d)
Fig. 3. (a) Range image with pavement crack; (b) manually labeled points of crack overlaid on the range image; (c) binary image of crack curves
segmentation; and (d) binary image of crack area segmentation. Notice the difference in crack width between the crack curve and crack area binary
images.
Alligator Longitudinal Transverse Block Non-Crack Other Distress
Fig. 4. Example range images of each category in the testing data.
Table 2. Testing images in each category TP

Precision ¼ ð2Þ
Category Number of images TP þ FP
Alligator 4
Longitudinal 66 TP
Transverse 52 Recall ¼ ð3Þ
Block 54 TP þ FN
Noncrack 10
Other distress 24
Precision þ Recall
F1 ¼ 2 × ð4Þ
Precision × Recall
• False positive (FP): An actual noncrack pixel is predicted as a
crack pixel. These binary classifier metrics are commonly used in crack
• False negative (FN): An actual crack pixel is predicted as a non- detection, but they do not consider the subjectivity of manually la-
crack pixel. beled ground truth mentioned in Tsai and Chatterjee (2017). Differ-
• True negative (TN): An actual noncrack pixel is predicted ent researchers or experts can label slightly different ground truths
correctly. on the same image because of differences in the perceived location
Based on these components, the following metrics can be cal- of the crack. These ground truths can all be viable, but a predicted
culated: crack segmentation can achieve different scores using the above
metrics on these ground truths. Therefore, the enhanced Hausdorff
TP þ TN distance (EHD) metric proposed by Tsai and Chatterjee (2017) was
Accuracy ¼ ð1Þ
TP þ FP þ FN þ TN also used in this study.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

The Hausdorff distance between two sets of points A and B can 2019), VGG19 (Yang et al. 2018), and ResNet50 (Bang et al. 2019)
be calculated by are frequently used in crack segmentation and were used here for
comparison.
HDðA; BÞ ¼ maxðhHD ðA; BÞ; hHD ðB; AÞÞ ð5Þ
U-Net
where the penalty is determined by the pixel with the maximum U-Net (Ronneberger et al. 2015) is also a popular DL model that
Euclidean distance from the other set of pixels as has been used and modified for crack segmentation (Jenkins et al.
hHD ðA; BÞ ¼ maxa∈A minb∈B ka − bk ð6Þ 2018; Ji et al. 2018; Zou et al. 2018). Its encoder extracts feature
maps from the input image; its decoder generates the predicted im-
Based on the Hausdorff distance, Kaul et al. (2010) proposed the age segmentation from those feature maps. U-Net consists of skip
buffered Hausdorff distance metric for crack segmentation algo- connections, which concatenates the encoder feature maps the the
rithms. This version takes the mean instead of the maximum of decoder feature maps. Skip connections deliver localization infor-
the shortest distances from each pixel to the other set of pixels when mation on objects that are captured in the initial layers and required
calculating the penalty. Also, an upper limit u is used to ensure that for constructing image segmentation in the decoder. The U-Net
no further penalty is added after it is determined that a predicted model, developed by Jenkins et al. (2018), was implemented in this
crack is not an actual crack. The penalty for the buffered Hausdorff study.
distance metric is calculated by
DeepCrack
1 X DeepCrack (Liu et al. 2019) is similar to FCN, which applies con-
hBHD ðA; BÞ ¼ sat ðminb∈B ka − bkÞ ð7Þ
jAj a∈A u volutional layers to output spatial maps. It is different in that it gen-
erates multiscale predictions from different layers of the model.
To consider the subjectivity of ground-truth labeling, Tsai and These predictions are then fused to generate the final segmentation
Chatterjee (2017) incorporated a penalty-free buffer around the result. In this study, DeepCrack with the VGG16 backbone was
ground truth and predicted crack when calculating the penalty, implemented.
1 X CrackNetII
hEHD ðA; BÞ ¼ fðka − bkÞ;
jAj a∈A CrackNet (Zhang et al. 2017) generates a more precise segmenta-
8 tion of cracks. By removing the pooling layers and preserving the
> u − l for u < x
< spatial resolution of the input images throughout the hidden layers,
where fðxÞ ¼ x − l for l ≤ x ≤ u ð8Þ it avoids the loss of localization information caused by down-
>
:
0 for x ≤ l sampling, which appears in most segmentation models. Based on
CrackNet, CrackNetII (Zhang et al. 2018a) has several improve-
where l = lower limit of the Euclidean distance after which a pen- ments, although the main idea of preserving the spatial resolution
alty is applied. The lower limit provides a penalty-free region throughout the model is preserved.
around the crack to address the subjectivity issue of ground-truth
labeling. On the other hand, u is the upper limit of the Euclidean Image Translation Model—Pix2Pix
distance after which the penalty does not increase. In this study, the Image-to-image translation changes a given image to another image
lower limit was set to 15 mm; the upper limit, to 30 mm. Thus, l in a controlled way. It can be used to translate the range image to
was set to 4 pixels and u to 8. The set of data points A represents the the corresponding crack segmentation. Pix2Pix (Isola et al. 2017) is
predicted crack pixels, and the set of data points B represents the a conditional generative adversarial network (Mirza and Osindero
ground-truth crack pixels. With Eq. (8), the EHD score is given by 2014) designed for image-to-image translation. It consists of a gen-
erator and a discriminator. The generator, which generates a crack
maxðhEHD ðA; BÞ; hEHD ðB; AÞÞ segmentation from a range image, is trained to fool the discriminator,
scoreðA; BÞ ¼ 100 − × 100 ð9Þ
u−l which is trained to classify images as real (ground truth) or false
(generated) segmentation conditioned on the range image. After
Also, the FP and FN penalties are given by training, only the generator is used when evaluating the model on
testing images. Pix2Pix models with U-Net- and ResNet-based gen-
hEHD ðA; BÞ
FP penalty ¼ ð10Þ erators were implemented in this study.
u−l
hEHD ðB; AÞ Model-Training Setup

FN penalty ¼ ð11Þ
u−l
All models were trained on a single GeForce RTX 2080Ti GPU.
Stochastic gradient descent (SGD), used as the optimizer, had a
Crack Segmentation Models learning rate of 0.0015, a momentum of 0.9, and a weight decay
This section briefly describes the models that were implemented of 0.0005. No pretrained models were used; all models were trained
and compared. All models were implemented using PyTorch, an from scratch with the Kaiming initialization method (He et al.
open-source machine learning library. 2015) and bias set to 0. The models were trained with 20 epochs
and a batch size set to 2. Early stopping was applied to select the
Fully Convolutional Network model weights to be used in the evaluation stage.
The fully convolutional network (FCN) model (Long et al. 2015) is Several data-preprocessing steps were taken before feeding the
the most popular model in crack segmentation. It consists of a fea- images to the models. First, the images were normalized by the
ture extraction backbone as in conventional CNN models, but the mean and standard deviation values of the training data set. The
fully connected layers are replaced by fully convolutional layers to images were then resized to 512 × 1,024 pixels so that height and
output a spatial map for each class. Among the different backbone width would be divisible by 32 to better fit the models with skip
networks, VGG16 (Huang et al. 2018; Bang et al. 2019; Dung et al. connections. Finally, the training data were randomly flipped in
J. Comput. Civ. Eng., 2020, 34(5): 04020038

both horizontal and vertical directions with a probability of 0.5 for U-Net—Refined Crack Segmentation with Skip
data augmentation. Connections
Among the models, U-Net had the best EHD score, 90.27. How-
ever, It had a slightly higher FN penalty and lower recall than FCN
Results and Discussion with the ResNet backbone. The reason can be seen in Fig. 6. With
Table 3 summarizes the performance of the crack segmentation the skip connections in U-Net, feature maps in the encoder, which
models. Based on this performance, important observations are usually contain more precise localization information, can be used
offered. in the decoder for generating crack segmentation. This makes U-
Net able to generate crack segmentation with a more precise crack
width, which means a lower FP penalty and higher precision. On
FCN Models—The Importance of a Backbone Network the other hand, FCN with with the ResNet backbone can segment
By comparing FCN models with different backbone networks, the thin cracks more correctly, which leads to lower FN penalty and
authors observed that a deeper backbone leads to better perfor- higher recall.
mance. This indicates that even with 3D pavement data, which re-
moves much noise in 2D data, pavement crack segmentation still CrackNetII—The Need for Spatial Information between
requires deeper networks to extract more complex features. With a Pixels
deeper backbone, models can not only segment cracks more accu-
rately, especially for hard cases such as thin cracks, but also better CrackNetII did not perform well. Fig. 7 shows that it segmented
distinguish cracks and similar patterns such as pavement boundaries cracks correctly with precise crack width for wide pavement cracks.
(Fig. 5). In the images of the crack segmentation results, both FP and However, for thinner cracks it generated disjointed crack pixels,
FN pixels are also shown. Also, the penalty-free buffer was used leading to many FN pixels. Also, disjointed FP pixels were gener-
when showing the result images so the TP pixels would be wider ated when positions in the images were darker. The disjointed pix-
than the ground truth. els may have been caused by the multiple convolutional layers with
Table 3 indicates that DeepCrack with the VGG16 backbone the 1 × 1 kernel used in CrackNetII, which led to less information
performed better than FCN with the VGG16 and VGG19 back- on the spatial relationship between pixels being obtained when
bones. This shows the usefulness of multiscale predictions and fu- extracting the feature maps. For models that do not apply the
sion in DeepCrack for crack segmentation. However, FCN with a encoder-decoder structure, the spatial relationship of cracks is an
ResNet backbone still performed better, which again shows the important factor to be considered when interpolation and skip ar-
benefit of a deeper backbone network. chitectures are removed.
Table 3. Evaluation result of the crack segmentation models

Models EHD FP penalty FN penalty Accuracy Precision Recall F1
FCN-VGG16 75.0195 0.2319 0.1257 0.9900 0.1292 0.8218 0.2199
FCN-VGG19 78.0620 0.2078 0.0963 0.9902 0.1364 0.8514 0.2313
FCN-ResNet 89.6477 0.0931 0.0537 0.9900 0.1905 0.8940 0.2989
DeepCrack 80.0456 0.1853 0.0883 0.9934 0.2007 0.8357 0.3115
U-Net 90.3755 0.0676 0.0614 0.9942 0.2732 0.8673 0.4010
CrackNetII 40.7643 0.5909 0.2096 0.9886 0.1143 0.6279 0.1856
Pix2Pix-UNet 84.1488 0.0819 0.1184 0.9975 0.4025 0.4605 0.4032
Pix2Pix-ResNet 86.2458 0.1044 0.0835 0.9975 0.3948 0.4735 0.4071
Note: Bold values indicate the best score of each metric (column).
Range Image Ground-truth FCN_VGG16 FCN_VGG19 FCN_ResNet
FN
FP
Fig. 5. Crack segmentation results of FCN models with different backbone networks.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

Range Image Ground-truth U-Net FCN_ResNet
Fig. 6. Crack segmentation results of U-Net and FCN with ResNet. The circles in the result of U-Net indicate FN cracks not detected by U-Net. The
circles in the result of FCN indicate FP pixels caused by the FCN model’s inability to segment cracks with precise crack width.
Range Image CrackNetII Range Image CrackNetII
Fig. 7. Crack segmentation results of CrackNetII.
Image Translation—Toward More Precise Crack Width Affect of Pavement Crack Types and Pavement
Boundary
Encoder-decoder models, such as FCN and U-Net, usually generate
coarse segmentation when the object is too small or thin because of Table 4 shows the performance of the two models with the high-
up-sampling. CrackNetII tries to solve this problem by removing est EHD scores, U-Net and FCN with ResNet, on different pave-
the pooling operation and thus the up-sampling. However, as men- ment crack types. The models performed best in the noncrack
tioned earlier, CrackNetII has FP and FN problems. To obtain crack category, as only slight FP occurred. Both models got very low
segmentation with a more precise crack width, the Pix2Pix model EHD scores in the Other Distress category, because they pre-
was explored. As shown in Fig. 8, It generated crack segmentation dicted other distresses as cracks, especially when there were
with a more precise width than U-Net. However, it experienced scratches on the pavement. Therefore, a high FP penalty can be
more FN problems for thin cracks that were nearly 1-pixel wide, seen in the table for this category. For the, alligator, block, lon-
especially when U-Net was used as the generator. Also, more FP gitudinal, and transverse crack categories, both models had EHD
and FN pixels arose because the crack pixels were not accurately scores over 90. This indicates that both models can correctly seg-
located. These problems can be seen in Table 3. With more precise ment cracks in normal cases, and that cracks with more complex
crack widths, Pix2Pix had the highest precision values, but the FP patterns, such as alligators, do not necessarily negatively affect
and FN problems led to low recall values. It also had the highest F1 model performance.
scores, but because the EHD score has a buffer region with no score As indicated in Table 3, most models had a higher FP penalty
penalty, U-Net and FCN with ResNet still managed to score higher than FN penalty. Most FPs were caused by patterns that were sim-
on coarse crack segmentation. Therefore, if crack width needs to be ilar to cracks in the images. Other pavement distresses could have
considered when measuring model performance, precision and re- been a source of patterns that confused the models, or the patterns
call and not just EHD score are important metrics that should be may have been pavement boundaries because of their nearly iden-
used together. tical structure. The models often generated FP pixels at the location
J. Comput. Civ. Eng., 2020, 34(5): 04020038

Range Image Ground-Truth
U-Net Pix2Pix_ResNet Pix2Pix_U-Net
Fig. 8. Crack segmentation results of U-Net and Pix2Pix models. The circles in the result of Pix2Pix indicate Pix2Pix’s localization problem. With
predicted crack pixels not located accurately, both FP and FN pixels were generated.
Table 4. Evaluation result of the models for different pavement crack types
Model Metric Alligator Longitudinal Transverse Block Noncrack Other distress
U-Net EHD score 95.4811 92.4068 91.6501 91.8085 99.814 74.8066
FP penalty 0.0416 0.0438 0.0541 0.0448 0.0019 0.2397
FN penalty 0.0373 0.0612 0.0506 0.0706 0 0.0895
FCN-ResNet EHD score 90.3525 90.7444 90.2685 91.1525 99.9499 78.3495
FP penalty 0.0965 0.0803 0.0798 0.0804 0.0005 0.2165
FN penalty 0.0431 0.0548 0.0565 0.0442 0 0.0859
Range Image with Range Image with

Pavement Boundary FCN_ResNet U-Net Other Distress FCN_ResNet U-Net
Fig. 9. Crack segmentation results of U-Net and FCN with ResNet on images with pavement boundary and other pavement distress.
of the pavement boundary. As shown in Fig. 9, ResNet backbone Conclusions and Recommendations
and U-Net solved the problem to some extent but were still largely
confused by the patterns of other distresses. Experiments using ML and DL have become mainstream technologies for developing
training data containing such confusing patterns can be used to bet- enhanced pavement crack detection algorithms. In this paper, the
ter train models to distinguish them from cracks. authors organized and provided up-to-date information on research
J. Comput. Civ. Eng., 2020, 34(5): 04020038

into ML crack detection algorithms, reviewing68 ML-based crack minimal path selection.” IEEE Trans. Intell. Transp. Syst. 17 (10):
detection papers to identify the current development trend, pixel- 2718–2729. https://doi.org/10.1109/TITS.2015.2477675.
level crack segmentation. Performance comparisons among 8 DL Badrinarayanan, V., A. Kendall, and R. Cipolla. 2017. “Segnet: A deep
crack segmentation models were then conducted using consistent convolutional encoder-decoder architecture for image segmentation.”
IEEE Trans. Pattern Anal. Mach. Intell. 39 (12): 2481–2495. https://doi
evaluation metrics and real-world 3D pavement images under di-
.org/10.1109/TPAMI.2016.2644615.
verse conditions. Finally, the authors critically assessed the re-
Bang, S., S. Park, H. Kim, and H. Kim. 2019. “Encoder-decoder network
vealed differences in performance. Based on this assessment, the for pixel-level road crack detection in black-box images.” Comput.-
following conclusions are offered: Aided Civ. Infrastruct. Eng. 34 (8): 713–727. https://doi.org/10.1111
• Different backbone networks affect model performance. For /mice.12440.
FCN models, deeper backbone networks, which can extract Carr, T. A., M. D. Jenkins, M. I. Iglesias, T. Buggy, and G. Morison. 2018.
more complex features, lead to better performance. “Road crack detection using a single stage detector based deep neural
• Skip connections in U-Net improve the model’s ability to seg- network.” In Proc., IEEE Workshop on Environmental, Energy, and
ment cracks with a more precise crack width. This indicates that Structural Monitoring Systems (EESMS), 1–5. New York: IEEE.
localization information is important in improving crack seg- Cha, Y.-J., W. Choi, and O. Büyüköztürk. 2017. “Deep learning-based
mentation models. crack damage detection using convolutional neural networks.” Com-
put.-Aided Civ. Infrastruct. Eng. 32 (5): 361–378. https://doi.org/10
• Image translation models largely improve the precision of seg-
.1111/mice.12263.
mented crack width. However, more FP and FN problems are
Chatterjee, A., and Y.-C. Tsai. 2018. “A fast and accurate automated pave-
introduced because the predicted pixels are not accurately lo- ment crack detection algorithm.” In Proc., 26th European Signal
cated. Postprocessing steps that use information from the input Processing Conf. (EUSIPCO), 2140–2144. New York: IEEE.
range images can be explored to address this problem. Chen, F.-C., and M. R. Jahanshahi. 2017. “NB-CNN: Deep learning-based
• Pavement boundaries and other pavement distresses that have crack detection using convolutional neural network and naive Bayes
patterns similar to crack patterns are the major causes of FP pre- data fusion.” IEEE Trans. Ind. Electron. 65 (5): 4392–4400. https://doi
dictions. A possible solution is to collect and add more data con- .org/10.1109/TIE.2017.2764844.
taining these confusing patterns while training the models. Chen, F.-C., M. R. Jahanshahi, R.-T. Wu, and C. Joffe. 2017. “A texture-
• Table 4 shows that state-of-the-art models can achieve EHD based video processing methodology using Bayesian data fusion for
scores over 90 for all pavement crack categories except Other autonomous crack detection on metallic surfaces.” Comput.-Aided Civ.
Distress. This shows the effectiveness of DL models in crack seg- Infrastruct. Eng. 32 (4): 271–287. https://doi.org/10.1111/mice.12256.
Chen, K., A. Yadav, A. Khan, Y. Meng, and K. Zhu. 2019. “Improved crack
mentation. Solving the FP problem is an important step in further
detection and recognition based on convolutional neural network.” In
improvement. Modelling and simulation in engineering. New York: Hindawi.
Based on this study, the following are recommendations for Cheng, H., J. Wang, Y. Hu, C. Glazier, X. Shi, and X. Chen. 2001.
future research: “Novel approach to pavement cracking detection based on neural
• A performance evaluation system should be established to quali- network.” Transp. Res. Rec. 1764 (1): 119–127. https://doi.org/10
tatively and objectively evaluate crack detection algorithms. .3141/1764-13.
Key components should include a consistent performance met- Cheng, J. C., and M. Wang. 2018. “Automated detection of sewer pipe de-
ric and consistent pavement data sets with diverse conditions. fects in closed-circuit television images using deep learning tech-
• Computation speed, as well as accuracy, is an important aspect niques.” Autom. Constr. 95 (Nov): 155–171. https://doi.org/10.1016/j
to be considered. .autcon.2018.08.006.
• Data quantity and quality are important factors in ML model Chou, J., W. A. O’Neill, and H. Cheng. 1994. “Pavement distress classi-
improvement. Fast, robust, and accurate data acquisition and fication using neural networks.” In Vol. 1 of Proc., IEEE Int. Conf. on
Systems, Man and Cybernetics, 397–401. New York: IEEE.
ground-truth labeling should be further developed.
Daniel, A., and V. Preeja. 2014. “Automatic road distress detection and
analysis.” Int. J. Comput. Appl. 101 (10): 18–23. https://doi.org/10
.5120/17723-8018.
Data Availability Statement Dorafshan, S., R. J. Thomas, C. Coopmans, and M. Maguire. 2018. “Deep
learning neural networks for SUAS-assisted structural inspections: Fea-
Some or all data, models, or code that support the findings of this sibility and application.” In Proc., Int. Conf. on Unmanned Aircraft Sys-
study are available from the corresponding author upon reasonable tems (ICUAS), 874–882. New York: IEEE.
request. (testing data, codes for implemented models, and evalu- Dung, C. V. 2019. “Autonomous concrete crack detection using deep fully
ation metrics). convolutional neural network.” Autom. Constr. 99 (Mar): 52–58. https://
doi.org/10.1016/j.autcon.2018.11.028.
Eisenbach, M., R. Stricker, D. Seichter, K. Amende, K. Debes, M.
Acknowledgments
Sesselmann, D. Ebersbach, U. Stoeckert, and H.-M. Gross. 2017.
“How to get pavement distress detection ready for deep learning? A
The authors would like to thank the support provided by the Geor-
systematic approach.” In Proc., Int. Joint Conf. on Neural Networks
gia Department of Transportation and US Department of Transpor- (IJCNN), 2039–2047. New York: IEEE.
tation. In addition, the authors would like to thank the Georgia Tech Fan, R., M. J. Bocus, Y. Zhu, J. Jiao, L. Wang, F. Ma, S. Cheng, and M. Liu.
research team, including Geoffrey Price and Zhongyu Yang for 2019. “Road crack detection using deep convolutional neural network
collecting the 3D pavement surface data, and Dr. Chenglong Jiang, and adaptive thresholding.” Preprint, submitted April 18, 2019. http://
Dr. Anirban Chatterjee, and Arindam Duttagupta for making crack arxiv.org/abs/1904.08582.
digitization and segmentation tools available, and Shuho Chou for Fan, Z., Y. Wu, J. Lu, and W. Li. 2018. “Automatic pavement crack detec-
initiating the FCN for crack detection. tion based on structured prediction with the convolutional neural net-
work.” Preprint, submitted February 1, 2018. http://arxiv.org/abs/1802
.02208.
References Fei, Y., K. C. Wang, A. Zhang, C. Chen, J. Q. Li, Y. Liu, G. Yang, and B. Li.
2019. “Pixel-level cracking detection on 3D asphalt pavement images
Amhaz, R., S. Chambon, J. Idier, and V. Baltazart. 2016. “Automatic crack through deep-learning-based CrackNet-V.” IEEE Trans. Intell. Transp.
detection on two-dimensional pavement images: An algorithm based on Syst. 21( 1): 273–284.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

Feng, C., M.-Y. Liu, C.-C. Kao, and T.-Y. Lee. 2017. “Deep active learning Kaul, V., Y. Tsai, and R. M. Mersereau. 2010. “Quantitative performance
for civil infrastructure defect detection and classification.” In Proc., evaluation algorithms for pavement distress segmentation.” Transp.
Computing in Civil Engineering 2017, 298–306. Reston, VA: ASCE. Res. Rec. 2153 (1): 106–113. https://doi.org/10.3141/2153-12.
Fujita, Y., K. Shimada, M. Ichihara, and Y. Hamamoto. 2017. “A method Kim, B., and S. Cho. 2018. “Automated vision-based detection of cracks on
based on machine learning using hand-crafted features for crack concrete surfaces using a deep learning technique.” Sensors 18 (10):
detection from asphalt pavement surface images.” In Vol. 10338 of 3452. https://doi.org/10.3390/s18103452.
Proc., 13th Int. Conf. on Quality Control by Artificial Vision 2017. Lee, B. J., and H. D. Lee. 2004. “Position-invariant neural network for dig-
Bellingham, WA: International Society for Optics and Photonics. ital pavement crack analysis.” Comput.-Aided Civ. Infrastruct. Eng.
Gavilán, M., D. Balcones, O. Marcos, D. F. Llorca, M. A. Sotelo, I. Parra, 19 (2): 105–118. https://doi.org/10.1111/j.1467-8667.2004.00341.x.
M. Ocaña, P. Aliseda, P. Yarza, and A. Amrola. 2011. “Adaptive road Li, B., K. C. Wang, A. Zhang, E. Yang, and G. Wang. 2020. “Automatic
crack detection system by pavement classification.” Sensors 11 (10): classification of pavement crack using deep convolutional neural net-
9628–9657. https://doi.org/10.3390/s111009628. work.” Int. J. Pavement Eng. 21(4), 457–463. https://doi.org/10.1080
Geary, G. M., Y. Tsai, and Y. Wu. 2018. “An area-based faulting measure-
/10298436.2018.1485917.
ment method using three-dimensional pavement data.” Transp. Res. Li, N., X. Hou, X. Yang, and Y. Dong. 2009. “Automation recognition of
Rec. 2672 (40): 41–49. https://doi.org/10.1177/0361198118759951. pavement surface distress based on support vector machine.” In Proc.,
Gopalakrishnan, K., H. Gholami, A. Vidyadharan, A. Choudhary, and A. 2nd Int. Conf. on Intelligent Networks and Intelligent Systems, 346–
Agrawal. 2018. “Crack damage detection in unmanned aerial vehicle 349. New York: IEEE.
images of civil infrastructure using pre-trained deep learning model.” Li, S., and X. Zhao. 2019. “Image-based concrete crack detection using
Int. J. Traffic Transp. Eng. 8 (1): 1–14. https://doi.org/10.7708/ijtte convolutional neural network and exhaustive search technique.” Adv.
.2018.8(1).01. Civ. Eng. 2019: 19. https://doi.org/10.1155/2019/6520620.
Gopalakrishnan, K., S. K. Khaitan, A. Choudhary, and A. Agrawal. 2017. Li, Y., H. Li, and H. Wang. 2018. “Pixel-wise crack detection using deep
“Deep convolutional neural networks with transfer learning for computer local pattern predictor for robot application.” Sensors 18 (9): 3042.
vision-based data-driven pavement distress detection.” Constr. Build. https://doi.org/10.3390/s18093042.
Mater. 157 (Dec): 322–330. https://doi.org/10.1016/j.conbuildmat.2017 Liu, S.-W., J. H. Huang, J.-C. Sung, and C. Lee. 2002. “Detection of cracks
.09.110. using neural networks and computational mechanics.” Comput. Meth-
He, K., X. Zhang, S. Ren, and J. Sun. 2015. “Delving deep into rectifiers: ods Appl. Mech. Eng. 191 (25–26): 2831–2845. https://doi.org/10.1016
Surpassing human-level performance on ImageNet classification.” In /S0045-7825(02)00221-9.
Proc., IEEE Int. Conf. on Computer Vision, 1026–1034. New York: Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C.
IEEE. Berg. 2016. “SSD: Single shot multibox detector.” In Proc., European
He, K., X. Zhang, S. Ren, and J. Sun. 2016. “Deep residual learning for Conf. on Computer Vision, 21–37. New York: Springer.
image recognition.” In Proc., IEEE Conf. on Computer Vision and Liu, Y., J. Yao, X. Lu, R. Xie, and L. Li. 2019. “DeepCrack: A deep hier-
Pattern Recognition, 770–778. New York: IEEE. archical feature learning architecture for crack segmentation.” Neuro-
Hoang, N.-D. 2018. “An artificial intelligence method for asphalt pavement computing 338 (Apr): 139–153. https://doi.org/10.1016/j.neucom
pothole detection using least squares support vector machine and neural .2019.01.036.
network with steerable filter-based feature extraction.” Adv. Civ. Eng. Long, J., E. Shelhamer, and T. Darrell. 2015. “Fully convolutional networks
2018: 12. https://doi.org/10.1155/2018/7419058. for semantic segmentation.” In Proc., IEEE Conf. on Computer Vision
Huang, H.-W., Q.-T. Li, and D.-M. Zhang. 2018. “Deep learning based and Pattern Recognition, 3431–3440. New York: IEEE.
image recognition for crack and leakage defects of metro shield tunnel.” Ma, K., M. Hoai, and D. Samaras. 2017. “Large-scale continual road in-
Tunnelling Underground Space Technol. 77 (Jul): 166–176. https://doi spection: Visual infrastructure assessment in the wild.” In Proc., British
.org/10.1016/j.tust.2018.04.002. Machine Vision Conf. (BMVC). Durham, UK: British Machine Vision
Islam, M., and J.-M. Kim. 2019. “Vision-based autonomous crack detection Association Press.
of concrete structures using a fully convolutional encoder-decoder net- Maeda, H., Y. Sekimoto, T. Seto, T. Kashiyama, and H. Omata. 2018.
work.” Sensors 19 (19): 4251. https://doi.org/10.3390/s19194251. “Road damage detection and classification using deep neural networks
Isola, P., J.-Y. Zhu, T. Zhou, and A. A. Efros. 2017. “Image-to-image trans- with smartphone images.” Comput.-Aided Civ. Infrastruct. Eng.
lation with conditional adversarial networks.” In Proc., IEEE Conf. on 33 (12): 1127–1141. https://doi.org/10.1111/mice.12387.
Computer Vision and Pattern Recognition, 1125–1134. New York: Maguire, M., S. Dorafshan, and R. J. Thomas. 2018. SDNET2018: A con-
IEEE. crete crack image dataset for machine learning applications. Logan,
Jenkins, M. D., T. A. Carr, M. I. Iglesias, T. Buggy, and G. Morison. 2018. UT: Utah State Univ.
“A deep convolutional neural network for semantic pixel-wise segmen- Mandal, V., L. Uong, and Y. Adu-Gyamfi. 2018. “Automated road crack
tation of road and pavement surface cracks.” In Proc., 26th European detection using deep convolutional neural networks.” In Proc., IEEE
Signal Processing Conf. (EUSIPCO), 2120–2124. New York: IEEE. Int. Conf. on Big Data (Big Data), 5212–5215. New York: IEEE.
Ji, J., L. Wu, Z. Chen, J. Yu, P. Lin, and S. Cheng. 2018. “Automated pixel- Mirza, M., and S. Osindero. 2014. “Conditional generative adversarial
level surface crack detection using U-Net.” In Proc., Int. Conf. on Multi- nets.” Preprint, submitted November 6, 2014. http://arxiv.org/abs
disciplinary Trends in Artificial Intelligence, 69–78. New York: /1411.1784.
Springer. Moon, H. G., and J. H. Kim. 2011. “Intelligent crack detecting algorithm on
Jiang, C. 2015. “A crack detection and diagnosis methodology for auto- the concrete crack image using neural network.” In Proc., 28th ISARC,
mated pavement condition evaluation.” Ph.D. thesis, Dept. of Civil 1461–1467. London: International Association for Automation and Ro-
and Environmental Engineering, Georgia Institute of Technology. botics in Construction.
Jiang, C., Y. Tsai, and Z. Wang. 2016. “Use of three-dimensional pavement Moussa, G., and K. Hussain. 2011. “A new technique for automatic detec-
surface data to analyze crack deterioration: Pilot study on Georgia State tion and parameters estimation of pavement crack.” In Proc., 4th Int.
Route 26.” Transp. Res. Rec. 2589 (1): 154–161. https://doi.org/10 Multi-Conf. on Engineering Technology Innovation, IMETI. Orlando,
.3141/2589-17. FL: Multilingual Europe Technology Alliance.
Jiang, C., and Y. J. Tsai. 2016. “Enhanced crack segmentation algorithm Nguyen, N. T. H., T. H. Le, S. Perry, and T. T. Nguyen. 2018. “Pavement
using 3D pavement data.” J. Comput. Civ. Eng. 30 (3): 04015050. crack detection using convolutional neural network.” In Proc., 9th Int.
https://doi.org/10.1061/(ASCE)CP.1943-5487.0000526. Symp. on Information and Communication Technology, 251–256.
Kaseko, M. S., and S. G. Ritchie. 1993. “A neural network-based method- New York: ACM.
ology for pavement crack detection and classification.” Transp. Res. Nie, M., and K. Wang. 2018. “Pavement distress detection based on transfer
Part C: Emerging Technol. 1 (4): 275–291. https://doi.org/10.1016 learning.” In Proc., 5th Int. Conf. on Systems and Informatics (ICSAI),
/0968-090X(93)90002-W. 435–439. New York: IEEE.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

O’Byrne, M., F. Schoefs, B. Ghosh, and V. Pakrashi. 2013. “Texture analy- neural network.” In Proc., Int. Conf. on Airfield Highway Pavements,
sis based damage detection of ageing infrastructural elements.” Com- 166–177. Reston, VA: ASCE.
put.-Aided Civ. Infrastruct. Eng. 28 (3): 162–177. https://doi.org/10 Wang, L., L. Zhuang, and Z. Zhang. 2018. “Automatic detection of rail
.1111/j.1467-8667.2012.00790.x. surface cracks with a superpixel-based data-driven framework.” J. Com-
Oliveira, H., and P. L. Correia. 2014. “CrackIT—An image processing tool- put. Civ. Eng. 33 (1): 04018053. https://doi.org/10.1061/(ASCE)CP
box for crack detection and characterization.” In Proc., IEEE Int. Conf. .1943-5487.0000799.
on Image Processing (ICIP), 798–802. New York: IEEE. Wang, S., S. Qiu, W. Wang, D. Xiao, and K. C. Wang. 2017b. “Cracking
Özgenel, Ç. F., and A. G. Sorguç. 2018. “Performance comparison of preclassification using minimum rectangular cover–based support vector
trained convolutional neural networks on crack detection in buildings.” machine.” J. Comput. Civ. Eng. 31 (5): 04017027. https://doi.org/10
In Vol. 35 of Proc., Int. Symp. on Automation and Robotics in Construc- .1061/(ASCE)CP.1943-5487.0000672.
tion, 1–8. Berlin: IAARC. Wang, X., and Z. Hu. 2017. “Grid-based pavement crack analysis using
Park, S., S. Bang, H. Kim, and H. Kim. 2019. “Patch-based crack detection deep learning.” In Proc., 4th Int. Conf. on Transportation Information
in black box images using convolutional neural networks.” J. Comput. and Safety (ICTIS), 917–924. New York: IEEE.
Civ. Eng. 33 (3): 04019017. https://doi.org/10.1061/(ASCE)CP.1943 Xu, H., X. Su, Y. Wang, H. Cai, K. Cui, and X. Chen. 2019. “Automatic
-5487.0000831. bridge crack detection using a convolutional neural network.” Appl. Sci.
Pauly, L., D. Hogg, R. Fuentes, and H. Peel. 2017. “Deeper networks for 9 (14): 2867. https://doi.org/10.3390/app9142867.
pavement crack detection.” In Proc., 34th ISARC, 479–485. Berlin: Xue, Y., and Y. Li. 2018. “A fast detection method via region-based fully
IAARC. convolutional neural networks for shield tunnel lining defects.” Com-
Ren, S., K. He, R. Girshick, and J. Sun. 2015. “Faster r-cnn: Towards real- put.-Aided Civ. Infrastruct. Eng. 33 (8): 638–654. https://doi.org/10
time object detection with region proposal networks.” In Advances in neu- .1111/mice.12367.
ral information processing systems, 91–99. Cambridge, MA: MIT Press. Yang, F., L. Zhang, S. Yu, D. Prokhorov, X. Mei, and H. Ling. 2019.
Ronneberger, O., P. Fischer, and T. Brox. 2015. “U-Net: Convolutional net- “Feature pyramid and hierarchical boosting network for pavement crack
works for biomedical image segmentation.” In Proc., Int. Conf. on detection.” IEEE Trans. Intell. Transp. Syst. 21 (4): 1525–1535.
Medical Image Computing and Computer-Assisted Intervention, Yang, X., H. Li, Y. Yu, X. Luo, T. Huang, and X. Yang. 2018. “Automatic
234–241. New York: Springer. pixel-level crack detection and measurement using fully convolutional
Schmugge, S. J., L. Rice, J. Lindberg, R. Grizziy, C. Joffey, and M. C. Shin. network.” Comput.-Aided Civ. Infrastruct. Eng. 33 (12): 1090–1109.
2017. “Crack segmentation by leveraging multiple frames of varying https://doi.org/10.1111/mice.12412.
illumination.” In Proc., IEEE Winter Conf. on Applications of Com- Yokoyama, S., and T. Matsumoto. 2017. “Development of an automatic
puter Vision (WACV), 1045–1053. New York: IEEE. detector of cracks in concrete using machine learning.” Procedia
Schmugge, S. J., L. Rice, N. R. Nguyen, J. Lindberg, R. Grizzi, C. Joffe, and Eng. 171: 1250–1255. https://doi.org/10.1016/j.proeng.2017.01.418.
M. C. Shin. 2016. “Detection of cracks in nuclear power plant using Yusof, N., M. Osman, M. Noor, A. Ibrahim, N. Tahir, and N. Yusof. 2018.
spatial-temporal grouping of local patches.” In Proc., IEEE Winter Conf. “Crack detection and classification in asphalt pavement images using
on Applications of Computer Vision (WACV), 1–7. New York: IEEE. deep convolution neural network.” In Proc., 8th IEEE Int. Conf. on
Shi, Y., L. Cui, Z. Qi, F. Meng, and Z. Chen. 2016. “Automatic road crack Control System, Computing and Engineering (ICCSCE), 227–232.
detection using random structured forests.” IEEE Trans. Intell. Transp. New York: IEEE.
Syst. 17 (12): 3434–3445. https://doi.org/10.1109/TITS.2016.2552248. Zhang, A., K. C. Wang, Y. Fei, Y. Liu, C. Chen, G. Yang, J. Q. Li, E. Yang,
Simonyan, K., and A. Zisserman. 2014. “Very deep convolutional networks and S. Qiu. 2019. “Automated pixel-level pavement crack detection on
for large-scale image recognition.” Preprint, submitted September 4, 3D asphalt surfaces with a recurrent neural network.” Comput.-Aided
2014. http://arxiv.org/abs/1409.1556. Civ. Infrastruct. Eng. 34 (3): 213–229. https://doi.org/10.1111/mice
Stricker, R., M. Eisenbach, M. Sesselmann, K. Debes, and H.-M. Gross. .12409.
2019. “Improving visual road condition assessment by extensive experi- Zhang, A., K. C. Wang, Y. Fei, Y. Liu, S. Tao, C. Chen, J. Q. Li, and B. Li.
ments on the extended gaps dataset.” In Proc., Int. Joint Conf. on Neu- 2018a. “Deep learning–based fully automated pavement crack detection
ral Networks (IJCNN), 1–8. New York: IEEE. on 3D asphalt surfaces with an improved CrackNet.” J. Comput. Civ.
Tabernik, D., S. Šela, J. Skvarč, and D. Skočaj. 2020. “Segmentation-based Eng. 32 (5): 04018041. https://doi.org/10.1061/(ASCE)CP.1943-5487
deep-learning approach for surface-defect detection.” J. Intell. Manuf. .0000775.
31 (3): 759–776. https://doi.org/10.1007/s10845-019-01476-x. Zhang, A., K. C. Wang, B. Li, E. Yang, X. Dai, Y. Peng, Y. Fei, Y. Liu, J. Q.
Tsai, Y.-C., and A. Chatterjee. 2017. “Comprehensive, quantitative crack Li, and C. Chen. 2017. “Automated pixel-level pavement crack detec-
detection algorithm performance evaluation system.” J. Comput. Civ. tion on 3D asphalt surfaces using a deep-learning network.” Comput.-
Eng. 31 (5): 04017047. https://doi.org/10.1061/(ASCE)CP.1943-5487 Aided Civ. Infrastruct. Eng. 32 (10): 805–819. https://doi.org/10.1111
.0000696. /mice.12297.
Tsai, Y.-C., and A. Chatterjee. 2018. “Pothole detection and classification us- Zhang, K., H. Cheng, and B. Zhang. 2018b. “Unified approach to pavement
ing 3D technology and watershed method.” J. Comput. Civ. Eng. 32 (2): crack and sealed crack detection using preclassification based on trans-
04017078. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000726. fer learning.” J. Comput. Civ. Eng. 32 (2): 04018001. https://doi.org/10
Tsai, Y.-C. J., and F. Li. 2012. “Critical assessment of detecting asphalt .1061/(ASCE)CP.1943-5487.0000736.
pavement cracks under different lighting and low intensity contrast con- Zhang, K., Y. Zhang, and H. Cheng. 2020. “Self-supervised structure learn-
ditions using emerging 3D laser technology.” J. Transp. Eng. 138 (5): ing for crack detection based on cycle-consistent generative adversarial
649–656. https://doi.org/10.1061/(ASCE)TE.1943-5436.0000353. networks.” J. Comput. Civ. Eng. 34 (3): 04020004. https://doi.org/10
Tsai, Y. J., F. Li, and Y. Wu. 2013. “A new rutting measurement method .1061/(ASCE)CP.1943-5487.0000883.
using emerging 3D line-laser-imaging system.” Int. J. Pavement Res. Zhang, L., F. Yang, Y. D. Zhang, and Y. J. Zhu. 2016. “Road crack detec-
Technol. 6 (5): 667–672. tion using deep convolutional neural network.” In Proc., IEEE Int.
Tsai, Y. J., and Z. Wang. 2015. Development of an asphalt pavement rav- Conf. on Image Processing (ICIP), 3708–3712. New York: IEEE.
eling detection algorithm using emerging 3D laser technology and Zimmerman, K. A. 2017. Pavement management systems: Putting data to
macrotexture analysis. Final Rep. No. NCHRP IDEA Project 163. work. Washington, DC: Transportation Research Board.
Washington, DC: Transportation Research Board. Zou, Q., Y. Cao, Q. Li, Q. Mao, and S. Wang. 2012. “CrackTree: Automatic
Tsai, Y. J., Y. Wu, and C. Ai. 2011. “Feasibility study of measuring con- crack detection from pavement images.” Pattern Recognit. Lett. 33 (3):
crete joint faulting using 3D continuous pavement profile data 2.” 227–238. https://doi.org/10.1016/j.patrec.2011.11.004.
In Proc., 90th Annual Meeting on Transportation Research Board, Zou, Q., Z. Zhang, Q. Li, X. Qi, Q. Wang, and S. Wang. 2018. “DeepCrack:
23–27. Washington, DC: Transportation Research Board. Learning hierarchical convolutional features for crack detection.” IEEE
Wang, K., A. Zhang, J. Q. Li, Y. Fei, C. Chen, and B. Li. 2017a. “Deep Trans. Image Process. 28 (3): 1498–1512. https://doi.org/10.1109/TIP
learning for asphalt pavement cracking recognition using convolutional .2018.2878966.
J. Comput. Civ. Eng., 2020, 34(5): 04020038

Machine Learning For Crack Detection Review An

Uploaded by

Copyright:

Available Formats

You might also like

Machine Learning For Crack Detection Review An

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning For Crack Detection Review An

Uploaded by

Copyright:

Available Formats

Machine Learning for Crack Detection:

Review and Model Performance Comparison

Introduction technique in almost every field. By providing a sufficient amount

© ASCE 04020038-1 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

Input Classification Object Detection Segmentation

(a) (b) (c) (d)

© ASCE 04020038-2 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

To gain insight into current ML crack detection, the authors per-

© ASCE 04020038-3 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

© ASCE 04020038-4 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

(a) (b) (c) (d)

Alligator Longitudinal Transverse Block Non-Crack Other Distress

Fig. 4. Example range images of each category in the testing data.

Table 2. Testing images in each category TP

© ASCE 04020038-5 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

hEHD ðB; AÞ Model-Training Setup

© ASCE 04020038-6 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

Table 3. Evaluation result of the crack segmentation models

Range Image Ground-truth FCN_VGG16 FCN_VGG19 FCN_ResNet

© ASCE 04020038-7 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

Range Image CrackNetII Range Image CrackNetII

Fig. 7. Crack segmentation results of CrackNetII.

© ASCE 04020038-8 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

U-Net Pix2Pix_ResNet Pix2Pix_U-Net

Range Image with Range Image with

© ASCE 04020038-9 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

© ASCE 04020038-10 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

© ASCE 04020038-11 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

© ASCE 04020038-12 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2020, 34(5): 04020038

You might also like