Measurement: Jiale Hou, Huachen Jiang, Chunfeng Wan, Letian Yi, Shuai Gao, Youliang Ding, Songtao Xue

Measurement 196 (2022) 111206
Contents lists available at ScienceDirect
Measurement
journal homepage: www.elsevier.com/locate/measurement
Deep learning and data augmentation based data imputation for structural
health monitoring system in multi-sensor damaged state
Jiale Hou a, 1, Huachen Jiang a, 1, Chunfeng Wan a, *, Letian Yi a, Shuai Gao a, Youliang Ding a,
Songtao Xue b, c, *
a
Southeast University, Key Laboratory of Concrete and Prestressed Concrete Structure of Ministry of Education, Nanjing, China
b
Research Institute of Structural Engineering and Disaster Reduction, College of Civil Engineering, Tongji University, Shanghai, China
c
Department of Architecture, Tohoku Institute of Technology, Sendai, Japan
A R T I C L E I N F O A B S T R A C T
Keywords: Sensors, as an important part of structural health monitoring systems (SHMSs), will be abnormal sometimes due
Missing data imputation to their deterioration or environment effect, which will result in data loss during the health monitoring process of
Deep learning the structures. Data loss often happens in real monitoring applications, especially for wireless monitoring sys
Data augmentation
tems. Missing data, especially the long-term continuous missing data, will have a great impact on structural
Generative adversarial network
damage detection and condition evaluation. Usually, the long-term continuous missing data of the sensors are
Long short-term memory network
interpolated by traditional methods such as the correlation methods, which use a lot of normal monitoring data
to build models and impute the missing data. However, in practice, many SHMSs in China have been in service
for about 20 years or more, and many sensors installed have become faulty. It is usually difficult to obtain enough
dataset fit for above methods. In this paper, a novel data imputation framework based on deep learning and data
augmentation technique is therefore proposed, which enables the application of data modeling and missing data
imputation based on the less remaining data when multiple sensors fail. Data imputation can be made between
the same type of sensors (STSs) and also different types of sensors (DTSs). Generative adversarial network (GAN)
based deep learning method and data augmentation technique are used for the imputation between the STSs;
while long short-term memory (LSTM) network method is used for data imputation between the DTSs. The
proposed methods are verified on the dataset of a real concrete bridge located in China, and results show that the
proposed method achieves good performance.
1. Introduction easily become faulty and so do the data they collect. Among all types of
data abnormity, the most common one is the data loss, especially in
The evaluation of structural health condition and ensure their safety wireless transmission case.
is an important task for civil engineering structures [1–3]. In recent Missing data will lead to the loss of important monitoring informa
decades, structural health monitoring systems (SHMSs) have been tion in SHMS and reduce the stability and reliability of the SHMS seri
installed on various important infrastructures, especially long-span ously. In addition, many machine learning based structural damage
bridges [4,5]. SHMSs contain different types of sensors to capture the detection algorithms require the complete dataset for calculation, while
static and dynamic responses, external excitation loads and service the missing dataset will cause the collapse of the analyzing models and
environmental conditions of structures. The strain and displacement algorithms, such as the early warning model based on linear regression
sensors installed will continuously collect a large amount of structural [6] and the machine learning based algorithms for structural modal
response data for structural online damage detection and condition identification [7] and so on. Therefore, repairing the data loss or cor
evaluation. However, many sensors are suffered from severe environ ruption is a vital task to be resolved.
mental noise, extreme weather, and other factors, and therefore are In recent years, many researchers have developed data imputation
* Corresponding authors at: University, Key Laboratory of Concrete and Prestressed Concrete Structure of Ministry of Education, Nanjing, China (Chunfeng Wan);
Research Institute of Structural Engineering and Disaster Reduction, College of Civil Engineering, Tongji University, Shanghai, China (Songtao Xue).
E-mail addresses: wan@seu.edu.cn (C. Wan), xue@tongji.edu.cn (S. Xue).
1
First author: Jiale Hou & Huachen Jiang contributed equally to this work, and should be considered as co-first authors.
https://doi.org/10.1016/j.measurement.2022.111206
Received 26 December 2021; Received in revised form 18 March 2022; Accepted 14 April 2022
Available online 20 April 2022
0263-2241/© 2022 Elsevier Ltd. All rights reserved.
J. Hou et al. Measurement 196 (2022) 111206
methods to reduce the negative effects of data loss on structural health from both the numerical model and the real structure, where no data of
monitoring systems. Most of these data imputation methods are based the real structure in damaged state is needed. On the other hand, data
on the correlation models between sensors. Zhang and Luo et al. [8] used augmentation is a method of expanding the amount of data, which
the linear regression and multiple linear regression between structural means that data scarcity scenarios, where deep learning techniques may
responses to recover the lost data in building structures. Chen et al. [9] fail, will be alleviated to some extend so that the imputation result can
proposed a copula method to capture the relationship between the strain be improved [26]. It has been widely used in time-series data analysis in
sensor at two different locations and reconstruct the missing data. The many fields [27,28]. Long et al. [29] adopted a stacked autoencoder and
imputation results had a high accuracy, but a delicate model is required. back propagation network to enhance the one-dimensional signal and
Wan et al. [10] reconstructed the missing temperature data and accel the enhanced data was used to train conventional neural network for the
eration data using Bayesian multi-task learning approach to overcome mechanical fault diagnosis. Hu et al. [30] proposed a data augmentation
the Gaussian process–based Bayesian approach based on the selected algorithm based on the core assumption of Order Tracking and pre
covariance function. Bao et al. [11] proposed a data recovery method sented a self-adaptive convolutional neural network for fault diagnosis.
based on compressive sampling technology for moving wireless sensing Through data augmentation technology, these methods realized fault
systems. Fairly high recovery accuracy can be obtained if the original diagnosis based on limited sample data and achieved fairly high
data have a sparse characteristic in some orthonormal basis. Zhang et al. accuracy.
[12] reconstructed the bridge deflection data by the finite-element Inspired by these methods, a data imputation method based on deep
model combined with the partial least-square regression method. learning and data augmentation is proposed in this paper, which aims to
Some data imputation methods, however, are based on non-iterative impute missing data between DTSs or STSs. With which, missing data
methods. Ivan and Roman et al. introduced GRNN-based cascade can be imputed with only 1–2 days’ incomplete dataset and it out
scheme [13] and GRNN-SGTM ensemble method [14,15] to impute performs the traditional models, such as support vector regression
missing data for air pollution monitoring datasets. (SVR), radial basis function network (RBFN), and GRNN. An adaptive
With the rapid development of computer technologies such as deep approach is also proposed for different scenarios, imputing missing data
learning and big data, some related algorithms have achieved promising between STSs with GAN and data augmentation while imputing data
results in the structural health monitoring field in recent years, such as between DTSs with LSTM, to achieve both the high accuracy and effi
crack detection and damage detection by convolution neural networks ciency. Considering the limited data case in STS scenario, data
[16]. Researchers have also proposed some data imputation methods augmentation is introduced to work with GAN together to improve the
based on deep learning, however, most of them are used to impute the imputation effect. Compared with the LSTM, when GAN and data
dynamic acceleration data or velocity data [17–20], and only few of augmentation technique are applied, less parameters setting is required.
them concern strain and displacement data [21]. Li et al. [17] developed However, LSTM can help to establish sound complicated nonlinear
a hybrid method to convert the missing data imputation task into a time correlation models between DTSs which can be hardly realized by GAN.
series prediction task. The acceleration data were decomposed by In the following sections, the proposed imputation framework is
empirical mode decomposition and the long short-term memory (LSTM) established and GAN, LSTM and data augmentation techniques are
network was used to remember long-range correlations of subsequences. introduced. Then the datasets obtained from SHMS of a real box bridge
Zhao et al. [22] established nonlinear models between multi-source is used to verify the proposed imputation method. In the analysis, the
monitoring data by LSTM, including temperature to temperature- imputation results between STSs using GAN and data augmentation, and
induced strain, vehicle-induced strain to dynamic displacement, etc. those between DTSs using LSTM, are exhibited respectively. Considering
Fan and Li et al. [18,23] proposed a fully feed-forward convolutional the discrete data loss can be easily imputed using interpolation or cor
neural network (CNN) and segment based generative adversarial net relation based methods and other types of anomalies, such as data drift,
works (SegGAN), which reconstructed dynamic acceleration data even gain, offset, etc. can be repaired by removing them as data missed,
when the loss ratio reached 90%. Based on the CNN, skip connection and continuous missing data are, therefore, only concerned in this article.
dense block techniques were applied to increase the training efficiency
and accuracy of feature extraction with fewer parameters in the network 2. Methodology
[19]. Lei et al. [20] proposed a deep convolutional generative adver
sarial network which included a generator with encoder-decoder 2.1. Missing data imputation framework
structure and an adversarial discriminator to reconstruct the lost ac
celeration data. Jiang et al. [21] proposed a generative adversarial The proposed deep learning and data augmentation based frame
network (GAN) for imputing missing strain response by only the work for data imputation as shown in Fig. 1 divides the data imputation
incomplete dataset even during the training process. problem into two cases. In one case, the same type of sensors (STSs) is
The deep learning based data imputation methods proposed by available, and the GAN is used to establish the nonlinear model between
above papers utilized more than 3 months historical data [22] or more STSs. If the nonlinear model is difficult to be obtained due to the limited
than 7 sensors of the same type [18–21]. However, many SHMSs STSs, data augmentation method can be used to solve this problem. In
installed on the civil engineering structures in China have been in this paper, the data missing problem in bridge health monitoring will be
operation for more than 20 years, and the sensors’ quality in the early focused. There are two cases of the imputation between STSs considered
21st century lacked strong stability and robustness. Therefore, these in this paper, one is the missing strain data imputed by normal strain
SHMSs are facing a challenge of multi-sensor failure with only few data, and the other is the missing displacement imputed by normal
sensors still working normally. How to utilize limited normal sensor data displacement data. It is noted that the strain data of a bridge includes
to repair missing data is an important problem. The more complicated two parts: vehicle induced response and temperature induced response.
problem is that when there are no normal STSs, then the limited DTSs, The two parts of strain data are imputed respectively, and finally added
which have stronger nonlinear relationship with faulty sensors, must be to obtain the imputation result.
used for data imputation. Faced with the limited data problems in ma In the other case, all STSs are faulty while only DTSs work normally,
chine learning or deep learning algorithms, on the one hand, some and LSTM is then used to establish a more nonlinear model between
scholars use transfer learning method to achieve cross-domain damage DTSs. Since GAN are prone to have model collapse problems when the
diagnosis, from rich dataset to limited dataset. Wan et al. [24] achieved nonlinearity between input data and output data is strong, it is not
effective cross-domain bearing fault diagnosis, which has superior suitable for imputation between DTSs. Compared with GAN, LSTM can
transfer capability. Lin et al. [25] designed a domain adaptation (DA) obtain the time characteristics of data, so it can capture more complex
neural network for structural damage detection and trained on the data nonlinear relationships between DTSs, but more parameters, such as
2
Fig. 1. The proposed data imputation framework based on deep learning and data augmentation.
number of layers, units of each layer, and the parameters in the forget models. In practice, GANs have been shown to be very successful in civil
gate, input gate and output gate and so on, need to be adjusted and more engineering such as the automated structural design of shear wall
time need to be trained to obtain a good performance. This imputation buildings [32] and bridge damage detection [33,34]. On sensors’ data
framework, compared with other imputation methods, is suitable for loss problem, some data imputation methods based on GAN have been
incomplete datasets. The dataset containing missing data can be directly proposed and good results are achieved [20,35]. However, these
input into the imputation network and finally obtain the complete methods utilize the historical data of more than 7 STSs to impute the
dataset. strain or acceleration data for only one sensor, in which the fault ratio of
sensors in SHMSs is no more than 35%. In this article, displacement
sensors are considered and data augmentation is used combining with
2.2. GAN-based imputation method between STSs
GAN to impute the loss data in multi-sensor failure state when 85% of
the STSs are faulty, and only three remaining sensors are still working
In 2014, Goodfellow et al. [31] proposed the generative adversarial
normally.
network, which provides a powerful modeling framework for studying
GAN is an unsupervised generative model that consists of two parts, a
distribution characteristics of complex high-dimensional data. Unlike
generator and a discriminator. In this article, generator learns from the
likelihood-based methods, GANs are referred to as implicit probabilistic
Fig. 2. Architecture of the proposed GAN.
3
original observed sensor data distribution, a random noise matrix, and a { }

mask matrix which indicates which part is real or missing, to generate X’r,j = X’1,j , X’2,j , ⋯, X’m,j
the data as realistic as possible to “confuse” the discriminator. The { }
1
= X1,j 2
, X1,j n
, ⋯, X1,j 1
, X2,j 2
, X2,j n
, ⋯, X2,j 1
, ⋯⋯, Xm,j 2
, Xm,j n
, ⋯, Xm,j (3)
discriminator is a classifier, which classifies the generated data ac
cording to the generator and a hint matrix which reveals the imputed
and real part of the data. The goal of the discriminator is to distinguish where m denotes the reference sensor augmentation times and znrm , Xrm n
the generated data from the original data as much as possible. After such are the noise and augmented data separately of n-th reference sensor at
adversarial training process, the performance of generator and the m-th augmentation. It is noted that the noise z follows a Gaussian
discriminator is constantly improved to generate data with high accu distribution with a mean value of 0 and a very small variance about
n
racy rate. Fig. 2 shows the proposed imputation architecture based on 0.001 to prevent z affecting the characteristics of the original signal. Xrm
n n
GAN. is calculated by the Xm,j = Xr,j +znm,j and this formula can also be
For the convenience of the following description, the sensor with expressed in matrix as below:
missing data is named as “target sensor”, while the sensor used to
(4)
′
establish the relationship with the target sensor is named as “reference Xi,j = zi,j ⊕ Xr,j
sensor”. The target sensor data at a same time point j can be expressed as
1 2 k i where ⊕ means element-wise addition. The augmentation data is added
Xt,j = {Xt,j ,Xt,j ,⋯,Xt,j }, where Xt,j denotes the i th fault sensor data at the
to dataset D which can be expressed as D = {X1 , X2 , ⋯, Xl }T and each
time point j and k denotes the total number of the target sensors. Simi
1 2 n Xj = {Xt,j , Xr,j , X’r,j }. Then, the new mask matrix M and new random
larly, Xr,j = {Xr,j , Xr,j , ⋯, Xr,j } indicates the reference sensor data at j time
i
noise variable Z (different with z) can be obtained from the new dataset
point, where Xr,j denotes the i th reference sensor data and n denotes the
D and input into GAN architecture by the above process, as shown in
total number of the reference sensors. Thus, all sensor data at the same Fig. 3.
1 2 k 1 2
time point j can be expressed as Xj = {Xt,j ,Xr,j } = {Xt,j ,Xt,j ,⋯,Xt,j ,Xr,j ,Xr,j , Generator network Before the incomplete data D ̃ is input into the
n
⋯, Xr,j }. Suppose there are k target sensors, and each of them can be generator network, the missing data of target sensor X ̃ t,j must be filled
modeled by GAN with reference sensors respectively without any dif with random noise variable Z = {Z1 , Z2 , ⋯, Zl }. The generator G takes
ference. If we focus on a certain sensor, then k = 1. The reference sensor ̃ j , Mj and Zj as input and outputs Xj , a vector of imputations. Then Xj
X
1 2 n
data at the same time point can be arranged as Xj = {Xt,j , Xr,j , Xr,j , ⋯, Xr,j }
and X
̂ j are defined as follows:
and the dataset can be expressed as D = {X1 , X2 , ⋯, Xl }T , where l is the ( )
number of total sampling points. The mask vector at a same time point Xj = G X
( )
̃ j , Mj , 1 − Mj ⊙ Zj (5)
{ }
j,Mj = Mt,j , M1r,j , M2r,j , ⋯, Mnr,j , can be obtained from the dataset D,
which is used to indicate the location of the missing data in the dataset. ̂ j = Mj ⊙ X
X
( )
̃ j + 1 − Mj ⊙ Xj (6)
The incomplete dataset is defined as D ̃ = {X ̃ 1, X ̃ l }T , where X
̃ 2 , ⋯, X ̃j =
1 2 n
{X
̃ t,j , X
̃ ,X ̃ , ⋯, X ̃ } is Xj masked by Mj as: where ⊙ denotes element-wise multiplication. Xj represents the vector
r,j r,j r,j
{ of generated values and X
̂ j corresponds to the complete data vector that
X if Mi,j = 1
̃ i,j =
X i,j
(1) replacing missing data in incomplete vector X
̃ j with the corresponding
nan if Mi,j = 0
value of Xj . The architecture of generator G is composed of an input
where nan means “not a number” which indicates the missing data. layer and two fully connected layers whose activation function are
Impute the missing data in each X
̃ j using incomplete dataset D
̃ is the goal Rectified Linear Unit (ReLU) and sigmoid respectively, as shown in
of this paper. By attempting to model the conditional distribution of Xj Fig. 2. ReLU and sigmoid can be formulated as follows:
⃒ {
⃒ xi , if xi ≥ 0
given X xj , written as P(X⃒⃒X
̃j = ̃ xj ), where ̃
̃j = ̃ xj is one realization of X
̃j , ReLU : f (x) = max(0, x) = (7)
0, if xi < 0
we can generate samples to fill the missing data in D. ̃ The original
complete dataset is only used to test the effect of the data imputation Sigmoid : f (x) =
1
(8)
method based on deep learning.
x
1 + exp−
Data augmentation Data augmentation technique is used to generate Discriminator network As in the traditional GAN framework, the
time series data for deep learning networks when the amount of data are discriminator D is used as an adversary network to the generator G.
small. It has been widely used in time-series data, such as time-series However, the output of the generator G in this GAN is not complete real
classification [28] and construction equipment activity recognition data or complete fake data, but generates data where the data is missing.
[27]. In our case, multiple sensors are deemed damaged, which means Therefore, the output vector of the discriminator is not true or false
only fewer sensor data can be used in deep learning modeling. There simply, but identifies which part of the output data of G is true
fore, adding noise, the most common time series data enhancement, is (observed) and which part is fake(imputed).
used to expand the reference sensor data to overcome the problem of The hint matrix H is predefined and dependents on the mask matrix
small amount of reference sensor data. according to the hint mechanism. The discriminator D takes the output
1 2 n
As mentioned earlier, Xr,j = {Xr,j , Xr,j , ⋯, Xr,j } demonstrates the
of generator X
̂ j and hints matrix H, and then predicts the mask matrix
reference sensor data at j time point. The noise z for data augmentation
M.D consists of an input layer activated by the ReLU function and two
̂
and the augmented data X’r,j at the same time point j are defined as fully connected layers whose activation function are Rectified Linear
follows: Unit (ReLU) and sigmoid respectively, as shown in Fig. 2.
{
zj = z1,j , z2,j , ⋯, zm,j
} Hint mechanism The discriminator needs to discriminate each value
{ } generated by G from the remained ground true data, but the continuous
= z11,j , z21,j , ⋯, zn1,j , z12,j , z22,j , ⋯, zn2,j , ⋯⋯, z1m,j , z2m,j , ⋯, znm,j (2) loss of a large amount of data without providing any “hint” will confuse
the discriminator. The hint matrix is a random variable H, which con
tains most of the information of the mask matrix and provides a hint for
the discriminator. The hint matrix, as shown in Fig. 3(d), which is
4
Fig. 3. Simple example of each matrix: (a) dataset D, (b) mask matrix M, (c) random noise matrix and (d) hint matrix.
dependent on the distribution H|M, tells the discriminator most of the

where log is element-wise logarithm and dependent on G is via X.
̂
answers, 0 represents missing data, 1 represents observed data, and 0.5
represents the data that needs to be discriminated by the discriminator
itself. The proportion of ‘0.5’ in the hint matrix H needs to be set 2.3. LSTM-based imputation method between DTSs
manually, which always defaults to 0.1 in this paper. After repetitive
iterations, the discriminator finally obtains the distribution of data and The long short-term memory (LSTM) network, proposed by
outputs satisfactory results ̂
M. Hochreiter and Schmidhuber in 1997, is a variant of recurrent neural
Objective function Though the adversarial training process, the network (RNN), which can map all historical data to each prediction,
discriminator D is trained to increase the accuracy of predicting mask thereby remembers the long-term correlation of time series [36]. The
matrix, and the generator G is trained to “confuse” the D and minimize greatest contribution of LSTM is to solve the gradient explosion of RNN
the accuracy of D predicting mask matrix. The objective function can be though the development of three gates, the input gate, the forget gate,
defined as follows: and the output gate, as shown in Fig. 4(a). The forget gate f t determines
[ ] which information from previous cell should be deleted and ot de
M + (1 − M)T log(1 − loĝ
minmaxÊx,M,H MT loĝ M) (9) termines what to output by filtering the input and the previous hidden
G D
state. Suppose the input time series data is x = (x1 , x2 , ⋯, xn ) and the
Fig. 4. Architecture of LSTM network (a) LSTM unit structure and (b) the proposed LSTM network for data imputation.
5
predicted value is y, the LSTM mechanism can be represented by the where σ and tanh denotes the Sigmoid function and hyperbolic tangent
following equations: functions; it and ht denotes the activation vectors of the input gate and
( ) the hidden state; Ct and Ĉ t are the cell state and updated state of time
f t = σ W f [ht− 1 , xt ] + bf (10)
step; Wi , Wc , Wf , Wo are the corresponding weight matrix; bi , bc , bf , bo are
it = σ(W i [ht− 1 , xt ] + bi ) (11) the corresponding biases; the symbol “⋅” represents the element-wise
multiplication.
̃ t = tanh(W C [ht− 1 , xt ] + bC )
C (12) A LSTM-based imputation method for imputation between DTSs is
proposed in this paper. The architecture of the LSTM network is shown
̃t (13) in Fig. 4(b), consisting of five layers: a input layer, three LSTM hidden
Ct = f t ⋅Ct− 1 + it ⋅C
layers, and a fully connected layer. The solver for network training uses
ot = σ (W o [ht− 1 , xt ] + bo ) (14) Adaptive Moment Estimation (Adam) optimizer. The rule of setting the
number of hidden layer units is generally a comprehensive consideration
ht = ot ⋅tanh(Ct ) (15) of accuracy and efficiency and to minimize the possibility of gradient
explosion in the training. The loss function of the LSTM network is mean
Fig. 5. (a) Lieshihe bridge; (b) strain sensor; (c) dynamic displacement sensor; (d) sensor deployment of the midspan cross section.
6
squared error (MSE) which can be defined as follow: decomposed, but directly imputed. That is because the sensor data with
data loss is incomplete, and it is impossible to use wavelet decomposi
1∑ n
loss = (y − ̂y i )2 (16) tion for a whole day’s data directly.
n i=1 i After wavelet transform decomposition, the data normalization is
also necessary to improve the imputation efficiency and evaluate the
where yi and ̂y i represent the ground true value and the predicted value imputation performance. The vehicle-induced strain data, temperature-
respectively. induced data, and dynamic displacement data are normalized and
mapped into 0 to1. The normalization equation is as follows.
3. Application and verification
y − min( y )
y normalized = (17)
3.1. Data pre-processing max( y ) − min( y )
The data imputation method is applied to the monitoring data of the where y demonstrates the sensor data.
Lieshihe bridge, a real box girder bridge located in China as shown in
Fig. 5(a), to validate its effect and efficiency. The sensor deployment of
3.2. GAN-based imputation method between STSs
the middle cross section of the bridge is shown in Fig. 5(d) and a large
amount of complete data of all sensors is collected. It can be seen that
The distribution characteristics of the STSs data on the same section
there are two types of sensors in the SHM system, which are strain
are often similar; however, the nonlinear mapping model between DTSs,
sensors and dynamic displacement sensors as shown in Fig. 5(b) and (c),
like dynamic displacement sensors and strain sensors, is much more
and 3 sensors are faulty including 2 strain sensors and 1 dynamic
complex than that of STSs. The above two cases will be discussed
displacement sensor. In this case, monitoring data of all normal sensors
separately. In this section, randomly selected two days’ monitoring data
in two days in July 2017 are used for the algorithm verification. The
from 18 strain sensors and 4 dynamic displacement sensors of Lieshihe
sampling frequency of these sensors is 10 Hz.
Bridge in September of 2018 are used to validate the GAN-based
The strain sensors can capture the quasi-static behavior caused by
imputation method between STSs. Assuming that some data of a
temperature, whose data distribution is obviously different from that of
certain sensor are missing, the proposed GAN imputation framework
the dynamic displacement sensor. Therefore, the strain sensor data can
will then be applied to establish the nonlinear mapping model between
be divided into two parts: temperature-induced response and vehicle-
the target sensor and reference sensors and the missing data can be
induced response. The vehicle-induced response is related to moving
imputed.
vehicles and the time of occurrence is random, while the temperature-
induced response has a strong regularity which changes simulta
3.2.1. Imputation between strain sensors
neously with the ambient temperature in one day period. The captured
In order to validate the effectiveness of GAN imputation framework,
dynamic displacement data only have the vehicle response part. In order
it is assumed that the data of No.77 strain sensor from 1:00 a.m. to 3:00
to reduce the influence of temperature-induced response on their
a.m. are missing. The proposed GAN-based imputation method utilizes
modeling, the strain data are decomposed into vehicle-induced response
the remaining 17 strain sensors to establish a nonlinear mapping model
data, which are used to model the dynamic displacement sensor data,
between STSs, and imputes the missing data for No.77 strain sensor. As
and semi-static temperature-induced response data. The vehicle-
mentioned before, the strain sensor data contain two parts, vehicle-
induced strain response and temperature-induced strain response can
induced response, and temperature-induced response. These two parts
be separated by wavelet packet decomposition. The method of wavelet
of data should be imputed respectively, and then added to obtain the
packet decomposition developed from wavelet transform is very accu
imputation result of strain missing data.
rate in signal analysis, with the predominance of analyzing detailed
However, many SHMSs installed on the bridge in China have been in
information of high frequency. In our case, the Daubechies wavelet is
operation for more than 20 years, and many sensors in SHMSs have
applied with the filter length being 16. Since the vehicle-induced
actually failed. Therefore, it is important to consider the multi-sensor
response is a high-frequency component, and the temperature-induced
damage of SHMSs. For the above reasons, another case where only
response is a low-frequency component, it is easy to decompose the
three sensors (No.69, 78 and 79 strain sensors) are still working, which
two responses, as shown in Fig. 6.
means 85% sensors in this SHMSs have been faulty, is also considered. In
Before decomposition, it is necessary to determine the missing data
this case, the proposed GAN-based imputation method utilizes the data
range of the original strain and displacement data by identifying the
of only three strain sensors to establish a mapping model with No.77
data that continues to be 0 or nan for a long time period. It is noted that
strain sensor. Firstly, the vehicle-induced response imputation is
the strain sensor data are first divided into one-hour period segments,
considered. All the vehicle-induced response data decomposed by
and then the complete hourly data is decomposed by wavelet packet
wavelet transform are input into GAN without data augmentation
decomposition, while the hourly data with missing parts is not
techniques, and the training epoch is set to be 10000.
Fig. 6. Displacement and strain monitoring data in a single day: (a) No.100 displacement data; (b) No.77 strain data and temperature-induced part; (c) vehicle-
induced part of No.77 strain data.
7
The imputation result with observed data when 17 sensors are used augmentation technique, is considered to augment the inherent char
as reference sensors is shown in Fig. 7(a). It can be seen that the algo acteristics of input data and strengthen the discriminator, so that GAN
rithm has excellent performance and most of the imputation data almost can better capture the nonlinear relationship between target sensor data
coincide with the original data, except for a few peaks. On these peaks, and reference sensor data. The augmented reference sensor data, which
the absolute value of the imputation data is generally a little smaller are augmented 1, 3, 6, 9, 12, 15, 18, 36, 72 and 100 times respectively by
than the original real data, which is defined as a peak weakening phe adding random Gaussian noise, are input into GAN’s generator together
nomenon (PWP). Fig. 7(b) shows the imputation results when No.69, 78, with the original reference sensor data and simulated faulty sensor data.
and 79 strain sensors are used as reference sensors. In this case, most of The imputation results of missing data of strain sensor 77 are shown in
the imputation data also coincide with the original data, however, the Fig. 9. It can be seen that the imputation result of one-time augmenta
PWP becomes more obvious and evident. The PWP will cause the SHMSs tion is significantly better than that without data augmentation and the
underestimate the real vehicle-induced response on the bridge struc PWP is significantly reduced. With the increase of data augmentation
tures, and then affect the evaluation of bridge structural state. Therefore, times, the reduction of PWP is more and more obvious, and the impu
it is very important to study the reasons of PWP and put forward tation results gradually reach the original data at the peak.
correction methods. The mean squared error (MSE), root mean squared error (RMSE), and
The PWP is likely to be related to the complex nonlinear relationship coefficient of determination (R2 ) between the imputed data and
between sensor data. Take No.77 and 79 strain sensors as an example. observed data are chosen to calculate the evaluating indicator to judge
Fig. 8 shows the data of No.77 and 79 strain sensors, which are located the quality of the model.
on two different box girder bottom plates of the same section respec
1∑n
tively as shown in Fig. 5. It can be seen that some values at the peak of MSE = y − yi )2
(̂ (18)
n i i
No.77 sensor are only slightly smaller than those of No.79 sensor, some
are much smaller (nearly half), and some are even greater than those of √̅̅̅̅̅̅̅̅̅̅
(19)
RMSE = MSE
No.79 sensor. This is because when a vehicle passes through the right
most lane of the bridge section, it will cause greater displacement and ∑n
y i − yi )2
(̂
strain in the right lane than that in the left lane. In this case, No.77 strain R2 = 1 − ∑in (20)
sensor is close to the right lane and it will produce a large vehicle- i
(yi − y)2
induced response, especially a large peak. However, when the vehicle
passes through the left lane of the bridge, it has the opposite effect. where yi , ̂
y i is the observed data and imputed data respectively, n is the
Therefore, there is a strong nonlinear relationship between the two length of missing data, and y is the mean value of observed vehicle-
strain sensors. When GAN uses fewer reference sensors to impute the loss induced data. It is easy to conclude that the smaller the MSE and
data of the target sensor, this strong nonlinear relationship is too com RMSE, or the larger the R2 , the better the results of data imputation.
plex to be captured, because vehicles sometimes drive through different The error of the data imputation result at the peak is much more
lanes randomly. Only when the number of reference sensors is enough to important than that at other time points, because PWP will lead to un
identify the approximate position of the vehicle passing through the derestimation of vehicle-induced response on bridge. However, MSE,
bridge section, can the mapping relationship, especially at the peak, be RMSE and R2 treat the error of the data imputation result at all time
obtained. points equally. Therefore, in order to evaluate PWP of the data impu
Due to the poor performance of GAN-based imputation method in the tation result, amplitude ratio f is defined as follows:
case of fewer reference sensors, adding noise, a kind of data
Fig. 7. Imputation result for vehicle-induced part of No.77 strain sensor: (a) when 17 sensors are used as reference sensors; (b) when only 3 sensors (No.69, 78 and
79 strain sensors) are used as reference sensors.
8
Fig. 8. Vehicle-induced response of No.77 and 79 strain sensors.
Fig. 9. Imputation result for vehicle-induced part of No.77 strain sensor under different augmentation times.
⃒ ⃒
⃒maxi − mini ⃒ It can be seen from the results that with the increase of augmentation
f = ⃒⃒ − 1⃒⃒ (21)
maxt − mint times, the trend of MSE and RMSE is rapidly decreasing, while the f and
R2 are rapidly increasing. However, when the number of data
where maxi and mini is the maximum and minimum value of imputation augmentation times reaches 6, the data imputation has achieved an
data respectively, maxt and mint is the maximum and minimum value of excellent result. Therefore, when the number of data augmentation
the observed data respectively. The smaller the value of f, the less times continues to increase, the imputation results will not be greatly
phenomenon of peak clipping or exceeding. J is taken as the overall improved. However, the training time will be increased dramatically
evaluation index to comprehensively consider f and MSE as follows: with the increase of input data, which makes it difficult to ensure real-
time data imputation.
J = f *MSE (22)
The data imputation of the temperature-induced response of strain
With the increase of number of data augmentation times, the eval data is different from the vehicle-induced response. Compared with the
uating indicators are shown in Fig. 10.
9
Fig. 10. The line chart of evaluating indicators of vehicle-induced response imputation effect under different augmentation times.
randomness of vehicle-induced response, the temperature-induced method and data augmentation when multiple sensors fail.
response has obvious periodicity, generally taking one day as a com
plete cycle. Incomplete one-day reference sensor data, like those used in 3.2.2. Imputation between dynamic displacement sensors
the data imputation of vehicle-induced response, can not enable GAN to The dynamic displacement sensor is different from the strain sensor.
capture the periodicity of temperature-induced response of one day. The strain sensor mainly monitors the influence of the local stress of the
Thus, in order to ensure that the GAN can recognize the periodicity of structure under the environmental load, while the dynamic displace
the temperature-induced response, two days’ temperature-induced data ment sensor reflects the global response. Besides, the dynamic
of reference sensor are used to impute the missing data. It is also displacement sensor cannot capture the semi-static temperature-
assumed that the data of No.77 strain sensor is missing from 1:00 a.m. to induced response sensitively, so the data characteristics of the dynamic
3:00 a.m. and only three sensors are still working normally (No.69, 78 displacement sensor are significantly different from those of the vehicle-
and 79 strain sensors) which are used as reference sensors. MSE, RMSE induced response. In order to verify the effectiveness of the proposed
and R2 are chosen to calculate the evaluating indicator to judge the GAN-based imputation method with dynamic displacement data, the 24-
quality of the model. The results are as shown in Fig. 11. hour data of No. 100, 101, 102, and 103 dynamic displacement sensors
It can be seen from Fig. 11 that MSE and RMSE shows a downward of the Lieshihe bridge which is still working normally are used in this
trend as the number of data augmentation times increases. When the section. It is assumed that the data of the No.102 dynamic displacement
reference sensor data is augmented for more than 6 times, MSE and sensor from 1:00 a.m. to 3:00 a.m. is missing, the remaining data of four
RMSE is significantly reduced about 60% and reaches the stable stage. other displacement sensors in that day are used for imputation. Based on
As shown in the Fig. 12, the imputation results of all numbers of data the imputation experience of the strain sensor, the data augmentation
augmentation times meet the engineering requirements. method is used to improve the imputation results. No.100, 101, and 103
Fig. 13 shows the missing data imputation result for No.77 strain reference sensors are augmented with different number of augmentation
sensor, with the vehicle induced responses being augmented 6 times and times respectively, and the imputation results of missing data of dy
temperature induced responses being augmented 60 times respectively. namic displacement sensor 102 can be obtained as shown in Fig. 14. The
It can be found that with only three reference sensors excellent impu dynamic displacement sensor imputation result is evaluated by the same
tation can be achieved. Both the vehicle-induced response and evaluation index as the vehicle-induced strain, because PWP will occur
temperature-induced response of No. 77 strain sensor can be imputed in the imputation of both. The evaluation indicators of data imputation
accurately, which verifies the effectiveness of GAN-based imputation of dynamic displacement data with different data augmentation times
Fig. 11. The line chart of evaluating indicators of temperature-induced response imputation effect under different augmentation times.
10
Fig. 12. Imputation result for temperature-induced part of No.77 strain sensor under different augmentation times.
are shown in Fig. 15. Therefore, in practical engineering, the reference sensor can be
Figs. 14 and 15 show that the imputation results by dynamic augmented 6 times and then the collected data can be inputted into the
displacement are similar to those by vehicle-induced strain. There is a GAN to impute missing data of target sensors. Through the above tests, it
common and obvious peak weakening phenomenon (PWP) without data is proved that the data augmentation technique is effective when less
augmentation, and the peak values of the imputation results are reduced STSs data are available and is effective for both strain sensors and dy
by about 50% compared with those of the measured data. However, with namic displacement sensors.
the increasing number of data augmentation times, seen from the
Figs. 14 and 15, MSE, f, and J show a downward trend, while R2 shows 3.3. LSTM-based imputation between DTSs
an upward trend, and the PWP decreases significantly. It is noted that
the MSE of 100 times of data augmentation is 78.4%, f is 86% and J is In the previous sections, the STSs are used as reference sensors to
97% lower than that of no data augmentation. When the reference impute missing data, and it is verified that the combination of the GAN-
sensor is augmented 6 times, f and J reaches the stable stage, and the based imputation method and data augmentation technique gives very
data imputation results will not be greatly improved. The network good effect even when fewer STSs are available. However, dynamic
structure used in the training is the same, so the training time depends displacement sensors monitor the global response of the bridge struc
entirely on the size of the input dataset. With the increase of augmen ture, so the number of displacement sensors installed on bridge is rela
tation times, the dataset input into GAN continues to increase, and then tively very small compared with the strain sensors, which monitor the
the training time of the imputation network increases significantly. local response of the bridge. Therefore, the dynamic displacement
11
Fig. 13. Imputation result for No.77 strain sensor.
sensors sometimes face the situation that there are no STSs to be the model [37], which leads to poor imputation effect. Therefore, even if the
reference sensor. In this case, the local response data can be considered sensor data is augmented 6 times, the imputation effect is still poor as
to impute the missing data in the overall response, that is, using the shown in Fig. 19. It can be concluded that GAN-based imputation
strain sensor as the reference sensor, and establish the imputation model method is not suitable for DTS situation.
between DTSs. The dynamic displacement sensor only has the vehicle- In order to make full use of the time characteristics between DTSs,
induced response part, so the temperature-induced part of the strain LSTM-based imputation method is used to impute the missing data. The
sensor should be removed firstly, and then, the model between DTSs can verification of LSTM-based imputation method is carried out under the
be established. Compared with STSs, the nonlinearity of the relationship same simulated situation as above, with 2 h’ data of No.102 dynamic
between DTSs is much stronger. Fig. 16 shows the vehicle-induced part displacement sensor losing. And one day’s vehicle-induced response
of No.77 strain sensor and No.100 dynamic displacement sensor data. data of No.77 and No.78 strain sensors are used as reference sensors. The
It is assumed that No.102 dynamic displacement sensor data from remaining data is input into LSTM and after adjusting parameters for
1:00 a.m. to 3:00 a.m. is missing, and only one day’s vehicle-induced several times, an accurate mapping model of strain and dynamic
response data of No.77 and No.78 strain sensors are available to displacement based on LSTM is established to impute the missing data of
impute the missing data of No.102 displacement sensor. There are only 2 dynamic displacement sensor. The unit number of three LSTM layer is
normal sensors out of 20 strain sensors, and the sensor loss ratio of the 128, 32, and 4 respectively. The learning rate is 0.01 and training epochs
imputation between DTSs in this situation is 18/20, or 90%. After is 40. The following results can be obtained after 40 epochs of training as
10,000 epochs of GAN training, the loss function converges as shown in shown in Fig. 20. It can be seen from results that LSTM successfully
Fig. 17, and the imputation performance is poor. As shown in the Fig. 18, captures the nonlinear relationship between different types of sensors,
the data imputation of GAN does not meet the engineering requirements with less peak clipping, and the final regularized MSE is only 0.000489.
obviously. It is found that using No. 77 and No. 78 as reference sensors can also
The poor performance of GAN-based imputation method between impute the missing data of other dynamic displacement sensors, such as
DTSs is mainly from the fact that the imputed data is generated ac No.101 dynamic displacement sensor as shown in Fig. 21 and No.103
cording to the data distributional characteristics between sensors, and dynamic displacement sensor as shown in Fig. 22. The unit number of
this generation method does not make use of the time characteristics of three LSTM layer is also 128, 32, 4 respectively and the training epochs
time series data, so it is difficult to capture the strong nonlinear rela are 40. It can be seen from Figs. 21 and 22 that LSTM successfully
tionship between DTSs. On the other hand, in the case of highly established the relationship between the dynamic displacement sensor
nonlinear mappings between DTSs, there may be mode collapse in GAN and vehicle-induced part of strain sensor. The final regularized MSE of
12
Fig. 14. Imputation result for No.102 dynamic displacement sensor under different augmentation times.
Fig. 15. The line chart of evaluating indicators of dynamic displacement imputation effect under different augmentation times.
No.101 dynamic displacement sensor is 0.000555, and that of No.102 correlation between DTSs to impute missing data from dynamic
dynamic displacement sensor is 0.000535. displacement sensor. In fact, some non-iterative models have also ach
ieved fairly good imputation effect, such as GRNN based methods.
4. Discussion Therefore, it is necessary to compare the imputation effects between
these models. The traditional methods based on GRNN, SVR, and RBFN
In this paper, a missing data imputation framework to deal with are used to compare with the method proposed in this paper.
different missing data scenarios is proposed. GAN is used to impute For the verification of imputation between STSs, the data of vehicle
missing data between STSs, with data augmentation technique being induced strain data of three sensors (No.69, 78 and 79 strain sensors),
incorporated to reduce PWP. LSTM is adopted to establish the nonlinear from 0:00 a.m. to 1:00 a.m., are used for training and establishing a
13
Fig. 16. Correlation between the vehicle-induced part of No. 77 strain sensor and No.100 dynamic displacement sensor: (a) original data; (b) the vehicle-induced
part of No. 77 strain sensor versus No.100 dynamic displacement sensor.
method between STSs.

For the imputation between DTSs, the vehicle induced strain data
(No.77, 78 strain sensors), from 0:00 a.m. to 1:00 a.m., are used for
training and to impute the assumed missing data of No.102 displace
ment sensor, from 1:00 a.m. to 3:00 a.m., MSE and R2 results of different
models are shown in Table 2. It can be concluded that MSE of LSTM is
evidently lower than those of GRNN, RBFN, and SVR based methods,
which verifies that LSTM can better capture the time characteristics.
It can be found that the imputation effect of GAN and LSTM is better
than the traditional methods. Furthermore, compared with the tradi
tional methods, fewer parameters are required for this proposed
method. Although there are important findings revealed the proposed
method is proved to work successfully and effectively, there are still
limitations. Firstly, a key problem is to develop an imputation method
between DTSs with fewer parameters and strong generalization ability
through advanced ML/DL methods, such as GAN. However, as
mentioned before, GAN model is difficult to capture the strong nonlinear
relationship between DTSs because of the mode collapse. Adding
gradient penalty or modifying the loss function of GAN may solve the
Fig. 17. Training loss over 10,000 epochs.
mode collapse problem, and therefore may be possible to replace LSTM
by GAN with less parameters and more generalization ability in the
mapping model with No.77 strain sensor to impute the assumed missing
future. Secondly, it is also important to verify the effectiveness of the
vehicle induced data from 1:00 a.m. to 3:00 a.m. The parameters of the
proposed algorithm on other types of datasets. These problems will be
SVR, GRNN and RBFN models are determined by 4-fold cross validation.
further studied in the future.
MSE and R2 results of different methods are shown in Table 1. It can be
seen from the table that MSE of GAN with 3 times data augmentation is
5. Conclusion
lower than that of GRNN, RBFN, and SVR based methods, which means
that the imputation effect of GAN with 3 times data augmentation is
In this paper, a deep learning and data augmentation based data
better than other methods. For the GAN-based method, with the increase
imputation method for missing data imputation for SHMSs under multi-
of data enhancement times, MSE is decreasing and R2 is increasing, sensor damaged state is proposed. The effectiveness and efficiency of the
which corresponds to the results in section ‘GAN-based imputation
14
Fig. 18. Imputation result for No.102 dynamic displacement sensor using GAN.
Fig. 19. Imputation result for No.102 dynamic displacement sensor using GAN and data augmentation technique.
Fig. 20. Imputation result for No.102 dynamic displacement sensor using LSTM.
15
the evaluation of the state of the bridge structure. PWP also occurs in
Table 1
the data imputation of dynamic displacement sensor when there are
MSE and R2 results of GAN and other methods for comparison.
only three sensors of the same type as the reference sensor.
Method Parameters setting MSE R2 2) Data augmentation technique can significantly reduce the PWP in
GRNN Spread of radial basis functions: 0.1 0.2249 0.9491 the data imputation of the STSs, because it significantly augments
RBFN Spread of radial basis functions:1.0 0.0515 0.9883 the characteristics of input data and strengthen the training of
MSE goal: 0.0001 discriminator. From the verification results on the measured data,
Maximum number of neurons: 4
Number of neurons to add between displays: 25
the more data augmentation times, the better the effect. In the data
SVR Kernel function: RBF 0.0709 0.9839 imputation of dynamic displacement sensor based on the STSs, the
Degree in kernel: 3 MSE of data augmentation 100 times is 78.4%, f is 86.0% and J is
P in loss function: 0.02 97.0% lower than that of independent GAN. And in the data impu
GAN Data enhancement times: 0 0.1586 0.9641
tation of vehicle-induced part of strain sensor based on the STSs, the
Data enhancement times: 1 0.1305 0.9704
Data enhancement times: 3 0.0345 0.9922 MSE of data augmentation 100 times is 82.9%, f is 92.1% and J is
Data enhancement times: 6 0.0308 0.9930 98.6% lower than that of independent GAN.
Data enhancement times: 12 0.0307 0.9930 3) GAN-based missing data imputation method only considers the dis
Data enhancement times: 24 0.0269 0.9939 tribution characteristics of different sensor data, and therefore is no
Data enhancement times: 36 0.0237 0.9946
longer suitable for data imputation between DTSs, whose distribu
tion characteristics are quite different. The LSTM-based imputation
method, which can capture the time characteristics between DTSs, is
Table 2 proposed in this article and is verified. From the analysis results it
MSE and R2 results of LSTM and other methods for comparison.
can be found that LSTM successfully captures the nonlinear rela
Method Parameters setting MSE R2 tionship between DTSs with less peak clipping, with even only one
GRNN Spread of radial basis functions: 0.1 0.1055 0.2905 day’s incomplete data used. The final regularized MSE is only
RBFN Spread of radial basis functions:1.0 0.0930 0.3745 0.00052. At the same time, the effectiveness of the method to impute
MSE goal: 0.001 other dynamic displacement sensors is also verified.
Maximum number of neurons: 4
Number of neurons to add between displays: 25
SVR Kernel function: RBF 0.0930 0.3744 The imputation method proposed in this paper makes full use of the
Degree in kernel: 3 advantages of the two deep learning networks to deal with different
P in loss function: 0.02 scenarios. GAN and data augmentation technique can quickly impute
LSTM LSTM layer 1 units: 256 0.0345 0.7682 data between STSs with less parameter adjustment, while LSTM can
LSTM layer 2 units: 64
LSTM layer 3 units: 32
capture time characteristics of time series data and impute data between
LSTM layer 4 units: 4 DTSs. It is verified that the imputation framework can use only 1–2 days’
Fully connected layer incomplete dataset to impute missing data in the SHMS in multi-sensor
failure state, with good imputation result obtained. The proposed
method will have a wide application prospect in long term health
proposed method are verified in a practical study based on the measured
monitoring of real structures, especially in multi-sensor damaged state,
strain and dynamic displacement sensors. Main conclusions are
and this research is expected to be further improved in the future.
remarked as follows:
CRediT authorship contribution statement
1) GAN-based data imputation method between STSs can achieve an
excellent performance when there are many reference sensors.
Jiale Hou: Writing – original draft, Investigation, Software. Hua
However, for multi-sensor damaged SHMS, there are only few sen
chen Jiang: Methodology, Visualization, Investigation. Chunfeng
sors are still working normally which can be used as reference sen
Wan: Supervision, Conceptualization, Writing – review & editing.
sors. And the fewer the reference sensors are, the worse the
Letian Yi: Validation. Shuai Gao: Visualization. Youliang Ding: Data
imputation results will be. For the vehicle induced strain, when there
curation. Songtao Xue: Funding acquisition, Conceptualization,
are only three same type sensors as the reference sensors, the
Review.
imputation results show a significant peak weakening phenomenon
(PWP) at the peak. PWP will cause SHMS to underestimate the
vehicle-induced response on the bridge structure, which will affect
16
Declaration of Competing Interest [16] Y.P. Ren, J.S. Huang, Z.Y. Hong, W. Lu, J. Yin, L.J. Zou, X.H. Shen, Image-based
concrete crack detection in tunnels using deep fully convolutional networks,
Constr. Build. Mater. 234 (2020).
The authors declare that they have no known competing financial [17] L. Li, H. Zhou, H. Liu, C. Zhang, J. Liu, A hybrid method coupling empirical mode
interests or personal relationships that could have appeared to influence decomposition and a long short-term memory network to predict missing measured
the work reported in this paper. signal data of SHM systems, Struct. Health Monit.-An Int. J. 20 (4) (2021)
1778–1793.
[18] G. Fan, J. Li, H. Hao, Lost data recovery for structural health monitoring based on
Acknowledgements convolutional neural networks, Struct. Control & Health Monit. 26 (10) (2019).
[19] G. Fan, J. Li, H. Hao, Dynamic response reconstruction for structural health
monitoring using densely connected convolutional networks, Struct. Health
This work is supported by the National Key R&D Program of China Monit.-An Int. J. 20 (4) (2021) 1373–1391.
(2018YFB1600200), Key Program of Intergovernmental International [20] X. Lei, L. Sun, Y. Xia, Lost data reconstruction for structural health monitoring
Scientific and Technological Innovation Cooperation using deep convolutional generative adversarial networks, Struct. Health Monit. 20
(4) (2021) 2069–2087.
(2021YFE0112200), the Japan Society for Promotion of Science [21] H.C. Jiang, C.F. Wan, K. Yang, Y.L. Ding, S.T. Xue, Continuous missing data
(Kakenhi No. 18K04438), the Tohoku Institute of Technology research imputation with incomplete dataset by generative adversarial networks-based
Grant, the Fund for Distinguished Young Scientists of Jiangsu Province unsupervised learning for long-term bridge health monitoring, Struct. Health
Monit.-An Int. J.
(BK20190013) and Postgraduate Research & Practice Innovation Pro [22] H.W. Zhao, Y.L. Ding, A.Q. Li, W. Sheng, F.F. Geng, Digital modeling on the
gram of Jiangsu Province (SJCX21_0053, KYCX21_0113). nonlinear mapping between multi-source monitoring data of in-service bridges,
Struct. Control & Health Monit. 27 (11) (2020).
[23] G. Fan, J. Li, H. Hao, Y. Xin, Data driven structural dynamic response
References
reconstruction using segment based generative adversarial networks, Eng. Struct.
234 (2021), 111970.
[1] Y.J. Kim, L.B. Queiroz, Big Data for condition evaluation of constructed bridges, [24] L. Wan, Y. Li, K. Chen, K. Gong, C. Li, A novel deep convolution multi-adversarial
Eng. Struct. 141 (2017) 217–227. domain adaptation model for rolling bearing fault diagnosis, Measurement 191
[2] Y.S. Park, S. Kim, N. Kim, J.J. Lee, Evaluation of bridge support condition using (2022), 110752.
bridge responses, Struct. Health Monit.-An Int. J. 18 (3) (2019) 767–777. [25] Y.-Z. Lin, Z.-H. Nie, H.-W. Ma, Dynamics-based cross-domain structural damage
[3] K. Feng, A. Gonzalez, M. Casero, A kNN algorithm for locating and quantifying detection through deep transfer learning, Comput.-Aided Civ. Infrastruct. Eng. 37
stiffness loss in a bridge from the forced vibration due to a truck crossing at low (1) (2022) 24–54.
speed, Mech. Syst. Sig. Process. 154 (2021), 107599. [26] B. Li, Y. Hou, W. Che, Data Augmentation Approaches in Natural Language
[4] H.F. Zhou, Y.Q. Ni, J.M. Ko, Structural health monitoring of the Jiangyin Bridge: Processing: A Survey, 2021.
system upgrade and data analysis, Smart Struct. Syst. 11 (6) (2013) 637–662. [27] K.M. Rashid, J. Louis, Times-series data augmentation and deep learning for
[5] J.M. Ko, Y.Q. Ni, Technology developments in structural health monitoring of construction equipment activity recognition, Adv. Eng. Inf. 42 (2019).
large-scale bridges, Eng. Struct. 27 (12) (2005) 1715–1725. [28] K. Kamycki, T. Kapuscinski, M. Oszust, Data Augmentation with Suboptimal
[6] F.N. Catbas, M. Susoy, D.M. Frangopol, Structural health monitoring and reliability Warping for Time-Series Classification, Sensors 20 (1) (2020).
estimation: Long span truss bridge application with environmental monitoring [29] Y. Long, W. Zhou, Y. Luo, A fault diagnosis method based on one-dimensional data
data, Eng. Struct. 30 (9) (2008) 2347–2359. enhancement and convolutional neural network, Measurement 180 (2021),
[7] D. Liu, Z. Tang, Y. Bao, H. Li, Machine-learning-based methods for output-only 109532.
structural modal identification, Struct. Control Health Monit. (2021). [30] T. Hu, T. Tang, R. Lin, M. Chen, S. Han, J. Wu, A simple data augmentation
[8] Z.Y. Zhang, Y.Z. Luo, Restoring method for missing data of spatial structural stress algorithm and a self-adaptive convolutional architecture for few-shot fault
monitoring based on correlation, Mech. Syst. Sig. Process. 91 (2017) 266–277. diagnosis under different working conditions, Measurement 156 (2020), 107539.
[9] Z.C. Chen, H. Li, Y.Q. Bao, Analyzing and modeling inter-sensor relationships for [31] I.J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
strain monitoring data and missing data imputation: a copula and functional data- A. Courville, Y. Bengio, Generative Adversarial Nets, 28th Conference on Neural
analytic approach, Struct. Health Monit.-An Int. J. 18 (4) (2019) 1168–1188. Information Processing Systems (NIPS), CANADA, Montreal, 2014, pp. 2672–2680.
[10] H.P. Wan, Y.Q. Ni, Bayesian multi-task learning methodology for reconstruction of [32] W. Liao, X. Lu, Y. Huang, Z. Zheng, Y. Lin, Automated structural design of shear
structural health monitoring data, Struct. Health Monit.-An Int. J. 18 (4) (2019) wall residential buildings using generative adversarial networks, Autom. Constr.
1282–1309. 132 (2021).
[11] Y.Q. Bao, H. Li, X.D. Sun, Y. Yu, J.P. Ou, Compressive sampling-based data loss [33] Y. Gao, B. Kong, K.M. Mosalam, Deep leaf-bootstrapping generative adversarial
recovery for wireless sensor networks used in civil structural health monitoring, network for structural image data augmentation, Comput.-Aided Civ. Infrastruct.
Struct. Health Monit.-An Int. J. 12 (1) (2013) 78–95. Eng. 34 (9) (2019) 755–773.
[12] W. Zhang, L.M. Sun, S.W. Sun, Bridge-Deflection Estimation through Inclinometer [34] H. Maeda, T. Kashiyama, Y. Sekimoto, T. Seto, H. Omata, Generative adversarial
Data Considering Structural Damages, J. Bridge Eng. 22 (2) (2017). network for road damage detection, Comput.-Aided Civ. Infrastruct. Eng. 36 (1)
[13] R. Tkachenko, I. Izonin, I. Dronyuk, M. Logoyda, P. Tkachenko, Recovery of (2021) 47–60.
Missing Sensor Data with GRNN-based Cascade Scheme, Int. J. Sens., Wireless [35] H. Jiang, C. Wan, K. Yang, Y. Ding, S. Xue, Modeling relationships for field strain
Commun. Control 11 (5) (2021) 531–541. data under thermal effects using functional data analysis, Measurement 177
[14] I. Izonin, R. Tkachenko, V. Verhun, K. Zub, An approach towards missing data (2021).
management using improved GRNN-SGTM ensemble method, Eng. Sci. Technol., [36] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (8)
an Int. J. 24 (3) (2021) 749–759. (1997) 1735–1780.
[15] R. Tkachenko, I. Izonin, N. Kryvinska, I. Dronyuk, K. Zub, An Approach towards [37] M. Arjovsky, S. Chintala, L. Bottou, Wasserstein GAN, 2017, p. arXiv:1701.07875.
Increasing Prediction Accuracy for the Recovery of Missing IoT Data Based on the
GRNN-SGTM Ensemble, Sensors (Basel) (2020).
17

Measurement: Jiale Hou, Huachen Jiang, Chunfeng Wan, Letian Yi, Shuai Gao, Youliang Ding, Songtao Xue

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Measurement: Jiale Hou, Huachen Jiang, Chunfeng Wan, Letian Yi, Shuai Gao, Youliang Ding, Songtao Xue

Uploaded by

Copyright:

Available Formats

Measurement 196 (2022) 111206

Contents lists available at ScienceDirect

Fig. 2. Architecture of the proposed GAN.

original observed sensor data distribution, a random noise matrix, and a { }

dependent on the distribution H|M, tells the discriminator most of the

Fig. 8. Vehicle-induced response of No.77 and 79 strain sensors.

Fig. 13. Imputation result for No.77 strain sensor.

method between STSs.

You might also like