Journal of Asian Earth Sciences

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Journal of Asian Earth Sciences 259 (2024) 105837

Contents lists available at ScienceDirect

Journal of Asian Earth Sciences


journal homepage: www.elsevier.com/locate/jseaes

Detection and characterization of geomagnetic anomaly waveforms


Zongxuan Wu a, b, c, f, Jiening Xia a, b, c, f, *, Benyan Tan a, b, c, f, Bin Wang d, Qian Zhao e,
Shaopeng He d
a
Institute of Seismology, China Earthquake Administration, Wuhan, China
b
Hubei Earthquake Administration, Wuhan, China
c
Hubei Key Laboratory of Earthquake Early Warning, Institute of Seismology, Wuhan, China
d
Hebei Earthquake Administration, Shijiazhuang, China
e
Liaoning Earthquake Administration, Shenyang, China
f
Wuhan Institute of Seismologic Instrument Co., LTD, Wuhan, China

A R T I C L E I N F O A B S T R A C T

Keywords: Geomagnetic anomalies are abnormal changes in the geomagnetic field caused by changes in the stress of the
Windowed Weighted Correlation (WWC) underground rock. Research on geomagnetic anomaly signals can help explore earthquake prediction. This study
Geomagnetic anomaly proposes a window-weighted correlation degree (WWC) method to detect geomagnetic vertical component
Machine learning
anomaly waveforms, addressing the limitations of traditional methods. The WWC method calculates the simi­
Earthquake prediction
Artificial intelligence
larity between geomagnetic and reference data in daytime and nighttime windows, identifying abnormal signals
based on change values. This method achieves a 0.976 accuracy in identifying geomagnetic anomaly waveforms,
effectively detecting and recognizing complex waveform features in the geomagnetic field. Further, based on this
method, a dataset of geomagnetic anomaly waveforms under various earthquake events was built, identifying
seven typical anomalies. Five types of features were selected and constructed to measure the non-random de­
viation of diurnal geomagnetic variation from the baseline signal, improving the precision and recall of the
random forest classification model. The machine learning method can classify and predict geomagnetic anomaly
waveforms related to earthquake occurrence with superior precision and recall compared to traditional methods.
This approach offers a productive means for identifying geomagnetic anomaly waveforms associated with
earthquake occurrence and analyzing the potential correlations between anomalies and seismic events. It also
provides a possibility for exploring the physical mechanism involved in geomagnetic anomalies’ generation and
evolution process. It provides effective data support for further analysis of seismic-magnetic relationships and
helps to promote the deep integration of artificial intelligence and earthquake prediction research.

1. Introduction electromagnetic method for earthquake prediction is to set up in­


struments at each station, observe the electromagnetic data of the
Earthquakes result from fault instability caused by long-term accu­ covered area, and extract anomalies from it. Among them, the
mulation and change of underground stress(Chen et al., 2009). This geomagnetic field data anomalies have a higher sensitivity in reflecting
natural occurrence can produce massive disasters, resulting in loss of seismic activity(Ding et al., 2004; Xie et al., 2018). Therefore, the
life, economic losses, and environmental damage. In order to effectively extraction and analysis of geomagnetic anomaly signals are fundamental
prevent the impact of earthquake disasters, people have taken various for studying earthquake early warning and forecasting, as well as other
measures, one of which is to try to achieve earthquake prediction(Uyeda spatial electromagnetic anomaly phenomena(Xu et al., 2018).
et al., 2009). Among them, the seismic electromagnetic method, as a However, geomagnetic anomaly signals are often difficult to identify
critical predictive method, has played an essential role in observing pre- from the background noise because of interference from solar activity,
seismic anomaly precursors and is regarded as one of the primary atmospheric disturbance, artificial noise, and other factors. In order to
methods that may produce advances in earthquake prediction first suppress the impact of external field signals and highlight the abnormal
(Hayakawa & Molchanov, 2007; Zhao et al., 2022). The process of an disturbances in the seismic source area, scholars have proposed various

* Corresponding author at: Institute of Seismology, China Earthquake Administration, Wuhan, China.
E-mail address: time_xjn@whu.edu.cn (J. Xia).

https://doi.org/10.1016/j.jseaes.2023.105837
Received 29 June 2023; Received in revised form 16 August 2023; Accepted 26 August 2023
Available online 28 August 2023
1367-9120/© 2023 Elsevier Ltd. All rights reserved.
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 1. (a) The preprocessed minute-level data of the vertical component of the geomagnetic field from the magnetic flux gate magnetometer at a geomagnetic
station in Sichuan Province in 2021 is shown in blue, along with its trend component in orange. (b) The waveform of the preprocessed minute-level data after
removing the trend component. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Furthermore, all of the above methods require complex signal process­


ing techniques for the observed data (Smirnova & Hayakawa, 2007; He
& Liao, 2019; Feng et al., 2021), which could lead to the loss of some
seismic information features in the raw data. In many cases, it is chal­
lenging to establish a valid link between geomagnetic signal anomalies
and earthquake occurrence. As a result, there is still a great deal of
uncertainty when utilizing these techniques to anticipate earthquakes,
which limits our ability to investigate and analyze geomagnetic data one
by one.
In view of the problems and limitations of traditional methods in
extracting and analyzing geomagnetic anomaly information, this paper
proposes a method for detecting geomagnetic vertical component
anomaly waveforms based on window-weighted correlation. The
method uses window-weighted correlation as a judgment criterion,
segments the observed data by sliding windows, and calculates the
correlation between the data within each window and the reference
Fig. 2. The reference signals for some geomagnetic stations. data. When the correlation is lower than a certain threshold, it is
considered that there is an abnormal waveform in the window. This
methods for extracting and analyzing geomagnetic anomaly informa­ method effectively identifies anomalous waveforms in the geomagnetic
tion. Such as the geomagnetic vertical intensity polarization method field and is not affected by the choice of station location or azimuth. The
(Hayakawa et al., 1996; Hayakawa et al., 2000; Molchanov et al., 2003; geomagnetic anomaly dataset was then constructed based on this
Li et al., 2015; He & Feng, 2017; Liu et al., 2017; He, 2018; Li et al., method, and seven types of typical anomalies were identified. Anomaly
2019; Liao et al., 2019; Liao et al., 2021; Feng et al., 2021; Fan et al., signals were classified and predicted using machine learning techniques.
2022), principal component analysis method (Gotoh et al., 2002; Uyeda Based on the time-domain and frequency-domain features constructed
et al., 2002; Serita et al., 2005; Hattori et al., 2006; Mursula & Holappa, with reference to previous studies, this paper mines the seismic wave­
2017; Zhu et al., 2019), and short-term average long-term average signal form records in the dataset. Then, we constructed five types of features
processing method (STA-LTA) (Kappler et al., 2019). that can accurately characterize the details of typical anomalies. This
In earthquake prediction, these methods reflect the changes in the provides the possibility to realize earthquake prediction through
geomagnetic field to a certain extent before the earthquake. However, geomagnetic anomalies and explore the physical mechanisms involved
according to Li et al. (2015), there are some drawbacks to the polari­ in generating and evolving electromagnetic anomalies.
zation method for detecting pre-seismic anomalies, most notably in two The first section of this paper introduces the research background,
areas, firstly, it is impacted by the quality of observation data and the significance, problems, and challenges of traditional geomagnetic
deployment spacing of stations, and the anomaly signal typically only anomaly signal extraction methods. The second section introduces the
manifests on a single station close to the epicenter. Secondly, it is principles of data pre-processing and geomagnetic anomaly waveform
challenging to exclude data interference factors, and the reliability of detection methods and describes how to construct a geomagnetic
abnormal signals is difficult to judge. In Kappler’s study, the STA/LTA anomaly waveform dataset based on these methods. In the third section,
method is sensitive to the selection of window length, trigger threshold, through the in-depth analysis of the anomalous dataset, we classify and
and feature function, and these parameters may need to be adjusted summarize the characteristics of geomagnetic anomaly waveforms,
according to different signal and noise conditions(Liu et al., 2017). combine feature engineering, and propose five types of features that can

2
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 3. The comparison of the identification performance of geomagnetic anomalies using correlation coefficients and WWC is shown in the figure. The blue curve
represents the reference signals for each station, while the orange curve represents the geomagnetic anomaly waveforms deviating from the reference signals. Pcorr
denotes the waveform similarity calculated based on correlation coefficients, while Pw represents the similarity calculated based on WWC. (a) Pcorr = 0.87, Pw =
0.57;(b) Pcorr = 0.71, Pw = 0.39;(c) Pcorr = 0.72, Pw = 0.45;(d) Pcorr = 0.64, Pw = 0.34. (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)

Table 1 accuracy and recall rate by combining the five types of features proposed
Precision and recall of different methods for detecting geomagnetic anomalies. in the third part. The fifth section deliberates upon the strengths of this
(Where precision represents the proportion of correctly classified positive sig­ research and acknowledges potential shortcomings. Ultimately, the
nals among all predicted positive signals, and recall represents the proportion of sixth section comprehensively summarizes the undertaken work and
positive samples that are correctly predicted as positive among all actual posi­ resultant achievements.
tive samples (Zhou, 2016). KNN indicates K-Nearest Neighbors; LOF refers to
Local Outlier Factor; COF stands for Connectivity Outliers Factor; DBSCA in­
dicates Density-Based Spatial Clustering of Applications with Noise). 2. Methodology

Method Precision Recall


2.1. Data pre-processing method
3sigmma 0.095 0.056
KNN 0.238 0.270
The geomagnetic field is the superposition of various magnetic field
LOF 0.048 0.333
COF 0.214 0.243 components generated by the magnetic rocks inside the earth and the
DBSCAN 0.333 0.169 current systems distributed inside and outside the earth (Zhang et al.,
Isolation Forest 0.310 0.302 2021). It reflects the changes in physical processes inside and outside the
Euclidean Distance 0.595 0.455 earth. Although the geomagnetic field’s distribution and variations
Correlation Coefficient 0.690 0.763
WWC 0.976 0.911
show considerable regularity(Xiao et al., 2006), they are frequently
influenced by various random factors, which causes noise and outliers in
the data. In order to extract the abnormal signals reflecting pre-seismic
characterize the details of anomaly waveforms. In the fourth segment, precursor information from geomagnetic data, remove the linear trend
we apply machine learning techniques to classify and predict the data­ of geomagnetic signal, and highlight its periodic variation, it is neces­
set, compare the classification effects under different datasets, try to sary to preprocess the original data reasonably and effectively.
improve the quality of the dataset, and improve the classification Singular spectrum analysis (SSA) is a method for processing

3
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 4. Trend anomalies.

nonlinear time series data(Vautard et al., 1992; Golyandina and Zhigl­ geomagnetic data are to the reference signal using WWC.
javsky, 2013). SSA is not constrained by the sine wave assumption, does
not require prior information, and has advantages such as stable iden­ 2.2. A geomagnetic anomaly waveform detection method based on WWC
tification and enhancement of periodic signals(Lu et al., 2015). By
decomposing and reconstructing the trajectory matrix of time series, The geomagnetic diurnal variation anomaly is defined as the non-
different component series such as long-term trend, seasonal trend, and random deviation from the baseline signal, and the pre-seismic verti­
noise are extracted to analyze or denoise time series and use them for cal component diurnal variation anomaly mainly manifests in amplitude
other tasks. and phase changes. In order to extract the possible geomagnetic diurnal
This paper removed the linear trend component from the pre­ variation anomaly data related to earthquakes and pay attention to the
processed minute-level data of the vertical component of the geomag­ changes of waveform phase and amplitude, from the perspective of
netic field from each station using the singular spectrum analysis (SSA) similarity, we constructed the baseline signal of geomagnetic data. Then
method. The resulting residual sequence contains the primary modes we extracted the possible geomagnetic anomaly data. Based on the SSA
and periodic components of the geomagnetic field variations and any method, we removed the linear trend of each station vertical component
possible abnormal fluctuations or mutation phenomena in the preprocessed minute data. The average value of each half-hour data was
geomagnetic field variations. This part can show the local anomalous taken as the baseline signal of that day. Subsequently, the mean of the
fluctuations of the geomagnetic data, which paves the way for later daily baseline signals was calculated to suppress noise interference,
abnormal data identification and station reference signal generation. enhance the signal-to-noise ratio and smoothness, and establish the
The linear trend was extracted and eliminated using SSA, as shown in reference signals for all stations. The reference signals for some stations
Fig. 1, using the preprocessed minute-level data of the vertical compo­ are shown in Fig. 2.
nent of the geomagnetic field from the magnetic flux gate magnetometer During the anomaly analysis process, it is found that the method
at a specific geomagnetic station in Sichuan Province in 2021 as an based on similarity, when using correlation coefficient to measure
example. anomaly, only pays attention to the change of geomagnetic data wave­
The figure shows that, after using the SSA method to remove the form trend and ignores the absolute amplitude of peak and valley. As a
linear trend from the geomagnetic data, the periodic fluctuations of the result, anomalous waveforms with significant amplitude fluctuations
geomagnetic signal and the abnormal local fluctuations manifest. This that differed from the reference signals were often overlooked.
opens up the possibility of later determining how similar the daily Therefore, the WWC is defined.

4
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 5. Low point deviation (The blue curve represents the reference signal for the station, while the orange curve represents the geomagnetic anomaly waveform).
(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

∑K ∑N
significantly weaker changes than the reference signal.
j=1 ωi xj yj
(1)
i=1
Pw = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ In contrast to the correlation coefficient, WWC demonstrates a
∑ N ∑N
ωi x2j ωi y2j greater detection rate for geomagnetic anomaly waveforms, as illus­
j=1 j=1
trated in Fig. 3.
It has the following properties: The degree of deviation from the reference signals was determined
by manually annotating the preprocessed minute-level data for each day
11) Pw ≤ 1; from the magnetic flux gate magnetometer at a geomagnetic station in
21) Pw (X, Y) = Pw (Y, X); Sichuan Province in 2016. A total of 42 anomalous waveform signals
31) Pw (X, X) = 1 were identified. Based on this, the identification precision and recall rate
of different detection methods for geomagnetic anomalies were
Among them, ωi is the weights of the ith time window, K is the total compared, as shown in Table 1.
number of time windows divided, xj is the jth data point of the reference In traditional anomaly detection methods based on probability,
signal in the current sliding window, and yj is the geomagnetic data to be proximity, and clustering, anomalies are determined by the distance
between data points in space. However, using distance measurement to
recognized in the current sliding window (j = 1,2,….…,N). N is the total
detect geomagnetic anomalies has the following drawbacks: (1) it
number of data points in the current window.
cannot distinguish waveform similarity; (2) it cannot reflect the simi­
The weighting factors ω1 = 0.375, ω2 = 0.625 are chosen by
larity of dynamic trend changes; (3) for geomagnetic data with a fixed
defining data windows for daytime and nighttime and considering that
order, it cannot take into account the continuity of time dimension. The
the noise level during the day is roughly 2 nT while around 1.2 nT(Lei &
WWC-based method for geomagnetic anomaly waveform detection
Jia, 2020) during the night. The calculated values fall between [− 1,1],
effectively addresses these limitations. As shown in Table 1, combined
and the closer a value is to 1, the less likely the result is anomalous.
with mean square error to analyze and extract geomagnetic data, the
This method is sensitive to offset translation, amplitude scaling,
window weighted correlation degree has an accuracy of 0.976 for
compression, and stretching of geomagnetic data and can also reflect the
identifying geomagnetic anomaly waveforms. The anomalies extracted
similarity of waveforms well. Compared with the correlation coefficient,
by this method also include the daily ratio anomalies of geomagnetic
it can identify anomalies with similar trends to the reference signal but
diurnal amplitude variation(Rui et al., 2019), making it more effective

5
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 6. Peak/valley anomalies.

in identifying geomagnetic anomalies and achieving high recognition earthquake is highly correlated with the occurrence of the earthquake
accuracy for geomagnetic anomaly waveform signals. and is a possible seismo-magnetic anomaly, which has more value for
further analysis. This type of data is labeled as 1. Otherwise, it is not
considered a seismo-magnetic anomaly and is marked as − 1. The
2.3. Construction of anomalous datasets recorded data from 45 geomagnetic stations in provinces such as Gansu,
Inner Mongolia, Ningxia, Qinghai, Sichuan, Tibet, and Xinjiang were
According to the formula for seismic electromagnetic anomaly analyzed using singular spectrum analysis, and linear trends were
magnitude (M) and epicentral distance (D) statistics derived by Rikitake removed. Next, we constructed the reference signal for each station,
(1997): constructed the allowable upper and lower deviation reference based on
the reference signal, calculated the window-weighted correlation de­
(2)
M+0.87
Dmax = 10 2.67
gree, and took the average. Then, by monitoring changes in similarity,
Where Dmax is the maximum detectable distance of the electromag­ based on 3Sigmma to set dynamic threshold, the anomaly was detected
netic emission precursor and M is the magnitude of the seismic event. and extracted, and the dataset GAW was obtained. Each data sample
In this paper, we calculate the maximum detectable distance of the records the preprocessed minute-level data for the geomagnetic Z
geomagnetic signals of earthquakes of magnitude three or higher component on a given day. The GAW dataset contains 9268 data points,
occurring in China from 2009 to 2021, select all the stations that may including 3573 positive samples with geomagnetic storm data removed
observe the geomagnetic signals in this range, construct an earthquake and 5695 negative samples (where anomalous data with a daily
catalog corresponding to the earthquakes and stations, and realize the geomagnetic storm ring current index Dst < − 30 are uniformly labeled
selection of geomagnetic anomalous signals by comparing with this as − 1). In addition, ignoring the limitation of the maximum propagation
earthquake catalog. distance of electromagnetic signals, strong earthquakes in mainland
Various geomagnetic methods indicate that anomalous seismic China were selected. The same method was used to construct a strong
magnetic disturbances exist several days to three months before an earthquake dataset for the Sichuan region, denoted as SSW. It contains a
earthquake (Yao & Feng, 2018; Marchetti et al., 2020; Özsöz & Ankaya total of 1402 data points, including 619 positive samples and 783
Pamukçu, 2021). Therefore, it is possible that the geomagnetic anomaly negative samples.
disturbance data that occurred within 90 days before the target

6
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 7. Sudden change anomalies.

3. Characterization of the datasets Sudden change anomalies: This anomaly is manifested by a sudden
and significant drop or rise in the geomagnetic signal. The main reasons
3.1. The abnormality in the waveform of positive samples for forming this type of anomaly are instrument restart, vehicle stop­
ping, or high-voltage DC transmission interference(Zou et al., 2015;
As can be seen from the baseline geomagnetic signals of each station Chen et al., 2020), which have minimal correlation with earthquakes
in Fig. 2, it is evident that the daily amplitude of geomagnetic signals and should be removed from the dataset. (see Fig. 7)
exhibits variations across distinct time intervals. The minimum daily Amplitude compression: The maximum daily amplitude variation
amplitude of the geomagnetic signal occurs between 3 and 6 AM, re­ of the signal is far below 20 nT. (see Fig. 8)
covers to the amplitude size at 0 between 9 and 11 AM, and maintains a Shape anomalies: This kind of signal is entirely dissimilar to the
nearly constant amplitude until 0 the next day. The magnitude of the baseline signal and has no regularity. The geomagnetic diurnal variation
daily amplitude change is around 15–25 nT. Through the comparative has significant amplitude and phase changes and noticeable distortion in
analysis of the positive samples in the dataset and the baseline signals for shape. The low point time of the day does not exist, and the data is
each station, it was found that the anomalies can be classified as trend relatively scattered. (see Fig. 9)
anomalies, low point deviation, peak/valley anomalies, sudden change Daytime low-value: The amplitude of this kind of signal remained
anomalies, amplitude compression, daytime low-value, shape anoma­ unrecovered after 11 am and did not return to the level seen at 0 on the
lies, as well as some data of poor quality. same day. (see Fig. 10)
Trend anomalies: This type of anomaly is mainly manifested by the Error data: Such data may have poor quality due to environmental
fact that, within the low point time range, the geomagnetic signal shows interference or instrument malfunctions, where noise can mask changes
a peak fluctuation instead. The daily variation distortion anomaly and in the data waveform (see Fig. 11). Therefore, they should be removed
significant amplitude and phase changes are apparent. (see Fig. 4). from the dataset.
Low point deviation: The waveform resembles the reference signal, From Figs. 4-10, it is evident that the abnormal daily variation of the
but the low point of the geomagnetic signal deviates significantly from vertical component before the earthquake reflects the instability and
the low point range of the reference signal. (see Fig. 5). nonlinearity of the geomagnetic field before seismic events. These
Peak/valley anomalies: The waveform is similar to the reference anomalies mainly manifest themselves in the random deviation of
signal, but the maximum amplitude change is 35 nT or higher. (see amplitude and phase from the baseline signal. Among them, the trend
Fig. 6). and shape anomalies manifest as daily distortion and missing low point

7
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 8. Amplitude compression.

time. The peak/valley anomalies, amplitude compression anomalies, anomalous signals may be related to earthquake occurrences.
and low point deviation anomalies show waveforms similar to the Although some studies on the anomaly of pre-earthquake geomag­
reference signal. However, the maximum daily amplitude variation of netic fields have already been done by earlier scholars, systematic
peak/trough anomalies and amplitude compression anomalies differs identification and classification analysis of these anomalies are still
significantly from the reference signal. The low point deviation anom­ uncommon. These anomalies show regularities in spatial distribution
alies are characterized by the spatial deviation of the occurrence time and time series and are closely related to seismic activity. Therefore, it
(low point time) of the minimum value of the daily variation amplitude can provide a significant foundation and references for earthquake
of the geomagnetic Z component at the station. A sudden change prediction by systematically identifying and categorizing these anoma­
anomaly occurs when the geomagnetic signal abruptly and significantly lous types.
decreases or increases. A daytime low-value anomaly occurs when the
signal does not fully recover by 0:00 on the same day.
After analyzing various anomalies and removing error data and 3.2. Feature engineering
sudden change anomalies, the GAW dataset contains 9206 data points,
including 3542 positive and 5664 negative samples. The SSW dataset With the development of observational technologies and the
contains 1398 samples, including 617 positive and 781 negative. The expansion of data volumes in recent years, some researchers have
dataset inevitably contains noise signals from the environment, but after attempted to extract features from a large amount of observational data,
removing sudden change anomalies and error data, the electromagnetic construct mathematical models, and achieve prediction of the proba­
disturbances do not reflect the characteristics of noise changes. These bility or scale of earthquakes occurrence. In 2007, Panakkat and Adeli
are inconsistent with the known ambient noise and human noise char­ (2007; 2009) proposed an effective method for earthquake prediction
acteristics, but other sources, which may be related to the occurrence of using seismic activity indices. These indices are calculated from the
earthquakes. As shown in Fig. 12, Fig. 12a and b show the geomagnetic temporal distribution of earthquakes and represent the potential seismic
anomalies before the 6.4-magnitude earthquake in Menyuan, Qinghai. state of the region. Based on geophysical facts such as Gutenberg-Richter
In contrast, Fig. 12c and d show the geomagnetic anomalies before the inversion law, earthquake occurrence frequency, foreshock frequency,
6.3-magnitude earthquake in Zaduo, Qinghai. Studies by Xie, Li, and Rui and earthquake magnitude distribution, Asim et al. (2017) constructed
(Xie et al., 2018; Rui et al., 2019; Li et al., 2021) have shown that such eight features by mathematical modeling in 2017. Asim et al. (2018)
collected 60 seismic activity indicators the following year to predict

8
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 9. Shape anomalies.

earthquakes of magnitude M5.0 and above. However, seismic activity selection of windows. Obviously, for periodic non-smooth geomagnetic
indicators based on conventional seismic catalogs typically only signals, the above features do not reflect the non-random deviation of
discretely record information about unstable ruptures in fault zones, the daily geomagnetic variation from the reference signal, measuring its
missing more subtle or unconventional fault zone activities, such as fault amplitude and waveform phase variation.
tremor or creep. This results in coarser features. Based on the time domain and frequency domain features con­
Therefore, to better reflect the subtle changes and complexity of fault structed by previous studies, this paper aims at the waveform charac­
zone activity, some researchers have begun exploring new methods to teristics of geomagnetic anomaly signals and mines the seismic
achieve earthquake prediction. In 2019, based on the data provided by waveform records in the dataset. On top of this foundation, the study has
the multi-component seismic monitoring system AETA, Lv constructed included wavelet features to assess signal details or mutations, energy
and selected 25 general statistical features based on the time domain, features to evaluate signal strength or amplitude, and aggregated fea­
energy, and frequency characteristics as the prediction basis for the tures to measure signal similarity or dissimilarity. The study considers
classifier. Lv then predicted destructive earthquakes (magnitude ≥ six typical anomalous waveforms after removing sudden change
Ms5.0) occurring within a range of five days from the seismic station anomalies in Section 3.1: trend anomalies, low point offset anomalies,
within 200 km. In the retrospective experiment, the pre-earthquake peak/valley anomalies, amplitude compression anomalies, daytime low-
prediction model achieved a precision rate of 0.66 and a recall rate of value anomalies, and shape anomalies. The variations in amplitude and
0.75. In 2021, Wei(2021) also built time, frequency, and sliding window time for amplitude compression, daytime low-value, and peak/valley
features using waveform data of the Earth’s electric field from 20 days anomalies are described using energy and temporal features. For low
before an earthquake. The NS component’s prediction accuracy was point offsets, frequency domain and wavelet features describe the
0.692 at the Pingliang Earth electric field monitoring station. variation in frequency and scale. The spatiotemporal distribution of
However, the time-domain features cannot reflect the periodic in­ trend and shape anomalies is described using temporal and aggregation
formation in the signal. The time-domain features cannot effectively features. These five types of features are used to describe typical
distinguish or identify some signals with periodic or frequency varia­ geomagnetic anomalies. Therefore, an alternative comprehensive
tions. The frequency-domain features cannot reflect the changes in the feature set(Christ et al., 2018) is constructed, which is expected to reflect
signal over time, and for some signals with time-varying or non- the characteristics of the anomaly signal more accurately. (see Table 2).
smoothness, the frequency-domain features cannot effectively capture
or describe them. The sliding window features are sensitive to the

9
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 10. Daytime low-value.

4. Machine learning methods for classification experiments classification precision increase is insignificant in the GAW dataset. This
might be because some abnormalities in the GAW dataset are brought on
This section used machine learning methods to classify and predict by geomagnetic anomalies at a single station. Single-station magnetic
the GAW and SSW datasets. Furthermore, based on the features con­ anomalies have a complicated daily variation and are significantly
structed in Section 3.2, the SSW was analyzed in depth to achieve feature influenced by outside magnetic fields. Even if there are noticeable
selection and model construction. The objective is to promote the deep changes, they cannot be written off as anomalies before earthquakes.
integration of artificial intelligence and earthquake prediction research,
further integrate machine learning techniques with earthquake elec­ 4.2. Classification experiment for multi-station synchronous anomalies
tromagnetics, and create the possibility for more effective delineation of
earthquake-prone areas and realization of earthquake forecasting. Based on the dataset GAW, the data of the Sichuan region from 2016
to 2019 were selected, and a sub-data set of GAW was obtained, denoted
as SGAW. By selecting the anomaly signals that appeared synchronously
4.1. Classification experiment based on the GAW dataset at multiple stations before the earthquake in this data set, the data set
MSW was obtained. Both datasets were classified using Random Forest
The following classifiers will be constructed for the GAW dataset, (n_estimators = 80), and the results were compared (see Table 5).
which is composed of geomagnetic anomaly signals from 45 stations (see The comparison of confusion matrices between the two datasets is as
Table 3). follows. (see Fig. 14).
By using different processing methods on the data set GAW, such as After selecting the multi-station synchronous anomalies, the preci­
data differencing or data augmentation, classifiers A to I were trained sion and recall rate greatly improved under the same model. This in­
respectively, and classifier J was obtained by a simple voting ensemble dicates that replacing the anomaly data in GAW with multi-station
strategy. Table 4 shows the performance of the classifiers above in the synchronous anomalies is a critical way to improve the quality of the
GAW dataset. dataset and the classification precision in the next step.
Classifier J achieved the best precision of 78.0 % and an improved
AUC of 0.769, as shown in Table 4 and Fig. 13. However, the

10
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 11. Error data.

4.3. Feature selection and classification experiment based on the SSW of the importance, we arrive at the following features: median、
dataset cwt_coefficients、abs_energy、mean、ft_coefficient_angle、absolute_­
sum_of_changes、abs_energy_diff、fft_aggrerated_skew、cor120、
Some strong earthquake precursors will produce some electromag­ skew、per1、cor60、fft_coefficient_abs、f_agg_mean_maxlag_60. These
netic emission source in the epicenter area, which makes the electro­ features can be classified into five categories (see Table 6).
magnetic disturbance propagate to a very far place in the air(Rikitake, Different types of features can better characterize the six typical
1997). Therefore, the seismic electromagnetic anomaly signals that anomalies. Energy features quantify the signal’s power or amplitude and
appear before the strong earthquake can be detected by multiple represent the signal’s size. Signal periodicity or harmonic components
observation points at the same time, and the strong earthquake anomaly are measured using frequency domain features, representing the signal’s
signal dataset, that is, the SSW dataset, meets the requirements of the frequency distribution. Wavelet features measure the signal’s details or
multi-station synchronous anomaly to a certain extent. mutations by reflecting the local variations of the signal at various scales
For the features constructed in Section 3.3, the distribution of these and places. Time-domain features quantify the shape or trend of the
features in the training and test sets of SSW is viewed and compared by signal by assessing the regularity of the signal’s change over time. The
plotting the KDE (Kernel Density Estimation) distribution, and feature relationships between signals are reflected in aggregation features,
variables with uneven distribution in the two datasets are found, as which compare and contrast signals. These features represent changes in
shown in Fig. 15. amplitude and waveform phase and are better at measuring the non-
The data distribution in the graph above shows that, in contrast to random deviation of daily geomagnetic variations from the reference
the other characteristics, the distributions of the features of cv, first_­ signal than seismic activity features. They can more accurately find the
location_of_maximum, and div differ noticeably between the training geomagnetic anomalies that have a more significant correlation with the
and testing sets. Such differences make the model less generalizable. As occurrence of earthquakes and advance further analysis, looking for the
a result, taking these features out of the dataset is advised. potential correlation with earthquakes.
Next, following the idea of the embedded feature selection method We can improve the model’s capacity to detect inherent structure
(Zhou, 2016), the random forest method will rank the remaining fea­ and patterns in the data by choosing the right combination of charac­
tures based on their importance. (see Fig. 16). teristics and their ordering. We can lower the dimensionality of the data,
Using the random forest method, we can obtain the feature impor­ eliminate noise and redundancy, and enhance the model’s functionality
tance ranking for the different features. By sorting the features based on and generalizability by using these five kinds of features. The original
their importance and selecting the top 14 features that account for 80 % dataset is converted using these features, and the changed dataset is used

11
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 12. (a) The abnormal waveform was observed 17 days before the M6.4 earthquake in Menyuan (amplitude compression). (b) The abnormal waveform was
observed 9 days before the M6.4 earthquake in Menyuan (trend anomalies). (c) The abnormal waveform was observed 40 days before the M6.3 earthquake in Zaduo
(low point deviation). (d) The abnormal waveform was observed 28 days before the M6.3 earthquake in Zaduo (low point deviation).

Table 2
Alternative comprehensive feature set.
Feature name Feature meaning or calculation formula Detailed information

abs_energy abs energy = ni=1 x2i The absolute energy value describes the square of the distance from the origin of a time
series data, indicating the level of fluctuation (energy) present in the data.

abs_energy_diff abs energy diff = i=1,⋯,n− 1 |xi+60 − xi |2 Compute the differenced data’s absolute energy value after taking the original data’s
difference (l = 60).
ADF_pvalue Augmented Dickey-Fuller test Hypothesis test result.
ADF_teststat Augmented Dickey-Fuller test T-test, hypothesis test value.
ADF_usedlag Augmented Dickey-Fuller test The order of lag used.
binned_entropy −
∑min(max bins,len(x))
pk log(pk ) • 1(pk >0) Grouping entropy, divide the whole sequence into max_bins by value, put each value into
k=0
the corresponding bucket, and then calculate the entropy.
median ⎧ N+1 The median of the data, where X is the data sorted from smallest to largest and N is the
X[ ]; Nisodd

⎨ 2 data length.
median = N N + 1
⎩ X[ ] + X[
⎪ ]
2 2 ; Niseven
2
mean X1 + X2 + ... + XN Average of data.
mean =
N
cwt_coefficients 2 x2 x2 Ricker wavelet analysis (Wang, 2015), a is the width parameter in the wavelet transform
r = (1 − 2 )exp(− )
1 a 2a2 function.
√̅̅̅̅̅̅
3aπ4
Feature name Feature meaning or calculation formula Detailed information
(continued on next page)

12
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Table 2 (continued )
Feature name Feature meaning or calculation formula Detailed information

skew 1∑n Xi − μ 3 Skewness is used to measure the asymmetry of the probability distribution of random
skew = ( )
n i=1 σ variables, where μ is the mean value and σ is the standard deviation.
fft_aggregated_skew 1 n FFT(Xi ) − μ f 3
∑ Skewness of original data after the Fourier transform, where μ f is the mean value after
fft skew = ( )
n i=1 σ f the Fourier transform of original data, σ f is the standard deviation after the Fourier
transform of original data.
fft_coefficient_abs ∑n− 1 mk To compute the coefficients of a one-dimensional discrete Fourier sequence using the
Ak = m=0 am exp(− 2πi ), k = 0, ⋯, n − 1
fft_coefficient_angle n fast Fourier transform algorithm.

absolute_sum_of_changes i=1,⋯,n− 1 |xi+1 − xi | Absolute sum of first-order difference, sum of absolute values of the first-order
difference result of the time series data
f_agg_mean_maxlag_60 R(l) = The aggregated (mean) statistical feature of the 60th-order autocorrelation coefficient.
1 ∑n− l Here, Xi represents the values of a time series, n represents its length, and l represents the
(Xt − μ)(Xt+1 − μ)fagg (R(1), ⋯, R(m) )m = 60
(n − l)σ2 t=1 lag order. These values form a vector, and the mean of this vector is to be calculated.
Feature name Feature meaning or calculation formula Detailed information
cor60、cor120 1 ∑n− l The autocorrelation coefficients for a lag of 60 and 120 data points
R(l) = (Xt − μ)(Xt+1 − μ)
(n − l)σ2 t=1

large_standard_deviation std(data) > r*[max(data) − min(data)] Whether the standard deviation is multiple of the range and the output is 0 or 1,
according to the rule of thumb, the standard deviation should be one-quarter of the
range of values, and r is the percentage of the range to be compared
∑k
coeff_3_k_10 Xt = φ0 + Each order coefficient of the autoregressive equation (autoregressive order: 10, the third
i=1 φi Xt− i + εt
coefficient in autoregressive), where k is the order of the autoregressive equation.
per1 Q = Q3 − Q1 Quartile difference, Q3 is the upper quartile, Q1 is the lower quartile.
cv std Coefficient of variation, reflecting the degree of variation in a set of data
cv =
mean
∑ m
div |xi | n is the length of the time series, m < n
div = ∑n i=1
i=m+1 |xi |
first_location_of_maximum index[max(data)] The position of the relative maximum value based on the length of the sequential data.
location =
len(data)

Table 3
Classifiers.
Classifier Methods of data set Classification model selection
processing

A Undersampling SVM
B – BalancedRandomForestClassifier(Chen, 2004)
(n_estimators = 150)
C Reverse Differential BalancedRandomForestClassifier
(l = -50) (n_estimators = 200)
D Forward Differential BalancedRandomForestClassifier
(l = 50) (n_estimators = 200)
E Forward Differential BalancedRandomForestClassifier
(l = 100) (n_estimators = 180)
F Random SVM
Oversampling
G SMOTE SVM
H ADASYN SVM
I SMOTEENN SVM
J Integrating the above weak classifier

to retrain a random forest model. The results of the random forest


Fig. 13. The ROC curves and the corresponding AUC for each classifier.
method on both the SSW dataset and the feature-engineered SSW dataset
are displayed in the following table (see Table 7, Fig. 17, Fig. 18).
The AUC value in SGAW is 0.705, MSW is 0.786, and SSW is 0.762.

Table 4
The performance of each classifier on the GAW dataset. (The definitions of Precision and Recall can be found in Table 1. F1-Score is the harmonic mean of Precision and
Recall. Specificity, also known as the true negative rate, represents the proportion of true negative samples in the actual negative samples. AUC stands for Area Under
Curve, which is defined as the area enclosed by the ROC curve and the coordinate axes. The Receiver Operating Characteristic curve is referred to as ROC.).
Classifier Precision Recall F1 score Specificity AUC

- + Avg/Total - + Avg/Total - + Avg/Total - + Avg/Total

A 0.83 0.63 0.76 0.73 0.76 0.74 0.78 0.69 0.75 0.76 0.73 0.75 0.744
B 0.84 0.62 0.76 0.72 0.77 0.74 0.78 0.69 0.74 0.77 0.72 0.75 0.745
C 0.72 0.50 0.64 0.66 0.56 0.62 0.69 0.53 0.63 0.56 0.66 0.60 0.612
D 0.72 0.49 0.63 0.66 0.56 0.62 0.68 0.53 0.63 0.56 0.66 0.60 0.609
E 0.73 0.52 0.65 0.67 0.58 0.64 0.70 0.55 0.64 0.58 0.67 0.62 0.629
F 0.81 0.68 0.76 0.81 0.69 0.76 0.81 0.68 0.76 0.69 0.81 0.73 0.747
G 0.81 0.68 0.76 0.81 0.68 0.76 0.81 0.68 0.76 0.68 0.81 0.73 0.747
H 0.81 0.67 0.76 0.81 0.67 0.76 0.81 0.67 0.75 0.67 0.81 0.72 0.739
I 0.82 0.61 0.74 0.72 0.73 0.72 0.77 0.66 0.73 0.73 0.72 0.73 0.725
J 0.83 0.69 0.78 0.81 0.73 0.78 0.82 0.71 0.78 0.73 0.81 0.76 0.769

13
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Table 5 precision of 0.692 and a recall of 0.5. In contrast, the Z component daily
Different performances of Random Forest on the SGAW and MSW datasets. amplitude ratio anomaly had a precision of 0.687 and a recall of 0.733.
Dataset Precision Recall F1 score Specificity data In 2016, Li et al. (2016) used the daily variation amplitude and load-
unload response ratio approach for the Z component to analyze some
SGAW 0.72 0.72 0.72 0.69 1039
MSW 0.79 0.78 0.78 0.79 843 geomagnetic observation data collected in Qinghai and neighboring
regions (Gansu, Sichuan) between 2007 and 2014, and obtained an
earthquake reflection rate of 0.643 and a recall of 0.450. The following
year, He and Feng(2017) used the second-sampled observation data
The classifier’s AUC after feature engineering and feature selection is produced by the geomagnetic combination observation system at the
0.827. Chengdu station in Sichuan Province as the research object. Applying
Exploring the waveform recordings in SSW and performing feature the polarization method to extract ultra-low frequency magnetic
engineering enhanced the performance of the same model on the strong anomaly signals before several moderate to strong earthquakes near this
earthquake geomagnetic anomaly dataset when compared to using raw station, the precision of anomalies that correspond to earthquakes was
time-series data directly as input for machine learning. Therefore, the 0.625, with a recall rate of 0.714. Huang et al. (2020) calculated the low-
subsequent work must eliminate variations in the normal background point displacement anomalies between 2008 and 2019 using the
field. Based on GAW, compared with the geomagnetic stations with the amplitude-phase method in 2020. The earthquake reflection rate was
same latitude, longitude and all geomagnetic stations in China, the data discovered to be 0.696.
of non-local or non-regional changes are removed, and feature engi­ Therefore, this paper’s geomagnetic anomaly dataset is constructed
neering is performed on this basis. By combining machine learning based on the window-weighted correlation method. Through feature
methods, possible geomagnetic anomaly waveforms related to earth­ engineering of the dataset SSW, the final classification accuracy and
quake occurrences can be identified, and the correlation with earth­ recall rate reaches 0.83, higher than the seismicity mapping rate of the
quakes can be analyzed to achieve a one-to-one correspondence between traditional geomagnetic anomaly detection method. This study in­
electromagnetic anomalies and earthquake occurrences. This can lead to troduces the machine learning method and provides a new perspective
earthquake prediction through geomagnetic anomalies, and further and exploration path for analyzing the potential correlation between
research can be conducted to investigate the causes of electromagnetic geomagnetic anomalies and earthquakes. The in-depth analysis of the
anomalies and provide more possibilities for summarizing the charac­ anomaly signals provides effective data support for further analysis of
teristics of electromagnetic anomalies caused by earthquakes. the seismo-magnetic relationship. It provides strong support for inte­
grating and developing artificial intelligence and earthquake prediction
5. Discussion research. However, the method is sensitive to data quality, reference
data selection, and time window size, which need further optimization.
In the field of earthquake prediction, the traditional methods for The anomaly signal dataset established based on the method covers
detecting geomagnetic anomalies include the polarization method for multiple scales of seismic events, but the number of samples is still
vertical geomagnetic intensity, the daily variation amplitude and its limited. In addition, the imbalance of positive and negative samples and
day-to-day ratio for the vertical component of geomagnetism, and the the existence of a single geomagnetic day-variation anomaly have some
amplitude-phase method. In 2007, Zhang et al. (2007) used the 25-year impacts on the training effect of the machine learning model.
hour average data for the vertical component of geomagnetism at the
Urumqi station from 1980 to 2004 to determine the amplitude of daily 6. Conclusion
variation and its day-to-day ratio. He compared the earthquake reflec­
tion ability of the daily variation amplitude and day-to-day ratio This paper proposes a method for detecting geomagnetic anomaly
method. He found that the Z component daily amplitude anomaly had a waveforms based on window-weighted correlation to solve the problems

Fig. 14. (a) Confusion matrix of the classifier on the SGAW dataset. (b) Confusion matrix of the classifier on the MSW dataset.

14
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Fig. 15. The distribution of each feature in the training and testing sets.

Fig. 16. Ranking of the importance of the feature.

15
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Table 6
Categorization of features.
Types of features Feature names

Energy feature abs_energy、abs_energy_diff


Frequency domain fft_coefficient_angle、fft_coefficient_abs、
feature fft_aggrerated_skew
Wavelet feature cwt_coefficients
Time domain feature cor120、cor60、absolute_sum_of_changes、、median、
mean、skew、per1
Aggregation feature f_agg_mean_maxlag_60

and limitations of traditional methods in geomagnetic anomaly infor­


mation extraction and analysis. Compared with the traditional anomaly
detection method, the method can identify geomagnetic anomalies more
effectively, with an identification accuracy of 97.6 %, demonstrating an
efficient detection capability for weak and complex changing regularity
waveform features. By comparing with the reference signals from
various magnetic stations, it is found that the daily amplitude of the
geomagnetic signals varies in different periods, and the daily amplitude Fig. 18. ROC curve and area under the curve (AUC) of the Random Forest
varies in the range of about 15–25 nT. In addition, based on the window- method on SGAW, SSW, MSW, and SSW after feature engineering
weighted similarity, the geomagnetic anomaly dataset and the strong transformation.
earthquake dataset of the Sichuan region for earthquakes above
magnitude 3 were constructed, and seven typical types of geomagnetic
anomaly waveforms were summarized, as well as the regularity of each may be related to earthquake occurrences and analyzing their potential
type of anomaly in spatial distribution and time series. Finally, machine correlation with earthquakes provides the possibility of earthquake
learning methods were employed to extract five types of distinct features prediction through geomagnetic anomalies. Furthermore, it provides
and conduct classification and prediction experiments on the dataset. effective data support for further analysis of the seismo-magnetic
The final model achieved an AUC of 0.827, with better earthquake relationship.
reflection ability than traditional geomagnetic anomaly detection
methods. Identifying possible geomagnetic anomaly waveforms that

Table 7
The performance of Random Forest on SSW and SSW after feature engineering.
Dataset Precision Recall F1 score Specificity AUC

- + Avg/Total - + Avg/Total - + Avg/Total - + Avg/Total

SSW 0.80 0.72 0.76 0.75 0.77 0.76 0.78 0.74 0.76 0.77 0.75 0.76 0.705
After feature engineering 0.86 0.79 0.83 0.81 0.84 0.83 0.84 0.81 0.83 0.84 0.81 0.83 0.827

Fig. 17. (a) Confusion matrix of the classifier on SSW. (b) Confusion matrix of the classifier on SSW after feature engineering.

16
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

CRediT authorship contribution statement He, C., Feng, Z.S., 2017. Application of polarization method to geomagnetic data from
the station Chengdu. Acta Seismologica Sinica (in Chinese), 39(4): 558-564+633.
He, C., Liao, X.F., 2019. Study on anomaly characteristics of geomagnetic vertical
Zongxuan Wu: Conceptualization, Methodology, Software, Formal intensity polarization method. Int. Seismic Dynamics (in Chinese) 8, 130–131.
analysis, Investigation, Resources, Visualization, Project administration. He, C., 2018. Application of geomagnetic polarization method in Sichuan area. Chinese
Jiening Xia: Conceptualization, Methodology, Validation, Formal Geophysical Society, Chinese Seismological Society, Organizing Committee of
National Symposium on Petrology and Geodynamics, Tectonic Geology and
analysis, Investigation, Supervision, Project administration. Benyan Geodynamics Committee of Geological Society of China, Regional Geology and
Tan: Validation, Formal analysis, Investigation, Resources, Visualiza­ Metallogenic Committee of Geological Society of China, and Earth Science
tion, Supervision. Bin Wang: Formal analysis, Investigation, Supervi­ Department of National Natural Science Foundation of China (in Chinese): 12.
https://kns.cnki.net/KCMS/detail/detail.aspx?
sion. Qian Zhao: Formal analysis, Investigation, Supervision. Shaopeng dbcode=CPFD&dbname=CPFDLAST2019&filename=ZGDW201810042009&v=.
He: Formal analysis, Investigation, Supervision. Huang, S., Yao, L., Jiang, C.F., 2020. Validity Analysis of Geomagnetic Low-point
Displacement Based on Amplitude-phase Method. Earthquake (in Chinese) 40 (3),
131–141.
Declaration of Competing Interest Kappler, K.N., Schneider, D.D., MacLean, L.S., Bleier, T.E., Lemon, J.J., 2019. An
algorithmic framework for investigating the temporal relationship of magnetic field
pulses and earthquakes applied to California. Comput. Geosci. 133, 104317 https://
The authors declare that they have no known competing financial doi.org/10.1016/j.cageo.2019.104317.
interests or personal relationships that could have appeared to influence Lei, Q., & Jia, L. NOISE SUPPRESSION OF FGM-01 GEOMAGNETIC OBSERVATION
DATA BY HILBERT-HUANG TRANSFORM. Inland Earthquake (in Chinese), 34(4):
the work reported in this paper. 363–377.
Li, X., Ma, Y.H., Ma, Z., et al. 2015. APPLICATION OF Z COMPONENT GEOMAGNETIC
Data availability LOAD-UNLOADING RESPONSE RATIO IN QINGHAI AREA. Plateau Earthquake (in
Chinese), 28(1): 14–18.
Li, X., Feng, L.L., Feng, Z.S., Zhao, Y.H., Liu, L., 2019. The characteristics of ULF
The data that has been used is confidential. magnetic field in Qinghai were analyzed by polarization method. Int. Seismic
Dynam. (in Chinese) 8, 69–70.
Li, J.H., Jiang, C.F., Feng, L.L., He, K., 2021. Analysis on Geomagnetic Diurnal Variation
Acknowledgements Anomaly Before the Ms7.4 Maduo Earthquake. Sichuan Earthq. (in Chinese) 4, 7–11.
Li, Q., Yang, X., Cai, S.P., 2015. Case Study of Applying Polarization Method to
This work was supported by the Double Innovation Fund Research Geomagnetic Array Data. Earthq. Prevent. Technol. (in Chinese) 10 (2), 412–417.
Liao, X.F., Feng, L.L., Qi, Y.P., Li, X., 2019. Application of Geomagnetic Polarization
Project of Zhong Zhen (No. ZZSC2022G03), and the 2022 Hubei High- Method in the Alashan M5.0 Earthquake. Earthquake (in Chinese) 39 (4), 127–135.
Value IPR Cultivation Project (Patent Category). We thank Geomag­ Liao, X.F., Fan, W.J., Qiu, G.L., Li, X.H., Yang, P., 2021. Analysis on Short Term
netic Network Center of China, Institute of Geophysics, China Earth­ Characteristics of Geomagnetic Vertical Intensity Polarization Anomaly Before
Jiuzhaigou 7.0 Earthquake on August 8 2017. Earthquake (in Chinese) 41 (4),
quake Administration for providing geomagnetic data.
68–77.
Liu, S.Z., Qin, Z.M., Zhang, L.E., 2017a. Dynamic Variation Characteristics Analysis of
References Geomagnetic ULF in Taiyuan Station by polarization method. Shanxi Earthquake (in
Chinese) 4, 5–7.
Liu, X.M., Zhao, J.J., Wang, Y.M., Peng, P.A., 2017b. Automatic microseismic P-wave
Adeli, H., Panakkat, A., 2009. A probabilistic neural network for earthquake magnitude
pickup based on improved STA/LTA method. J. Northeastern Univ. (Natural Science
prediction. Neural Netw. 22 (7), 1018–1024.
Edition) (in Chinese) 38, 740–745.
Asim, K.M., Martínez-Álvarez, F., Basit, A., Iqbal, T., 2017. Earthquake magnitude
Lu, C.L., Kuang, C.L., Yi, C.H., Zhang, Z.T., 2015. Singular Spectrum Analysis Filter
prediction in Hindukush region using machine learning techniques. Nat. Hazards 85
Method for Mitigation of GPS Multipath Error. J. Wuhan University (Information
(1), 471–486.
Science Edition) (in Chinese), 40(7): 924–931.
Asim, K.M., Idris, A., Iqbal, T., Martínez-Álvarez, F., 2018. Earthquake prediction model
Marchetti, D., De Santis, A., D’Arcangelo, S., Poggio, F., Jin, S., Piscini, A.,
using support vector regressor and hybrid neural networks X. PLoS One 13 (7),
Campuzano, S.A., 2020. Magnetic Field and Electron Density Anomalies from Swarm
e0199004.
Satellites Preceding the Major Earthquakes of the 2016–2017 Amatrice-Norcia
Chen, X., Huang, E.X., Cheng, W.L., Yu, S.J., 2020. Typical interference identification
(Central Italy) Seismic Sequence. Pure Appl. Geophys. 177 (1), 305–319.
and data processing of FHD geomagnetic data in Xinyang Platform. Ground Water
Molchanov, O., Schekotov, A., Fedorov, E., Belyaev, G., Gordeev, E., 2003. Preseismic
(in Chinese) 42 (2), 101–103.
ULF electromagnetic effect from observation at Kamchatka. Nat. Hazards Earth Syst.
Chen, C.H., Liu, J.Y., Yang, W.H., Yen, H.Y., Hattori, K., Lin, C.R., Yeh, Y.H., 2009.
Sci. 3 (3/4), 203–209.
SMART analysis of geomagnetic data observed in Taiwan. Phys. Chem. Earth, Parts
Mursula, K., Holappa, L., 2017. Principal Component Analysis of Geomagnetic Activity
A/B/C 34 (6–7), 350–359.
13, 197–200.
Chen, C., 2004. Using Random Forest to Learn Imbalanced Data.
Özsöz, İ., Ankaya Pamukçu, O., 2021. Detection and interpretation of precursory
Christ, M., Braun, N., Neuffer, J., Kempa-Liehr, A.W., 2018. Time Series FeatuRe
magnetic signals preceding October 30, 2020 Samos earthquake. Turk. J. Earth Sci.
Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package).
30 (SI-1), 748–757.
Neurocomputing 307, 72–77.
Panakkat, A., Adeli, H., 2007. NEURAL NETWORK MODELS FOR EARTHQUAKE
Ding, J.H., Liu, J., Yu, S.R., Xiao, W.J., 2004. Geomagnetic Diurnal-Variation Anomalies
MAGNITUDE PREDICTION USING MULTIPLE SEISMICITY INDICATORS. Int. J.
and Their Relation to Strong Earthquakes. Acta Seismol. Sin. (in Chinese) S1, 79–87.
Neural Syst. 17 (01), 13–33.
FAN Wenjie, FENG Lili, Li Xia, et al. Characteristic of geomagnetic vertical intensity
Rikitake, T., 1997. Nature of Electromagnetic Emission Precursory to an Earthquake.
polarization anomalies before the Menyuan, Qimghai Ms6.9 earthquake on January
J. Geomag. Geoelec. 49 (9), 1153–1163.
8, 2022. China Earthq. Eng. J. (in Chinese), 44(3): 744–750.
Rui, X.L., Liao, X.F., Yang, P., Huang, L.J., 2019. Analysis of Geomagnetic Anomalies of
Feng, L.L., Guan, Y.L., Fan, W.J., He, M.Q., Li, X., et al., 2021. Geomagnetic Vertical
Sichuan Associated with the Jiuzhaigou M7.0 earthquake. Sichuan Earthquake (in
Component Polarization Anomaly In October 15TH 2020 Before Maduo M7.4
Chinese) 3, 28–31.
Earthquake. Plateau Earthquake (in Chinese) 33, 1–6.
Serita, A., Hattori, K., Yoshino, C., Hayakawa, M., Isezaki, N., 2005. Principal component
Golyandina, N. & Zhigljavsky, A. 2013. Basic SSA. In N. Golyandina & A. Zhigljavsky,
analysis and singular spectrum analysis of ULF geomagnetic data associated with
eds. Singular Spectrum Analysis for Time Series. SpringerBriefs in Statistics. Berlin,
earthquakes. Nat. Hazards Earth Syst. Sci. 5 (5), 685–689.
Heidelberg: Springer: 11–70. https://doi.org/10.1007/978-3-642-34913-3_2 5
Smirnova, N.A., Hayakawa, M., 2007. Fractal characteristics of the ground-observed ULF
March 2023.
emissions in relation to geomagnetic and seismic activities. J. Atmos. Sol. Terr. Phys.
Gotoh, K., Akinaga, Y., Hayakawa, M., Hattori, K., 2002. Principal component analysis of
69 (15), 1833–1841.
ULF geomagnetic data for Izu islands earthquakes in July 2000. J. Atmospheric
Uyeda, S., Hayakawa, M., Nagao, T., Molchanov, O., Hattori, K., Orihara, Y., Gotoh, K.,
Elect. 22 (1), 1–12.
Akinaga, Y., Tanaka, H., 2002. Electric and magnetic phenomena observed before
Hattori, K., Serita, A., Yoshino, C., Hayakawa, M., Isezaki, N., 2006. Singular spectral
the volcano-seismic activity in 2000 in the Izu Island Region, Japan. Proc. Nat. Acad.
analysis and principal component analysis for signal discrimination of ULF
Sci. 99 (11), 7352–7355.
geomagnetic data associated with 2000 Izu Island Earthquake Swarm. Phys. Chem.
Uyeda, S., Nagao, T., Kamogawa, M., 2009. Short-term earthquake prediction: Current
the Earth, Parts A/B/C 31 (4–9), 281–291.
status of seismo-electromagnetics. Tectonophysics 470 (3–4), 205–213.
Hayakawa, M., Molchanov, O.A. 2007. Seismo-Electromagnetics as a New Field of
Vautard, R., Yiou, P., Ghil, M., 1992. Singular-spectrum analysis: A toolkit for short,
Radiophysics: Electromagnetic Phenomena Associated with Earthquakes., (320): 10.
noisy chaotic signals. Physica D 58 (1–4), 95–126.
Hayakawa, M., Kawate, R., Molchanov, O.A., Yumoto, K., 1996. Results of ultra-low-
Wang, Y., 2015. Frequencies of the Ricker wavelet. GEOPHYSICS, 80(2): A31–A37.
frequency magnetic field measurements during the Guam earthquake of 8 August
Wei, 2021. Research on anomaly detection and classification of ground electric field
1993. Geophys. Res. Lett. 23 (3), 241–244.
based on Machine learning. Lanzhou Institute of Seismology, China Earthquake
Hayakawa, M., Itoh, T., Hattori, K., Yumoto, K., 2000. ULF electromagnetic precursorsfor
Administration (in Chinese).https://kns.cnki.net/KCMS/detail/detail.aspx?
an earthquakeat Biak, Indonesia on February 17, 1996. Geophys. Res. Lett. 27 (10),
dbcode=CMFD&dbname=CMFD202102&filename=1021616293.nh&v=.
1531–1534.

17
Z. Wu et al. Journal of Asian Earth Sciences 259 (2024) 105837

Xiao, W.J., Yu, S.R., Ding, J.H., 2006. The anomalous phenomenon of geomagnetism Zhang, Y., Yang, F.X., He, R., 2007. THE OMEN ANALYSIS OF DAY AMPLITUDE AND
before strong earthquake occurrence. Earthquake (in Chinese) 4, 52–58. ITS DAY TO DAY RATIO ABOUT THE GEOMAGNETISM Z COMPONENT IN
Xie, T., Liu, J., Lu, J., et al., 2018. Retrospective analysis on electromagnetic anomalies URUMQI STATION. Inland Earthquake (in Chinese) 1, 78–85.
observed by ground fixed station before the 2008 Wenchuan Ms8.0 earthquake. Zhao, G., Zhang, X., Cai, J., Zhan, Y., Ma, Q., Tang, J., Du, X., Han, B., Wang, L., Chen, X.,
Chinese J. Geophys. (in Chinese) 61 (5), 1922–1937. Xiao, Q., Sun, X., Dong, Z., Wang, J., Zhang, J., Fan, Y., Ye, T., 2022. A review of
Xu, P.S., Teng, Y.T., Yu, Z.Y., Wang, X.M., Wu, Q., Hu, X.X., 2018. Electromagnetic seismo-electromagnetic research in China. Sci. China Earth Sci. (in Chinese). https://
anomaly identification algorithm based on signal fingerprinting. Acta Seismol. Sin. doi.org/10.1007/s11430-021-9930-5.
(in Chinese) 40 (1), 79–88. Zhou, Z.H., 2016. Machine Learning. Tsinghua University Press, 2 March 2023.
YAO Xiu-yi & FENG Zhi-sheng. 2018. Review on the recent development of analysis Zhu, K., Li, K., Fan, M., Chi, C., Yu, Z., 2019. Precursor Analysis Associated With the
methods on magnetic disturbance associated with earthquakes. Progress in Geophysics Ecuador Earthquake Using Swarm A and C Satellite Magnetic Data Based on PCA.
(in Chinese), 33(2): 511–520. IEEE Access 7, 93927–93936.
Zhang Zhihong, Guo Anning, Li Mengying, et al. 2021. Geomagnetic anomalies analysis Zou, G., Gao, S.Q., Zhao, G., Feng, Y., Sheng, Y., 2015. INTERFERENCE ANALYSIS ON
of 2018 Songyuan Ms5.7 earthquake in Jilin Province. Science Technology and FHD-2B PROTON MAGNETOMETER DATA IN WENQUAN SEISMIC STATION.
Engineering (in Chinese), 21(29): 12406–12414. Inland Earthquake (in Chinese) 29 (4), 371–377.

18

You might also like