Spatio-Temporal Graph Learning For Epidemic Prediction: Shuo Yu Feng Xia Shihao Li Mingliang Hou Quan Z. Sheng

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

36

Spatio-temporal Graph Learning for Epidemic Prediction

SHUO YU, Dalian University of Technology, China


FENG XIA, RMIT University, Australia
SHIHAO LI and MINGLIANG HOU, Dalian University of Technology, China
QUAN Z. SHENG, Macquarie University, Australia

The COVID-19 pandemic has posed great challenges to public health services, government agencies, and
policymakers, raising huge social conflicts between public health and economic resilience. Policies such as
reopening or closure of business activities are formulated based on scientific projections of infection risks
obtained from infection dynamics models. Though most parameters in epidemic prediction service models
can be set with domain knowledge of COVID-19, a key parameter, namely, human mobility, is often challeng-
ing to estimate due to complex spatio-temporal correlations and social contexts under escalating COVID-19
facilities. Moreover, how to integrate the various implicit features to accurately predict infectious cases is still
an open issue. To address this challenge, we formulate the problem as a spatio-temporal network represen-
tation problem and propose STEP, a Spatio-Temporal Epidemic Prediction framework, to estimate pandemic
infection risk of a city by integrating various real-world conditions (e.g., City Risk Index, climate, and medical
conditions) into graph-structured data. We also employ a multi-head attention mechanism in representation
learning to extract implicit features for a given city. Extensive experiments have been conducted upon the
real-world dataset for 51 states (50 states and Washington, D.C.) of the USA. Experimental results show that
STEP can yield more accurate pandemic infection risk estimation than baseline methods. Moreover, STEP
outperforms other methods in both short-term and long-term prediction.
CCS Concepts: • Information systems → Spatial-temporal systems; • Applied computing → Life and
medical sciences;
Additional Key Words and Phrases: Epidemic prediction, infection risk evaluation, spatial-temporal network,
graph learning, representation learning
ACM Reference format:
Shuo Yu, Feng Xia, Shihao Li, Mingliang Hou, and Quan Z. Sheng. 2023. Spatio-temporal Graph Learning for
Epidemic Prediction. ACM Trans. Intell. Syst. Technol. 14, 2, Article 36 (February 2023), 25 pages.
https://doi.org/10.1145/3579815

This work is partially supported by National Natural Science Foundation of China under Grant No. 62102060 and the
Fundamental Research Funds for the Central Universities under Grant No. DUT22RC(3)060.
Authors’ addresses: S. Yu, School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024,
China; email: shuo.yu@ieee.org; F. Xia, (corresponding author) School of Computing Technologies, RMIT University, Mel-
bourne, VIC 3000, Australia; email: f.xia@ieee.org; S. Li and M. Hou, School of Software, Dalian University of Technology,
Dalian, 116620, China; emails: {shihao_leee, teemhold}@outlook.com; Q. Z. Sheng, School of Computing, Macquarie Uni-
versity, Sydney, NSW 2109, Australia; email: michael.sheng@mq.edu.au.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be
honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
2157-6904/2023/02-ART36 $15.00
https://doi.org/10.1145/3579815

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:2 S. Yu et al.

1 INTRODUCTION
Limited treatments for sudden COVID-19 have caused countless life losses, and the outbreak of
COVID-19 has brought a strong negative influence on the worldwide economy. Social distanc-
ing is a more effective way to control the spread of the epidemic, yet the effect on economic re-
covery cannot be avoided [1, 43]. With the gradual progress of effective vaccine injection, it is
generally believed that COVID-19 will be well controlled in the near future. However, the seri-
ous consequences on human health and worldwide economy still need much time to recover [5].
Consequently, to reduce the economic loss and make quick response to COVID-19-like epidemic,
evaluating and predicting infection risk appear to be significantly important for formulating corre-
sponding policies [2, 60]. Previous epidemic prediction models generally decompose the task into
profiling the epidemic and predicting the spread trend. A variety of studies have been proposed
to precisely predict the infection rate [19, 44], turning point [10, 11], hazard areas [15], and the
infection number [6, 35]. Some studies focus on mining the underlying reasons for the epidemic’s
quick transmission with infection disease transmission model. Social networks are also employed
to analyze close contacts and public opinion. However, few studies paid attention to qualify and
predict infection risk of certain regions (e.g., cities and regions) to assist local policymakers for
better decision-making.
Cities provide fundamental living environments for human beings and are considered to be the
centers for the creation and dissemination of modern human culture [26, 47, 51]. Cities are different
(e.g., climate, population, sanitation), leading to different policies. To tackle the spread of COVID-
19 and meanwhile recovering the economy, it is very important to measure the infection risks for
specific cities. In this way, proper policies can be made and justified on time. City risk in infection is
related to a variety of factors. When considering city risk infection, multiple factors are considered
and integrated as a compound metric called CRI (City Risk Index) [31] in our previous work. CRI
considers four factors of a certain city, i.e., economy (gross domestic product (GDP) and fitness
complexity index (FCI)), technology (i.e., education degree and innovation degree), population,
and geographical position.
Though quantitative metrics definitely help with formulating city features in epidemic predic-
tion [30, 62], precise models are meanwhile significant [3, 42, 48, 61]. The rapid development of
deep learning makes it possible to mine the potential value of spatio-temporal data [36, 59, 63, 64].
Spatio-temporal data mining tasks, especially prediction, are generally well-solved based on
Spatio-Temporal Graph Neural Network (STGNN), with two main advantages including auto-
matic feature representation learning and powerful function approximation ability [20, 28, 54, 57].
Therefore, we build a spatio-temporal network with cities’ information for accurate city epidemic
prediction service in this work.
This work is based on our previous work [31], which proposed an effective metric named CRI.
In the previous work, Pearson’s correlation coefficient is first used to indicate the correlation be-
tween the above-mentioned factors of a city. Then, regression analyses are implemented to show
that the CRI indicator is linearly related to the city risk in infection. Experimental results show that
the proposed CRI is effective in measuring city risk in infection amid COVID-19. In this full ver-
sion article, a comprehensive and effective forecasting framework called STEP (Spatio-Temporal
Epidemic Prediction) is presented. To be more specific, a temporal network is constructed and
then represented to accurately predict the pandemic infection risk of a city. Multi-head attention
mechanism is also employed to enhance representation learning. Among the rest, CRI is only em-
ployed as a critical node attribute to better profile a city’s integrated information. In a nutshell,
the main contributions of this work are summarized as follows:
• Accurate epidemic prediction: We propose an epidemic prediction algorithm called STEP,
which utilizes the characteristics of certain areas. We construct spatio-temporal network
ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:3

with cities’ information and propose the STEP method to predict infectious disease cases.
Together with spatio-temporal graph learning and attention mechanisms, STEP can accu-
rately predict the number of confirmed cases during the epidemic.
• Implicit features and mobility pattern formulation: We formulate implicit features of
certain areas as node attributes and node structural features. Node attributes include the
proposed CRI, climate, medical conditions, and structural feature refers to geographical po-
sition. Additionally, GRU is used to explore the valuable temporal features from the spatio-
temporal graph. Since different adjacent positions will have different influences on the target
area during the epidemic, multi-head attention mechanism is employed to better formulate
such kinds of implicit features and human mobility pattern in network representation learn-
ing process. Both node attributes and node structural features are integrated into spatial-
temporal network representation.
• Real-world data verification: We use data of confirmed cases from 51 states of the U.S. to
investigate the effectiveness of our proposed STEP method. Comparing to the baseline meth-
ods including SIR (Susceptible Infected Recovered model), Regression model, ARIMA
(Autoregressive Integrated Moving Average model), and LSTM (Long Short-Term
Memory), STEP shows better performance with higher accuracy. Meanwhile, STEP also
outperforms other methods in both short-term and long-term prediction.
• Emergency management policymaking enhancement: The effectiveness of CRI has
been verified in our previous work. Combining with CRI, our proposed prediction frame-
work can better guide policymakers to implement suitable policies for certain cities, thus
enhancing emergency service management for policymaking.
The rest of this article is organized as follows: Section 2 introduces previous studies about epi-
demic prediction and deep graph learning methods in prediction. Section 3 defines the epidemic
prediction problem. Section 4 presents the technical details of our proposed STEP framework. Sec-
tion 5 reports the experimental results, and Section 6 concludes the article.

2 RELATED WORK
2.1 Epidemic Prediction
The epidemic has been regarded as one of the most fundamental global health problems [21, 65].
Offering accurate prediction service of epidemic spread is crucial for effective health intervention.
Traditional epidemic prediction research can be classified into two categories. The first category is
the causal model, including compartmental models and agent-based models. They employ the SIR
model or its variants to simulate the epidemic spread mechanism [18, 37]. Compartmental models
focus on the mathematical modeling of population-level dynamics. Wong et al. [50] applied the
SIR model with a stochastic framework to consider the spreading of epidemics and indicated that
large super-spreading events should be the targets of interventions. Agent-based models simulate
the propagation process of individual and collective entities level and view to assessing their ef-
fects on the whole epidemic. Shuvo et al. [44] used artificial agent-based simulation modeling to
identify the importance of social distancing and the capacity of hospitals and found that shorter
social isolation activation and greater hospitals’ capacity have a higher impact on the control of
an epidemic. The second category is the statistical model. Autoregressive (AR) models are part
of the statistical models and they can effectively make predictions using historic data on time se-
ries. Perrotta et al. [40] applied linear autoregressive models to predict the report from Influweb
and significantly improved the performance. Gaussian Process Regression (GPR) models are
another important part of statistical model and the best known for their superior performance in
low-dimensional models. Lampos et al. [27] applied a nonlinear regression framework, based on a

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:4 S. Yu et al.

composite Gaussian Process to expand their previous model and augment the query-only predic-
tions with an autoregressive model. The results indicated that the nonlinear modeling approach
could obtain a better performance. Due to their simplicity, the above models require only a few
parameters, making them popular in epidemic prediction, but at the same time also limiting the
expressive ability of the model.

2.2 Deep Graph Learning for Prediction


The traditional prediction method adopts Bayesian method to predict. Xu et al. [55] proposed a
traffic flow prediction method with Interpretable Bayesian. Graphs are ubiquitous in the real world,
and their complicated structure allows them to contain rich information [22, 23, 45, 52]. In the past
few years, graph-based deep learning methods have attracted the attention of many researchers. In
the traffic flow prediction task, some researchers formulate geographical locations between cities
as graphs with regular shapes. Yao et al. [56] accepted CNN and LSTM to capture spatial and
temporal correlation. Since graph is a non-Euclidean structure data, to solve the application of
convolutional neural network on graph, Kipf and Welling [25] proposed Graph Convolutional
Network (GCN), which operates directly on graphs and has achieved the state-of-art results in
the tasks of prediction and classification [13, 53]. Chiang et al. [7] proposed Cluster-GCN, exploit-
ing the graph clustering structure to design the batches. At each step in the training, it chooses the
nodes that associate with a dense subgraph identified by a graph structure and solves the problem
of needing a large space for keeping the graph and the embedding. Luan et al. [32] proposed two
GCN architectures that replaced spectral filter with a block Krylov matrix and a learnable param-
eter matrix and have stronger abilities to extract richer representations of graph-structured data.
Graph-based deep learning methods for prediction are used in various applications such as traf-
fic flow prediction, social event prediction, and air quality prediction. Qiu et al. [41] designed a
framework based on convolutional and attention networks called DeepInf, which learns the user’s
latent feature representation from both network structures and user-specific features in social net-
works for predicting social influence. Lv et al. [33] proposed LC-RNN for traffic speed prediction. It
integrates both RNN and CNN models to capture complex traffic patterns to achieve more accurate
speed prediction. Li et al. [29] proposed Diffusion Convolutional Recurrent Neural Network
(DCRNN) to predict traffic flow. It captures the spatial dependency using bidirectional random
walks and the temporal dependency using autoencoder with scheduled sampling. Deng et al. [12]
developed a graph neural network to capture event context for forecasting events. It encodes the
dynamic graph structure of words from the past to forecast the events in the future. Yi et al. [58]
employed spatial transformation to simulate the pollutant sources and deep distribution fusion
network to capture the factors affecting air quality.
Additionally, various graph learning methods have been proposed for epidemic prediction.
Kapoor et al. [24] proposed a GNN-based approach for COVID-19 forecasting, where the region-
level human mobility and inter-region connectivity are modelled as nodes and spatial edges in the
spatio-temporal graph, respectively. Based on the similar idea, Guo et al. [17] used the character-
istics such as daily death to build a multivariate high-resolution spatio-temporal graph and then
utilized STGNNs to predict and verify infections. Davahli et al. [9] developed two types of GNN
models to predict the dynamics of the epidemic in the U.S. based on the indicator effective repro-
duction number. Considering the disadvantage that machine learning approaches generally lack
intuitive interpretability, Fritz et al. [14] combined distributional regression with GNN and used
rich types of data to predict COVID-19 cases in Germany. Panagopoulos et al. [39] were dedicated
to transferring the epidemic forecasting model from one country to another with limited data by
using meta learning. Tomy et al. [46] focused on analyzing the capability of GNNs in estimat-
ing the epidemics spreading state. The experimental results of the above-mentioned approaches

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:5

demonstrate the usefulness of GNNs in epidemic prediction. However, GNNs perform poorly in
capturing the temporal correlation. Based on this, Mahmud et al. [34] proposed a GCN-based epi-
demic prediction approach that leverages the ability of RNNs in modeling temporal dependencies.
But they ignore that different adjacent positions will have different influences on the target area
during the epidemic. In this case, multiple human mobility patterns between different cities are
not well explored, which may reduce the expressiveness of the model.
In this article, we exploit deep graph learning method in spatio-temporal data for epidemic pre-
diction, which addresses the low-dimensional and few-factor problems. We use a spatio-temporal
network as input, integrating two dimensions (i.e., spatial and temporal) and considering the im-
pact of different factors on epidemic transmission at the same time. Most previous approaches
focus on leveraging GNN to model the spatial correlation between different regions, while valu-
able interaction features between different timesteps are not considered. Compared to these ap-
proaches, we further use GRU to model the epidemic information in time series. At the same time,
we use the multi-head attention mechanism to explore the complex human mobility patterns be-
tween different cities during the epidemic, which was not considered in most previous approaches.
In addition, we have verified the effectiveness of CRI in our previous work [31]. CRI considers four
factors of a certain city, including economy, technology, population, and geographical position. In
this work, CRI is employed as a critical node attribute to better profile a county’s integrated infor-
mation. Therefore, our work is more capable of modelling the factors associated with the spread
of the epidemic from a holistic perspective.

3 PROBLEM DEFINITION
Epidemic prediction is a classic task of time series prediction. The task of prediction can be defined
as giving the profile of a region at a certain time h in the past to predict the profile at h + t in the
future, which can be summarized in Equation (1):
Ph+t = Predict (ph , ph+1 , . . . , ph+t −1 ), (1)
wherein, P represents the profile of a city. In this work, we define P as an undirected graph G =
< V, E, X >, where V is the set of nodes that represents all cities. E is the set of edges that
represents the relationship between cities.
A pair of cities that are geographically adjacent to each other have a higher probability of spread-
ing the epidemic. Therefore, we define that there is an edge between two nodes in the graph G
when cities represented by the nodes are geographically adjacent. Considering the flow of people
between cities in mutual, we use the undirected graph in this work. X is the set of features in the
time series. In each timestamp, X = [x1 , x2 ], wherein x1 is the static feature of the city and x2 is
the dynamic feature of the city. The overall process can be expressed as shown in Equation (2):
Nt +h = Predict (Gt , Gt +1 , . . . , Gt +h−1 ), (2)
wherein, Nt +h is the prediction target at time t + h. Gt is the graph we constructed at timestamp t.
The necessary mathematical notations are summarized in Table 1.

4 DESIGN OF STEP
In this section, we will illustrate the details of our proposed STEP method. The overall framework
of STEP is shown in Figure 1. STEP first uses graph convolutional neural network to represent
the states at each timestep in spatio. After the representation of each timestep is obtained, the
graph representation of each timestep is updated in temporal. Finally, the final prediction results
are output through MLP layer. Our method contains three procedures: epidemic propagation in
spatio, time series embedding, and prediction.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:6 S. Yu et al.

Table 1. Notations

Notation Description
G City Spatio-temporal Network
N The node set
E The edge set
X The feature set
N Number of the node
A Adjacent matrix of G
T Number of timestamps
C Number of feature
D Degree matrix
Ct Number of infected people at time t
Ẑ Attention value

Fig. 1. The overall framework of STEP.

4.1 Graph Construction


For graph construction, we calculate the geographic bordering relationship between two cities. We
define the adjacency matrix A as follows:

⎪1 if city i and city j border
ai j = ⎨
⎪0 (3)
⎩ otherwise.

4.2 Transmission Probability Calculation


The transmission probability of the epidemic has an important influence on the formulation of the
epidemic situation. Brockmann et al. [4] illustrated that complex spatio-temporal patterns could
be reduced to surprisingly simple, homogeneous wave propagation patterns. Specifically, conven-
tional geographic distance is replaced by a probabilistically motivated effective distance. Inspired

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:7

by this, we take the factors in cities into consideration of transmission probability. For example,
if a city has sufficient medical conditions, then the transmission of the epidemic may be better
controlled, and warmer cities may promote the transmission of the epidemic. A city with a large
population and a large number of confirmed cases may have a greater impact on its neighbor cities
and a greater probability of transmission. To solve the problem of calculating the transmission
probability between cities, we use the attention mechanism to calculate it. Attention mechanism
learns the hidden embedding of each node by iteratively using node features and relations between
nodes to calculate the similarity, that is, it can calculate the city’s infectious ability. The attention
coefficients ei j are defined as shown in Equation (4):
eij = a(W1 hi , W2 hj ), (4)
wherein, a is a single-layer feedforward neural network, W1 and W2 are the learnable matrices.
We use masked attention to apply the graph structure to this mechanism, where only the features
of the node and its neighbor nodes are calculated. Node i is added to the neighbor node set N of
node i. To make the attention coefficient easier to calculate and facilitate comparison, we introduce
softmax to regularize all attention coefficients.
exp(eij )
α i j = so f tmax (eij ) =  (5)
n ∈N exp(ein )

Combining the above equations: Equations (4) and (5), the complete attention mechanism is
depicted in Equation (6).
exp(a[W1 hi ||W2 hj ])
αi j =  (6)
n ∈N exp(a[W1 hi ||W2 hn ])
Through the above operation, the attention coefficient between different nodes is obtained,
which can be used as the output feature of each node.

hi = σ 

α i j Whj (7)
j ∈N
To get better output results, multi-head attention is used to optimize the learning process of
attention mechanism. Specifically, multi-head attention is defined in Equation (8):

1  
K k
hi = σ 

W k hj , (8)
K
 k=1 j ∈N i ,α i j

wherein K is the number of heads in multi-head attention, and Wk is the learnable matrix in certain
head attention k. The general process of transmission probability calculation between cities is
presented in Algorithm 6, and the diagram is shown in Figure 2(a).

4.3 Graph Convolutional Layer Propagation


One of the key problems in the prediction of epidemic is to obtain complex spatial dependence. By
simulating the propagation between neighborhood cities, more accurate predictions can be made
for the target city. The Convolutional Neural Network (CNN) can obtain local spatial features,
but it can only be used in Euclidean space. However, the city network has a complex topology, so
the traditional CNN models cannot be used to capture spatial dependence. Therefore, in our work,
we use Graph Convolutional Network (GCN) to obtain the topological relationship between
the city and its neighbors. The GCN model is an effective model for dealing with graph-related
problems, which has been applied to many applications. The GCN model constructs a filter in
the Fourier domain, and the filter acts on the nodes of graph and its one-order neighborhood to

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:8 S. Yu et al.

Fig. 2. Epidemic spread formulation in spatio transmission.

ALGORITHM 1: Transmission probability calculation


Input: City spatio-temporal network.
Output: Transmission risk index.
1 for k=0;k≤ K;k++ do
2 for each node i in G do
3 for each node j in Ni do
exp (eij )
4 ei j = a(W1 hi , W2 hj ) α i j = so f tmax (eij ) = 
n∈Ni exp (ein )
 
5 hi = σ ( j ∈Ni α i j Whj )
 K k k
6 hi = σ ( K1 k=1 j ∈N α i j W hj )
i

capture spatial features between the nodes. The main idea of GCN is to aggregate the information
of neighbor nodes in the process of node representation. This means that in every next iteration,
the information of higher order neighbor nodes will be aggregated. As the number of GCN layers
increases, the aggregation radius becomes larger, thus enabling the information from multi-hop
neighbors to be aggregated. The effectiveness of GCN in modeling complex spatial dependence is
directly linked to the selection of GCN layers. A small number of layers will make the target node
aggregate information from limited lower-order neighbors and may fail to learn valuable structural
information. A large number of layers may make the target node learn useless information from
high-order neighbors, which may instead reduce the performance of GCN. As shown in Figure 2(b),
it is a diagram of GCN simulating the spread of a city epidemic. Assuming that C 1 is the target city,
the GCN model can obtain the topological relationship between the target city and its neighboring
cities. The core of GCN is graph convolution operation based on spectral graph theory; the formula
is expressed as follows:
1 1
дθ ×G x = θ (In + D− 2 AD− 2 )x. (9)
Herein, x represents the graph feature vector, θ represents the weight parameter vector, and ×G rep-
resents the graph convolution operation. To improve the stability of the model, following changes
on GCN are made [49]:
1 1
In + D− 2 AD− 2 → A = D− 2 ĀD− 2 ,
1 1
(10)

wherein, Ā = In + A and Dii = j Āij . The overall propagation can be expressed as follows:

Hl+1 = σ (D− 2 AD− 2 H (l) W (l) ),


1 1
(11)

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:9

wherein, Hl is the feature matrix in layer l, and Hl = [hl1 . . . hli ]. Through this method, we can
formula the propagation process.

4.4 Time Series Embedding


The obtained node embedding contains the spatial features extracted from the graph. Our pro-
posed STEP method uses Gated Recurrent Neural Networks (GRU) to model the epidemic
information in time series by retaining the key information and selectively forgetting unimpor-
tant information. GRU can reduce the problem of gradient disappearance compared with LSTM.
We adopt maximum pooling to integrate the embedding on the nodes and reduce the dimension.
GRU has two gates, namely, a reset gate and an update gate. At timestep t, we first need to calculate
the update gate zt using the following Equation (12):

zt = σ (W (z) xt + U (z) ht−1 ). (12)


wherein, xt is the input vector of the tth timestep, ht−1 is the vector of the t-1 −th time. W (z) and
U (z) are the weight matrices. The update gate helps the model decide how much information from
the past should be sent to the future. The reset gate mainly determines how much past information
needs to be forgotten, calculated by using Equation (13) to calculate:

rt = σ (W (r) xt + U (r) ht−1 ), (13)


wherein, W (z) and U (z) are the weight matrices. In the use of the reset gate, the new memory
content will use the reset gate to store relevant information in the past, expressed by Equation (14).

ht = tanh(Wxt + rt )  Uht−1 (14)
Input vector xt and the previous timestep information ht−1 are linearly transformed by right
multiplying the matrices W and U. In the last step, the network retains the information of the
current unit and passes it to the next unit. Update gate is used to determine the current memory

content ht and how much information needs to be collected in the previous timestep ht−1 . This
process can be expressed in Equation (15):

ht = zt  ht −1 + (1 − zt )  ht , (15)
where ht is the final result, and zt is the result of the update gate, which controls the inflow
of information on the form of gating. The Hadamard product of zt and ht −1 represents the
information retained to the final memory at the previous timestep.

4.5 Output Layer—Prediction


The ht finally obtained is regarded as the final embedding of a specific city at a time t, which
contains all the important time and space features from the real-world dataset, and the multi-
layer perceptron (MLP) is used in the output layer to predict the increased number of confirmed
cases at a certain timestep.
ΔCt = MLP (ht ) (16)
The final predicted number of confirmed cases can be calculated according to Equation (17):
 (ΔCt ),
Ct = Cm + lr (17)
 represents the calculation of cumulative sum from left to right, and Cm represents the
wherein, lr
number of confirmed cases on a certain timestep before the current forecast.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:10 S. Yu et al.

ALGORITHM 2: STEP
Input: City spatio-temporal network.
Output: The predicated number of confirmed cases.
1 for m=1; m≤ T; m++ do
2 t=Tm ;
3 Spread of epidemic formulation.
4 for each layer in Graph Convolutional networks do
5 for i=0;i<N;i++ do
6 for j=0;j∈Neighbor{i} do
exp (a[W h | |W h ])
1 i 2 j
7 αi j = 
k ∈Ni exp (a[W1 hi | |W2 hk ])
 
8 hi = σ ( j ∈Ni α i j Whj )
 K k k
9 hi = σ ( K1 k=1 j ∈N α i j W hj )
i

10 H = hi ;
11 A=A+I;

12 D = j ai j ;
Hl+1 = σ (D− 2 AD− 2 H (l) W (l) )
1 1
13

14 Time series embedding.


15 for p=0;p< N;p++ do

16 ztp = σ (W (z) xtp + U (z) ht−1 ) rtp = σ (W (r) xtp + U (r) ht−1 ) htp = tanh(Wxt + r t )  Uh (t−1)p

htp = ztp  h (t−1)p + (1 − ztp )  htp

17 Predicted increased number of confirmed cases= MLP (htp )

4.6 Implementation of STEP


To better predict the number of people infected in a city, we first construct a city spatio-temporal
network from real datasets. At each timestamp, we take city’s own attributes, including medical
conditions, economic conditions, and climate conditions into consideration.
Combining with GCN and attention mechanism, we formulate the spread of epidemic. We cal-
culate the possibility of epidemic spread in each city to replace the geographical distance with
probabilistically motivated effective distance through multi-head attention mechanism and use
GCN to aggregate the information from the target city with its neighbor cities. This result is not
only considered the spread of the epidemic within the city, but also between the cities. In the term
of temporal feature, the GRU idea is used to process the results obtained at each time point, and
the result obtained at the last time point is used as the embedding result of the nodes. Finally, MLP
is used to predict the number of growth. The overall process is presented in Algorithm 2.

5 EXPERIMENTS
5.1 Datasets and Graph Construction
We collect and preprocess a real-world epidemic dataset to verify the effectiveness of our method.
The statistical details of datasets are shown in Table 2. To more intuitively observe the changes
in the number of confirmed cases in each states, we visualize the number of confirmed cases in
51 states, wherein, the abscissa, ordinate, and applicate are time series, states, and the number of
confirmed cases, respectively (see Figure 3).
• City Risk Index (CRI): CRI is proven to be an effective indicator to measure city infection
risk amid COVID-19. The following is the definition of CRI:

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:11

Table 2. Detailed Information of Datasets

Fig. 3. The number of confirmed cases in 51 states.

Definition 5.1. CRI: Considering a city’s economy (i.e., GDP and FCI), technology (i.e., edu-
cation and innovation), population, and geographical position (i.e., latitude and longitude),
CRI is calculated by:
1 
CRI = N <XṖi> + μ (18)
N i=1
1 
N
X , μ = arg min = (<XṖi> + μ − Qi ) (19)
X ∈R 6, μ ∈R N i=1
In Equation (18), vector X and μ are learned parameters. Equation (19) shows the method of
solving the parameters, where Pi is the ith row data of matrix P, Qi is the ith data of the vector
Q, and <XṖi > calculates the inner product of two vectors. P is a matrix with the data above,
and Q is the data in the first column of matrix P. Figure 4 compares the Pearson product-
moment correlation coefficient (PCCs) between each influence factor pair, indicating the
degree of linear correlation between the two influence factors. The result shows that there is
a strong linear relationship between infections, GDP, education, innovation, and population,
and there is no linear relationship between location and other factors.

1 2019ncov.chinacdc.cn/.
2 https://www.nytimes.com/interactive/2021/us/covid-cases.html/.
3 data.stats.gov.cn/.
4 maps.google.com/.
5 www.ncei.noaa.gov/.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:12 S. Yu et al.

Fig. 4. PCCs between influence factors.

• City medical capacity: More than half of the cities and counties in the United States do not
have Intensive Care Units (ICUs), and the ICU has complex equipment that can effectively
treat COVID-19. The shortage of ICUs makes people worried about the further spread of
COVID-19 and the cure rate. The areas will delay the treatment of infected people. This
work uses the ICU beds data of 51 states in the United States provided by Kaiser Health
News (KHN) to reflect the medical capabilities of a state.
• City climate conditions: Humidity and temperature may affect the spread of COVID-19;
the socio-economic and health demographics of warmer regions tend to be different with
colder regions. It includes the daily average temperature, minimum temperature, and max-
imum temperature for each state. By comparing the accuracy of the experimental results
under the three temperature attributes, the results indicate that the lowest temperature has
the most accurate. Therefore, this work uses the city’s daily lowest temperature to measure
the city’s climate conditions.
In addition, we consider S (Susceptible) and R (Recovered) as additional features of the nodes,
which is inspired by the underlying principle of the SIR model.

5.2 Baselines
We compare our model with several state-of-the-art methods in spatio-temporal representation to
evaluate STEP:
• SIR (Susceptible Infected Recovered): The classic SIR model consists of three ordinary
differential equations, which can be used to understand how the COVID-19 spreads among
different groups of people in a timely manner.
• Regression: By observing the time-series epidemic data of each state, we find that the number
of infected people in each state has shown a gradual increasing trend over time. Therefore,
we analyze that there may be a linear relationship between the number of infected people in
a city and the length of infection time, so we use a linear regression model to predict changes
in infection trends.
• ARIMA: The ARIMA model is a statistical analysis method that uses time series data to
predict future values by examining the differences between values in the time series.
• LSTM: In recent years, deep learning methods have had excellent effects on the prediction
problem. The LSTM model retains the information that meets the conditions and forgets the
information that does not meet the conditions.
• GCN: GCN can effectively solve graph-based problems, and better results can be obtained
with lower time complexity and space complexity. We use the data from May 1 to August

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:13

31 in 2020 as the training set and use softmax function to active the embedding result for
outputting the final prediction result.
• STP-TrellisNets [38]: STP-TrellisNets is a spatio-temperal prediction network proposed to
forecast metro station passenger flows. It employs multiple TrellisNets in serial to model
both short-term and long-term correlation of metro station passenger flows. A GCN model
integrated with TrellisNet is also utilized to capture the spatial correlation between the flows.
• DCRNN [29]: In DCRNN, the traffic prediction on the road network is considered as a spatio-
temporal forecasting problem. The spatial dependency is captured through bidirectional ran-
dom walks, and the temporal dependency is captured by using autoencoder with scheduled
sampling.

5.3 Experimental Settings


In the city spatio-temporal network of this work, we set N = 51 to be the number of the US states,
the static feature number c 1 is 2, the dynamic feature number c 2 is 2, and the value of T is 123 to
indicate the number of timestamps. In the sequence, the dataset between May 1 and August 31 is
used as the training set, and the data from September 1 to 30 and the data from September 1 to
October 30 are used as the test set, respectively. The learning rate is set to 1 × 10−4 , the training
epochs is set to 2,000, and the number K of multi-head attention mechanism is set to 3.

5.4 Evaluation Metrics


In the experiments, we adopt the Mean Absolute Error (MAE) and Root Mean Square Error
(RMSE) as the evaluation metrics to evaluate our method:
1
n
MAE = |yi − yi |, (20)
n i=1

 n
1
RMSE = (yi − yi ) 2 , (21)
n i=1
wherein, yi is the predicted value of x based on experimental methods.

5.5 Results and Analysis


In this work, we use the SIR model, regression model, ARIMA model, LSTM model, GCN and our
STEP model to predict the number of COVID-19 confirmed cases in 51 states in the United States.
We verify the forecast performance of our method in both long and short terms (30 days and 60
days). Detailed experimental results can be viewed in Appendix A. Tables 3 and 4 show the forecast
results of STEP and baselines in comparison for 30-day and 60-day time series data, respectively.
To better illustrate the results, we divide all data by 1,000 and round to two decimal places. The
best results are in bold. To show the experimental results more intuitively, we also present the
experimental results in the form of heat map, which can be seen in Figures 5, 6, 7, and 8. The best
results are in the darkest color.
As Table 3 shows, STEP achieves the best results of MAE in 30 states and RMSE in 29 states
and presents normal distribution in both MAE and RMSE. MAEs of 48 states are between 2–14;
94.1% of the states are stable in this range, of which MAEs of 25 states are between 2–4. MAEs
of the remaining three states are between 22 and 24. RMSEs of 49 states are between 2 and 18;
96.1% of the states are stable in this range. RMSEs of the remaining two states are between 22 and
24. When considering MAE as our evaluation metrics, compared with the baselines, the average
reduction rates of MAE in 51 states are 23.7%, 23.9%, 19.5%, 12.6%, 12.9%, 4.3%, and 3.8%. For RMSE,
they are 25.9%, 25.8%, 25.9%, 15.8%, 15.1%, 7.3%, and 6.9%, respectively. We analyze the states where

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:14 S. Yu et al.

Fig. 5. The prediction results in 30 days on MAE.

Fig. 6. The prediction results in 30 days on RMSE.

STEP does not achieve the best results and find that the baselines achieve very low error in these
states. For example, in DE, regression achieves RMSE of 0.12 and LSTM achieves MAE of 0.55. After
analyzing the changes in the number of confirmed cases in these states, we find that the number
of confirmed cases in these states show a linear growth trend. The fact that regression achieves
the best results in 7 of the remaining 14 states confirms this conjecture.
As shown in Table 4, STEP achieves the best results in MAE of 27 states and RMSE of 29 states.
MAEs of 46 states are between 2 and 14, of which 22 states are between 2 and 4, and MAEs of the
remaining 5 states are between 22 and 26. When considering MAE as an evaluation metrics, com-
pared with SIR, ARIMA, regression, LSTM, and GCN, the average reduction rates in 51 states are
64.3%, 65.5%, 67.5%, 61.6%, 34.9%, 29.6%, 27.9%, 23.3%, and 19.8%. Considering RMSE as evaluation
metrics, the average reduction rates in 51 states are 66.2%, 65.8%, 71.7%, 59.9%, 34%, 29.3%, 22.7%,

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:15

Fig. 7. The prediction results in 60 days on MAE.

Fig. 8. The prediction results in 60 days on RMSE.

19.2%, and 10.8%, respectively. By comparing STEP and two traffic flow prediction methods, the
traffic flow prediction methods can not get good results when they are directly applied to epidemic
prediction. The reason for our analysis is that traffic flow prediction tasks usually have only one or
two node attributes, but in epidemic prediction tasks, the nodes have higher dimensional attribute,
so our method achieves better results.
We will analyze the differences between STEP and other methods from two aspects: prediction
accuracy and stable prediction results in the following:
(1) Higher prediction accuracy: We find that the neural network method, STEP and LSTM mod-
els usually have better prediction accuracy than other baselines. For example, in Kansas (KS),
due to the continuous fluctuation of the number of epidemics, the SIR model, ARIMA model, and
Regression model do not perform well. This may be because mathematical-based methods such as

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:16 S. Yu et al.

Fig. 9. Case study of confirmed cases prediction. Case 1 shows the realistic confirmed cases and our pre-
diction results including four counties, including Maryland, South Carolina, and Virginia. Case 2 includes
Colorado, New York, and Oregon. Case 3 includes Idaho, Kansas, and New Jersey. Case 4 includes Maine
and Montana.

SIR and ARIMA are difficult to handle complex non-stationary time series data. In addition, the
prediction accuracy of the ARIMA model is relatively low, mainly because the ARIMA model is cal-
culated by the error of each node and averaging. If some data fluctuates, then it will also increase
the final overall error.
(2) Stable prediction results: Regardless of the changes in the number of epidemics, or in different
states, the STEP model can obtain the best prediction performance from time series features, and
the changing trend of the prediction results is small, which indicates that our model is particularly
sensitive to changes in the number of epidemics and has a certain degree of stability through the
influence of a variety of feature factors.
Four common curve changes from the experiment are shown in Figure 9, and several represen-
tative states are selected as examples. The solid line represents the reality, and the dashed line
represents the forecast results.
In the first group (Case 1), the forecast results fit the reality well, and it can be observed in
states with more confirmed cases. Our proposed STEP model generally fits well. The number of
confirmed cases in Maryland, South Carolina, and Virginia exceeds the average infection number
in 51 states.
In the second group (Case 2), the two curves fit in the first half, but in the second half, there
is a deviation. We investigate the political decisions of these three states and find that Colorado
opened certain public places on September 28, 2020, and resumed certain activities to reduce the
impact of the epidemic on tenants. New York announced on September 10, 2020, that indoor dining
in New York would resume from September 30 (time series 30 in the figure). Oregon announced on
October 6 that it would deploy a large number of rapid COVID-19 tests. The situation mentioned
above may lead to a sudden increase in the number of confirmed diagnoses, and our model does
not fit well in the later stage because the mentioned influences are not considered.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:17

In the third group (Case 3), the two endpoints of the curve are partially fitted, but the mid-term
deviation is significant. The state of Idaho announced on August 25 that it would continue to
implement the Idaho Rebounds Plan for two weeks and remain in Stage 4 during the subsequent
period. Governor Brad Little announced that Idaho would remain in Stage 4 of the Idaho Rebounds
plan for another two weeks—the fourth stage represents a lower degree of openness—and would be
restored to the third stage in February 2021. Between August 24 and September 8, several counties
in the state of Kansas issued emergency public health orders. Between August 18 and September
23, New Jersey continued to add states and regions into the travel restriction list. These policies
have limited the spread of COVID-19 to some extent, resulting in the model’s forecast number of
people being higher than the real number.
In the fourth group (Case 4), there is a big difference between the forecast curve and the real
curve, and the forecast value is higher than the real value. Maine extended the State of Civil Emer-
gency in September. Montana announced in September for K—12 schools to respond to confirmed
or suspected cases. This may inhibit the spread of COVID-19, leading to deviations in our forecast
results.6

5.6 Ablation Study


We conduct the ablation study to verify the effectiveness of key strategies we propose. The visual-
izations of experimental results are shown in Figures 10, 11, 12, and 13. The detailed experimental
results are presented in Tables 5 and 6 in Appendix A.
STEP improves the forecast accuracy of the model by introducing CRI coefficients. In Tables 5
and 6, we summarize the results of using CRI and not using CRI (STEP-NC). In 30-day time series
data, compared with STEP, the MAE of STEP-NC increases by 30.3% and RMSE increases by 46.8%.
In the 60-day time series data, the MAE of STEP-NC increases by 9.2%, and the RMSE increases
by 6.7%. The results show that CRI improves the short-term forecast performance significantly,
but the improvement in the long-term forecast is not obvious. It may be due to the increase in
the amount of data that the impact of CRI on the model decreases, resulting in a small difference
between the effect of using CRI and not using CRI in the long-term forecast.
To verify the effectiveness and efficiency of GRU module, we use LSTM to replace GRU module
in STEP model, named STEP-LS. As shown in Tables 5 and 6, STEP achieves better results than
step-LS. In 30-day time series data, compared with STEP, STEP-LS increases by 26% on MAE and
27% on RMSE, respectively. In 60-day time series data, compared with STEP, the MAE of STEP-LS
increases by 16.9%. It can be concluded that the two modules have almost the same influence on
the prediction results. However, the time complexity of GRU is lower than LSTM. The time cost
is also lower than STEP-LS using the LSTM. We do not design relevant experiments to prove this
conclusion, as it has been confirmed by previous work [8, 16].

6 CONCLUSION
Provisioning of services that can predict the number of confirmed cases is of great significance
in combatting the COVID-19 epidemic. The accurate prediction result can alert people to prepare
in advance. In this article, we have proposed STEP (Spatio-Temporal Epidemic Prediction) to
forecast the number of confirmed cases. First, we take CRI (City Risk Index), medical conditions,
and weather conditions as node features, and take geographical adjacency information as the
edge, construct spatio-temporal network with the day as the time series. Using GCN and GRU
to embed the nodes in the network, we finally use MLP (Multilayer Perceptron) to obtain the
prediction result. According to the comparison results of traditional prediction algorithms and
6 The above policies information can be found in: https://www.huschblackwell.com.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:18 S. Yu et al.

Fig. 10. Ablation results in 30 days on MAE.

Fig. 11. Ablation results in 30 days on RMSE.

Fig. 12. Ablation results in 60 days on MAE.

Fig. 13. Ablation results in 60 days on RMSE.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:19

machine learning algorithms, our approach has greatly reduced the error. STEP can accurately
predict the number of confirmed cases. By using our method, people can prepare well in advance
to deal with the outbreak of the epidemic.
The outbreaks of SARS and MERS in recent years have had a great impact on global economy
and development of the world. Many countries call the outbreak of these epidemic ‘disaster’. The
government and authoritative organizations say that the most effective way to combat is social
distancing, but this will also have an impact on social development and the normal life of human
beings. If there is no adequate predictive method to prepare beforehand, then under the trans-
portation network, the infection of a city may spread rapidly to the entire country. Therefore,
a reasonable, accurate, and long-term effective forecast service of the spread of the epidemic is
helpful for policy release, epidemic prevention and control, and economic stability, and it has far-
reaching significance. Our work provides a prediction method for this problem that is superior
to other algorithms in the short and long term, and the error is greatly reduced, thus providing
the entire society with more accurate information on the spread of the epidemic, which can better
combat COVID-19. We will consider the transportation information in the spread of epidemic for
more precise prediction.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:20 S. Yu et al.

APPENDIX
A DETAILED EXPERIMENTAL RESULTS

Table 3. The Prediction Results in 30 Days

Model State AK HI WY DE WV NH DC ME VT MT MI AL NE ID NM ND NV
MAE 0.55 0.44 0.37 5.07 1.31 1.23 1.42 4.08 1.52 1.60 21.32 34.01 1.57 9.56 8.22 1.28 16.28
SIR
RMSE 0.65 0.51 0.39 5.08 1.34 1.23 1.43 4.08 1.52 1.73 21.36 34.14 1.58 9.67 8.27 1.39 19.11
MAE 0.37 0.05 0.70 1.13 3.11 0.84 1.59 0.22 1.22 3.15 8.56 8.12 2.94 4.05 1.29 6.19 2.06
ARIMA
RMSE 0.39 0.06 0.81 1.17 3.26 0.84 1.62 0.22 1.22 3.34 8.86 8.20 3.25 4.09 1.42 6.60 2.57
MAE 0.56 0.52 0.72 2.19 2.76 1.82 1.05 0.55 1.14 2.24 1.06 1.17 1.28 4.49 2.61 1.82 2.15
Regression
RMSE 0.61 0.53 0.73 0.12 3.12 1.84 1.15 0.52 1.71 2.28 1.34 1.59 1.19 4.56 3.10 1.96 2.39
MAE 0.46 0.53 0.17 0.55 0.64 1.20 1.06 1.44 1.53 1.55 13.60 41.63 1.31 1.22 2.39 1.65 22.48
LSTM
RMSE 0.58 0.51 0.12 0.48 0.60 1.19 1.05 1.14 1.52 1.46 13.45 41.51 1.10 1.96 2.35 1.55 22.26
MAE 0.45 0.52 2.49 1.46 1.11 1.31 1.60 1.31 2.82 1.64 2.05 1.18 2.52 2.03 2.28 2.78 2.99
GCN
RMSE 0.42 0.52 2.46 1.47 1.33 1.59 1.75 1.46 2.79 2.02 2.14 1.25 2.45 2.46 2.68 2.75 2.45
MAE 0.62 0.93 0.78 0.74 3.51 1.52 1.31 2.23 1.22 1.28 1.68 1.29 3.63 1.36 1.16 4.20 2.12
STP
RMSE 0.87 0.64 0.85 2.43 2.59 1.81 3.05 1.55 2.00 2.95 1.41 1.14 4.31 2.14 2.22 5.34 1.91
MAE 0.54 0.90 0.18 0.42 0.95 1.15 1.74 3.16 1.27 2.06 1.33 2.48 2.72 2.26 2.24 7.89 2.76
DCRNN
RMSE 0.84 0.39 0.09 1.65 1.00 1.11 3.87 1.04 1.38 2.65 0.99 1.55 8.54 3.68 1.87 6.62 2.94
MAE 0.36 0.42 0.31 0.92 0.92 0.74 0.90 1.25 1.86 1.50 1.34 0.63 1.64 1.16 1.85 2.06 2.06
STEP
RMSE 0.32 0.45 0.31 1.19 0.91 0.77 1.02 1.05 1.86 1.43 2.16 0.91 1.56 1.72 2.06 2.52 2.02
Model State AR IN MS OR SC SD PA NC UT CO KY CT RI MO LA WA OK
MAE 1.86 9.56 2.90 3.50 32.39 3.07 15.09 7.39 2.80 4.52 5.51 8.23 5.43 33.27 6.24 17.42 9.93
SIR
RMSE 1.86 9.67 3.48 3.83 33.78 3.17 15.17 7.58 3.39 5.60 6.55 8.27 5.54 34.18 7.46 18.49 10.58
MAE 6.54 4.05 4.18 3.03 1.48 5.08 3.09 6.44 3.40 1.51 12.07 2.10 5.60 28.61 4.65 6.19 15.23
ARIMA
RMSE 6.90 4.09 4.46 3.05 1.72 5.43 3.27 6.63 3.63 1.90 12.38 2.11 5.13 29.50 5.86 6.54 15.87
MAE 1.07 4.49 2.03 3.69 1.85 3.07 3.54 3.92 4.58 5.11 6.26 3.24 2.41 5.88 3.90 3.62 7.80
Regression
RMSE 1.49 4.56 2.47 4.16 2.44 3.28 3.61 4.72 4.17 5.20 6.34 3.14 3.09 5.58 4.77 3.34 8.31
MAE 1.08 1.22 4.50 2.76 8.71 3.55 3.05 3.77 3.16 5.82 10.22 3.24 4.99 17.25 2.03 7.14 2.72
LSTM
RMSE 1.19 1.96 4.02 3.53 7.39 3.44 3.83 3.07 2.32 5.46 9.93 3.08 4.90 14.82 1.51 6.28 2.89
MAE 3.98 2.03 1.45 4.84 3.65 2.81 4.37 2.39 3.65 4.06 1.50 5.54 4.18 2.27 11.70 3.14 2.44
GCN
RMSE 4.65 2.46 1.27 4.76 3.85 2.79 4.86 2.57 3.90 4.54 1.71 7.45 4.28 2.35 12.49 3.35 3.16
MAE 3.62 1.50 2.45 2.21 1.59 4.85 3.72 2.61 7.28 3.00 2.45 6.26 4.15 2.32 10.30 6.44 13.03
STP
RMSE 7.44 4.48 0.83 7.99 3.49 2.00 5.52 5.14 2.67 2.36 2.17 11.03 5.65 3.62 23.86 4.31 5.07
MAE 3.11 2.94 4.70 1.99 13.50 6.53 3.11 3.10 5.25 1.86 1.59 7.32 7.49 3.61 8.55 5.00 3.43
DCRNN
RMSE 5.13 5.95 1.10 5.22 8.87 2.61 5.09 9.71 3.74 3.97 1.59 11.69 8.13 6.01 36.74 5.97 3.47
MAE 0.79 1.16 1.59 2.37 2.51 2.62 2.43 2.11 2.78 0.76 1.43 0.92 2.51 1.46 1.31 2.88 2.29
STEP
RMSE 1.17 1.72 2.33 2.97 2.47 2.52 2.12 2.46 2.78 1.25 1.64 1.42 3.27 2.34 1.50 2.96 2.84
Model State NJ IA WI KS OH TN MN IL MA VA MD FL GA TX CA NY AZ
MAE 37.21 1.36 10.79 14.46 3.91 7.47 9.25 58.72 11.27 15.73 16.38 170.33 46.53 40.95 124.08 79.60 32.28
SIR
RMSE 37.23 1.46 11.17 14.88 3.99 9.80 9.25 59.85 11.68 18.42 18.34 207.58 47.85 54.55 174.21 79.65 37.44
MAE 5.65 12.18 16.10 9.65 10.09 16.73 6.80 23.86 12.04 6.67 1.68 29.28 16.40 27.20 33.25 9.38 38.29
ARIMA
RMSE 5.67 12.59 18.47 9.88 10.14 16.78 7.12 24.55 12.26 6.76 1.93 35.81 17.19 28.49 33.98 9.41 40.25
MAE 5.02 1.37 6.83 6.89 3.85 1.08 4.39 5.49 10.77 8.06 1.85 2.75 21.14 6.39 20.56 5.37 30.73
Regression
RMSE 5.67 1.70 6.21 7.24 4.06 1.16 4.94 5.25 14.51 8.39 1.35 3.33 21.35 6.33 24.61 5.26 31.12
MAE 12.31 7.31 4.57 6.73 3.75 13.58 4.55 7.89 10.83 26.65 38.14 59.93 24.50 25.86 31.60 35.09 27.32
LSTM
RMSE 11.80 7.23 5.48 6.01 3.64 12.56 4.36 7.15 10.60 26.07 37.76 57.55 23.20 24.71 31.59 34.43 27.15
MAE 11.09 5.98 9.02 3.60 11.94 7.57 9.47 11.42 12.37 4.56 9.47 13.26 20.62 22.20 27.53 26.14 32.47
GCN
RMSE 11.96 7.53 9.19 4.53 11.79 7.82 9.73 11.78 12.04 4.90 9.37 14.36 20.75 25.79 23.74 28.76 33.69
MAE 6.10 6.22 5.81 5.76 23.76 2.11 10.99 12.22 18.18 15.96 15.25 7.69 29.90 3.51 17.07 32.94 58.45
STP
RMSE 20.21 5.80 3.17 8.65 18.98 2.06 6.32 22.15 11.80 4.45 9.37 18.96 20.13 4.75 34.19 28.76 35.04
MAE 3.66 9.20 2.38 11.12 28.99 9.23 9.56 9.41 24.73 16.79 15.25 12.92 23.92 39.57 10.75 49.08 75.40
DCRNN
RMSE 11.52 5.97 3.29 14.97 17.65 21.23 5.12 40.53 9.44 43.02 17.71 11.18 18.32 19.27 43.42 14.67 53.96
MAE 3.09 1.20 2.98 1.19 2.29 2.65 1.47 6.84 1.33 2.86 1.25 11.01 11.96 4.70 11.54 5.00 10.71
STEP
RMSE 3.79 1.36 3.14 1.96 2.74 4.58 2.21 7.05 2.16 2.95 1.47 11.57 11.58 4.60 11.02 5.33 11.07

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:21

Table 4. The Prediction Results in 60 Days

Model State AK VT MT NE DE ME ND SD HI DC IN NH CO MS WV WY ID
MAE 1.31 2.36 3.04 1.68 5.23 3.68 5.02 7.53 3.32 9.75 4.26 6.16 3.35 3.39 4.49 4.15 7.41
SIR
RMSE 1.45 2.37 3.38 1.70 5.23 3.70 5.60 7.81 2.93 9.86 5.47 6.19 3.75 3.87 4.86 4.24 8.09
MAE 0.90 2.01 4.17 2.68 3.19 3.54 5.54 3.30 2.98 8.72 17.22 5.96 4.99 18.21 3.85 4.96 11.23
ARIMA
RMSE 1.49 2.02 4.57 3.24 3.23 3.55 6.63 4.44 2.59 7.90 18.71 6.00 5.15 18.37 4.23 5.07 11.46
MAE 2.53 2.23 1.36 8.98 2.22 3.57 1.19 1.30 6.12 7.65 33.90 3.26 4.07 39.73 4.25 5.56 4.93
Regression
RMSE 2.68 2.31 2.05 9.30 3.17 3.72 1.62 2.16 6.81 7.11 41.17 3.76 5.34 42.00 4.06 5.72 5.89
MAE 1.23 2.22 1.21 1.52 1.80 4.04 2.11 1.04 5.38 3.22 3.34 4.04 16.96 5.06 4.27 4.11 3.99
LSTM
RMSE 1.17 2.42 1.60 1.73 1.95 4.05 2.44 1.25 5.08 3.23 3.66 4.05 17.91 5.60 4.32 4.16 4.05
MAE 0.77 3.83 3.26 2.89 2.40 2.14 3.43 4.44 3.95 2.30 2.35 3.16 3.68 2.92 3.31 3.12 3.82
GCN
RMSE 0.86 3.96 3.51 2.63 2.75 2.58 3.73 4.16 4.56 2.72 2.86 3.78 3.79 3.57 3.76 3.76 4.00
MAE 1.59 2.22 2.17 2.77 3.60 2.50 2.05 5.57 3.39 3.80 3.71 6.79 21.20 4.35 5.75 3.53 4.35
STP
RMSE 1.87 1.47 1.22 3.11 1.91 6.80 2.07 6.33 3.75 6.14 2.67 7.17 13.79 4.31 5.64 2.25 6.80
MAE 0.64 1.35 2.84 5.40 3.89 2.16 5.25 1.68 3.61 2.62 2.75 1.86 2.94 3.65 2.77 5.85 1.99
DCRNN
RMSE 1.89 2.14 4.77 5.02 3.19 2.43 6.23 4.66 2.51 3.48 2.23 6.27 7.20 3.00 4.57 5.93 4.28
MAE 0.48 3.12 2.34 2.37 1.67 2.11 2.64 3.00 2.31 1.65 1.92 2.44 2.50 2.27 3.18 3.28 2.86
STEP
RMSE 0.49 3.87 2.75 2.79 1.73 2.30 3.76 4.58 2.03 1.74 2.09 2.96 2.61 2.56 3.20 3.32 3.04
Model State UT KY OH OK OR NM WI AR CT IA AL LA KS NV RI MD WA
MAE 31.99 2.65 4.35 7.97 14.12 7.86 8.78 7.94 10.00 11.01 27.84 7.68 9.68 9.15 17.23 8.55 12.06
SIR
RMSE 32.93 4.34 5.11 8.61 14.66 7.96 9.20 8.51 10.16 11.15 29.42 8.92 10.84 13.04 17.28 12.35 13.60
MAE 4.52 16.15 20.41 23.11 5.87 8.11 19.19 12.20 4.81 10.74 26.55 22.44 12.09 16.76 8.46 11.07 10.22
ARIMA
RMSE 5.10 17.51 21.07 24.67 5.98 8.36 21.93 12.77 4.95 12.72 26.85 22.70 13.24 16.83 8.47 11.28 10.26
MAE 3.36 6.39 40.78 5.78 3.64 8.12 5.02 9.74 3.67 11.84 32.88 73.30 4.48 7.16 8.95 6.79 6.40
Regression
RMSE 4.43 8.78 47.64 6.01 4.79 7.84 6.45 12.87 5.06 13.97 45.62 118.53 6.11 10.68 8.55 10.18 8.79
MAE 4.20 34.07 6.11 0.97 15.24 4.49 5.37 4.02 9.82 5.09 5.97 7.47 17.79 5.29 5.06 28.00 10.54
LSTM
RMSE 4.33 37.56 6.17 1.36 15.52 4.52 5.38 4.08 9.71 6.42 6.87 9.03 19.14 5.87 5.31 29.11 11.30
MAE 4.43 2.35 5.02 3.71 3.66 3.58 6.40 4.64 4.13 6.75 3.53 7.26 5.16 8.57 4.48 7.45 4.72
GCN
RMSE 4.36 2.77 5.05 4.41 4.13 3.20 6.96 4.45 4.99 6.29 3.12 8.88 5.47 8.88 4.26 7.74 5.05
MAE 34.87 26.92 3.85 4.70 10.03 2.83 5.88 3.42 10.80 3.26 7.76 9.04 33.62 8.78 18.09 28.28 15.44
STP
RMSE 27.66 51.08 7.10 9.56 16.57 8.86 7.27 4.90 9.61 3.66 11.89 17.07 10.14 5.64 19.01 17.17 16.46
MAE 4.93 1.69 4.57 23.34 5.34 6.84 23.03 8.58 7.81 3.38 2.19 7.26 7.33 9.77 10.49 10.80 11.24
DCRNN
RMSE 6.63 5.26 2.73 14.31 7.53 6.14 13.38 3.29 3.29 10.00 3.49 8.35 9.03 12.17 4.24 6.35 8.00
MAE 3.14 2.02 3.78 2.79 2.83 2.49 5.25 3.45 3.61 5.96 3.19 7.11 3.97 4.40 3.91 6.12 3.15
STEP
RMSE 4.76 2.22 3.81 4.02 4.19 3.18 5.22 3.39 3.74 5.14 3.14 7.25 3.13 4.83 4.21 6.36 5.01
Model State MN PA MA MI NC NJ MO VA TN SC GA AZ IL TX NY CA FL
MAE 9.04 11.68 13.27 18.44 8.45 39.12 25.74 19.20 17.49 17.53 46.53 46.48 38.16 33.24 86.52 128.08 89.01
SIR
RMSE 9.05 12.19 13.63 18.66 9.35 39.17 27.09 22.69 19.54 17.81 47.85 50.27 43.61 43.60 86.79 128.23 139.03
MAE 7.94 5.71 13.00 10.66 18.71 10.90 38.44 20.00 41.60 18.07 16.40 21.43 21.09 143.08 38.05 75.57 117.38
ARIMA
RMSE 8.85 5.99 14.41 11.72 19.00 11.31 41.65 20.64 42.53 18.09 17.19 25.67 25.08 143.08 38.77 100.06 119.17
MAE 15.81 25.58 72.61 35.16 14.39 12.86 10.24 18.72 21.17 37.14 21.14 23.11 28.18 62.54 56.77 124.87 60.65
Regression
RMSE 15.74 31.18 73.76 49.58 14.54 18.21 14.48 19.70 29.38 41.38 21.35 31.77 33.96 84.53 59.23 126.10 85.25
MAE 11.21 12.74 9.84 10.30 11.49 9.82 21.40 15.43 12.68 58.33 24.50 42.35 47.19 28.49 60.29 83.99 163.53
LSTM
RMSE 11.60 13.08 10.00 10.60 12.15 9.03 25.43 16.93 14.96 59.35 23.20 41.93 41.81 25.51 58.84 82.34 158.29
MAE 12.94 13.22 12.59 12.71 14.13 45.80 12.48 12.91 23.85 42.90 20.62 14.28 34.02 15.16 38.99 27.53 13.85
GCN
RMSE 13.49 14.98 12.63 12.99 14.50 44.98 13.01 13.60 24.56 43.78 20.75 15.03 38.46 17.66 40.98 23.74 13.98
MAE 18.61 12.03 11.71 13.80 10.11 15.81 27.39 19.58 10.67 15.43 43.37 71.57 23.60 29.92 30.15 46.19 322.15
STP
RMSE 19.14 12.56 8.60 10.81 23.69 11.65 49.84 21.56 15.24 10.69 39.44 60.80 22.16 54.06 50.60 85.63 248.52
MAE 10.48 5.02 12.97 20.21 24.44 44.88 12.11 23.00 23.71 20.60 27.01 25.56 53.07 134.50 21.05 42.12 8.45
DCRNN
RMSE 7.55 7.13 7.20 14.81 14.36 53.53 16.26 26.01 22.54 14.47 28.43 18.94 54.23 84.42 79.09 23.74 21.67
MAE 6.23 8.86 7.16 8.17 12.66 13.87 12.30 13.12 13.01 12.92 11.96 11.36 22.53 13.01 22.62 20.48 11.77
STEP
RMSE 6.60 9.22 7.43 8.48 13.64 12.09 12.64 15.09 14.65 14.35 11.58 12.38 23.12 14.79 23.43 16.49 11.81

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:22 S. Yu et al.

Table 5. Ablation Results in 30 Days

Model State WY AK HI AL CO AR DE NH DC WV CT KS IA MD ID MI IN
MAE 0.31 0.36 0.42 0.63 0.76 0.79 0.92 0.74 0.90 0.92 0.92 1.19 1.20 1.25 1.16 1.34 1.16
STEP
RMSE 0.31 0.32 0.45 0.91 1.25 1.17 1.19 0.77 1.02 0.91 1.42 1.96 1.36 1.47 1.72 2.16 1.72
MAE 1.18 0.76 0.88 0.76 1.03 0.94 1.26 0.97 1.19 1.07 1.17 1.44 1.36 1.47 1.32 1.55 1.37
STEP-NC
RMSE 1.33 1.02 1.15 1.15 1.57 1.44 1.84 1.04 1.82 1.31 1.28 2.21 2.07 2.30 2.03 2.39 2.10
MAE 0.45 0.59 0.60 0.72 0.70 0.67 0.86 1.04 1.02 1.16 1.27 1.17 1.14 1.01 1.29 1.30 1.40
STEP-LS
RMSE 0.60 0.61 0.90 1.23 1.48 1.56 1.62 0.83 1.25 0.93 1.59 1.96 1.44 1.75 2.18 3.11 1.57
Model State LA KY ME MA MN MO MT NE NM NV MS NC ND VT OR OH OK
MAE 1.31 1.43 1.25 1.33 1.47 1.46 1.50 1.64 1.85 2.06 1.59 2.11 2.06 1.86 2.37 2.29 2.29
STEP
RMSE 1.50 1.64 1.05 2.16 2.21 2.34 1.43 1.56 2.06 2.02 2.33 2.46 2.52 1.86 2.96 2.74 2.84
MAE 1.44 1.41 1.44 1.56 1.61 1.69 1.78 1.84 2.14 1.88 4.18 2.22 2.36 2.88 2.37 2.39 2.47
STEP-NC
RMSE 2.24 2.19 1.23 2.36 2.43 2.59 1.70 2.78 2.29 1.87 4.46 3.58 3.75 2.89 2.97 3.79 2.97
MAE 1.39 2.00 1.55 1.44 1.76 1.80 1.85 2.31 1.61 2.88 2.08 1.77 2.12 2.23 3.37 2.91 4.37
STEP-LS
RMSE 1.55 1.53 1.19 2.66 1.83 2.13 1.26 2.32 2.84 1.90 3.29 2.46 2.14 3.11 4.26 4.03 3.41
Model State PA SC RI SD TN UT VA WA WI NJ TX NY IL AZ FL GA CA
MAE 2.43 2.51 2.51 2.62 2.65 2.78 2.86 2.88 2.98 3.09 4.70 5.00 6.84 10.71 11.01 11.96 11.54
STEP
RMSE 2.12 2.47 3.27 2.52 4.58 2.78 2.95 2.96 3.14 3.79 4.60 5.33 7.05 11.07 11.57 11.58 11.02
MAE 2.59 2.66 2.72 2.73 2.80 2.95 3.02 3.05 3.10 4.36 5.84 6.23 7.06 12.89 12.12 13.16 33.25
STEP-NC
RMSE 4.20 4.43 3.33 3.55 4.66 2.87 4.97 4.12 5.24 5.08 5.72 6.44 7.15 12.39 11.83 13.84 33.98
MAE 3.23 4.99 3.84 4.01 4.77 4.78 4.52 3.92 4.05 4.42 5.55 6.00 8.34 10.88 11.45 13.40 15.12
STEP-LS
RMSE 2.12 3.78 5.69 4.96 5.18 3.67 5.31 5.68 4.40 3.56 7.08 4.96 9.31 11.25 15.74 11.35 11.90

Table 6. Ablation Results in 60 Days

Model State AK DC DE IN ME KY HI MS NE MT NH CO NM ND OK OR ID
MAE 0.48 1.65 1.67 1.92 2.11 2.02 2.31 2.27 2.37 2.34 2.44 2.50 2.49 2.64 2.79 2.83 2.86
STEP
RMSE 0.49 1.74 1.73 2.09 2.30 2.22 2.03 2.56 2.79 2.75 2.96 2.61 3.18 3.76 4.02 4.19 3.04
MAE 1.26 1.89 1.90 2.12 2.27 2.21 3.16 2.47 2.57 2.51 2.62 2.71 2.75 2.86 2.94 2.95 3.07
STEP-NC
RMSE 1.46 2.25 2.21 2.40 2.50 2.46 3.26 2.63 2.66 2.64 2.69 3.12 2.72 2.80 2.85 2.96 3.38
MAE 0.47 1.83 2.17 2.50 2.15 2.63 2.26 2.36 2.18 2.93 2.78 2.73 3.24 3.12 3.38 3.85 3.58
STEP-LS
RMSE 0.66 2.28 2.37 2.09 2.67 3.04 2.68 2.56 2.59 3.19 3.29 3.21 2.93 4.17 4.06 3.94 3.37
Model State SD VT UT AL WA WV WY AR OH CT RI KS NV WI IA MD MN
MAE 3.00 3.12 3.14 3.19 3.15 3.18 3.28 3.45 3.78 3.61 3.91 3.97 4.40 5.25 5.96 6.12 6.23
STEP
RMSE 4.58 3.87 4.76 3.14 5.01 3.20 3.32 3.39 3.81 3.74 4.21 3.13 4.83 5.22 5.14 6.36 6.60
MAE 3.07 3.21 3.24 3.33 3.35 3.36 3.40 3.58 4.86 3.83 4.06 4.19 4.59 5.36 5.15 6.31 6.42
STEP-NC
RMSE 2.98 3.04 3.02 3.76 3.08 3.21 3.15 4.03 4.81 4.17 4.92 3.42 4.69 6.11 5.41 6.58 6.61
MAE 4.41 3.12 3.49 4.37 3.84 4.23 3.58 4.49 3.48 4.58 3.75 4.53 4.93 6.72 8.22 7.28 6.79
STEP-LS
RMSE 6.14 3.95 6.57 3.39 4.96 4.22 3.69 4.44 3.96 5.16 4.00 4.35 6.23 5.48 5.55 6.30 8.12
Model State LA MA MI PA FL MO NC TN TX SC AZ VA NJ GA CA IL NY
MAE 7.11 7.16 8.17 8.86 11.77 12.30 12.66 13.01 13.01 12.92 11.36 13.12 13.87 11.96 20.48 22.53 22.62
STEP
RMSE 7.25 7.43 8.48 9.22 11.81 12.64 13.64 14.65 14.79 14.35 12.38 15.09 12.09 11.58 16.49 23.12 23.43
MAE 7.26 7.38 8.34 9.01 11.94 12.48 12.79 13.15 13.13 13.04 13.52 13.30 12.13 13.96 25.00 22.68 22.79
STEP-NC
RMSE 7.50 7.60 8.56 9.88 12.28 12.66 12.78 12.98 16.98 15.93 15.95 13.07 12.37 12.29 20.50 22.70 27.25
MAE 9.60 8.02 8.01 12.76 13.54 14.51 13.80 11.84 13.01 17.18 13.18 14.56 13.32 16.39 22.32 28.84 25.33
STEP-LS
RMSE 7.83 8.25 7.80 13.18 14.64 14.41 17.73 21.10 16.56 13.35 15.48 21.58 11.85 13.78 20.94 29.59 24.84

ACKNOWLEDGMENTS
The authors would like to thank Qihang Lei and Mingliang Liu for their help with experiments.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:23

REFERENCES
[1] Laura Alessandretti, Ulf Aslak, and Sune Lehmann. 2020. The scales of human mobility. Nature 587, 7834 (2020),
402–407.
[2] Javier Andreu-Perez, Humberto Perez-Espinosa, Eva Timonet, Mehrin Kiani, Manuel Ivan Giron-Perez, Alma B.
Benitez-Trinidad, Delaram Jarchi, Alejandro Rosales, Nick Gkatzoulis, Orion F. Reyes-Galaviz, Alejandro Torres, Car-
los Alberto Reyes-Garcia, Zulfiqar Ali, and Francisco Rivas. 2022. A generic deep learning based cough analysis sys-
tem from clinically validated samples for point-of-need Covid-19 test and severity levels. IEEE Transactions on Service
Computing 15, 3 (2022), 1220–1232.
[3] Hayat Dino Bedru, Shuo Yu, Xinru Xiao, Da Zhang, Liangtian Wan, He Guo, and Feng Xia. 2020. Big networks: A
survey. Comput. Sci. Rev. 37 (2020), 1–1.
[4] Dirk Brockmann and Dirk Helbing. 2013. The hidden geometry of complex, network-driven contagion phenomena.
Science 342, 6164 (2013), 1337–1342.
[5] Darlan S. Candido, Ingra M. Claro, Jaqueline G. de Jesus, William M. Souza, Filipe R. R. Moreira, Simon Dellicour,
Thomas A. Mellan, Louis Du Plessis, Rafael H. M. Pereira, Flavia C. S. Sales et al. 2020. Evolution and epidemic spread
of SARS-CoV-2 in Brazil. Science 369, 6508 (2020), 1255–1260.
[6] Wei Koong Chai and George Pavlou. 2016. Path-based epidemic spreading in networks. IEEE/ACM Trans. Netw. 25,
1 (2016), 565–578.
[7] Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An efficient algo-
rithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. 257–266.
[8] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recur-
rent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[9] Mohammad Reza Davahli, Krzysztof Fiok, Waldemar Karwowski, Awad M. Aljuaid, and Redha Taiar. 2021. Predict-
ing the dynamics of the COVID-19 pandemic in the United States using graph theory-based neural networks. Int. J.
Environ. Res. Pub. Health 18, 7 (2021), 3834.
[10] Jonas Dehning, Johannes Zierenberg, F. Paul Spitzner, Michael Wibral, Joao Pinheiro Neto, Michael Wilczek, and
Viola Priesemann. 2020. Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions.
Science 369, 6500 (2020).
[11] Shohreh Deldari, Daniel V. Smith, Hao Xue, and Flora D. Salim. 2021. Time series change point detection with self-
supervised contrastive predictive coding. In Proceedings of the Web Conference. 3124–3135.
[12] Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2019. Learning dynamic context graphs for predicting social
events. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
1007–1016.
[13] Songgaojun Deng, Shusen Wang, Huzefa Rangwala, Lijing Wang, and Yue Ning. 2020. Cola-GNN: Cross-location
attention based graph neural networks for long-term ILI prediction. In Proceedings of the 29th ACM International
Conference on Information & Knowledge Management. 245–254.
[14] Cornelius Fritz, Emilio Dorigatti, and David Rügamer. 2021. Combining graph neural networks and spatio-temporal
disease models to predict Covid-19 cases in Germany. arXiv preprint arXiv:2101.00661 (2021).
[15] Zhenxin Fu, Yu Wu, Hailei Zhang, Yichuan Hu, Dongyan Zhao, and Rui Yan. 2020. Be aware of the hot zone: A
warning system of hazard area prediction to intervene novel coronavirus COVID-19 outbreak. In Proceedings of the
43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2241–2250.
[16] Klaus Greff, Rupesh K. Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. 2016. LSTM: A search
space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28, 10 (2016), 2222–2232.
[17] Lihao Guo and Yuxin Yang. 2021. Research on the forecast of the spread of COVID-19. In Proceedings of the 11th
International Conference on Biomedical Engineering and Technology. 47–51.
[18] Tiberiu Harko, Francisco S. N. Lobo, and M. K. Mak. 2014. Exact analytical solutions of the susceptible-infected-
recovered (SIR) epidemic model and of the SIR model with equal death and birth rates. Appl. Math. Comput.
236 (2014), 184–194.
[19] Mikhail Hayhoe, Fady Alajaji, and Bahman Gharesifard. 2019. Curing epidemics on networks using a Polya contagion
model. IEEE/ACM Trans. Netw. 27, 5 (2019), 2085–2097.
[20] Suining He and Kang G. Shin. 2019. Spatio-temporal capsule-based reinforcement learning for mobility-on-demand
network coordination. In Proceedings of the World Wide Web Conference. 2806–2813.
[21] Solomon Hsiang, Daniel Allen, Sébastien Annan-Phan, Kendon Bell, Ian Bolliger, Trinetta Chong, Hannah Druck-
enmiller, Luna Yue Huang, Andrew Hultgren, Emma Krasovich et al. 2020. The effect of large-scale anti-contagion
policies on the COVID-19 pandemic. Nature 584, 7820 (2020), 262–267.
[22] Ling Huang, Xing-Xing Liu, Shu-Qiang Huang, Chang-Dong Wang, Wei Tu, Jia-Meng Xie, Shuai Tang, and Wendi
Xie. 2021. Temporal hierarchical graph attention network for traffic prediction. ACM Trans. Intell. Syst. Technol. 12,
6 (2021).
ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
36:24 S. Yu et al.

[23] Yantao Jia, Yuanzhuo Wang, Xiaolong Jin, and Xueqi Cheng. 2016. Location prediction: A temporal-spatial Bayesian
model. ACM Trans. Intell. Syst. Technol. 7, 3 (2016).
[24] Amol Kapoor, Xue Ben, Luyang Liu, Bryan Perozzi, Matt Barnes, Martin Blais, and Shawn O’Banion. 2020. Examining
COVID-19 forecasting using spatio-temporal graph neural networks. arXiv preprint arXiv:2007.03113 (2020).
[25] Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Pro-
ceedings of the International Conference on Learning Representations (ICLR).
[26] Xiangjie Kong, Menglin Li, Kai Ma, Kaiqi Tian, Mengyuan Wang, Zhaolong Ning, and Feng Xia. 2018. Big trajectory
data: A survey of applications and services. IEEE Access 6 (2018), 58295–58306.
[27] Vasileios Lampos, Andrew C. Miller, Steve Crossan, and Christian Stefansen. 2015. Advances in nowcasting influenza-
like illness rates using search query logs. Sci. Rep. 5, 1 (2015), 1–10.
[28] Ting Li, Junbo Zhang, Kainan Bao, Yuxuan Liang, Yexin Li, and Yu Zheng. 2020. AutoST: Efficient neural architecture
search for spatio-temporal prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining. 794–802.
[29] Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. Diffusion convolutional recurrent neural network: Data-driven
traffic forecasting. In Proceedings of the 6th International Conference on Learning Representations.
[30] Jiaying Liu, Hansong Nie, Shihao Li, Xiangtai Chen, Huazhu Cao, Jing Ren, Ivan Lee, and Feng Xia. 2021. Tracing the
pace of COVID-19 research: Topic modeling and evolution. Big Data Res. 25 (2021), 100236.
[31] Mingliang Liu, Shuo Yu, Xinbei Chu, and Feng Xia. 2020. CRI: Measuring city infection risk amid COVID-19. In
Proceedings of the IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE). IEEE, 1–6.
[32] Sitao Luan, Mingde Zhao, Xiao-Wen Chang, and Doina Precup. 2019. Break the Ceiling: Stronger Multi-scale Deep
Graph Convolutional Networks. Curran Associates Inc.
[33] Zhongjian Lv, Jiajie Xu, Kai Zheng, Hongzhi Yin, Pengpeng Zhao, and Xiaofang Zhou. 2018. LC-RNN: A deep learning
model for traffic speed prediction. In Proceedings of the 27th International Joint Conference on Artificial Intelligence.
3470–3476.
[34] Shohaib Mahmud, Haiying Shen, Ying Natasha Zhang Foutz, and Joshua Anton. 2021. A human mobility data driven
hybrid GNN+ RNN based model for epidemic prediction. In Proceedings of the IEEE International Conference on Big
Data (Big Data). IEEE, 857–866.
[35] C. Jessica E. Metcalf and Justin Lessler. 2017. Opportunities and challenges in modeling emerging infectious diseases.
Science 357, 6347 (2017), 149–152.
[36] Yue Ning, Liang Zhao, Feng Chen, Chang-Tien Lu, and Huzefa Rangwala. 2019. Spatio-temporal event forecasting
and precursor identification. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery
& Data Mining. 3237–3238.
[37] Julia V. Noble. 1974. Geographic and temporal development of plagues. Nature 250, 5469 (1974), 726–729.
[38] Junjie Ou, Jiahui Sun, Yichen Zhu, Haiming Jin, Yijuan Liu, Fan Zhang, Jianqiang Huang, and Xinbing Wang. 2020.
STP-TrellisNets: Spatial-temporal parallel TrellisNets for metro station passenger flow prediction. In Proceedings of
the 29th ACM International Conference on Information & Knowledge Management. 1185–1194.
[39] George Panagopoulos, Giannis Nikolentzos, and Michalis Vazirgiannis. 2021. Transfer graph neural networks for pan-
demic forecasting. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, 33rd Conference on Innovative
Applications of Artificial Intelligence, 11th Symposium on Educational Advances in Artificial Intelligence. AAAI Press,
4838–4845.
[40] Daniela Perrotta, Michele Tizzoni, and Daniela Paolotti. 2017. Using participatory web-based surveillance data to
improve seasonal influenza forecasting in Italy. In Proceedings of the 26th International Conference on World Wide Web.
303–310.
[41] Jiezhong Qiu, Jian Tang, Hao Ma, Yuxiao Dong, Kuansan Wang, and Jie Tang. 2018. DeepInf: Social influence prediction
with deep learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data
Mining. 2110–2119.
[42] Marian-Andrei Rizoiu, Swapnil Mishra, Quyu Kong, Mark Carman, and Lexing Xie. 2018. SIR-Hawkes: Linking epi-
demic models and Hawkes processes to model diffusions in finite populations. In Proceedings of the World Wide Web
Conference. 419–428.
[43] Markus Schläpfer, Lei Dong, Kevin O’Keeffe, Paolo Santi, Michael Szell, Hadrien Salat, Samuel Anklesaria, Moham-
mad Vazifeh, Carlo Ratti, and Geoffrey B. West. 2021. The universal visitation law of human mobility. Nature 593,
7860 (2021), 522–527.
[44] Shaon Bhatta Shuvo, Bonaventure C. Molokwu, and Ziad Kobti. 2020. Simulating the impact of hospital capacity and
social isolation to minimize the propagation of infectious diseases. In Proceedings of the 26th ACM SIGKDD Interna-
tional Conference on Knowledge Discovery & Data Mining. 3451–3457.
[45] Ke Sun, Lei Wang, Bo Xu, Wenhong Zhao, Shyh Wei Teng, and Feng Xia. 2020. Network representation learning: From
traditional feature learning to deep learning. IEEE Access 8 (2020), 205600–205617.

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.
Spatio-temporal Graph Learning for Epidemic Prediction 36:25

[46] Abhishek Tomy, Matteo Razzanelli, Francesco Di Lauro, Daniela Rus, and Cosimo Della Santina. 2022. Estimating the
state of epidemics spreading with graph neural networks. Nonlin. Dynam. 109, 1 (2022), 249–263.
[47] Jinzhong Wang, Xiangjie Kong, Feng Xia, and Lijun Sun. 2019. Urban human mobility: Data-driven modeling and
prediction. ACM SIGKDD Explor. Newslett. Arch. 21, 1 (2019), 1–19.
[48] Jingyuan Wang, Xiaojian Wang, and Junjie Wu. 2018. Inferring metapopulation propagation network for intra-city
epidemic control and prevention. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining. 830–838.
[49] Max Welling and Thomas N. Kipf. 2016. Semi-supervised classification with graph convolutional networks. In Pro-
ceedings of the International Conference on Learning Representations (ICLR’17).
[50] Felix Wong and James J. Collins. 2020. Evidence that coronavirus superspreading is fat-tailed. Proc. Nat. Acad. Sci. 117,
47 (2020), 29416–29418.
[51] Feng Xia, Azizur Rahim, Xiangjie Kong, Meng Wang, Yinqiong Cai, and Jinzhong Wang. 2018. Modeling and analysis
of large-scale urban mobility for green transportation. IEEE Trans. Industr. Inform. 14, 4 (2018), 1469–1481.
[52] Feng Xia, Ke Sun, Shuo Yu, Abdul Aziz, Liangtian Wan, Shirui Pan, and Huan Liu. 2021. Graph learning: A survey.
IEEE Trans. Artif. Intell. 2, 2 (2021), 109–127.
[53] Qinge Xie, Tiancheng Guo, Yang Chen, Yu Xiao, Xin Wang, and Ben Y. Zhao. 2020. Deep graph convolutional networks
for incident-driven traffic speed prediction. In Proceedings of the 29th ACM International Conference on Information &
Knowledge Management. 1665–1674.
[54] Jin Xu, Shuo Yu, Ke Sun, Jing Ren, Ivan Lee, Shirui Pan, and Feng Xia. 2020. Multivariate relations aggregation learning
in social networks. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries. 77–86.
[55] Yanyan Xu, Qing-Jie Kong, Reinhard Klette, and Yuncai Liu. 2014. Accurate and interpretable Bayesian MARS for
traffic flow prediction. IEEE Trans. Intell. Transport. Syst. 15, 6 (2014), 2457–2469.
[56] Huaxiu Yao, Yiding Liu, Ying Wei, Xianfeng Tang, and Zhenhui Li. 2019. Learning from multiple cities: A meta-learning
approach for spatial-temporal prediction. In Proceedings of the World Wide Web Conference. 2181–2191.
[57] Junchen Ye, Leilei Sun, Bowen Du, Yanjie Fu, Xinran Tong, and Hui Xiong. 2019. Co-prediction of multiple transporta-
tion demands based on deep spatio-temporal neural network. In Proceedings of the 25th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. 305–313.
[58] Xiuwen Yi, Junbo Zhang, Zhaoyuan Wang, Tianrui Li, and Yu Zheng. 2018. Deep distributed fusion network for air
quality prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data
Mining. 965–973.
[59] Shuo Yu, Hayat Dino Bedru, Ivan Lee, and Feng Xia. 2019. Science of scientific team science: A survey. Comput. Sci.
Rev. 31 (2019), 72–83.
[60] Shuo Yu, Qing Qing, Chen Zhang, Ahsan Shehzad, Giles Oatley, and Feng Xia. 2021. Data-driven decision-making in
COVID-19 response: A survey. IEEE Transactions on Computational Social Systems 8, 4 (2021), 1016–1029.
[61] Shuo Yu, Feng Xia, Yuchen Sun, Tao Tang, Xiaoran Yan, and Ivan Lee. 2020. Detecting outlier patterns with query-
based artificially generated searching conditions. IEEE Trans. Computat. Soc. Syst. 8, 1 (2020), 134–147.
[62] Jun Zhang, Wei Wang, Feng Xia, Yu-Ru Lin, and Hanghang Tong. 2020. Data-driven computational social science: A
survey. Big Data Res. 21 (2020), 2214–5796.
[63] Yingxue Zhang, Yanhua Li, Xun Zhou, Xiangnan Kong, and Jun Luo. 2020. Curb-GAN: Conditional urban traffic estima-
tion through spatio-temporal generative adversarial networks. In Proceedings of the 26th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. 842–852.
[64] Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, and Haifeng Li. 2019. T-GCN: A temporal
graph convolutional network for traffic prediction. IEEE Trans. Intell. Transport. Syst. 21, 9 (2019), 3848–3858.
[65] Peng Zhou, Xing-Lou Yang, Xian-Guang Wang, Ben Hu, Lei Zhang, Wei Zhang, Hao-Rui Si, Yan Zhu, Bei Li, Chao-Lin
Huang et al. 2020. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579, 7798
(2020), 270–273.

Received 19 December 2021; revised 27 November 2022; accepted 5 December 2022

ACM Transactions on Intelligent Systems and Technology, Vol. 14, No. 2, Article 36. Publication date: February 2023.

You might also like