Professional Documents
Culture Documents
Physics-Informed Neural Networks (PINNs) For Integer - and Fractional-Order Models.
Physics-Informed Neural Networks (PINNs) For Integer - and Fractional-Order Models.
1 Introduction
The Supplementary Information contains additional results that provide more information about the motivation of the work,
definition and details of the nine epidemiological models considered in this study, detailed discussion on different sets of results,
new results on extrapolation and forecasting, an identifiability study, and detailed formulation of physics-informed neural
networks (PINNs) for integer- and fractional-order models.
2 Model uncertainty
Quantifying parametric input uncertainty is not sufficient and the effect of model structure must be properly studied1 . Sup-
plementary Figure 1 clearly shows the uncertainty associated with several different models in analysing and predicting the
dynamics of this complex system. The figure is obtained from the COVID-19 Forecast Hub2 , which is a public online server
that serves as a central repository of forecasts and predictions from over 50 international research groups.
● Huge model uncertainty in predicting the incident death
Supplementary Figure 1. Prediction uncertainty associated with different models. The figure is obtained from the
COVID-19 Forecast Hub2 , which is a public online server that serves as a central repository of forecasts and predictions from
over 50 international research groups.
Supplementary Table 1. Parameters of Epidemiological models.
Symbol Definition Value/Range Note/Reference
βI Community transmission rate (0, 1) Fixed/Fitted
q Proportion of disease related deaths from the H class (0, 1) Fixed/Fitted
p Proportion of hospitalized individuals (0, 1) Fixed/Fitted
r infection rate (0, 1) Fitted
a recovery rate (0, 1) Fixed
b death rate (0, 1) Fixed
d number of delay days in time-delay model [0, 10) Fitted
ε Infectivity ratio of mildly infected to severely infected 0.75 3
2/19
Supplementary Table 2. Integer-order models. Definitions and values/ranges for parameters are given in Supplementary
Table 1.
Model (I1 ) Model (I2 ) Model (I3 )
d
dt S = − βI [I+εJ]
N S − Nv S, d
dt S = − βI [χP+I+εJ]
N S − Nv S, d
dt S = − βI [I+εP+εJ]
N S − Nv S,
d βI [I+εJ] d βI [χP+I+εJ] d βI [I+εP+εJ]
dt E = N S − αE, dt E = N S − α1 E, dt E = N S − α1 E,
d d d
dt I = δ αE − γI, dt P = α1 E − α2 P, dt P = α1 E − α2 P,
d d d
dt J = (1 − δ )αE − γa J, dt I = δ α2 P − γI, dt I = δ α2 P − dI I,
d d d
dt D = qφD H, dt J = (1 − δ )α2 P − γa J, dt J = (1 − δ )α2 P − γa J,
d d d
dt H = pγI − qφD H dt D = qφD H, dt D = qdH H,
−(1 − q)φR H, d d
dt H = pγ I − qφD H dt H = pdI I − dH H,
d
dt R = γa J + (1 − p)γI −(1 − q)φR H, d
+(1 − q)φR H + Nv S, dt Q = (1 − p)dI I − dQ Q,
d
dt R = γa J + (1 − p)γI d
d c dt R = γa J + (1 − q)dH H
dt I = δ αE, +(1 − q)φR H + Nv S,
+dQ Q + Nv S,
d c d c
dt H = pγI, dt I = δ α2 P, d c
dt I = δ α2 P,
d c
dt H = pγI, d c
dt H = pdI I,
Supplementary Table 3. Fractional-order models. Definitions and values/ranges for parameters are given in Supplementary
Table 1.
Model (F1 ) Model (F2 ) Model (F3 )
3/19
Supplementary Table 4. Time-delay model. Definitions and values/ranges for parameters are given in Supplementary
Table 1. λ (t) = βI [I+εJ]
N .
Model (T1 )
d
dt S = −λ (t − d)S(t − d),
d
dt I = δ λ (t − d)S(t − d) − γI,
d
dt J = (1 − δ )λ (t − d)S(t − d) − γa J,
d
dt H = pγI − dH H,
d
dt D = qdH H,
d
dt R = γa J + (1 − p)γI + (1 − q)dH H,
d c
dt I = δ λ (t − d)S(t − d),
d c
dt H = pγI,
{S, I, J, H, D, R, Ic , Hc }. The governing equation of the dynamics can be written as dtd U(T1 ) (t) = F (T1 ) (U(T1 ) ,t; λ ); see
Supplementary
Table 4. Similar to the integer-order model I1 , the effective reproduction number for model T1 is given by
(1 − δ )ε δ
Rc = βI + .
γa γ
where similar definition holds if the order is constant. Let u(t) denote the dynamics of a complex system defined on the
half-time axis t ∈ [0, ∞]. The left-sided Caputo-Taylor series of u(t) is given as
(∆t)κ C κ (∆t)2κ C κ C κ
u(t + ∆t) = u(t) + D t u(t) + D t D t u(t) + · · · , (3)
Γ(1 + κ) Γ(1 + 2κ)
where κ ∈ R+ and the expansion includes the sequential fractional derivative. The fractional Taylor series is only valid for
the Caputo form. The main distinguishing feature of the Caputo fractional derivative is that, like the integer order derivative,
the Caputo fractional derivative of a constant is zero. This property is critical for a fractional Taylor series. Note that the
third term in the expansion involves the fractional derivative of the κ fractional derivative, which is not the same as the 2κ
fractional derivative. It has been shown in17 that if the change in u(t) follows a power-law of order κ that matches the order of
the fractional Taylor series approximation, then the two-term fractional Taylor series approximation to this function is exact. By
truncating the expansion after second term, we can approximate the rate of change of u(t) by
u(t + ∆t) − u(t) (∆t)κ−1 C κ
≈ D t u(t), (4)
∆t Γ(1 + κ)
which is equal to the net mass flux of the system. Therefore, for each compartment in the epidemiological model, we can write
(∆t)κ−1 C κ
D t u(t) = inflow − outflow. (5)
Γ(1 + κ)
4/19
The time scale ∆t appears in (5) denotes a characteristic time of observation which amounts to a built-in scale effect. It is
usually considered to be the total time of observation22 . The scale effect goes away as κ → 1, ∆t κ−1 → ∆t 0 → 1 and thus the
equation recovers the classical continuity equation. This scaling factor is also useful in practice as it ensures the consistency of
units’ dimensions in fractional models. Here, we take ∆t = 7.0 in our simulation since we pre-process the data by taking a
seven-day average. We consider three fractional models. We note that the fractional operators can be of different orders for
each compartment, and they can be either fixed or time-dependent parameters.
• Model F1 : Fractional-order SIR. We use the Caputo fractional derivative of different fractional orders for each compartment in
the integer-order classical SIR model. We let U(F1 ) (t) = {S, I, R, Ic } and thus the governing equation of the dynamics takes the
~κ −1
form (∆t) D U (t) = F (F1 ) (U(F1 ) ,t; λ ); see Supplementary Table 3 first column. Here, ~κ = (κ1 , κ2 , κ3 ) ∈ (0, 1)3 with
C ~κ (F1 )
Γ(~κ +1) 0 t
κ1 , κ2 , and κ3 being the fractional derivative orders for the compartments S, I, and R, respectively. The effective reproduction
number is given by Rc = βγ1I .
• Model F2 : Fractional-order SIDR. We include the additional compartment D in model F1 and let U(F2 ) (t) = {S, I, D, R, Ic }.
~κ −1 ~κ (t)
The governing eqaution of the dynamics takes the form (∆t) C
D U(F2 ) (t) = F (F2 ) (U(F2 ) ,t; λ ); see Supplementary Table
Γ(~κ +1) 0 t
3 second column. Here, ~κ (t) = (κ1 (t), κ2 (t), κ3 (t), κ4 (t)) ∈ (0, 1)4 with κ1 (t), κ2 (t), κ3 (t), and κ4 (t) being the fractional
derivative orders for the compartments S, I, D, and R, respectively. The effective reproduction number is given by Rc = br .
• Model F3 : Fractional-order SIHDR. We include two additional compartments H, D into the fractional-order model (F1 ).
Therefore, we let U(F3 ) = {S(t), I(t), H(t), D(t), R(t), Ic (t), Hc (t)}. The governing equation of the dynamics takes the form
(∆t)~κ (t)−1 C ~κ (t) (F3 )
Γ(~κ (t)+1) 0 t
D U (t) = F (F3 ) (U(F3 ) ,t; λ ); see Supplementary Table 3 third column. Here, ~κ (t) = (κ1 (t), κ2 (t), κ3 (t), κ4 (t), κ5 (t)) ∈
(0, 1)5 with κ1 (t), κ2 (t), κ3 (t), κ4 (t), and κ5 (t) being the fractional derivative orders for the compartments S, I, H, D, and R,
respectively. The effective reproduction number is given by Rc = βγI .
5/19
day (V (t)) by the following piece-wise linear function
0, 0 ≤ t < 260,
V (t) = 500(t − 300), 260 ≤ t < 300, (6)
20000, t ≥ 300,
• Time-dependent versus fixed parameter: A study based on model I3 . Epidemiological models should accommodate
time-dependent parameters due to the change in disease symptoms, concentration and behavior of pathogen throughout the
course of the disease27, 28 , especially in long time pandemics. Here, we highlight the importance of considering time-dependent
parameters in the epidemiological models in studying the multi-rate dynamics of long time pandemics. We make the comparison
based on NYC data set by considering model I3 with time-dependent and fixed parameters. We use PINN formulation in
both cases to fit the data and infer the parameters. Supplementary Figure 4a and Supplementary Figure 4b show the accurate
fitting to the data and the inferred fixed and time-dependent parameters. The correct measure to compare these two models
is to solve them using an ODE solver (any accurate time marching scheme) starting from the onset of pandemic. We see in
Supplementary Figure 4 panel C, however, that these two models produce completely different dynamics, where the model
with fixed parameter reveals insufficient capacity to reproduce the data.
• Forecasting accuracy of integer-order models I1 , I2 , I3 : Here, we discuss the accuracy of prediction of the three different
integer-order models I1 , I2 , I3 based on NYC data. We split the available data into training and testing part and train PINNs
using the training part to infer the parameters of models. The dynamics can be forecast by considering three different approaches
as follows:
• P1 : Prediction by using the inferred models I1 , I2 , I3 . We infer the parameters βI (t), p(t), and q(t) in the training
window. Then, in the prediction window, we fix the parameters at their final values from the training window and advance
in time with that fixed value.
• P2 : Prediction by extrapolating the inferred parameters using model I3 . We first use the neural networks βI (t), p(t),
q(t) (shown in Supplementary Figure 9) to extrapolate the values of these parameters in near future after the training
window. Then, we employ the model to forecast the dynamics by using the extrapolated parameter.
• P3 : Prediction by extrapolating the states using model I3 . After training PINNs, we can use the neural network U(t)
to extrapolate the state dynamics in future. In this case, we do not need to solve the model and directly employ the
trained PINNs.
6/19
(a) Fitting available data (Inew , H, Dnew , q(t))
Dnew
Inew
0.4
q
2000 50
2500
0.2
0 0 0
0 0 0 1 0 0 0 1 0 0 1 0 0 0 1
04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202
p
0.2 model 1 0.1
0.0
0 0 0 1 0 0 0 1
04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202
3 3
2.5
I
model 3 model 1
model 1
2 0 0 0 1 0.0 20 0 0 1 0 0 0 0 1 0 0 0 0 1
04-2 0 07-202 10-202 01-202 04-20 07-202 10-202 01-202 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202
1e4 1e4 1e6
model 1 model 3 model 1
model 2 5.0 model 2
2 model 3 1 model
Q
R
3
J
0.20 0.20
0.2 0.15
0.15 0.15
4 7 0 1 4 7 0 1 4 7 0 1 4 7 0 1 4 7 0 1 4 7 0 1 4 7 0 1 4 7 0 1
0-0 0-0 0-1 1-0 1-0 1-0 1-1 2-0 0-0 0-0 0-1 1-0 1-0 1-0 1-1 2-0 0-0 0-0 0-1 1-0 1-0 1-0 1-1 2-0 0-0 0-0 0-1 1-0 1-0 1-0 1-1 2-0
202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202
7500 Prediction
Prediction(higher rate vacc.)
7500 Prediction
Prediction(higher rate vacc.)
7500 Prediction
Prediction(higher rate vacc.)
7500 Prediction
Prediction(higher rate vacc.)
5000 Data-7davg 5000 Data-7davg 5000 Data-7davg 5000 Data-7davg
Inew
Inew
Inew
Inew
Dnew
Dnew
Dnew
Supplementary Figure 2. PINNs inference using the integer-order models I1 , I2 , I3 and time-delay model T1 for MI.
a: Fitting to the available data of daily infectious and death cases and the current hospitalized cases. b: Inference of
time-dependent parameters (βI (t), p(t)). c: Inference of unobserved dynamics. d: Prediction and uncertainty quantification of
daily infectious and death cases. Here the inferred delay is d = 3.21 for the time-delay model T1 .
7/19
(a) Fitting available data (Inew , H, Dnew , q(t))
10
Data-7davg 400 Data-7davg Data-7davg 0.3 Data-7davg
1000 PINN PINN PINN PINN
0.2
Dnew
Inew
q
500 200 0.1
0 0 0 0.0
0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1
04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202
(b) Inference of time-dependent parameters (βI (t), p(t))
model
0.6 model model
1 1 3
0.4 model 2 0.4 model 2 model 1
model 3
I
p
model 1 0.2
0.2
0 0 0 1 0 0 0 1
04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202
(c) Inference of unobserved dynamics (S, E, P, I, J, Q, R)
1e6 1e4 1e4 1e4
1.0 model 0.4 model model
1.0 model 1 model
1
2 model
2
3
0.50 model
1
2
model 2 0.5 model 0.2 model 3
P
S
3
0.25
I
model 3 model 1
0.8 model 1
0 7-2020 0-2020 1-2021
0.0 0 0 0 1
0.0 0 0 0 1
0.00 0 0 0 1
0 4 -202 0 1 0 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202
1e4
model 1
1.0 1e4 model
1e6
model
3 1
0.4 model 2 0.2 model 2
model 3 0.5 model
Q
R
3
0.2
J
model 1 model 1
0.0 0 0 0 1
0.0 0 0 0 1
0.0 0 0 0 1
04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202 04-202 07-202 10-202 01-202
(d) Prediction and uncertainty quantification
Model I1 Model I2 Model I3 Model T1
PINN-Training
Prediction
0.4 PINN-Training
Prediction 0.4 PINN-Training
Prediction
PINN-Training
Prediction
0.3 Prediction-std-(15%)
0.3 Prediction-std-(15%) Prediction-std-(15%) 0.4 Prediction-std-(15%)
Prediction-std-(30%) Prediction-std-(30%) 0.3 Prediction-std-(30%) Prediction-std-(30%)
I
Inew
Inew
Inew
Hnew
Hnew
Hnew
Dnew
Dnew
Dnew
8/19
(a) PINN: fitting to available data (training)
Data Data Data
q
0.20
0.4 0.25
0.15
0.2 0.20
0.10
0 0 1 1 0 0 1 1 0 0 1 1
202 202 202 202 202 202 202 202 202 202 202 202
06- 10- 02- 06- 06- 10- 02- 06- 06- 10- 02- 06-
(c) Solving the ODE with inferred parameter from PINN
Daily hospitalized cases (Hnew)
Supplementary Figure 4. Time-dependent parameter versus constant parameter based on NYC data. Panel a shows
the fitting results by PINN. Using time-dependent parameter leads to better fitting results. Panel b shows the inferred fixed and
time-dependent parameters (βI , p, q). Panel c shows the solution of ODE with inferred parameter from PINN.
In approaches 1 and 2 that we use the models, we postulate that the effect of several different control measures in the community
transmission rate βI (t) can be captured by adding an uncertainty bound to its mean value in the prediction window. By
using standard forward propagation techniques, i.e., Monte Carlo29 and probabilistic collocation methods30 , we propagate the
uncertainty into the prediction window.
In Supplementary Figure 5, we compare the performance of approaches P1 –P3 in making a two-week prediction of the
transmission dynamics at the early stage of the pandemic, which exhibits large variations in time. The predictive power of the
second approach (extrapolating parameters) turns out to be the highest. The first approach (fixing parameters) gives Hnew and
Dnew which do not match the data well and leads to larger uncertainty during the prediction. The predicted Dnew of the third
approach deviates from the ground truth by a large margin.
Next, in Supplementary Figure 6 we compare the performance of models I1 , I2 , I3 using approach P1 in long term prediction
of dynamics (≈ 3 months). We consider the data up to beginning of June and divide it into training part up until March and
testing part from March to June. We fix the parameters βI (t), p(t), and q(t) at their final values from the training window and
add an uncertainty bound of 15% and 30% to the mean value of βI (t) in the prediction window. In the close up panel B, we
see that model I3 gives the most robust prediction as the uncertainty bounds around the predicted values completely cover the
unseen data.
• Identifiability of fractional model F1 : We discuss the identifiability of fractional model F1 based on the NYC data26 and
show the results in Supplementary Figure 7. As a comparison, we plot the results of fractional model F1 against the results of
its corresponding integer-order model I4 with time-dependent parameter. In the first three rows of each column, black color
shows the results of model F1 and red color shows the results of corresponding integer-order model I4 . The first column
shows the inferred dynamics and fractional orders κi , i = 1, 2, 3, based on the only available data Inew . Although with a small
uncertainty bound, the inferred compartment I shows an erroneous trend with a sharp increase at the end of the training. The
uncertainty bounds for S, R, and the fractional orders in this case are also larger compared to the other cases due to lack of
data. In the second column, we assume that in addition to daily infectious cases Inew , the compartment S is available from the
corresponding integer-order model. It can be observed that the integer-order model results cannot be recovered even though the
9/19
PINN-Training PINN-Training 0.40 PINN-Training
Extrapolating parameters 0.6 Extrapolating parameters Extrapolating parameters
0.8 Fixed parameters Fixed parameters Fixed parameters
0.35
0.5
0.6 0.30
0.4
I
q
0.4 0.25
0.3
0.1 0.15
2 020 2 020 2 020 2 020 202
0
202
0
04- 05- 04- 05- 04- 05-
5000 Data-7davg (training) 1600 Data-7davg (training) Data-7davg (training)
Supplementary Figure 5. Short term prediction using the model and PINN itself. We use three approaches P1 –P3 to
make two-week prediction.
fractional orders converge to 1 in this case. In the third column, we assume that in addition to the daily infectious cases Inew , the
compartment I is available from the corresponding integer-order model. In the last column, we assume that in addition to the
Inew , the compartments S, I, and R are available. We observer that the uncertainty bounds are reduced when we have more data.
for all Φ(t) in the same space as Θ(t), for all t ∈ [0, T ]. To the best of our knowledge, most existing literature and software
are focused on the case when Θ and Φ are constants, i.e., independent of t. Despite the lack of rigorous formulation for the
structural identifiability of systems with time-dependent parameters, it is still possible to determine the identifiable parameters in
some cases by observation. Assuming that: 1) Ic , Hc and D are given as output variables y, 2) parameters to fit are Θ = (β , p, q),
and 3) the rest of the parameters and initial values of all state variables X(0) are given, then we observe that all the three
integer-order models I1 , I2 , I3 are structurally globally identifiable. Here, we carry out the analysis for model I3 , which has the
most number of compartments among other models and thus may be more susceptible to lack of identifiability.
10/19
(a) Training and forecasting: full time window
Model !!
Model !"
Model !#
Supplementary Figure 6. Forecasting of dynamics using integer-order models I1 , I2 , and I3 based on NYC data.
Panel a shows the the full window of training and prediction. Panel b shows the close up of prediction window. Left column:
daily (new) infectious cases (I new ). Middle column: daily (new) hospitalized cases (H new ). Right column: daily (new) death
cases (Dnew ).
11/19
Inference based on available data Inew and simulated data
only Inew Inew & Int. model (S) Inew & Int. model (I) Inew & Int. model (S & I)
1e5
84.0 84 84 84
1e5 1e5 1e5
83.5 82
82 82
83.0 80
S
S
S
82.5 80 78 80
82.0 SIR Model 76 SIR Model SIR Model
fPINN fPINN fPINN
0 0 0 0 0 0 0 0 0 0 1 1 1 78 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 78 0 0 0 0 0 0 0 0 0 0 1 1 1
03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2
1e5
1e5 1e5 1e5
0.20 1.0 1.0 1.0
0.15 0.8 0.8 0.8
0.6 0.6 0.6
0.10
I
I
I
I
0.4 0.4 0.4
0.05 0.2
0.2 SIR Model SIR Model 0.2 SIR Model
0.00 0.0 fPINN 0.0 fPINN
0.0 fPINN
0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1
03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2
1e5
1e5
2.0 8 1e5 1e5
4
1.5 4 6
3
1.0 4
R
R
R
R
2 2
0.5 2 1
SIR Model SIR Model SIR Model
fPINN 0 fPINN fPINN
0.0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1
03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2 03-2 04-2 05-2 06-2 07-2 08-2 09-2 10-2 11-2 12-2 01-2 02-2 03-2
model parameters
model parameters
model parameters
0.8 0.8 0.8 0.8
0.6 0.6 0.6 0.6
0.4 0.4 0.4 0.4
0.2 (mean) (mean) 0.2 0.2 (mean) (mean) 0.2
1 3 1 (mean) 3(mean) 1 3 1 (mean) 3 (mean)
2 (mean) betaI (fixed) (mean) betaI (fixed) 2 (mean) betaI (fixed) (mean) betaI (fixed)
0.0 0.0 2 0.0 0.0 2
0 50000 100000 150000 200000 250000 0 20000 40000 60000 0 20000 40000 60000 0 20000 40000 60000
iteration iteration iteration iteration
Supplementary Figure 7. Identifiability study for PINNs for the fractional order model F1 for NYC. The parameters
β = 0.25 and γ = 0.0365 are fixed in the fractional model. Instead of time-dependent parameters, here we aim to infer
unobserved dynamics and different fractional operators for κi , i = 1, 2, 3. First column: Inference based on only available data
Inew ; Second column: Inference based on available data Inew and S from the corresponding integer-order SIR model; Third
column: Inference based on available data Inew and I from the corresponding integer-order SIR model; Fourth column:
Inference based on available data Inew , S, I, and R from the corresponding integer-order SIR model.
Since Hc , dI are given and dtd Hc = pdI I then pI can be uniquely determined. Since dH is known and dtd H = pdI I − dH H
then H can be uniquely determined. Since D is known and dtd D = qdH H, then q can be uniquely determined and thus it
should be structurally identifiable. By dtd Ic = δ α2 P, P can be uniquely determined and by dtd I = δ α2 P − dI I, I can be uniquely
determined. Since pI can be uniquely determined, as shown previously, p is structurally identifiable. Since dtd P = α1 E − α2 P,
d βI [I+εP+εJ]
then E can be uniquely determined and since dt E = N S − α1 E, then βI S can be uniquely determined. Finally, since
d
dt S = − βI [I+εP+εJ]
N S − Nv S
then S can be uniquely determined and therefore βI is structurally identifiable. Thus, model I3 is
structurally globally identifiable. The same procedure can be applied to model I1 and I2 , indicating that these two models are
also structurally identifiable. In the case of fixed parameters, we have used the SIAN software, based on differential algebra and
Taylor series expansion to study the structural identifiability of model I3 , see32 .
1. Solve Eq. (8) with Θ = (β , p, q) equal to the estimated value Θ̃ = (β̃ , p̃, q̃) to get the calibrated observables. We assume
ỹ = [Inew , Hnew , Dnew ].
( j)
2. Multiply the calibrated observables ỹi by independent and identically distributed Gaussian random noise ξi ∼
N (1, σ 2 IT ), 1 ≤ i ≤ m, 1 ≤ j ≤ M, where T is the number of time stamps, IT is the T × T identity matrix, and
M is the number of Monte Carlo steps. By this method we generate a dataset of size M:
( j) ( j)
where is the elementwise product and ξ ( j) = [ξ1 , · · · , ξm ].
12/19
3. The parameters are estimated again using these perturbed observables {Y ( j) }M
j=1 . The estimated value for the ith unknown
( j)
parameter θi using Y ( j) is denoted as θ̂i . The average relative error (ARE) for θi is defined as:
M T ( j)
1 θ̂i (t) − θi (t)
ARE(θi ) = ∑∑ .
Mσ T j=1 t=1 θi (t)
Finally, we define the maximum average relative error (MARE) of the model to be the largest ARE across all the model
parameters:
If MARE < 1, then the model parameters are not very sensitive to the perturbation in the data. We use this as a definition
of practical identifiability.
1.0 0.40
Noiseless inference Noiseless inference 0.45 Noiseless inference
= 0.05 = 0.05 = 0.05
= 0.1 0.35 = 0.1 = 0.1
= 0.2 = 0.2 0.40 = 0.2
0.8
0.30
0.35
0.6 0.25
0.30
p
q
0.20
0.4 0.25
0.15
0.20
0.2 0.10
0.15
2020-06 2020-11 2021-04 2020-06 2020-11 2021-04 2020-06 2020-11 2021-04
Days Days Days
1.0 1.0 1.0
MARE, = 0.1
MARE, = 0.2
Supplementary Figure 8. Practical identifiability analysis of model I3 based on NYC data. We perturb the calibrated
observables by random noises at three levels of standard deviation σ = {0.05, 0.1, 0.2}. More significant perturbation
corresponds to a larger uncertainty band during the data inference procedure.
We follow the aforementioned procedure to study the practical identifiability of I3 , as shown in Supplementary Figure 8.
Three levels of noise with standard deviation σ = {0.05, 0.1, 0.2} are chosen; the number of Monte Carlo steps is M = 100. We
find that this integer order model is practically identifiable, since MARE < 1 in all three cases. Here, q has a large uncertainty
band when σ = 0.2, indicating higher sensitivity to large perturbations in the data, and therefore we should be more conservative
about the inferred values of q.
13/19
Fitting data
Inferring unobserved dynamics Inferring parameters Solving system of ODEs
NN
𝜎 𝜎 𝜎
𝛽! (𝑡)
𝜎
𝜎
𝜎
𝜎
𝜎
𝜎
ODE Residuals
𝑝(𝑡)
𝜎 𝜎 𝜎
𝜎 𝜎 𝜎
𝑞(𝑡) ODEs
𝑡 𝜎 𝜎 𝜎
𝜎 𝜎 𝜎 𝑑(𝑡)
𝜎 𝜎 𝜎 (model 𝕋!)
𝜎 𝜎 𝜎 𝜎
𝜎 𝜎 𝜎 𝜎 𝐔 𝑡
Loss
𝜎 𝜎 𝜎 𝜎
<𝜀?
Supplementary Figure 9. Schematic of physics-informed neural networks. NN denotes different neural networks
representing the states U(t) (green-shaded area) and the model parameters βI (t), p(t), q(t) in integer-order models and d in
time-delay model (red-shaded area). ODE Residuals: computes the residual of models. Loss: is comprised of both terms from
the mismatch between data and NN and the ODE residuals.
The unknown time-dependent parameters in the governing equation are also parametrized by separate neural networks; see
the red-shaded area in Supplementary Figure 9. In the PINN formulation, we define two finite sets of training points {tuj }Nj=1
u
where MSE stands for mean squared error. We see that the loss function of PINN contains two terms. The MSEu measures
the mismatch between solution UNN and data U(D) at the training points, which depends on the availability of data on the
epidemiological classes as discussed in Section 6.3. The MSEr penalizes the governing equation at the residual points. We
provide the detailed expansion of this term in Section 6.4 equation (17) for model I1 . Similar expansions can be obtained for
(I )
other integer-order models I1 , I2 , I3 , and time-delay model T1 . In each model, we denote the output of the network as UNN1 (t),
(I2 ) (I3 ) (T1 )
UNN (t), UNN (t), and UNN (t), respectively. We consider three separate networks to represent the time-dependent parameters
βI (t), p(t) and q(t). In time-delay model, we use a separate network to represent the time-delay parameter. In all cases, the
derivatives of network with respect to the input t and all network parameters Θ are computed by applying the chain rule for
differentiating compositions of functions using the automatic differentiation36 . In particular, we use Tensorflow programming37 ,
which is a popular and relatively well documented open source software library for automatic differentiation and deep learning
computations. Other formulations of PINNs combined with other networks have been also studied in the literature; see e.g. the
work by Long et al.38 .
14/19
6.2 Fractional Physics-Informed Neural Networks (fPINNs)
Fractional PINNs (fPINNs)39 extend PINNs to solve forward and inverse problems with fractional differential operators.
Since the standard chain rule for integer-order derivatives is not valid for the fractional ones, fPINN formulation does not use
automatic differentiation to compute the fractional derivatives. Instead, the residual in the loss function of fPINNs adopts
a numerical discretization for the fractional operators. This will require additional set of “auxiliary” points to numerically
compute the fractional derivatives. In particular, the Caputo fractional derivative used in this paper is numerically approximated
by the L1 scheme40, 41 on a uniform mesh {tn = nτ} as ,
n−1
0D t ∑ bn,n−k−1 [ f (tk+1 ) − f (tk )] + O(τ 2−κ(t) ),
C κ(t) κ(t)
f (tn ) = δt f (tn ) , (11)
k=0
τ −κ(tn ) 1−κ(tn ) −
where κ(t) ∈ (0, 1), τ is the (sufficiently small) time step, and the coefficients are given by bn,k = Γ(2−κ(tt )) [(k + 1)
(F)
k1−κ(tn ) ]. The formulation of fPINNs is very similar to PINNs. We let UNN (t; Θ) be a deep neural network with input t,
(F)
parameterized by Θ as weights and biases of the network. We approximate the solution of fractional model by U(t) ≈ UNN (t; Θ)
and define the residual as
(F) (∆t)κ(t)−1 C κ(t) (F) (F) (∆t)κ(t)−1 κ(t) (F) (F)
RNN (t) = D U (t) − F (F)
(U ,t; λ ) ≈ δ UNN (t) − F (F) (UNN ,t; λ ). (12)
Γ(κ(t) + 1) 0 t NN NN
Γ(κ(t) + 1) t
Similar to PINNs, in the fPINN formulation, we define two finite sets of training points {tuj }Nj=1
u
and residual points {trj }Nj=1
r
.
The training points are the points where we have the data available and the residual points are the points where the residual (12)
is satisfied and they are freely available all over the computational domain. We define the loss function of fPINN as
Nu Nr
(F) (F) 1 (F) 2 1 (F) 2
L(F) (Θ, λ ) = ωu MSEu + ωr MSEr = ωu ∑ UNN (tuj ) − U(D) (tuj ) + ωr ∑ RNN (trj ) . (13)
Nu j=1 Nr j=1
(F)
where MSE stands for mean squared error. The MSEu is defined similar to PINNs. The detailed expansions for MSEr for
different fractional models are given in Section 6.4. We note that in fPINNs, the number of residual points depends on the the
time step τ. While a small time step improves the discretization error, it increases the number of residual points, which will
further impose extra computational costs. The computational bottleneck is the for loop in computing the fractional derivative.
We show in Sec. 6.5 a vectorization technique that we implemented to reduce the computational cost.
(F ) (F )
We apply fPINNs to solve the fractional models F1 , F2 , and F3 with the network output denoted by UNN1 (t), UNN2 (t), and
(F )
UNN3 (t), respectively. In the fractional models, we fix the model parameters and infer the fractional order κi (t)’s, as well as the
unobserved dynamics. We use a separate network to represent each time-dependent fractional order.
15/19
new
For the data in Set 1, if only H(t) is given then we have q(t) = φD DH(t)(t) and then MSEu becomes
Nu Nu
1 1
MSEu = ∑ |Inew
NN (t j ) − I
new
(t j )|2 + |Dnew (t j ) − Dnew (t j )|2
Nu j=1 Nu ∑ NN j=1
Nu Nu
1 1
+
Nu ∑ |IcNN (t j ) − Ic (t j )|2 + Nu ∑ |DcNN (t j ) − Dc (t j )|2 (15)
j=1 j=1
Nu Nu
1 1
+
Nu ∑ +|HNN (t j ) − H(t j )|2 + Nu ∑ |qNN (t j ) − q(t j )|2
j=1 j=1
c
For the data in Set 1, if both H (t) and H(t) are given then the MSEu becomes the combination of both (14) and (15). For the
data in Set 2, the MSEu becomes
Nu Nu Nu
1 1 1
MSEu =
Nu ∑ |INN (t j ) − I(t j )|2 + Nu ∑ |DcNN (t j ) − Dc (t j )|2 + Nu ∑ |RNN (t j ) − R(t j )|2 . (16)
j=1 j=1 j=1
In any case, if the data for one of the compartments is not available we omit the corresponding term from MSEu .
with SNN (t) = N − ENN (t) − INN (t) − JNN (t) − DNN (t) − HNN (t) − RNN (t). Detailed expansions of the ODE residual for
models I2 , I3 , and T1 can be similarly derived.
In view of the ODE systems for fractional models in Supplementary Table 3, the ODE residual for model F1 takes the form
Nf 2
(F ) 1 (∆t)κ1 −1 κ1 βI
MSEr 1 = ∑ δ SNN (t j ) + INN (t j )SNN (t j )
Nf j=1 Γ(κ1 + 1) t N
Nf 2
1 (∆t)κ2 −1 κ2 βI
+ ∑ δ INN (t j ) − INN (t j )SNN (t j ) + γ1NN (t j )INN (t j )
Nf j=1 Γ(κ2 + 1) t N
Nf 2
1 (∆t)κ3 −1 κ3
+ ∑ δ RNN (t j ) − γ1NN (t j ) JNN (18)
Nf j=1 Γ(κ3 + 1) t
Nf 2
1 (∆t)κ2 −1 κ2 c βI
+ ∑ δt INN (t j ) − INN (t j )SNN (t j )
Nf j=1 Γ(κ2 + 1) N
Nf
1
+
Nf ∑ |N − SNN (t) − INN (t) − RNN (t)|2 ,
j=1
16/19
and the ODE residual for the fractional model F3 takes the form
Nf 2
(F3 ) 1 (∆t)κ1NN (t j )−1 κ1NN (t j ) βI
MSEr = ∑ δ SNN (t j ) + INN (t j )SNN (t j )
Nf j=1 Γ(κ1NN (t j ) + 1) t N
Nf 2
1 (∆t)κ2NN (t j )−1 κ2NN (t j ) βI
+ ∑ δ INN (t j ) − INN (t j )SNN (t j ) + γ1NN (t j )INN (t j )
Nf j=1 Γ(κ2NN (t j ) + 1) t N
Nf 2
1 (∆t)κ3NN (t j )−1 κ3NN (t j )
+ ∑ δ HNN (t j ) − pNN (t j )γINN (t j ) + qNN (t j )φD HNN (t j ) + (1 − qNN (t j ))φR HNN (t j )
Nf j=1 Γ(κ3NN (t j ) + 1) t
Nf 2
1 (∆t)κ4NN (t j )−1 κ4NN (t j )
+ ∑ δ DNN (t j ) − qNN (t j )φD HNN (t j )
Nf j=1 Γ(κ4NN (t j ) + 1) t
Nf 2
1 (∆t)κ5NN (t j )−1 κ5NN (t j )
+ ∑ δ RNN (t j ) − (1 − pNN (t j ))γNN (t j )INN − (1 − qNN (t j ))φR HNN (t j )
Nf j=1 Γ(κ5NN (t j ) + 1) t
Nf 2
1 (∆t)κ2NN (t j )−1 κ2NN (t j ) c βI
+ ∑ δt INN (t j ) − INN (t j )SNN (t j )
Nf j=1 Γ(κ2NN (t j ) + 1) N
Nf 2
1 (∆t)κ3NN (t j )−1 κ3NN (t j ) c
+ ∑ δ HNN (t j ) − pNN (t j )γINN (t j )
Nf j=1 Γ(κ3NN (t j ) + 1) t
Nf
1
+
Nf ∑ |N − SNN (t) − INN (t) − HNN (t) − DNN (t) − RNN (t)|2 .
j=1
(19)
Detailed expansions of the ODE residual for model F2 can be similarly derived.
where and × denote the Hadamard product and matrix multiplication, respectively. The coefficient matrix W can be obtained
by W = [W0 ∗ ∗ C − W1 ∗ ∗ C] + [W2 ∗ ∗ C − 2W3 ∗ ∗ C + W4 ∗ ∗ C] + I, where ∗∗ denotes point-wise power operator and the
matrices on the right-hand side are given by
1 − κ0 1 − κ0 1 − κ0 ··· 1 − κ0 1 − κ0
1 − κ1 1 − κ1 1 − κ1 ··· 1 − κ1 1 − κ1
1 − κ2 1 − κ2 1 − κ2 ··· 1 − κ2 1 − κ2
C= , (21)
.. .. .. .. ..
. . . ··· . .
1 − κM−1 1 − κM−1 1 − κM−1 ··· 1 − κM−1 1 − κM−1
1 − κM 1 − κM 1 − κM ··· 1 − κM 1 − κM
17/19
0 0 0 ··· 0 0 0 0 0 ··· 0 0
0 1 0 · · · 0 0 0 0 0 ··· 0 0
0 0 1 · · · 0 0 1 0 0 ··· 0 0
I = . . . . .. , W0 = .. .. , (22)
.. .. .. . . ..
.. .. .. .. . . . . . . . .
0 0 0 · · · 1 0 (M − 2) 0 0 ··· 0 0
0 0 0 ··· 0 1 (M − 1) 0 0 ··· 0 0
0 0 0 ··· 0 0 0 0 0 ··· 0 0
1 0 0 ··· 0 0
0
0 0 ··· 0 0
2 0 0 ··· 0 0 0 2 0 ··· 0 0
W1 = , W = .. , (23)
.... .. . . .. .. 2 .. .. .. ..
. . . . . . .
. ···
. . .
(M − 1) 0 0 · · · 0 0 0 (M − 1) (M − 2) · · · 1 0
M 0 0 ··· 0 0 0 M (M − 1) · · · 2 0
0 0 0 ··· 0 0 0 0 0 ··· 0 0
0
0 0 ··· 0 0
0
0 0 ··· 0 0
0 1 0 ··· 0 0 0 0 0 ··· 0 0
W3 . .. , W4 .. .. . (24)
.. .. .. .. .. ..
.. . . ··· . . . . ···
. . .
0 (M − 2) (M − 3) · · · 1 0 0 (M − 3) (M − 4) · · · 1 0
0 (M − 1) (M − 2) · · · 1 0 0 (M − 2) (M − 3) · · · 0 0
References
1. Edeling, W. et al. The impact of uncertainty on predictions of the CovidSim epidemiological code. Nat. Comput. Sci. 1,
128–135 (2021).
2. Cramer, E. et al. reichlab/covid19-forecast-hub: release for zenodo, 20210816. DOI: 10.5281/zenodo.5208210 (2020).
3. CDC. COVID-19 pandemic planning scenarios. Available at https://www.cdc.gov/coronavirus/2019-ncov/hcp/
planning-scenarios.html.
4. Oran, D. P. & Topol, E. J. Prevalence of asymptomatic sars-cov-2 infection: A narrative review. Annals Intern. Medicine
173, 362–367 (2020).
5. Hao, X. J. et al. Reconstruction of the full transmission dynamics of COVID-19 in Wuhan. Nature 584, 420–424 (2020).
6. Barlow, D. A. & Baird, J. K. Modeling the spring 2020 New York City COVID-19 epidemic: New criteria and methods for
prediction. medRxiv 10.1101/2020.06.12.20130005 (2020).
7. McAloon, C. et al. Incubation period of COVID-19: a rapid systematic review and meta-analysis of observational research.
BMJ open 10, e039652 (2020).
8. Li, R. Y. et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-COV-2).
Science 368, 489–493 (2020).
9. Cheng, H. Y. et al. Contact tracing assessment of COVID-19 transmission dynamics in Taiwan and risk at different
exposure periods before and after symptom onset. JAMA Intern. Medicine 180, 1156–1163 (2020).
10. He, X., Lau, E., Wu, P. et al. Temporal dynamics in viral shedding and transmissibility of COVID-19. Nat. Medicine 26,
672–675 (2020).
11. Wölfel, R. et al. Virological assessment of hospitalized patients with COVID-2019. Nature 581, 465–469 (2020).
12. Richardson, S. et al. Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with
COVID-19 in the New York City area. Jama 323, 2052–2059 (2020).
13. CDC. Considerations for wearing masks: Help slow the spread of covid-19. Available at https://www:cdc:gov/coronavirus/
2019-ncov/prevent-getting-sick/cloth-face-cover-guidance:html.
18/19
14. Angstmann, C. N. et al. A general framework for fractional order compartment models. SIAM Rev. 63, 375–392 (2021).
15. Angstmann, C. N., Henry, B. I. & McGann, A. V. A fractional-order infectivity SIR model. Phys. A: Stat. Mech. its Appl.
452, 86–93 (2016).
16. Angstmann, C. N., Henry, B. I. & McGann, A. V. A fractional order recovery SIR model from a stochastic process. Bull.
mathematical biology 78, 468–499 (2016).
17. Wheatcraft, S. W. & Meerschaert, M. M. Fractional conservation of mass. Adv. Water Resour. 31, 1377–1381 (2008).
18. Odibat, Z. M. & Shawagfeh, N. T. Generalized taylor’s formula. Appl. Math. Comput. 186, 286–293 (2007).
19. Cai, M. & Li, C. P. Numerical approaches to fractional integrals and derivatives: A review. Mathematics 8, 43 (2020).
20. Li, C. P. & Cai, M. Theory and Numerical Approximations of Fractional Integrals and Derivatives (SIAM, 2019).
21. Li, C. P. & Zeng, F. H. Numerical Methods for Fractional Calculus (CRC Press, 2015).
22. Leszczynskii, J. S. An Introduction to Fractional Mechanics (Publishing Office of Czestochowa University of Technology,
2011).
23. Michigan overview. Available at https://covidtracking.com/data/state/michigan.
24. COVID-19 vaccine dashboard. Available at https://www.michigan.gov/coronavirus/0,9753,7-406-98178 103214-547150--,
00.html.
25. COVID-19 Rhode Island data. Available at https://docs.google.com/spreadsheets/d/
1c2QrNMz8pIbYEKzMJL7Uh2dtThOJa2j1sSMwiDo5Gz4/edit#gid=1592746937.
26. NYC Coronavirus Disease 2019 (COVID-19) Data. Available at https://github.com/nychealth/coronavirus-data.
27. Fine, P., Eames, K. & Heymann, D. L. “herd immunity”: a rough guide. Clin. infectious diseases 52, 911–916 (2011).
28. Peak, C. M., Childs, L. M., Grad, Y. H. & Buckee, C. O. Comparing nonpharmaceutical interventions for containing
emerging epidemics. Proc. Natl. Acad. Sci. 114, 4023–4028 (2017).
29. Fishman, G. S. Monte Carlo: Concepts, Algorithms, and Applications (Springer, New York, 1996).
30. Xiu, D. B. & Hesthaven, J. S. High-order collocation methods for differential equations with random inputs. SIAM J. on
Sci. Comput. 27, 1118–1139 (2005).
31. Roda, W. C., Varughese, M. B., Han, D. & Li, M. Y. Why is it difficult to accurately predict the covid-19 epidemic? Infect.
Dis. Model. 5, 271–281 (2020).
32. Zhang, S., Ponce, J., Zhang, Z., Lin, G. & E.Karniadakis, G. An integrated framework for building trustworthy data-driven
epidemiological models: Application to the COVID-19 outbreak in New York City. medRxiv (2021).
33. Miao, H. Y., Xia, X. H., Perelson, A. S. & Wu, H. L. On identifiability of nonlinear ode models and applications in viral
dynamics. SIAM review 53, 3–39 (2011).
34. Tuncer, N. & Le, T. T. Structural and practical identifiability analysis of outbreak models. Math. biosciences 299, 1–18
(2018).
35. Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: A deep learning framework for solving
forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).
36. Baydin, A., Pearlmutter, B. A., Radul, A. A. & Siskind, J. M. Automatic differentiation in machine learning: a survey. The
J. Mach. Learn. Res. 18, 5595–5637 (2017).
37. Abadi, M., Agarwal, A. & et al, P. B. Tensorflow: Large-scale machine learning on heterogeneous distributed systems.
arXiv: 1603.04467 (2016).
38. Long, J., Khaliq, A. Q. M. & Furati, K. M. Identification and prediction of time-varying parameters of COVID-19 model:
a data-driven deep learning approach. Int. J. Comput. Math. 1–19 (2021).
39. Pang, G. F., Lu, L. & Karniadakis, G. E. fPINNs: Fractional physics-informed neural networks. SIAM J. on Sci. Comput.
41, A2603–A2626 (2019).
40. Lin, Y. M. & Xu, C. J. Finite difference/spectral approximations for the time-fractional diffusion equation. J. computational
physics 225, 1533–1552 (2007).
41. Sun, Z. Z. & Wu, X. N. A fully discrete difference scheme for a diffusion-wave system. Appl. Numer. Math. 56, 193–209
(2006).
19/19