Professional Documents
Culture Documents
2019-Liu-Machine Learning For Predicting Thermodynamic Properties of Pure Fluids and Their Mixtures
2019-Liu-Machine Learning For Predicting Thermodynamic Properties of Pure Fluids and Their Mixtures
2019-Liu-Machine Learning For Predicting Thermodynamic Properties of Pure Fluids and Their Mixtures
Energy
journal homepage: www.elsevier.com/locate/energy
a r t i c l e i n f o a b s t r a c t
Article history: Establishing a reliable equation of state for largely non-ideal or multi-component liquid systems is
Received 6 June 2019 challenging because the complex effects of molecular configurations and/or interactions on the ther-
Received in revised form modynamic properties must generally be taken into account. In this regard, machine learning holds great
2 September 2019
potential for directly learning the thermodynamic mappings from existing data, thereby bypassing the
Accepted 7 September 2019
Available online 9 September 2019
use of equations of state. The present study outlines a general machine learning framework based on
high-efficiency support vector regression for predicting the thermodynamic properties of pure fluids and
their mixtures. The proposed framework is adopted in conjunction with training data obtained from a
Keywords:
Thermodynamic properties
high-fidelity database to successfully predict the thermodynamic properties of three common pure
Machine learning fluids. The predictions demonstrate extremely low mean square errors. Moreover, little loss in the
Support vector regression prediction accuracy is obtained for ternary mixtures of the pure fluids at the cost of a modest increase in
Mixtures the volume of training data provided by state-of-the-art molecular dynamics simulations. Our results
Molecular dynamics simulation demonstrate the promising potential of machine learning for building accurate thermodynamic map-
pings of pure fluids and their mixtures. The proposed methodology may pave the way in the future for
the rapid exploration of novel or complex systems with potentially exceptional thermodynamic
properties.
© 2019 Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.energy.2019.116091
0360-5442/© 2019 Elsevier Ltd. All rights reserved.
2 Y. Liu et al. / Energy 188 (2019) 116091
space yielding the lowest CV error. then employed to optimize the objective function of our n-SVR
implementation with respect to the model parameters using the
2.2. Machine learning framework Gaussian kernel. As a result, a predictive model is constructed.
After constructing the predictive model, its feasibility can be
We consider mixtures consisting of n distinct molecular com- evaluated by comparing its ML mappings with the original training
ponents and seek to define the target property P according to the data, and the testing set can be employed to validate the accuracy
system attributes D ¼ [Vm T x1 x2 … xn1], where Vm is the molar and generalization ability of the predictive model. Two evaluation
volume of the mixture, T is the temperature, and xj is the molar criteria are commonly adopted for conducting quantitative as-
fraction of the jth molecular component, for j ¼ 1, 2,…, n 1. The sessments of the feasibility, accuracy, and generalization ability of a
input vector contains the molar fractions of only (n 1) compo- predictive model. The first is the mean square error (MSE)
nents because the last component xn is independent of the others
under the constraint xn ¼ 1 ðx1 þ x2 þ … þ xn1 Þ. In addition, we
1 XN
note that the attributes of a pure fluid are reduced to D ¼ [Vm T]. MSE ¼ ðf ðyi Þ zi Þ2 ; (8)
N i¼1
The proposed ML framework for predicting the thermodynamic
properties of pure fluids or their mixtures is illustrated in Fig. 1
based on the above discussion. This process is also compared in which is employed to represent the training error of the predictive
Fig. 1 with that employed when adopting equations of state for model with respect to the training set and the generalization error
generating mappings among thermodynamic parameters. For the of the predictive model with respect to the testing set. The other is
ML framework, the sample space of the thermodynamic parame- the squared correlation coefficient r 2 , given as
ters D must first be obtained from some high-fidelity databases, P P 2
P
experiments, or MD simulations. Then, the data within the sample N N i¼1 f y i zi N i¼1 f yi N
i¼1 zi
space must be rescaled to decrease the differences in their mag- r2 ¼ P 2 P P 2 ;
P 2
nitudes by various means, such as normalization, regularization, or N N
i¼1 f ðy i Þ N
i¼1 f ðy i Þ N N
z2
i¼1 i
N
z
i¼1 i
logarithmic transformation. This is a significant step prior to sub-
mitting the data for training to avoid the impacts of data with (9)
extremely imbalanced magnitudes on the training process. The
rescaling process, however, does not apply to the molar fractions which is used to quantify the strength of the linear correlations
because they are naturally normalized. We note from preliminary between the labels and the training results with respect to the
experiments that logarithmic transformation tends to maintain the training set and the prediction results with respect to the testing
relative magnitudes of PVT data better than normalization and set. It worth noting that there are several significant factors that
regularization and therefore provides better results for our tasks. will bring in errors with v-SVR: (a) regions of low data density, this
The logarithmic transformation equations are given as follows: could be avoided by adopting the relatively uniform data distri-
bution to train the model; (b) high variance of magnitude of inputs,
P 0 ¼ log10 ðPÞ; (5) this could be solved by the scaling of inputs; (c) underfitting, it
might be overcome by appropriately decreasing the regularization
strength; (d) overfitting, it might be settled by providing sufficient
V 0m ¼ log10 ð1000 Vm Þ (6)
training data and appropriately selecting the regularization
strength and the kernel width with the fivefold cross-validation.
T 0 ¼ log10 ðTÞ: (7)
Here, P is given in MPa, Vm in L/mol, and T in K. Accordingly, the 2.3. Details on molecular dynamics simulations
input vector y and the label z employed by SVR are determined in
training as [V'm T0 x1 x2 … xn1] and P 0 , respectively. Afterward, the The parameters of the potential models employed in our MD
rescaled sample space is divided into a training set and a testing set. simulations have been carefully chosen. Accordingly, the TIP4P [45],
Then, the grid search method is adopted with fivefold cross- EMP2 [46], and two-site models [47] were employed for simulating
validation (CV) to determine the optimal values of g and C in n- H2O, CO2, and H2 molecules, respectively. A series of MD simula-
SVR. The entire training set with the optimal values of g and C is tions are performed by using the LAMMPS package [48]. Simulated
Fig. 1. Outline of the two methods for generating mappings among thermodynamic parameters. One method involves establishing an empirical equation of state by fitting
experimental or simulated data, as illustrated in the upper half of the figure. The other method is based on the proposed machine learning framework, as illustrated in the lower half
of the figure.
4 Y. Liu et al. / Energy 188 (2019) 116091
Table 1
Details regarding the sample spaces. Here, N1 and N2 denote the sizes of the training sets and the testing sets, respectively.
systems contain 2500 molecules for the ternary mixtures. All ternary H2O-CO2-H2 mixtures required by the present study have
molecules are located in a cubic box with periodic boundary con- been obtained by conducting MD simulations ranging from the
ditions. The long-range Coulombic interactions are handled by the near-critical region to the supercritical region, which yielded 660
particle-particle/particle-mesh (PPPM) whereas the short-range datasets. Details regarding the adopted sample spaces of the
interaction potential is cut off beyond 12 Å. A Nose -Hoover ther- aforementioned systems are provided in Table 1.
mostat [49] is coupled to systems to control temperatures for the The ML framework was first applied to the pure fluids, where
canonical (NVT) ensembles along which the pressure of systems is y ¼ [V'm T'] and z ¼ P'. The results for H2O are described in detail,
calculated. The time step is of 1 fs in all simulations. Initial equili- whereas only the main results for CO2 and H2 are presented owing
bration periods are 600 ps, followed by simulation runs of 600 ps to to space limitations. The coarse- and fine-grained grid searches for
record meaningful data. the optimal parameters g and C for H2O are illustrated in Figs. 2(a)
and (b), respectively, as a function of log2 ðgÞ and log2 ðCÞ. Choosing
such exponentially increasing grid sizes is a typical strategy
3. Results and discussion
employed to increase the efficiency of the grid search method. In
addition, this strategy ensures that the grid space can span a suf-
In the following sections, we apply the proposed ML framework
ficiently wide range of both g and C. The total calculation times
to predict the thermodynamic properties of three pure fluids and
involved in the grid search method can be further reduced by first
ternary mixtures of the three. The pure fluids include water (H2O),
conducting a coarse grid search to determine an approximate re-
carbon dioxide (CO2), and hydrogen (H2) because these are the
gion for which the optimal parameters may be located prior to
most common substances and also include polar and non-polar
conducting searches over finer-grained grids. This approximate
molecules. Moreover, these pure fluids play important roles in
region for H2O is enclosed by the dashed black line in Fig. 2(a).
recently emerging technologies such as the extraction of super-
Then, smaller grid sizes are employed, as shown in Fig. 2(b), to
critical fluids [50] and coal gasification in supercritical water [51].
pinpoint the optimal parameters. The results indicate that the two
Predictions for all cases are conducted over a wide range of T, which
optimal parameters lie within the grid space, which confirms that
include near-critical and supercritical regions of their phase space.
our original grid spans are sufficiently large to include the optimal
The thermodynamic properties of H2O, CO2, and H2 have been
parameters. The contour lines in Fig. 2(b) suggest that multiple
extensively studied via experiments and simulations, and this has
parameter values correspond with an equivalent CV accuracy. Un-
generated an abundance of high-fidelity datasets. Accordingly, the
der this condition, the optimal parameters may be determined as
sample spaces for these pure fluids have been obtained directly
those lying with the minima of C to reduce the possibility of
from the database provided by the prestigious National Institute of
overfitting. The above-discussed procedure yields optimal g and C
Standards and Technology (NIST). However, experimental and
values of 16 and 8, respectively, for the H2O model. The optimiza-
simulation data regarding the thermodynamic properties of H2O-
tion results for other models are listed in Table 2.
CO2-H2 ternary mixtures are presently scarce. Therefore, sufficient
The optimal values of g and C are then utilized with the training
training data pertaining to the thermodynamic properties of
10 MSE
10 MSE
0.325 0.005
0.292
0.228
log2( )
log2( )
0.195 0.003
-2 0.163 -2
0.130 0.002
0.0975
-8 0.0650
-8 0.001
0.0325
set derived from the NIST basis sets for training the n-SVR, and the mappings for reproducing the pressure of H2O, CO2, and H2 in
final ML mappings are generated according to the n-SVR with the Table 3. It is noted that the training and prediction errors of ML
Gaussian kernel. Figs. 3(a) and (b) present comparisons between mappings are rather small. In addition, all values of r2 are quite
the NIST basis sets with the data generated from the ML mappings close to 1, indicating that a strongly linear relationship exists be-
for H2O based on a training set and a testing set, respectively. The tween the basis sets and the mapping and prediction data. These
NIST training set data are marked by the “þ” symbols in Fig. 3(a) results verify the applicability of our ML framework for various
whose colors represent the values of T based on the color scale pure fluids.
given to the right of the figure, while the corresponding ML map- In contrast to the pure liquids, the input vector y of the H2O-
pings are described by the smooth grey curves. Similarly, the NIST CO2-H2 mixtures would be rendered as [V'm T' xH2 O xCO2 ], [V'm T'
testing set data in Fig. 3(b) are marked by various symbols ac- xH2 O xH2 ], or [V'm T' xCO2 xH2 ], which are mutually equivalent
cording to the values of Vm pertaining to the data, while the cor- because the last molar fraction component is independent of the
responding ML mappings are described by the curves. The accuracy others under the above-discussed constraint. Despite the increased
of the ML mappings based on the training set serves as a necessary number of dimensions, the ML procedures implemented here are
factor for validating the feasibility of the data-driven ML frame- identical to those employed for the pure liquids. Specific compo-
work, whereas the prediction accuracy of the ML mappings based sition for the ternary mixtures and corresponding dataset size are
on the testing set serves the role of estimating the generalization presented in Fig. 4. Again, the optimized values of g and C are ob-
performance of the framework for data not included in the original tained, as listed in Table 2. From Fig. 5(a), we can observe that the
dataset. It is observed that all ML outputs for H2O agree remarkably ML mappings for the ternary mixtures achieve good thermody-
well with the training and testing set data over a wide range of T. namic mapping and high prediction accuracies from the near-
Furthermore, we quantify and summarize the precision of the ML critical region to the supercritical region. This is also indicated by
the results listed in Table 3 for the ternary mixtures, where the
mapping and prediction errors are extremely low, and the values of
Table 2 r2 are all nearly equal to 1. We also compare the results of Fig. 5(a)
Optimal values of parameters g and C obtained for n-SVR by fivefold CV.
with the results obtained from the three most typical equations of
System g C state for the H2O-CO2-H2 mixtures in Fig. 5(b), namely PR-EOS, SRK-
H2O 16 8 EOS, and DMW EOS. Here, extending an EOS to mixtures requires
CO2 8 16 the use of a mixing rule. We adopted the commonly used van der
H2 2 4 Waals mixing rule for the PR-EOS and SRK-EOS, and the corre-
H2O-CO2-H2 2 32
sponding equation parameters were obtained from the available
literature [52e54]. We note from the figure that the prediction
T (K)
2000
(a) Vm : L/mol NIST
05
3 Table 3
0.
SVR
=
1
m
0.
Accuracies and correlations for H2O, CO2, H2, and their ternary mixtures.
15
V
0.
25
0.
m
1600
V
0.
35
3
=
0.
m
45
4
V
0.
55
0.
m
0.
65
m
0.
75
6
0.
V
85
m
0.
=
0.
V
8
0.
95
m
0.
9
=
m
V
0.
0.
log10(P)
=
m
0.
=
0.
V
=
m
V
=
V
r2 r2
m
=
m
V
MSE MSE
m
=
V
m
V
m
V
m
V
m
V
m
V
1200
1 H2O 2.52 104 0.999223 2.41 104 0.999115
CO2 9.65 106 0.999958 5.18 106 0.999978
800 H2 4.44 107 0.999997 2.47 107 0.999999
0
H2O-CO2-H2 1.70 106 0.999992 2.52 105 0.999894
-1 400
0 100 200 300 400 500 600 700
N xCO2 (%)
80 60 40 20 10 20 30 40 40 30 20 10
80
x H2 = 1 xH2O xCO2
240
(b) NIST Vm = 0.125 L/mol
NIST
Vm = 0.275 L/mol
SVR SVR 60
180 NIST Vm = 0.525 L/mol NIST Vm = 0.88 L/mol
SVR SVR
P (MPa)
40
N
120
60
20
0
400 800 1200 1600 2000
0
T (K) 10 20 30 40 10 20 30 40 20 40 60 80
Fig. 3. Comparisons of the NIST basis sets with the data generated from the ML xH2O (%)
mappings for H2O: (a) ML mappings based on the training set as a function of the
training set size N; and (b) ML mappings based on the testing set as a function of T. Fig. 4. Chosen specific composition and corresponding dataset size for the H2O-CO2-
Both datasets are derived from the NIST database. H2 ternary mixtures.
6 Y. Liu et al. / Energy 188 (2019) 116091
400
P PMD
MARD ¼
model
100%; (10)
(b) Ideal curve PMD
PR EOS
300 where the subscripts ‘MD’ and ‘model’ represent the results from
SRK EOS
the MD simulations and the ML mappings or the EOS models,
DMW EOS
PEOS (MPa)
respectively. We can be observed from Fig. 5(c) that the DMW EOS
exhibits a minimum MARD among those three EOS models. It is
200 because unlike the cubic EOS truncated at the third virial coeffi-
cient, the DMW EOS has a rather complex form with more co-
efficients, generally presenting better predictability for
100 supercritical fluid mixtures. However, the ML mappings can give a
prediction with extremely low MARD of 1.7%, an order of magni-
tude improvement over those three equations of state. At this stage
0 in the development of equations of state, the proposed data-driven
ML method, which requires no knowledge of the motions and
intrinsic interactions of molecules, may hold greater promise for
0 100 200 300 400 generating accurate thermodynamic mappings for multi-
PMD (MPa) component mixtures.
80
(c)
59.40% 4. Conclusions
60
In summary, we introduced a general ML framework for pre-
MARD (%)
Acknowledgements
X
n
B¼ xj Bj (25)
This work was supported by the National Key R&D Program of j¼1
China (Grant No. 2016YFB0600100).
where djq is the binary interaction parameter for the components j
and q. All parameters in the above mixing rule for the H2O-CO2-H2
Appendix mixtures can be found in other literature [52e54].
Duan-Møller-Weare equation of state. Compared with cubic
Soave modified Redlich-Kwong equation of state. The SRK-EOS for equations of state, the DMW-EOS has a more complicated form
pure fluids is given by Ref. [11]. with fourteen coefficients [15].
RT A P z Vz a þ a2 =T 2z þ a3 =T 3z a4 þ a5 =T 2z þ a6 =T 3z
P¼ ; (11) Z¼ ¼1þ 1 þ
Vm B Vm ðVm þ BÞ RTz Vz V 2z
with
a þ a8 =T 2z þ a9 =T 3z a10 þ a11 =T 2z þ a12 =T 3z
þ 7 þ
A ¼ Ac b; (12) V 4z V 5z
! !
a a a
R2 T 2c þ 3132 1 þ 14 exp 14 ;
Ac ¼ 0:42747 ; (13) Tz Vz Vz2
V 2z
Pc
(26)
h pffiffiffiffiffi i2
b ¼ 1 þ k 1 Tr ; (14) with
3:0626t3 P
k ¼ 0:48508 þ 1:55171u 0:15613u2 ; (15) Pz ¼ ; (27)
m
RTc
B ¼ 0:08664 ; (16) 154T
Pc Tz ¼ ; (28)
m
where R is the ideal gas constant; the subscript ‘c’ denotes the
t 3
properties at the critical point; Tr represents a reduced tempera- Vz ¼ Vm ; (29)
ture; u is the acentric factor of molecules. 3:691
Peng-Robinson equation of state. The PR-EOS takes the form for
pure fluids [12]. where Z is the compression factor; t and m are the Lennard-Jones
parameters; Vm is in L/mol. For mixture applications of the DMW-
RT A EOS, the Lorentz-Berthelot rules is used to mix the parameters m
P¼ ; (17) and t
Vm B Vm ðVm þ BÞ þ BðVm BÞ
X
n X
n
pffiffiffiffiffiffiffiffiffiffi
with m¼ xj xq C1;jq mj mq ; (30)
j¼1 q¼1
A ¼ Ac b; (18)
X
n X
n
R2 T 2c t¼ xj xq C2;jq tj þ tq =2; (31)
Ac ¼ 0:45724 ; (19)
Pc j¼1 q¼1
h pffiffiffiffiffi i2 where C1;jq and C2;jq are the mixing parameters describing the bi-
b¼ 1þk 1 Tr ; (20) nary interaction between components j and q.
References
k ¼ 0:37464 þ 1:54226u 0:26992u2 ; (21)
[1] Zhang ZG, Duan ZH. An optimized molecular potential for carbon dioxide.
RTc J Chem Phys 2005;122(21).
B ¼ 0:07780 : (22) [2] Feng X-J, Liu Q, Zhou M-X, Duan Y-Y. Gaseous PVTx properties of mixtures of
Pc carbon dioxide and propane with the burnett isochoric method. J Chem Eng
Data 2010;55(9):3400e9.
van der Waals mixing rule. Mixing rules extend the applications of [3] Zhang XR, Yamaguchi H, Fujima K, Enomoto M, Sawada N. Theoretical analysis
equations of state from pure fluids to mixtures. In this work, van der of a thermodynamic cycle for power and heat production using supercritical
carbon dioxide. Energy 2007;32(4):591e9.
Waals mixing rule is used as follows: [4] Fan J, Hong H, Jin H. Power generation based on chemical looping combustion:
will it qualify to reduce greenhouse gas emissions from life-cycle assessment?
X
n X
n ACS Sustainable Chem Eng 2018;6(5):6730e7.
A¼ xj xq Ajq (23) [5] Chakraborty A, Saha BB, Koyama S, Ng KC. Thermodynamic modelling of a
j¼1 q¼1 solid state thermoelectric cooling device: temperatureeentropy analysis. Int J
Heat Mass Transf 2006;49(19):3547e54.
qffiffiffiffiffiffiffiffiffiffi
[6] Bang-Møller C, Rokni M. Thermodynamic performance study of biomass
gasification, solid oxide fuel cell and micro gas turbine hybrid systems. Energy
Ajq ¼ 1 djq Aj Aq (24) Convers Manag 2010;51(11):2330e9.
[7] Gilfillan SMV, Lollar BS, Holland G, Blagburn D, Stevens S, Schoell M, et al.
Solubility trapping in formation water as dominant CO2 sink in natural gas
8 Y. Liu et al. / Energy 188 (2019) 116091
fields. Nature 2009;458:614. autoregressive fractionally integrated moving average and least square sup-
[8] Gaillard F, Scaillet B, Arndt NT. Atmospheric oxygenation caused by a change port vector machine. Energy 2017;129:122e37.
in volcanic degassing pressure. Nature 2011;478:229. [31] Hansen K, Montavon G, Biegler F, Fazli S, Rupp M, Scheffler M, et al. Assess-
[9] Duan Z, Zhang Z. Equation of state of the H2O, CO2, and H2OeCO2 systems up ment and validation of machine learning methods for predicting molecular
to 10 GPa and 2573.15K: molecular dynamics simulations with ab initio po- atomization energies. J Chem Theory Comput 2013;9(8):3404e19.
tential surface. Geochem Cosmochim Acta 2006;70(9):2311e24. [32] Schutt KT, Glawe H, Brockherde F, Sanna A, Muller KR, Gross EKU. How to
[10] Guardia E, Martí J. Density and temperature effects on the orientational and represent crystal structures for machine learning: towards fast prediction of
dielectric properties of supercritical water. Phys Rev E 2004;69(1):011502. electronic properties. Phys Rev B 2014;89(20).
[11] Soave G. Equilibrium constants from a modified Redlich-Kwong equation of [33] Hansen K, Biegler F, Ramakrishnan R, Pronobis W, von Lilienfeld OA, Müller K-
state. Chem Eng Sci 1972;27(6):1197e203. R, et al. Machine learning predictions of molecular properties: accurate many-
[12] Peng D-Y, Robinson DB. A new two-constant equation of state. Ind Eng Chem body potentials and nonlocality in chemical space. J Phys Chem Lett
Fundam 1976;15(1):59e64. 2015;6(12):2326e31.
[13] Wilczek-Vera G, Vera JH. Understanding cubic equations of state: a search for [34] Deringer VL, Csanyi G. Machine learning based interatomic potential for
the hidden clues of their success. AIChE J 2015;61(9):2824e31. amorphous carbon. Phys Rev B 2017;95(9).
[14] Lopez-Echeverry JS, Reif-Acherman S, Araujo-Lopez E. Peng-Robinson equa- [35] Seko A, Maekawa T, Tsuda K, Tanaka I. Machine learning with systematic
tion of state: 40 years through cubics. Fluid Phase Equilib 2017;447:39e71. density-functional theory calculations: application to melting temperatures of
[15] Duan Z, Møller N, Weare JH. A general equation of state for supercritical fluid single- and binary-component solids. Phys Rev B 2014;89(5):054303.
mixtures and molecular dynamics simulation of mixture PVTx properties. [36] Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal
Geochem Cosmochim Acta 1996;60(7):1209e16. problems. Technometrics 1970;12(1):55e67.
[16] Ma J, Li J, He C, Peng C, Liu H, Hu Y. Thermodynamic properties and [37] Faber F, Lindmaa A, von Lilienfeld OA, Armiento R. Crystal structure repre-
vaporeliquid equilibria of associating fluids, PengeRobinson equation of state sentations for machine learning models of formation energies. Int J Quantum
coupled with shield-sticky model. Fluid Phase Equilib 2012;330:1e11. Chem 2015;115(16):1094e101.
[17] Arvelos S, Rade LL, Watanabe EO, Hori CE, Romanielo LL. Evaluation of [38] Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B
different contribution methods over the performance of PengeRobinson and Stat Methodol 1996;58(1):267e88.
CPA equation of state in the correlation of VLE of triglycerides, fatty esters and [39] Rasmussen CKIW CE. Gaussian processes for machine learning. Cambridge:
glycerolþCO2 and alcohol. Fluid Phase Equilib 2014;362:136e46. MIT Press; 2006.
[18] Zhao H, Lvov SN. Phase behavior of the CO2eH2O system at temperatures of [40] Vapnik VN. The nature of Statistical learning theory. Springer; 1995.
273e623 K and pressures of 0.1e200 MPa using Peng-Robinson-Stryjek-Vera [41] Scho€lkopf B, Smola AJ, Williamson RC, Bartlett PL. New support vector algo-
equation of state with a modified Wong-Sandler mixing rule: an extension to rithms. Neural Comput 2000;12(5):1207e45.
the CO2eCH4eH2O system. Fluid Phase Equilib 2016;417:96e108. [42] Chang C-C, Lin C-J. Training v-support vector regression: theory and algo-
[19] Rupp M, Tkatchenko A, Müller K-R, von Lilienfeld OA. Fast and Accurate rithms. Neural Comput 2002;14(8):1959e77.
modeling of molecular atomization energies with machine learning. Phys Rev [43] Libsvm Chang C-C, Lin C-J. ACM Trans Intell Syst Technol 2011;2(3):1e27.
Lett 2012;108(5):058301. [44] Scho€lkopf B, Smola AJ, Bach F. Learning with kernels: support vector ma-
[20] Rossouw D, Burdet P, de la Pen ~ a F, Ducati C, Knappett BR, Wheatley AEH, et al. chines, regularization, optimization, and beyond. MIT press; 2002.
Multicomponent signal unmixing from nanoheterostructures: overcoming [45] Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison
the traditional challenges of nanoscale x-ray analysis via machine learning. of simple potential functions for simulating liquid water. J Chem Phys
Nano Lett 2015;15(4):2716e20. 1983;79(2):926e35.
[21] Raccuglia P, Elbert KC, Adler PDF, Falk C, Wenny MB, Mollo A, et al. Machine- [46] Harris JG, Yung KH. Carbon dioxide's liquid-vapor coexistence curve and
learning-assisted materials discovery using failed experiments. Nature critical properties as predicted by a simple molecular model. J Phys Chem
2016;533:73. 1995;99(31):12021e4.
[22] Huan TD, Batra R, Chapman J, Krishnan S, Chen L, Ramprasad R. A universal [47] Yang Q, Zhong C. Molecular simulation of adsorption and diffusion of
strategy for the creation of machine learning-based atomistic force fields. npj hydrogen in metalorganic frameworks. J Phys Chem B 2005;109(24):
Comput Mater 2017;3(1):37. 11862e4.
[23] Ramprasad R, Batra R, Pilania G, Mannodi-Kanakkithodi A, Kim C. Machine [48] Plimpton S. Fast parallel algorithms for short-range molecular dynamics.
learning in materials informatics: recent applications and prospects. npj J Comput Phys 1995;117(1):1e19.
Comput Mater 2017;3(1):54. [49] Hoover WG. Canonical dynamics: equilibrium phase-space distributions. Phys
[24] Brockherde F, Vogt L, Li L, Tuckerman ME, Burke K, Müller K-R. Bypassing the Rev A 1985;31(3):1695e7.
Kohn-Sham equations with machine learning. Nat Commun 2017;8(1):872. [50] Mouahid A, Crampon C, Toudji S-AA, Badens E. Supercritical CO2 extraction of
[25] Choi S, Shin JH, Lee J, Sheridan P, Lu WD. Experimental demonstration of neutral lipids from microalgae: experiments and modelling. J Supercrit Fluids
feature extraction and dimensionality reduction using memristor networks. 2013;77:7e16.
Nano Lett 2017;17(5):3113e8. [51] Li Y, Guo L, Zhang X, Jin H, Lu Y. Hydrogen production from coal gasification in
[26] Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT, Müller K-R. supercritical water with a continuous flowing system. Int J Hydrogen Energy
Machine learning of accurate energy-conserving molecular force fields. Sci 2010;35(7):3036e45.
Adv 2017;3(5). [52] Yang X, Xu J, Wu S, Yu M, Hu B, Cao B, et al. A molecular dynamics simulation
[27] Liu Z, Zhu D, Rodrigues SP, Lee K-T, Cai W. Generative model for the inverse study of PVT properties for H2O/H2/CO2 mixtures in near-critical and super-
design of metasurfaces. Nano Lett 2018;18(10):6570e6. critical regions of water. Int J Hydrogen Energy 2018;43(24):10980e90.
[28] Hang Z, Hippalgaonkar K, Buonassisi T, Løvvik O M, Sagvolden E, Ding D. [53] Meng L, Duan Y-Y, Wang X-D. Binary interaction parameter kij for calculating
Machine learning for novel thermal-materials discovery: early successes, the second cross-virial coefficients of mixtures. Fluid Phase Equilib
opportunities, and challenges. ES Energy Environ 2018;2:1e8. 2007;260(2):354e8.
[29] Geng Z, Yang X, Han Y, Zhu Q. Energy optimization and analysis modeling [54] Nishiumi H, Gotoh H. Generalization of binary interaction parameters of Peng-
based on extreme learning machine integrated index decomposition analysis: Robinson equation of state for systems containing hydrogen. Fluid Phase
application to complex chemical processes. Energy 2017;120:67e78. Equilib 1990;56:81e8.
[30] Yuan X, Tan Q, Lei X, Yuan Y, Wu X. Wind power prediction using hybrid