Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Journal of ARTICLE scitation.

org/journal/jap
Applied Physics

A machine learning based approach for phononic


crystal property discovery
Cite as: J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153
Submitted: 28 February 2020 · Accepted: 23 June 2020 · View Online Export Citation CrossMark
Published Online: 13 July 2020

Seid M. Sadat and Robert Y. Wanga)

AFFILIATIONS
School for Engineering of Matter, Transport, and Energy, Arizona State University, Tempe, Arizona 85287, USA

Note: This paper is part of the special collection on Machine Learning for Materials Design and Discovery
a)
Author to whom correspondence should be addressed: rywang@asu.edu

ABSTRACT
Phononic crystals are artificially structured materials that can possess special vibrational properties that enable advanced manipulations of
sound and heat transport. These special properties originate from the formation of a bandgap that prevents the excitation of entire frequency
ranges in the phononic band diagram. Unfortunately, identifying phononic crystals with useful bandgaps is a problematic process because
not all phononic crystals have bandgaps. Predicting if a phononic crystal structure has a bandgap, and if so, the gap’s center frequency and
width is a computationally expensive process. Herein, we explore machine learning as a rapid screening tool for expedited discovery of
phononic bandgap presence, center frequency, and width. We test three different machine learning algorithms (logistic/linear regression,
artificial neural network, and random forests) and show that random forests performs the best. For example, we show that a random
phononic crystal selection has only a 17% probability of having a bandgap, whereas after incorporating rapid screening with the random
forests model, this probability increases to 89%. When predicting the bandgap center frequency and width, this model achieves coefficient
of determinations of 0.66 and 0.85, respectively. If the model has a priori knowledge that a bandgap exists, the coefficients of determination
for center and width improve to 0.97 and 0.85, respectively. We show that most of the model’s performance gains are achieved for training
datasets as small as ∼5000 samples. Training the model with just 500 samples led to reduced performance but still yielded algorithms with
predictive values.

Published under license by AIP Publishing. https://doi.org/10.1063/5.0006153

INTRODUCTION on phononic crystals and their applications, we refer the reader to


some review articles and books.10–12
Phononic crystals are artificially synthesized materials with
Figure 1(a) illustrates a two-dimensional phononic crystal that
periodic variations in acoustic properties.1–3 This periodicity can
consists of periodic cylinders in a square lattice that is embedded
lead to advantageous vibrational properties that can be used to
within a host material. Figure 1(b) shows the phononic crystal’s
manipulate the transport of sound and heat. The vibrational prop-
unit cell, and Fig. 1(c) illustrates the key symmetry points in recip-
erties of a phononic crystal are best described by its phononic band
rocal space. The full band diagram of a two-dimensional phononic
diagram (also known as the phonon dispersion relationship). The crystal is a three-dimensional surface. For ease of visualization, the
phononic band diagram relates a given phonon wave vector to its band diagram is often illustrated as a two-dimensional plot that is
corresponding frequencies and is analogous to electronic and pho- graphed along the key directions in reciprocal space [i.e., line con-
tonic band diagrams. Within a given phononic band diagram, there necting Γ–M–X–Γ in Fig. 1(c)]. Figure 1(d) shows the band
can be entire phonon frequency ranges that are forbidden from diagram for a phononic crystal that has a bandgap. The presence of
excitation. These frequency ranges are known as “phononic bandg- such a bandgap along with the bandgap’s center frequency and
aps” and are described by their bandgap center frequency and width are among the primary characteristics of what makes a pho-
bandgap width. These phononic bandgaps can be used to create nonic crystal useful for the creation of phononic devices. Not all
novel phononic devices such as phonon waveguides, cavities, filters, phononic crystals have a bandgap and Fig. 1(e) illustrates an
sensors, switches, and rectifiers.4–9 For a more in-depth discussion example of a phononic band diagram that does not have a bandgap.

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-1


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

For a given phononic crystal structure, it is not possible to


know a priori what the bandgap center frequency or bandgap
width will be. In fact, many phononic crystal structures do not
yield a phononic bandgap at all. For example, only 17% of the
14 112 phononic crystals investigated in this paper yielded band-
gaps. Calculating the phononic band diagram is the typical way to
determine if there is a bandgap, and if so, what are the correspond-
ing bandgap center frequency and width. Unfortunately, calculating
the phononic band diagram is a computationally expensive process
that depends on many parameters. These parameters include the
phononic crystal’s dimensionality, unit cell symmetry, number of
materials, and material characteristics (i.e., shape, size, density,
elastic modulus, and Poisson’s ratio). Calculating the phononic
band diagram begins with the elastic wave equation for a locally
isotropic medium,
     i  
@ 2 ui 1 @ @ul @ @u @ul
¼ λ þ μ þ , (1)
@t 2 ρ @xi @xl @xl @xl @xi

where t is the time, i and l are the Cartesian coordinate indices,


and ui , ul , xi , and xl are the Cartesian components of a displace-
ment vector, u(r), and a position vector, r, respectively. The
symbols λ(r), μ(r), and ρ(r) represent spatially varying material
properties corresponding to the first Lamé coefficient, second Lamé
coefficient, and density, respectively. The Lamé coefficient mechan-
ical property set can be transformed into the commonly used
mechanical property set of elastic modulus, E, and Poisson’s ratio,
ν, via the following relations:


λ¼ , (2)
(1 þ ν)(1  2ν)

E
μ¼ : (3)
2(1 þ ν)

Since a phononic crystal has a periodic structure, Bloch’s


theorem tells us that the eigensolutions of the wave equation are
modulated sinusoids of the form

u(r) ¼ uK (r)e jKr , (4)

where K is a phonon wave vector, uK(r) is a function with the same


FIG. 1. (a) Schematic of a two-dimensional phononic crystal with a square periodicity as the phononic crystal, and j is the imaginary unit.
lattice. The phononic crystal consists of a host material with cylindrical inclu- Each phonon wave vector is an eigenvector, and the corresponding
sions. The cylindrical inclusions are infinitely long in to and out of the plane of frequencies are the eigenvalues. The phononic band diagram is a
the paper. (b) Schematic of the phononic crystal’s unit cell in real space with rel- graph of these eigenvectors and eigenvalues (Fig. 1).
evant parameters labeled: lattice constant, a, inclusion diameter, d, inclusion Phononic band diagrams have historically been calculated
material (blue), and host material (white). (c) Schematic illustrating the key sym-
using a variety of computational methods. In 1992, Sigalas and
metry points within the reciprocal space unit cell (e.g., first Brillouin zone). (d)
Phononic band diagram of a two-dimensional square lattice with its phononic Economou13 used the plane wave expansion method to calculate
bandgap highlighted. This particular phononic crystal in part (d) has the follow- the phononic band diagram for periodic spherical inclusions inside
ing parameters: Ehost = 1 GPa, ρhost = 2000 kg/m3, Einclusion = 1000 GPa, a homogenous material. In 2000, Tanaka et al.14 used the finite-
ρinclusion = 4000 kg/m3, and d/a = 0.9. (e) Phononic band diagram of a two- difference time-domain method to calculate the phononic band
dimensional square lattice that does not have a phononic bandgap. Despite diagram for two-dimensional phononic crystals. One drawback of
their periodicity, many phononic crystals do not produce a bandgap. This partic- the plane wave expansion and finite-difference time-domain
ular phononic crystal in part (e) has the following parameters: Ehost = 1 GPa,
ρhost = 8000 kg/m3, Einclusion = 100 GPa, ρinclusion = 500 kg/m3, and d/a = 0.9. methods are that they are computationally expensive. Recognizing
the need for faster algorithms, researchers have recently utilized

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-2


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

reduced-order models to more quickly compute phononic band performances. We test these algorithms in three different ways. We
diagrams.15–17 Recently, commercial finite element method soft- first test these algorithms with respect to “classification,” which cor-
ware packages (e.g., COMSOL) have also been used for phonon responds to a simple binary yes/no output as to whether or not a
bandgap modeling of complicated structures with inclusion materi- bandgap exists for a given phononic crystal structure. Next, we test
als of various sizes and shapes.18 Although rigorous calculations these algorithms with respect to “prediction,” which yields a quanti-
like these are the best way to accurately determine the band tative output for the bandgap center frequency and bandgap width.
diagram for a given phononic crystal structure, these approaches Finally, we test the prediction capabilities of these algorithms for a
are computationally expensive and time-consuming. Furthermore, hypothetical special case. This special case corresponds to the situa-
there are an infinite number of structural possibilities for phononic tion where it is known a priori that a bandgap exists but that the
crystals and exploring a broad range of parameters can be impracti- bandgap center frequency and width are unknown. We find that out
cal. In addition, the inverse process of designing a phononic crystal of the investigated machine learning algorithms, the random forest
to yield a desired bandgap center frequency and width is an even model performs the best. For classification, this model achieves an
more challenging task. accuracy, precision, and recall of 0.94, 0.89, and 0.77, respectively.
In order to facilitate phononic crystal design, researchers have When predicting bandgap center frequency and width, the random
relied on simplified cases such as one-dimensional crystals19–21 or forests model achieves coefficients of determinations of 0.66 and
by considering only longitudinal waves for two-dimensional pho- 0.85, respectively. If the machine learning model has a priori knowl-
nonic crystals.22,23 Both of these simplified approaches result in an edge that a bandgap exists, these center and width coefficients
easily solved equation with a derivative that can be used to opti- of determination improve to 0.97 and 0.85, respectively. These
mize the bandgap characteristics. Researchers have also used two- results demonstrate the potential for machine learning algorithms
dimensional phononic crystal design approaches that incorporate as rapid screening tools for expedited discovery of phononic crystal
both longitudinal and transverse modes and are thus more properties.
accurate.24–27 These studies combine band diagram calculations
with an iterative process that steers the algorithm toward a pho-
nonic crystal with the desired properties. Another phononic crystal
design approach is topology optimization, which simplifies the METHODOLOGY
design process by focusing only on the unit cell shape and not the We studied the machine learning potential for phononic crystal
component materials themselves.20,22,24–26,28–30 Using this discovery by applying three different algorithms: linear/logistic
approach, researchers successfully created phononic crystals with regression, artificial neural network, and random forests. Linear/
very wide bandgaps.29,30 However, one drawback that all of the logistic regressions are suitable for linear systems, whereas artificial
works in this paragraph have is that they still rely upon rigorous neural networks and random forests are more suitable for nonlinear
calculations of the phononic band diagram and are hence computa- systems. We used the scikit-learn package to conduct the linear/
tionally expensive and time-consuming.28,31 Consequently, compu- logistic regression and random forests models. TensorFlow was used
tationally inexpensive methods, even if less accurate, would benefit to conduct the artificial neural network model. Within the machine
the field by speeding up the phononic crystal search process. The learning community, inputs are often referred to as “features” and
task of rigorously calculating the band diagram could then be outputs are often referred to as “labels.” These terms of input or
reserved for only the most promising phononic crystal structures. feature and output or label will be used interchangeable throughout
In this work, we propose to use machine learning as a funda- this paper.
mentally different and potentially efficient approach to determining Detailed descriptions of linear/logistic regression, artificial
the bandgap characteristics of a given phononic crystal structure. neural network, and random forest machine learning approaches
During machine learning, computer systems build their own math- can be found in Refs. 37, 41, 50, and 51. In brief, linear/logistic
ematical models to perform a specific task such as classification, regression approaches perform a least-squares fit on the data to
prediction, and/or decision-making. No explicit instructions are yield a simple, linear, and easily interpretable equation that
given to the computer system as it creates the mathematical model. describes the machine learning predictions.50 The output for logis-
The computer system instead creates the model using algorithms, tic regression is a binary value (e.g., yes or no), and hence we use
statistical analysis, and training data. In recent years, machine this analysis for our classification tests. The output for linear
learning has been employed in several areas, including advanced regression is a continuously variable value, and hence we use this
object detection and classification,32–34 natural language analysis for our prediction tests. Artificial neural networks attempt
processing,35–37 optimization in strategy games,38–41 cancer detec- to mimic the neural network in biological brains. In this approach,
tion,42,43 optical device design,44,45 metamaterial structure optimi- layers of connected nodes (or “artificial neurons”) process incom-
zation,46 and thermoelectric materials.47 During the past year, ing data and then feed their output to subsequent layers of artificial
researchers have started using machine learning for phononics by neurons. As the artificial neural network is trained, the network
applying artificial neural network-based approaches to design one- weighting at each node evolves and changes to minimize error.37,41
and two-dimensional phononic crystals.48,49 The random forests method uses a collection of decision trees to
To further explore the potential for machine learning in pho- determine the final output. In this method, each decision tree oper-
nonic crystal property discovery, we test the use of three different ates on a subset of data inputs and generates a corresponding
machine learning algorithms (linear/logistic regression, artificial output vote. The final output corresponds to the output that
neural network, and random forests) and report their corresponding received the most votes among the ensemble of decision trees.51

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-3


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

For the purposes of providing training data for the machine To account for bias in the random sampling of the training
learning algorithms, we used COMSOL to calculate the band and test datasets, we used a k-fold cross-validation scheme. In this
diagram of 14 112 phononic crystal structures. The most basic scheme, the data are partitioned into k bags, where k − 1 bags are
description of a phononic crystal is its number of components, used for training and 1 bag is used for testing the model perfor-
overall dimensionality, and unit cell symmetry. For the purposes of mance. This process is repeated k times for different combinations
simplicity and efficiency in creating the training data, we focus on of training and test data. The mean, standard deviation, and the
two-dimensional phononic crystals with square symmetry that coefficient of determination (R2) are then calculated to describe
consist of two components (i.e., cylindrical inclusions embedded in model accuracy.53 For our k-fold cross-validation scheme, we used
a host matrix). This type of phononic crystal is schematically illus- a value of k = 10 (we note this k is not to be confused with K,
trated in Fig. 1(a). To further reduce the design space of possible which we use to represent wave vector). This means that we trained
phononic crystal structures, we fix the unit cell length to 10 nm, our models using 90% of our sample data and tested them using
the host Poisson’s ratio to be 0.33, and inclusion Poisson’s ratio to 10% of the sample data. We then repeated this process a total of
be 0.33 (our prior work shows that Poisson’s ratio has a minimal ten times, wherein each repetition a different random subset of our
effect on bandgap formation).52 We then calculated the phononic sample data was used for the training and test datasets.
band diagrams for varying structures within this parameter space. We also use precision and recall to gauge the performance of
We varied host and inclusion elastic moduli values throughout a the machine learning classification tests. These metrics are impor-
range representative of polymer to diamond (1, 3, 10, 30, 100, 300, tant for situations in which the raw data are highly imbalanced,
and 1000 GPa). We also varied host and inclusion density values and this is the case in this paper. Of the 14 112 calculated band dia-
throughout a range representative of wood to heavy metals such as grams in this work, 17% had bandgaps (i.e., 2373 band diagrams)
tantalum (500, 1000, 2000, 4000, 8000, 16 000 kg/m3). Finally, we and 83% did not have bandgaps. In this situation, a machine learn-
varied the diameter of the cylindrical inclusions to be 2, 3, 4, 5, 6, ing model that always predicts no bandgap would technically be
7, 8, and 9 nm (i.e., diameter-to-lattice constant ratios of 0.2, 0.3, 83% accurate but does not have any real predictive power.
0.4, 0.5, 0.6, 0.7, 0.8, and 0.9). Collectively, these calculations yield a Precision and recall are performance metrics that are less affected
total of 14 112 pieces of sample data (phononic band diagrams) by this imbalance. Precision gauges what fraction of the selected
that we refer to as our raw data. items is relevant. High precision means that the algorithm selects
Each of these 14 112 pieces of sample data include five feature many more relevant results than irrelevant results. With respect to
values (host elastic modulus, Ehost, inclusion elastic modulus, this paper, precision tells us that out of all the phononic crystals
Einclusion, host density, ρhost, inclusion density, ρinclusion, and cylinder predicted to have a bandgap, what fraction actually do have a
diameter-to-lattice constant ratio, d/a). We then created three label bandgap {i.e., precision = [true positive/(true positive + false posi-
values for each band diagram. The first label is either 0 (bandgap tive)]}. Recall gauges what fraction of the actual relevant items was
does not exist) or 1 (bandgap exists), and this is the output for the selected. High recall means that most of the relevant results are
classification analysis. The other two labels represent the bandgap successfully selected. With respect to this paper, recall tells us that
center frequency and bandgap width. These labels were created for out of all the phononic crystals that actually have a bandgap, what
each of the 14 112 phononic band diagram by using a scripted fraction was correctly identified {i.e., recall = [true positive/(true
search algorithm that identified whether or not a bandgap exists positive + false negative)]}.
and what are the corresponding bandgap center frequency and
bandgap width. When no bandgap exists, the search algorithm
returned values of 0 for both bandgap center frequency and RESULTS
bandgap width. We limited the search algorithm to bandgaps that Figure 2 illustrates the distribution characteristics of the raw
existed within the first eight phonon branches and to those that are data (i.e., 14 112 phononic band diagrams calculated by COMSOL).
5 GHz or wider. We chose this minimum bandgap width of 5 GHz These distributions provide useful context that can help assess
because the median bandgap center frequency was 178 GHz and so machine learning algorithm performance. More specifically, it
bandgaps narrower than 5 GHz are of little use. shows how each of the five features effect the probability of
In order to evaluate the performance of each machine learning forming a bandgap in a given phononic crystal structure. For
algorithm, we split the 14 212 pieces of sample data into “training” example, Fig. 2(d) illustrates the effect of inclusion density on the
and “test” sets. The training dataset is used to train the algorithm likelihood of phononic bandgap appearance. This figure separates
so that it can create a mathematical model that relates the data fea- the 14 112 pieces of raw data into their corresponding bins for each
tures to the data labels. The test dataset is unseen by the algorithm inclusion density value. Inspecting this figure, it is clear that a large
during the training process and then is used after training to evalu- majority of the phononic crystal structures do not possess a
ate the model’s performance. During our training and test process, bandgap (yellow portion of columns). It is also apparent that
we normalized all features and labels to yield values less than 1. increasing the inclusion material density increases the likelihood of
This normalization process helps the machine learning algorithms bandgap formation.
converge more easily. In the case of artificial neural networks, nor- A common rule of thumb for predicting whether a bandgap
malization also helps prevent the formation of large gradients that exists in phononic crystals is that the inclusion material should be
can inhibit convergence. To more clearly illustrate the machine dense and stiff.54–56 This rule is captured within Fig. 2 as it can be
learning results, these normalized values are converted back into seen that larger inclusion elastic moduli and larger inclusion
absolute values when presenting results throughout this paper. density [Figs. 2(b) and 2(d), respectively] yield a higher probability

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-4


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

for bandgap formation. Figure 2(e) illustrates the effect of the inclu-
sion diameter-to-lattice constant ratio, which is effectively inclusion
volume fraction. This data distributions show that large inclusion
volume fractions are favorable for bandgap formation; however,
this effect is not as pronounced as the effect of inclusion
density.54–56

CLASSIFICATION RESULTS
We first examine the ability of the machine learning algo-
rithms to make a binary yes/no prediction as to whether or not a
phononic crystal structure possesses a bandgap. Table I summarizes
the results for these classification tests and shows that the random
forests model yielded the best results.
The logistic regression model had an accuracy, precision, and
recall of 0.84, 0.57, and 0.16, respectively. Although this accuracy
appears high, such an interpretation would be misleading due to
the characteristics of the raw data (i.e., 83% of the raw data did not
have bandgaps and so accuracies near this value are not necessarily
highly predictive). While these performance metrics are not great,
they do successfully demonstrate some predictive value. For
example, the precision is 0.57, and this means that a positive identi-
fication by the logistic regression model has a 0.57 chance of actu-
ally having a bandgap. In contrast, a random selection out of the
14 k samples has only a 0.17 chance of having a bandgap.
Consequently, using the logistic regression model as a rapid screen-
ing tool can significantly improve the chances of finding a pho-
nonic crystal with a bandgap by a factor of 0.57/0.17 ≈ 3.4.
It is not surprising that the logistic regression model yielded
an overall poor performance because this model works best for
linear systems. The output of the logistic regression model is a
simple linear equation and a corresponding decision function (see
in supplementary material). In reality, predicting whether a pho-
nonic bandgap exists is a complex problem that cannot be ade-
quately captured with linear equations. If a simple linear equation
could make these predictions accurately, then identifying phononic
bandgaps would likely have been a research problem that was
solved long ago.
The performance of the artificial neural network was superior
to logistic regression for all three performance metrics. The artifi-
cial neural network had an accuracy, precision, and recall of 0.87,
0.65, and 0.33, respectively. This improvement is not surprising
since artificial neural networks are better at approaching nonlinear
problems. The precision value of 0.65 means that using this model

TABLE I. Accuracy, precision, and recall for identifying phononic crystals with
bandgaps using logistic regression, artificial neural network, and random forests
machine learning models. Each model was implemented ten times with random vari-
FIG. 2. Bar graphs illustrating the likelihood of phonon bandgap formation ations of the training and test dataset. The uncertainty bars reflect ±1 standard devi-
within the 14 112 pieces of raw data. The number of phononic crystals with ations of these ten implementations.
bandgaps is shown in blue, and the number of phononic crystals without bandg-
aps is shown in yellow. Each part of this figure utilizes the same data (i.e., the Performance Logistic Artificial neural Random
sum of the bars equals 14 112 samples) but uses a different feature on the x metric regression network forests
axis. The likelihood of phonon bandgap formation is shown for (a) varying host
elastic modulus, (b) varying inclusion elastic modulus, (c) varying host density, Accuracy 0.84 ± 0.01 0.87 ± 0.06 0.94 ± 0.01
(d) varying inclusion density, and (e) varying ratio of inclusion diameter-to-lattice Precision 0.57 ± 0.04 0.65 ± 0.06 0.89 ± 0.02
constant. Recall 0.16 ± 0.02 0.33 ± 0.04 0.77 ± 0.02

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-5


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

as a rapid screening tool can improve the chances of finding a pho-


nonic crystal with a bandgap by a factor of 3.8 relative to a random
selection. The artificial neural network’s recall value of 0.33 is
approximately double that of the logistic regression model. This
means that the artificial neural network correctly identified approx-
imately one third of the phononic crystal structures that actually
have bandgaps.
The random forests model yielded the best performance of all
the algorithms. It had an accuracy, precision, and recall of 0.94,
0.89, and 0.77, respectively. These metrics indicate a high degree of
confidence for the random forest model’s ability to identify pho-
nonic crystals with bandgaps. As mentioned earlier, 83% of the
phononic crystal samples do not have bandgaps and so accuracies
near this value are not necessarily highly predictive (which is the
case for the logistic regression and artificial neural network
models). In contrast, the accuracy of the random forests model is
significantly higher at a value of 94%. Furthermore, the precision
value of 0.89 means that nearly 9 out of 10 selected phononic crys-
tals actually have a bandgap. Consequently, using the random
forests model as a rapid screening tool represents a huge improve-
ment over a random phononic crystal selection by a factor of 5.2.
The recall of 0.77 means that the random forests correctly identi-
fied approximately three quarters of the phononic crystals that
actually have bandgaps.
Although the results of the random forests model are encour-
aging, it is important to acknowledge the limitations of the FIG. 3. Accuracy, precision, and recall values for the random forests model as
machine learning approach. One limitation is that machine learn- a function of the training dataset size. Each data point represents the average of
ing requires a large amount of training data and generating this ten different model implementations with random variations of the training and
training data requires calculations of the actual phononic band dia- test dataset. The uncertainty bars reflect ±1 standard deviations of these ten
grams (which is in some sense what this machine learning implementations.
approach is trying to avoid). In this study, we used finite element
software to calculate the band diagrams for 14 112 two-dimensional
phononic crystals. A reasonable question to ask is how perfor- diagram calculations. The first and most obvious reason for this is
mance would be affected by reduced quantities of training data. that machine learning relies on these calculations to provide train-
Figure 3 illustrates the trade-off between training dataset size ing data. Furthermore, even if a training set already exists, rigorous
and performance of the random forest model. This figure displays band diagram calculations will still be needed to confirm the pre-
accuracy, precision, and recall as the training dataset is varied from dictions of machining learning algorithms. For example, while the
500 to 11 000 samples. As the training dataset size is increased, the precision of the random forests model is a high 89%, this is still
model performance increases in an asymptotic-like matter. Most of less than 100%. Consequently, we cannot be fully confident that an
the performance gains are achieved by the time the training data identified phononic crystal will in fact have a phononic bandgap.
set reaches 5000 samples. However, even for a small training The precision of a rigorous band diagram calculation is effectively
dataset of 500 samples, the precision is already approximately 0.70. 100% and this should be done to confirm machine learning predic-
Remarkably, this precision value of 0.70 obtained with 500 samples tions. However, the value of the machine learning prediction is that
is already better than the results of the logistic regression and artifi- it greatly expedites the speed at which phononic crystal properties
cial neural network models that were trained with the entire dataset can be determined. Our sample set size of 14 112 calculated pho-
(Table I). This once again reinforces the superiority of the random nonic band diagrams is incredibly small in comparison to the infi-
forest model for addressing this problem. The uncertainty bars in nite number of possible phononic crystal structures. Once a
Fig. 3 represent ±1 standard deviation on the performance metrics training dataset is established, a machine learning model can be
for our tenfold cross-validation scheme (i.e., the standard deviation used as a rapid screening tool to greatly increase the probability of
of ten different training and test sets). It can be seen that these finding a phononic crystal with a bandgap.
uncertainty bars decrease as the training dataset is increased, which Finally, we note that our total data set size of 14 112 samples
means that the performance consistency of the random forests is not particularly large when compared to other machine learning
model also increases with training dataset size. studies. One main reason that we are able to get good results with
Based on these classification results, we conclude that machine this relatively small dataset size is that our study only focused on a
learning algorithms are a powerful tool that can augment phononic simplified subset of the phononic crystal parameter space. Every
crystal property discovery. We also acknowledge that machine sample in this study consisted of just two materials arranged in
learning is by no means a complete replacement for rigorous band two-dimensional square lattice of cylinders embedded in a host

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-6


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

material. A more generalized parameter space would allow for having center frequencies of 0 and widths of 0. Each machine
greater flexibility such as more than two materials, arbitrary shapes learning model was tested using a tenfold cross-validation scheme.
other than cylinders, and different unit cell symmetries. This larger Figure 4 illustrates the prediction results by graphing the pre-
parameter space would likely require a much larger dataset size to dicted bandgap center frequencies and widths vs the actual
obtain the performance benchmarks achieved in this paper. It is bandgap center frequencies and widths. Perfect predictions would
also worth noting that the performance of most traditional fall directly along the diagonal red line shown in each figure. The
machine learning algorithms (e.g., regression and random forests) performance of these tests is captured via their coefficient of deter-
scales differently with dataset size than artificial neural networks. mination (R2) values. Of the three different algorithms, the random
The performance of traditional machine learning algorithms tend forest model yielded the best results. The coefficient of determina-
to plateau at a certain threshold dataset size, however the perfor- tion values exhibited some variation across each of the ten imple-
mance of artificial neural networks can continue to improve as the mentations, and this variation is reflected via the uncertainties
dataset size is increased beyond that threshold.57 So while our listed in each figure.
random forests algorithm performed better than our artificial The bandgap center frequency and width predictions progres-
neural network algorithm at a dataset size of 14 112 samples, it is sively improve for the linear regression, artificial neural network,
possible that artificial neural network algorithms could consistently and random forests models, respectively. The linear regression
outperform random forests at much larger dataset sizes. model did the worst with R2 values of 0.11 and 0.12 for the center
frequency and width, respectively. As in the case of classification,
this poor performance likely results from the fact that determining
band diagram characteristics is a complex nonlinear problem.
PREDICTION RESULTS Artificial neural networks are better at addressing these types of
We next examine the ability of the machine learning algo- problems, and this is reflected via improved R2 values of 0.61 and
rithms to quantitatively predict the bandgap center frequency and 0.62 for the center and width, respectively. The random forest
bandgap width of a given phononic crystal structure. During this model performed the best and had R2 values of 0.66 and 0.85 for
test, phononic crystals that do not have a bandgap were labeled as the bandgap center and width, respectively.

FIG. 4. Prediction results for bandgap centers [(a), (c), and (e)] and widths [(b), (d), and (f )] via the linear regression [(a) and (b)], artificial neural network [(c) and (d)],
and random forests [(e) and (f )] machine learning models. The diagonal red line indicates perfect prediction and the vertical distance from this line indicates the error of
the prediction. The coefficients of determination for each case are shown directly on the plots. The uncertainties on the coefficient of determination represent ±1 standard
deviations during ten different implementations of the model with randomized training and test sets. Note that there are many points directly on the y axis that are not
visible (i.e., phononic crystals that were predicted to have a bandgap, but in reality, do not). In a similar fashion, there are data points directly on the x axis that are not
visible as well.

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-7


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

number and was addressed in our prediction tests. However, the


prediction tests in the prior section were forced to address both of
these questions simultaneously, and this resulted in an underpre-
diction bias for bandgap center and width. This underprediction
bias arose because approximately 83% of the training dataset was
populated with phononic crystals that had a center frequency and
width of “0” (i.e., no bandgap). In a sense, the training and test
data were polluted with a large quantity of irrelevant data.
In this section, we examine if and how prediction results
would improve if the machine learning algorithms were trained
and tested with only phononic crystals that actually have bandgaps
(i.e., a priori knowledge that the phononic crystal has a bandgap).
In practice, this type of situation is unlikely to occur during pho-
nonic crystal property discovery. This is because users would not
know if the phononic structures they are entering into the machine
learning algorithm actually have bandgaps. Nonetheless, this line of
FIG. 5. Coefficient of determination results during the random forests prediction
inquiry is worth exploring as an academic exercise. To examine
of the bandgap center and width as a function of training dataset size. Each
data point represents the average of ten different model implementations with this scenario, we trained and tested the machine learning algo-
random variations of the training and test dataset. The uncertainty bars reflect rithms using only the 2373 phononic crystal structures that had
±1 standard deviation of these ten implementations. bandgaps (as opposed to all 14 112 phononic crystal structures). As
in our earlier tests, we used a tenfold cross-validation scheme
during model implementation, which means we tested each model
ten times with random variations on the test and training dataset.
For all three models, the predicted center frequency and width
Figure 6 shows the prediction results for the machine learning
exhibit a bias toward values that are smaller than the actual center
algorithms that were trained and tested using only phononic crys-
frequency and width. This bias can be seen in Fig. 3 via the fact
tals with bandgaps. Most notable in this figure is that the underpre-
that the majority of the data points fall below the red line repre-
diction bias observed in Fig. 4 is no longer observed in Fig. 6. The
senting perfect prediction. This underprediction bias is caused by
data in Fig. 6 are much more symmetric around the perfect predic-
the fact that the vast majority of the phononic crystal structures do
tion line (i.e., diagonal red line). This supports our earlier explana-
not have bandgaps (i.e., 83%). As mentioned earlier, during train-
tion that the underprediction bias in Fig. 4 originates from the
ing, the phononic crystal structures without bandgaps were labeled
abundance of phononic crystals with no bandgap in the training
with bandgap center and width values of 0 GHz. Having 83% of
and test datasets.
their training data labeled with values of 0 GHz caused the algo-
Another notable result in Fig. 6 is that the R2 values for the
rithms to become biased toward smaller values. This underpredic-
center frequency are significantly improved relative to the Fig. 4.
tion bias could likely be solved by separating the prediction
The R2 values for the artificial neural network and random forests
problem into two separate steps, and we explore this possibility
are 0.90 and 0.97, which indicate a high degree of confidence in
later in this paper (see the section titled “Prediction Results with A
these center frequency predictions. It is worth acknowledging that
Priori Knowledge of Bandgap Presence”).
the R2 values for the bandgap width do not appear to have
Figure 5 illustrates the trade-off between training dataset size
improved between Figs. 4 and 6. The fact that we achieved substan-
and performance of the random forest model during prediction of
tial improvements in center frequency prediction and no observable
the bandgap center frequency and width. Figure 5 shows that most
improvement in width is not entirely surprising. The effect of pho-
of the performance gains are achieved by the time the training
nonic crystal structure on center frequency is more well understood
dataset reaches approximately 4000 samples. For a small training
than that of bandgap width.13,54–56,58 It is well known that stiffer
dataset of 500 samples, the coefficient of determinations is approxi-
and lighter materials lead to increased center frequency and it
mately half of the value for the fully trained samples. Hence, even
should be easy for machine learning algorithms to learn this trend.
these small training set sizes can provide some predictive value.
In contrast, the bandgap width is much less predictable because it
depends on the relative curvature of all the phonon branches
PREDICTION RESULTS WITH A PRIORI KNOWLEDGE within the band structure.
OF BANDGAP PRESENCE Finally, it is worth noting and reinforcing that the generally
Determining the bandgap center frequency and width of a improved results of Fig. 6 were achieved with less training data
phononic crystal is a two-step process. The first step is to ask, than that of Fig. 4. The studies in Fig. 4 used all 14 112 calculated
“Does a bandgap exist?” The answer to this first step is a binary yes phononic band diagrams, whereas the studies in Fig. 6 used only
or no value and was addressed in our classification tests. If the 2373 band diagrams. Achieving better predictions with less data is
answer to this first step is “no,” then proceeding to the second step clearly desirable, and this points to the importance of training data
is actually pointless. The second step is to ask, “If there is a quality. The training data for the studies in Fig. 6 were not contam-
bandgap, then what is the center frequency and width?” The inated with phononic crystals that lacked a bandgap. Hence, these
answer to this second step is a positive and continuously variable model implementations were able to learn both better and faster.

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-8


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

FIG. 6. Prediction results for bandgap centers [(a), (c), and (e)] and widths [(b), (d), and (f )] via the linear regression [(a) and (b)], artificial neural network [(c) and (d)],
and random forests [(e) and (f )] machine learning models. Only phononic crystals with bandgaps were used during training and testing of these machine learning models.

We still expect there to be a trade-off between training dataset size when using all 14 112 phononic band diagrams for training and
and performance of the model, and this is shown in Fig. 7. This testing (Fig. 4). The slope of the R2 curve for bandgap width in
figure shows that high quality predictions of bandgap center with Fig. 7 is steeper than that in Fig. 5 and demonstrates that the algo-
R2 > 0.9 can be achieved with just 600 training samples. Even if rithm learns quicker when all the of the phononic crystal structures
only 100 training samples are used, the R2 for bandgap center is have bandgaps.
approximately 0.7, which is equivalent to the accuracy obtained
CONCLUSION
This paper demonstrates the potential for machine learning
models as rapid screening tools for the expedited discovery of pho-
nonic crystal properties. We investigated three different machine
learning algorithms (logistic/linear regression, artificial neural
network, and random forests) and found that random forests
yielded the best performance metrics. For classification, this model
achieves an accuracy, precision, and recall of 0.94, 0.89, and 0.77,
respectively. When predicting bandgap center frequency and width,
this model achieves R2 values of 0.66 and 0.85, respectively. If the
model has a priori knowledge that a bandgap exists, these R2 values
improve to 0.97 and 0.85, respectively. These accuracy metrics dem-
onstrate a high degree of confidence and demonstrate the utility of
machine learning for phononic crystal property discovery.
Future promising directions include more in-depth explora-
tions of the phononic crystal parameter space such as more
complex topologies, different unit cell symmetries, greater than two
FIG. 7. Coefficient of determination results during the random forests prediction constituent materials, etc. In addition, machine learning perfor-
of the bandgap center and width as a function of training dataset size. Only pho- mance could be potentially improved via feature engineering. For
nonic crystals with bandgaps were used during training and testing of this
model.
example, rather than directly using material properties as features,
non-dimensional groupings could be utilized [e.g., Ehost/Einclusion,

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-9


Published under license by AIP Publishing.
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

pffiffiffiffiffiffiffi pffiffiffiffiffiffiffi
ρhost/ρinclusion, ð E/pÞhost /ð E/pÞinclusion , etc.]. Finally, exploring the 25
O. Sigmund and J. Søndergaard Jensen, Philos. Trans. R. Soc. London A 361,
performance of various machine learning algorithms for inverse 1001 (2003).
problems (i.e., inputting bandgap characteristics and outputting a
26
Y. Li, X. Huang, and S. Zhou, Materials 9, 186 (2016).
27
S. Mukherjee, F. Scarpa, and S. Gopalakrishnan, Smart Mater. Struct. 25,
particular design) would also be a worthwhile endeavor.
054011 (2016).
28
Y. fan Li, X. Huang, F. Meng, and S. Zhou, Struct. Multidisc. Opt. 54, 595
SUPPLEMENTARY MATERIAL (2016).
See the supplementary material for the text describing nor-
29
Y. Lu, Y. Yang, J. K. Guest, and A. Srivastava, Sci. Rep. 7, 43407 (2017).
malization of the features and labels, programming specifics, algo-
30
O. R. Bilal and M. I. Hussein, Phys. Rev. E 84, 065701 (2011).
31
Y. F. Li, F. Meng, S. Li, B. Jia, S. Zhou, and X. Huang, Phys. Lett. A 382, 679
rithm training process and checks on overfitting, output equations
(2018).
for logistic and linear regression, and COMSOL simulation details. 32
Z.-Q. Zhao, P. Zheng, S. Xu, and X. Wu, arXiv:1807.05511 [Cs] (2018).
33
R. Girshick, arXiv:1504.08083 [Cs] (2015).
ACKNOWLEDGMENTS 34
S. Ren, K. He, R. Girshick, and J. Sun, arXiv:1506.01497 [Cs] (2015).
35
This work was supported by the National Science Foundation J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, arXiv:1810.04805 [Cs]
(2018).
CAREER Program through Award No. DMR-1654337. 36
Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le,
arXiv:1906.08237 [Cs] (2019).
DATA AVAILABILITY 37
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,
The data that support the findings of this study are available Ł Kaiser, and I. Polosukhin, in Advances in Neural Information Processing
from the corresponding author upon reasonable request. Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus,
S. Vishwanathan, and R. Garnett (Curran Associates, Inc, 2017), pp. 5998–6008.
38
Z. Ghahramani, Nature 521, 452 (2015).
REFERENCES 39
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche,
1
M.-H. Lu, L. Feng, and Y.-F. Chen, Mater. Today 12, 34 (2009). J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman,
2
T. Gorishnyy, C. K. Ullal, M. Maldovan, G. Fytas, and E. L. Thomas, Phys. Rev. D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach,
Lett. 94, 115501 (2005). K. Kavukcuoglu, T. Graepel, and D. Hassabis, Nature 529, 484 (2016).
40
3
T. Gorishnyy, M. Maldovan, C. Ullal, and E. Thomas, Phys. World 18, 24 D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez,
(2005). T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre,
4
R. H. Olsson III and I. El-Kady, Meas. Sci. Technol. 20, 012002 (2009). G. van den Driessche, T. Graepel, and D. Hassabis, Nature 550, 354 (2017).
5
A. Khelif, A. Choujaa, B. Djafari-Rouhani, M. Wilm, S. Ballandras, and
41
Y. LeCun, Y. Bengio, and G. Hinton, Nature 521, 436 (2015).
42
V. Laude, Phys. Rev. B 68, 214301 (2003). J. F. Mccarthy, K. A. Marx, P. E. Hoffman, A. G. Gee, P. O’neil, M. L. Ujwal,
6
S. Yang, J. H. Page, Z. Liu, M. L. Cowan, C. T. Chan, and P. Sheng, Phys. Rev. and J. Hotchkiss, Ann. N. Y. Acad. Sci. 1020, 239 (2004).
43
Lett. 93, 024301 (2004). S. Khan, N. Islam, Z. Jan, I. Ud Din, and J. J. P. C. Rodrigues, Pattern
7
N. Boechler, G. Theocharis, and C. Daraio, Nat. Mater. 10, 665 (2011). Recognit. Lett. 125, 1 (2019).
8
M. Maldovan, Phys. Rev. Lett. 110, 025902 (2013).
44
Z. Liu, D. Zhu, S. P. Rodrigues, K.-T. Lee, and W. Cai, Nano Lett. 18, 6570
9
R. Martínez-Sala, J. Sancho, J. V. Sánchez, V. Gómez, J. Llinares, and (2018).
45
F. Meseguer, Nature 378, 241 (1995). M. H. Tahersima, K. Kojima, T. Koike-Akino, D. Jha, B. Wang, C. Lin, and
10
M. I. Hussein, M. J. Leamy, and M. Ruzzene, Appl. Mech. Rev. 66, 040802 K. Parsons, Sci. Rep. 9, 1368 (2019).
(2014).
46
W. Ma, F. Cheng, and Y. Liu, ACS Nano 12, 6326 (2018).
11
Acoustic Metamaterials and Phononic Crystals, edited by P. A. Deymier
47
T. Wang, C. Zhang, H. Snoussi, and G. Zhang, Adv. Funct. Mater. 30,
(Springer-Verlag, Berlin, 2013). 1906041 (2020).
12 48
Dynamics of Lattice Materials, edited by A. S. Phani and M. I. Hussein (John X. Li, S. Ning, Z. Liu, Z. Yan, C. Luo, and Z. Zhuang, Comput. Method. Appl.
Wiley & Sons, Ltd, 2017). Mech. Eng. 361, 112737 (2020).
13
M. M. Sigalas and E. N. Economou, J. Sound Vib. 158, 377 (1992).
49
C.-X. Liu, G.-L. Yu, and G.-Y. Zhao, AIP Adv. 9, 085223 (2019).
50
14
Y. Tanaka, Y. Tomoyasu, and S. Tamura, Phys. Rev. B 62, 7387 (2000). J. Fox, Applied Regression Analysis and Generalized Linear Models, 3rd ed.
15
M. I. Hussein, Proc. R. Soc. A 465, 2825 (2009). (SAGE Publications, Inc, Los Angeles, 2015).
51
16
D. Krattiger and M. I. Hussein, J. Comput. Phys. 357, 183 (2018). Y. L. Pavlov, Random Forests (VSP, Utrecht, 2000).
17
A. Palermo and A. Marzani, Int. J. Solids Struct. 191–192, 601 (2020).
52
S. M. Sadat and R. Y. Wang, RSC Adv. 6, 44578 (2016).
53
18
Y. Sun, Y. Yu, Y. Zuo, L. Qiu, M. Dong, J. Ye, and J. Yang, Results Phys. 13, C. M. Bishop, Pattern Recognition and Machine Learning (Springer,
102200 (2019). New York, 2006).
19 54
P. Zhang, Z. Wang, Y. Zhang, and X. Liang, Sci. China Phys. Mech. Astron. M. S. Kushwaha, P. Halevi, L. Dobrzynski, and B. Djafari-Rouhani, Phys. Rev.
56, 1253 (2013). Lett. 71, 2022 (1993).
20 55
M. I. Hussein, K. Hamza, G. M. Hulbert, R. A. Scott, and K. Saitou, Struct. M. S. Kushwaha, P. Halevi, G. Martínez, L. Dobrzynski, and
Multidisc. Opt. 31, 60 (2006). B. Djafari-Rouhani, Phys. Rev. B 49, 2313 (1994).
56
21
Y. Huang, S. Liu, and J. Zhao, Acta Mech. Solida Sin. 29, 429 (2016). Y. Pennec, J. O. Vasseur, B. Djafari-Rouhani, L. Dobrzyński, and
22
G. A. Gazonas, D. S. Weile, R. Wildman, and A. Mohan, Int. J. Solids Struct. P. A. Deymier, Surf. Sci. Rep. 65, 229 (2010).
57
43, 5851 (2006). I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (The MIT Press,
23
Y. Lai, X. Zhang, and Z.-Q. Zhang, Appl. Phys. Lett. 79, 3224 (2001). Cambridge, MA, 2016).
24
K. Wang, Y. Liu, and B. Wang, Phys. B Condens. Matter 571, 263 (2019).
58
E. N. Economou and M. Sigalas, J. Acoust. Soc. Am. 95, 1734 (1994).

J. Appl. Phys. 128, 025106 (2020); doi: 10.1063/5.0006153 128, 025106-10


Published under license by AIP Publishing.

You might also like