Professional Documents
Culture Documents
Whalen Ewhalen SM Ccse 2021 Thesis
Whalen Ewhalen SM Ccse 2021 Thesis
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Center for Computational Science and Engineering
May 19, 2021
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Caitlin Mueller
Associate Professor, Civil and Environmental Engineering
Thesis Supervisor
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nicolas Hadjiconstantinou
Professor, Mechanical Engineering
Co-Director, Center for Computational Science and Engineering
2
Enhancing surrogate models of engineering structures with
graph-based and physics-informed learning
by
Eamon Jasper Whalen
Abstract
This thesis addresses several opportunities in the development of surrogate models
used for structural design. Though surrogate models have become an indispensable
tool in the design and analysis of structural systems, their scope is often limited by the
parametric design spaces on which they were built. In response, this work leverages
recent advancements in geometric deep learning to propose a graph-based surrogate
model (GSM). The GSM learns directly on the geometry of a structure and thus can
learn on designs from multiple sources without the typical restrictions of a parametric
design space.
Engineering surrogate models are often limited by data availability, since designs
and performance data can be expensive to produce. This work shows that transfer
learning, through which training data of varying topology, complexity, loads and
applications are repurposed for new predictive tasks, can be used to improve the data
efficiency of surrogates, often reducing the required amount of training data by one or
two orders of magnitude. This work also explores new potential sources for training
data, namely engineering design competitions, and presents SimJEB, a new public
dataset of simulated engineering components designed specifically for benchmarking
surrogate models. Finally, this work explores the emerging technology of physics-
informed neural networks (PINNs) for structural surrogate modeling, proposing two
new heuristics for improving the convergence and accuracy of PINNs in practice.
Combined, these contributions advance the generalizability and data efficiency of
surrogate models used in structural design.
3
4
Acknowledgments
I would like to extend a major thank you to my advisor, Dr. Caitlin Mueller, for
her technical expertise, academic guidance, and moral support. Thank you, along
with the other members of the Digital Structures Group, for creating an exciting and
supportive environment in which to learn and grow.
A special thanks to Fatma Kocer, Brett Chouinard, and many others at Altair for
encouraging me to attend graduate school and helping me make that dream a reality.
Thank you to Renaud Danhaive, Yijiang Huang, Joe Pajot, Jonathan Ollar, and
David Xu for their mentorship and technical expertise.
Thank you also to Simon Ganeles and Azariah Beyene for their many technical con-
tributions.
Last but not least, thank you to my partner, Janene, for her guidance, encouragement
and love.
This research was supported by the Engineering Data Science group at Altair Engi-
neering Inc. and is based upon work supported by the National Science Foundation
under Grant No. 1854833.
5
6
Contents
1 Introduction 11
1.1 Engineering surrogate modeling . . . . . . . . . . . . . . . . . . . . . 11
1.1.1 Physical simulations in engineering . . . . . . . . . . . . . . . 11
1.1.2 Surrogate modeling . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.1 Existing models are restrictive . . . . . . . . . . . . . . . . . . 13
1.2.2 Training data is scarce . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.1 Graph representations . . . . . . . . . . . . . . . . . . . . . . 15
1.3.2 Transfer learning . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.3 New datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.4 Physics-informed learning . . . . . . . . . . . . . . . . . . . . 17
7
2.3.2 Convolutions on graphs . . . . . . . . . . . . . . . . . . . . . . 25
2.3.3 The graph-based surrogate model (GSM) . . . . . . . . . . . . 26
2.3.4 A naive alternative: the pointwise surrogate . . . . . . . . . . 27
2.3.5 A baseline: predicting the mean . . . . . . . . . . . . . . . . . 27
2.4 Characterizing the GSM . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Data generation and filtering . . . . . . . . . . . . . . . . . . . 28
2.4.2 Training and tuning . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.3 Comparing the GSM to the pointwise surrogate . . . . . . . . 30
2.4.4 Studying generalizability . . . . . . . . . . . . . . . . . . . . . 30
2.5 Transfer learning: repurposing the GSM . . . . . . . . . . . . . . . . 34
2.5.1 Effects on generalizability . . . . . . . . . . . . . . . . . . . . 34
2.5.2 Studying data efficiency . . . . . . . . . . . . . . . . . . . . . 37
2.6 Conclusions and future work . . . . . . . . . . . . . . . . . . . . . . . 39
8
3.7 Conclusions and future work . . . . . . . . . . . . . . . . . . . . . . . 55
5 Conclusion 71
9
10
Chapter 1
Introduction
This work presents new strategies for augmenting surrogate models used in structural
design. The following sections present a broad overview of surrogate modeling as well
as existing challenges and opportunities in the field. Four opportunities in particular,
graph representations, transfer learning, new datasets, and physics-informed learning,
are addressed in subsequent chapters with novel methods and insights. Each chapter
also contains an in-depth introduction and literature review of the state of the art of
each topic.
Physical simulations have become ubiquitous in nearly every engineering field, includ-
ing structural design. Computational methods such as finite element, finite difference
and finite volume, can be used to solve partial differential equations (PDEs) on ar-
bitrarily complex domains, for which analytical solutions rarely exist. Simulation is
an increasingly popular alternative to physical testing, where physical tests may be
prohibitively expensive, time consuming, or simply not possible. Simulations allow
engineers to quickly evaluate "what if" scenarios. The result is often faster design
iterations and higher-performance outcomes. The parametric nature of engineering
11
simulations naturally lends itself to a wide array of invaluable computational experi-
ments, including design optimization, uncertainty quantification, main effects analy-
sis, and parameter estimation, to name a few. In structural design, the finite element
method is most commonly used to solve for displacement given the geometry, loads,
supports, and material properties as inputs.
Surrogate models, also known as metamodels, response surfaces, reduced order mod-
els, approximation models, or emulators, are mathematical approximations of a sys-
tem’s behavior. In contrast with physical simulations, surrogate models are almost
always data-driven, meaning that they are trained on a set of observations in a su-
pervised manner. There are many potential uses for surrogate models, including
generating smooth approximations of noisy systems, studying the effects of input pa-
rameters, and sharing system behavior while protecting intellectual property details;
however, the most common use of surrogate models is speeding up design evalua-
tion. Surrogate models are traditionally simple regression models which can make
performance predictions several orders of magnitude faster than physical simulation.
Surrogate models are invaluable for applications where either physical simulations are
prohibitively slow (e.g. simulating a full-vehicle car crash), when a prohibitively large
number of simulations is required (e.g. design optimization, uncertainty quantifica-
tion, generating dense visualizations), or both. In a typical workflow, observations are
generated by creating a parametric simulation model, sampling the parameters sev-
eral times in a computational experiment, and collecting key performance indicators
(KPIs). A supervised machine learning model is then trained to predict KPIs given
design parameters as inputs. The surrogate model can then replace the engineering
simulation in a variety of tasks.
Generally, surrogate models take a design as input and output a performance
prediction; however, the forms that these inputs and outputs take often depend on
the specific algorithm being used. Traditional surrogate models rely on standard
regression techniques that operate over fixed-length vectors. This framework, where
12
each design is represented by a vector, fits conveniently with the parametric modeling
strategies often used to generate training data. In a typical workflow, the engineer
manually designates one or more parameters in a physical model to be studied (i.e.
the "design variables"). These design variables form a design space: the theoretical
space containing all possible designs. A design of experiment (DOE) is then used to
efficiently sample the design space, resulting in a dataset of designs and accompa-
nying performance labels. Since each design is already represented by a parameter
vector, it is convenient to use this same vector as the input to a surrogate model.
Surrogate models based on hand-crafted design parameters are simple to implement
and understand because they rely on time-tested regression algorithms.
1.2 Challenges
Though engineering surrogate modeling has been applied successfully for decades,
fundamental challenges regarding surrogate modeling workflows still restrict their
adoption and impact.
While hand-crafted design parameters simplify the learning algorithm, they also limit
the design process in several ways. To start, designing a parametric design space is a
nontrivial task. Knowing which design changes will result in interesting or valuable
outcomes often requires experience, and yet the selection of a suitable design space is
critical to the success of the subsequent design exploration. Even when an effective
design space can be known a priori, a list of design parameters is far less expressive
than the geometry representations typically used in Computer Aided Design (CAD),
like splines and triangular meshes, because it limits the user to a fixed design space.
Finally, the use of design parameters as surrogate model inputs couples the surrogate
to the data generation process. A surrogate model trained on one design space is
effectively useless in another, and data from different design spaces (or data created
by other means) cannot be easily combined during training. Ideally, surrogate models
13
would operate on more organic representations of geometry that do not limit design
freedom or necessarily couple the surrogate model to the data source.
One of the primary reasons why deep learning has disrupted fields like computer
vision and natural language processing is that there exists an abundance of training
data. Images, video, audio, and tabular data are collected by increasingly affordable
sensors and distributed on the web at impressive rates. This is not the case with
engineering design. A single CAD design may take hours to days for an engineer
to create by hand and, once created, the physical simulation of its behavior can be
equally expensive. Furthermore, intellectual property concerns regarding engineering
designs often limit the open sharing of engineering data on the web. The result is that
the engineering equivalents of massive, labeled databases like Imagenet and AudioSet
are few to none.
14
1.3 Opportunities
Recent advancements in deep learning have led to the ability to learn on more or-
ganic representations of shape, including multi-view images, voxels, point clouds, and
meshes (graphs). Graphs in particular are a promising representation for engineering
surrogate models. Graphs are a natural representation of both polygonal meshes and
space frames, and thus little to no preprocessing is required to convert from a deep
learning representation to that used by designers. Graph representations avoid the
lossy rasterization required for Euclidian representations like images and voxels. Un-
like point clouds, graphs encode the topology as well as the geometry of the shape.
The emerging field of geometric deep learning has produced a variety of algorithms
for learning classification, segmentation, and regression tasks on graphical domains.
The integration of geometric deep learning techniques into structural surrogate
modeling has potential to revolutionize how surrogates are trained and deployed in
practice. No longer confided to a design space, graph-based surrogate models support
arbitrary shape changes, increasing design freedom. Since the geometry representa-
tion is decoupled from the design generation process, they also accommodate training
on data from multiple sources, including different parametric studies, and data from
previous design iterations, projects, or domains. This work presents a graph-based
surrogate model (GSM) for predicting the structural performance of space frame
structures in chapter 2. The GSM can accurately predict the deflection of space
frames using only the structure’s geometry, loads and supports as inputs. It is shown
that the GSM can learn on data from multiple design studies simultaneously, and that
doing so is often advantageous compared to training on data from a single source.
Besides improving design freedom, graph-based surrogate modeling also enables trans-
fer learning. The advantages of re-purposing previously trained models is well doc-
15
umented in the deep learning community, yet transfer learning is rarely applied to
engineering surrogate models. Engineering design is a natural candidate for trans-
fer learning. The incremental nature of engineering design results in many design
variants which are all created for the same functional purpose. Furthermore, tra-
ditional surrogate modeling requires that a training set be generated for each new
parametric model. The result is often several small, disjoint design studies which
differ slightly in geometry, topology, or loading conditions. Transfer learning has the
potential to significantly reduce the amount of new training data required to train
graph-based surrogate models by leveraging this existing data. Chapter 2 presents a
transfer learning methodology for the GSM and demonstrates how positive transfer
(i.e. improved prediction accuracy) can be achieved while learning across designs of
varying topologies, loads, complexities and applications. The implication is that the
amount of training data required to achieve a desired accuracy is reduced by one or
two orders of magnitude.
Since graph-based representations decouple the data generation process from the sur-
rogate model, training data no longer necessarily has to come from parametric models
at all. An alternative potential data source involves collecting engineering designs
from the web (i.e. "the wild"). Though there does exist a few large collections of 3D
models, they are rarely of engineering components, and those that are are essentially
random grab bags of CAD designs without any information about their intended load
conditions or functional purpose. One notable exception is online engineering design
contests. Design contest submissions are all designed for the same functional purpose
and so they can be more readily used by surrogate models. Since the contest sub-
missions are created by hand by various engineers, they exhibit significantly higher
geometric diversity and complexity than can be produced from a parametric design
space. Design contest data is therefore a middle ground between generated data and
data collected from the wild.
Chapter 3 presents SimJEB: a collection of 381 engineering brackets collected
16
from an online design contest. The brackets have been cleaned, oriented, meshed,
and simulated according to the original competition load conditions. The bracket
geometry and accompanying simulation results form a non-parametric dataset for
evaluating graph-based surrogate models. Chapter 3 also proposes a methodology
for using SimJEB as a benchmark, including guidelines for training surrogate models
and quantifying their performance in a consistent way. The SimJEB dataset has been
released for public use by researchers in geometric machine learning and engineering
surrogate modeling.
17
18
Chapter 2
19
2.1 Introduction
Surrogate models, also known as metamodels, response surfaces, reduced order mod-
els, approximation models, or emulators, are used extensively in engineering to ap-
proximate computationally-intensive processes. In a typical workflow, training data
is produced by running a design of experiment (DOE) of physics-based simulations,
after which a surrogate model is trained in a supervised manner to predict one or
more of the simulated quantities. The trained surrogate model might then be used
to perform fast optimizations or provide real-time performance predictions. Gener-
ally, these methods require that each design be represented as a fixed-length vector
of design parameters (e.g. design variables). This requirement restricts the surrogate
model to a single design space, requiring the user to train a new surrogate model
every time the parametrization changes.
20
possible shapes. How might one determine which inputs are “safe” and which are not
likely to yield quality predictions?
This work explores the use of graph neural networks as surrogate models for space
frame structures. The proposed Graph-based Surrogate Model (GSM) learns to pre-
dict a displacement field given only the geometry, supports, and loads as inputs. It is
shown that the GSM can be trained on data from multiple design models simultane-
ously, often outperforming GSMs trained on a single source. Transfer learning is then
explored as an effective method to repurpose previously trained GSMs to new tasks.
Both the generalizability and data efficiency of the GSM are improved with trans-
fer learning, with positive transfer being observed across varying topologies, loads,
complexities, and even different applications.
The key contributions of this chapter are as follows:
2. Transfer learning is shown to improve the GSMs data efficiency and generaliz-
ability, leveraging historical data to reduce the required number of simulations
by one or two orders of magnitude
The remainder of this paper is organized as follows: section 2.2 reviews related
work, section 2.3 introduces the methodology of the GSM and a few naive alternatives
used for comparison, section 2.4 outlines data generation methods and presents exper-
imental results, section 2.5 introduces transfer learning and presents further results,
and section 2.6 contains conclusions and ideas for future work.
The following terminology is used throughout the paper: Let design refer to a
specific design concept of a structure (i.e. something that could be built), design
21
model (DM) refer to a hand-parametrized design space which can be sampled to
generate designs, and surrogate model refer to a data-driven predictive model that
learns to predict a structure’s engineering performance.
Surrogate models have been used in engineering design for several decades (see [84, 25,
61] for a review). Some of the most common surrogate modeling algorithms include
polynomial regression [70], kriging (also known as Gaussian processes) [17], radial ba-
sis functions [23], random forest [78] and neural networks [57]. [79] compared several of
these algorithms for civil engineering problems. Dimensionality reduction techniques
have been used to derive more suitable parametrizations [11, 21] and quantities of
interest [91]. All of the aforementioned methods require that a design be represented
as a fixed-length vector of parametric design features, restricting the feasible designs
to some pre-determined space. This work proposes a surrogate model that operates
on the geometry directly and is thus not limited to a particular parametrization.
Recently, a few surrogate models have been proposed that do not rely on handcrafted
design parameters. [92] proposed using "knowledge-based" characteristics, which are
independent of design variables, as features. While this may enable the combination
of training data from multiple design spaces, it still relies heavily on the user to
22
craft useful characteristics. Other approaches, have sought to learn on the geometry
itself. The pursuit of deep learning methods for shape data has led to the ability
to learn on several geometry representations, including shape descriptors, images,
voxels, polycubes, signed distance functions, point clouds, and graphs (see [68, 4] for
a review). Surrogate models have been trained on images [35, 48, 93, 45, 26, 30], voxels
[94, 86] and polycubes [80, 6]. Images and voxels suffer from resolution problems and
data loss due to rasterization. Polycubes solve this problem by mapping the geometry
to a regular grid but are limited to fixed-topology data sets.
The advent of geometric deep learning techniques has enabled learning on non-
Euclidian domains which are generally more natural representations of geometry.
[18] trained a surrogate model to predict lift and drag coefficients from 3D point
clouds. While potentially useful for solid bodies, point clouds are not an adequate
representation of space frames because they lack topological information. Other works
have represented designs as graphs. [6] used a graph-based convolutional model to
learn fluid dynamics on meshed surfaces, [20] used a similar approach to learn the
structural behavior of a thin shell, and [83] learned material properties from graph-
based microstructures. The closest existing work to this one is probably [15], in which
graph representations of space frames were used to optimize cross section sizes for
structural loads. The structures in [15] had constant loads and geometry (apart from
the cross sections), whereas this study explores the flexibility of graph-based networks
to generalize across various geometries, topologies and loads.
Other notable engineering applications of graph-based learning include feature
recognition on 3D CAD [13], shape correspondence for additive manufacturing [33],
and generation of design decision sequences [63]; however, these do not directly ad-
dress surrogate modeling.
Graph-based learning, both for shape analysis as well as other tasks, has recently
received a lot of attention. [10] introduced the term geometric deep learning to mean
learning from non-Euclidian data structures such as graphs and point clouds. See
23
[90, 95] for a general survey on graph neural networks (GNNs). MoNet [46] was
the first framework to apply a GNN to meshed surfaces by leveraging convolutions
over local geodesic patches. ACNN [8] defined similar patches based on anisotropic
heat kernels, while GCNN [50] generalized these patches to user-defined pseudo co-
ordinates. FeaStNet [81] introduced an attention mechanism to perform "feature
steering" which acts as dynamic filtering over neighbors. Other notable extensions
of GNNs to shapes include MeshCNN [32] which introduced learnable edge pooling
and StructureNet [49] which introduced a graph-based encoder for hierarchical part
representations. The aforementioned frameworks were applied to geometry process-
ing tasks including shape correspondence, classification, and segmentation, whereas
this work focuses on structural surrogate modeling.
Transfer learning, where predictive models previously trained on source data are re-
trained on target data from a different domain, task, or distribution, is a widely
applied concept in machine learning [56]. Deep learning models in particular often
benefit from transfer learning due to their data-intensive nature [77]. [42] addressed
some of the particular challenges of transfer learning in graph neural networks. A few
works have explored transfer learning in the context of engineering design. [93] trained
a convolutional autoencoder on 2D wheel designs before retraining the encoder as a
surrogate model, reducing the required number of simulations. [45] first trained a
model to predict the original parametric design features of an artery before retraining
it to predict the location of maximum stress. [43] used a clustering algorithm to
identify which designs would make for useful source data when applying transfer
learning to microprocessor performance prediction. [6] trained a surrogate model to
predict the drag coefficient of 2,000 primitive shapes before tuning the model on 54
car designs. This paper differs from previous works in that it seeks to systematically
quantify the effects of transfer learning on data efficiency and generalizability across
several common source/target pairs in structural design.
24
2.3 Methodology: surrogate modeling with graphs
The following section presents a new graph-based surrogate model (GSM) for pre-
dicting the displacement of space frame structures.
25
Graph-based Surrogate Model (GSM)
FeaStNet Convolution
+
...
...
Figure 2-1: The graph-based surrogate model (GSM) learns to predict nodal displace-
ments given only geometry, supports and loads as inputs. Structures are represented
as undirected graphs, where each vertex is assigned a feature vector consisting of a
joint’s spatial coordinates and binary variables indicating the presence of supports or
loads. Graph convolutional layers utilize the FeaStNet operator [81].
made transformation invariant in feature space. The latter implies that raw spatial
coordinates can be used directly as input features without having to learn spatial
invariance or transform all designs to a common pose. Geometric deep learning is an
active field; it is likely that other graph-based learning methods are also suitable for
this context and should be considered as future research.
The proposed surrogate model learns to predict joint displacements given the geom-
etry, supports and loads as inputs. It does so by learning a map from an input graph
𝐺0 = (𝑉0 , 𝐸) to a topologically identical output graph 𝐺𝐻 = (𝑉𝐻 , 𝐸). The surrogate
model is implemented as a graph-based convolutional neural network built from a
single sequence of 𝐻 linear and FeaStNet convolutional layers (Fig. 2-1). All layers
except the final one are followed by a rectified linear (ReLu) activation function. It is
observed that batch normalization applied to the input and after each convolutional
operation significantly improves prediction accuracy. The network architecture, layer
26
dimensions, and number of attention heads per FeaStNet layer dictate the total num-
ber of learnable parameters.
A second, simpler type of surrogate model was used to compare against the proposed
graph-based method. This pointwise surrogate consists of several simple regression
models, which each take the spatial coordinates of the structure’s joints (flattened into
a vector) as inputs and predict a single scalar quantity. For a 2D truss with 15 nodes,
this corresponds to training 30 regression models (for the x and y displacement of each
node). The random forest algorithm was selected for this study, but any regression
technique (e.g. kriging, polynomials, radial basis functions) could be used. Note
that the pointwise surrogate relies on a fixed ordering of joints and thus cannot be
extended to multi-topology data sets. Also, note that in the case where all designs are
identically loaded, there is no benefit to including support or load information in the
input, since the designs are represented by a single vector. The pointwise surrogate
was implemented using the scikit-learn [58] random forest class using default settings.
As an additional reference point, consider an even simpler predictive model that sim-
ply predicts the mean displacement across each joint in the training set. Throughout
the chapter, the performance of this naive model is referred to as the baseline. Models
that fail to beat the baseline effectively have no predictive value.
The following section presents a series of trials designed to characterize the prediction
accuracy and generalizability of the proposed graph-based surrogate model.
27
2.4.1 Data generation and filtering
A GSM was trained to predict joint displacements given a truss design as input.
The truss designs were randomly partitioned such that 68% were used for training,
12% were used for validation, and 20% were reserved for testing. The GSM was
implemented with Pytorch Geometric [24] and trained for 100 epochs on a Tesla
K80 GPU using the ADAM optimizer [37] and a mean squared error (MSE) loss
function. Through a series of grid searches, the optimal architecture was found to
be L16/C32/C64/C128/C256/C512/C256/C128/L64/L2, where L denotes a linear
layer, C denotes a FeaStNet convolutional layer, and the numbers represent the length
of the vertex feature vectors after passing through a given layer. Similarly, the optimal
learning rate was found to be 1e-3 and the optimal number of FeaStNet heads was
28
Figure 2-2: A parametric design model of a truss. Data sets are created by perturbing
design variables 𝑝1 -𝑝5 . Each design is loaded with a uniformly distributed vertical load
across the top and simply supported on the bottom. This particular design model is
referred to as DM7.
found to be 8. The resulting model has 2.7 million training parameters. Throughout
all trials, batch normalization was applied to the input and after each convolutional
layer, the ADAM weight decay was set to 1e-3, and the batch size was set to 256.
29
20
Density
10
0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
Mean Absolute Error (cm)
Figure 2-3: The GSM and pointwise surrogate achieve comparable predictive per-
formance on the test designs. Both error distributions are left-skewed, with 85% of
designs producing a mean average error of less than 0.1 cm on either model.
Both the GSM and pointwise surrogate successfully learn to predict a wide range
of structural behaviors. Figure 2-3 shows the distribution of prediction errors for
both models evaluated on the test set. The predictive performance of the two models
is roughly comparable: the mean absolute error (MAE) over the entire test set is
0.049 cm for the GSM and 0.053 cm for the pointwise surrogate (30% and 33% of
the baseline respectively). The error distributions for both models are skewed left,
implying that the models perform well on most of the designs but poorly on a few.
Interestingly, it is observed that many of the designs for which prediction accuracies
are low tend to also exhibit poor structural performance (i.e. large displacements).
Effective surrogate models should generalize well to unseen designs. For surrogate
models that rely on bounded, handcrafted design parameters, one might assess gen-
eralizability simply by sampling the design space with sufficient density. In contrast,
graph representations span the set of all conceivable space frames and thus a bounded
30
DM 5
...
DM 6
...
DM 7
...
DM 8
...
DM 9
...
End loads
...
...
Tower
Bridge
...
Figure 2-4: Each row shows a few designs generated from one of the eight design
models used in this chapter. Loads and supports are omitted on all but the first
column for clarity. The design models were selected to test specific scenarios that
commonly arise in engineering design.
DM 5 DM 6 DM 7 DM 8 DM 9
C. Generalization to unseen
--
design models
DM 5 DM 6 DM 7 DM 8 DM 9* DM 9
D. Transfer learning
across design models
DM 7 Bridge* Bridge
G. Transfer learning across
complexities - bridge
Figure 2-5: An overview of the trials used to assess the GSM’s generalizability across
seven specific scenarios. The first three trials involve a single training, while the
remainder of the trials leverage transfer learning to repurpose a previously trained
GSM for new tasks.
31
Figure 2-6: The GSM can learn on data from multiple design models at once (Trial
B ), and doing so is sometimes advantageous even for cases when only a single design
model is of interest. The GSM does not seem to generalize well to unseen design
models (Trial C ); however, transfer learning is an effective remedy (Trial D ) and
requires a fraction of the data required to train a GSM from scratch.
Figure 2-7: Left: A previously trained Graph-based Surrogate Model (GSM) can be
re-trained on a new data set with differing geometry, loads or topology. Right: Pre-
training significantly increases the data efficiency of the GSM. In these results from
Trial F, a pre-trained GSM trained on 20 designs (N =20) outperforms a fresh GSM
trained on 500.
32
design space does not exist. Developing practical intuition regarding the extent to
which graph-based surrogate models generalize to new designs is an open challenge.
Towards this end, a series of data sets and trials were designed to test the gen-
eralizability of the GSM under a variety of conditions. The truss design model from
section 2.4.1 (DM7) was modified to create four new design models. The new design
models, named DM5, DM6, DM8, DM9 for the number of bars along the top, have
identical outer profiles as DM7 but differing topologies (Fig. 2-4).
The following trials were designed to test the generalizability of the GSM. The
reader is referred to Figure 2-5 for an overview of the trials used throughout the rest
of the chapter. Let the term target refer to the design model of interest to the user,
that is, the design model from which the test set was generated. In Trial A, a GSM
was trained and tested on designs generated from the target design model. Note that
there is no overlap between the training and testing sets. In Trial B, training data
from all of the design models was combined to train the GSM. The GSM was then
tested on designs from the target design model as in Trial A. Trial B thus quantifies
the GSM’s ability to learn on multiple design models simultaneously. Note that this
would be impossible with the pointwise surrogate which is limited to fixed-topology
data. In Trial C, designs originating from the target design model were removed from
the training set, thus testing the GSM’s ability to generalize to unseen design models.
Trials A-C were repeated with each of the five design models (DMs 5-7) as the target,
the results of which can be seen in Figure 2-6.
In Trial A, the GSM archives a MAE of less than 0.1 cm for all design models,
confirming the previous conclusion that the GSM effectively approximates single de-
sign model data. Trial B also produced MAEs less than 0.1 cm across each design
model, indicating that the GSM can learn on data from multiple design models si-
multaneously. Interestingly, for three of the design models (DMs 6-8), the inclusion
of data from other design models actually improved predictions on the target. These
results indicate that it is sometimes beneficial to add designs to the training data
even if they are not from the design model of interest. Note that this did not hold
true for DMs 5 or 9 which might be considered the most different from the rest of the
33
design models in that they have the fewest and most bars, respectively. The degree to
which including off-target designs in the training data benefits training may therefore
depend on how similar those designs are to the target.
Since the GSM is able to learn on multiple design models simultaneously, one
might hope that the model generalizes well to previously unseen design models; how-
ever, this was not the case. The MAEs produced in Trial C were on average 76%
higher than those in Trial B. While the mid-range topologies (DMs 6-8) showed better
generalization than the extremes (DMs 5,9), the general trend was that removing all
target designs from the training data significantly reduces predictive performance. In
other words, the GSM does not seem to generalize well to unseen design models.
Consider a GSM that has been trained to predict the performance of one or more
design models as described in sections 2.3 and 2.4. Let these design models now
be referred to as the source. The subsequent trials demonstrate the performance of
this GSM when re-trained on a small training set from a new target design model
(Fig. 2-7). Multiple strategies exist for applying transfer learning to neural networks.
34
GSM Undeformed Deformed Prediction
20
Density
10
0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Mean Absolute Error (cm)
This study employs what is perhaps the most basic: simply retraining all learnable
parameters on the target data set for an additional 100 epochs. The further explo-
ration of transfer learning strategies, for example those that freeze parameters or add
new ones, for engineering design encouraged as future work.
In Trial D, the GSMs that were previously trained (pre-trained) on all design
models but the target (Trial C ) are re-trained on a small dataset (N =200) from the
target model. The results can be seen in the final series of Figure 2-6. The re-trained
GSMs produce significantly better predictions than those in Trial C. In fact, the re-
trained GSMs on average produce 5.5% lower errors than those in Trial B and use less
than a third of the training data. To further analyze the effects of transfer learning
on prediction accuracy, the error distribution from the pre-trained GSM in Trial D
was directly compared to that of a GSM trained only on the 200 design training set
(without pre-training). The distributions can be seen in Figure 2-8. Pre-training on
related source models reduces the average MAE across the test set by 70% and the
standard deviation by 54%, resulting in a more accurate and robust surrogate.
35
0.3
MAE (cm)
baseline
0.2
0.1
0.0
1 10 100 1,000
N
Figure 2-9: Transfer learning consistently improved the GSMs data efficiency, reduc-
ing the amount of training data required to achieve a given prediction accuracy. The
baseline refers to a naive model which always predicts the mean displacement from
the 1,000 design training set.
Loss
1.0 2.0
0.0 0.0
0 20 40 60 80 100 0 20 40 60 80 100
Epoch Epoch
6.0
N = 200 N = 1,000
4.0 5.0
Loss
Loss
2.0
0.0 0.0
0 20 40 60 80 100 0 20 40 60 80 100
Epoch Epoch
Figure 2-10: Pre-trained (prtn) GSMs converge faster and to a lower loss value than
those trained from scratch, particularly when the training size (N) is small. All curves
taken from Trial D, design model 7 (DM7).
36
2.5.2 Studying data efficiency
Effective surrogate models should achieve a useful level of prediction accuracy with
a minimum amount of training data. Data efficiency is particularly important in
engineering design, where quality design data is often scarce or prohibitively expensive
to generate. On the other hand, deep learning methods, with their large number of
trainable parameters, are notorious for requiring large data sets. This section explores
the effects of transfer learning on the GSM’s data efficiency.
Trial D was repeated for a variety of target data set sizes. Each data set was
generated as in section 2.4.1, and the full 1,000-design set was reserved for testing.
To ensure that all data sets were similarly distributed, any designs with maximum
displacements exceeding the 90th percentile from the test set were discarded. A
different random seed was used in sampling the 1,000-design training set to ensure
that training and testing sets did not overlap. In addition to the pre-trained GSM
from Trial D, a second (not pre-trained) GSM and a pointwise surrogate were trained
on the target sets for comparison.
The mean absolute prediction errors as a function of training set size (N ) can
be seen in the top row of Figure 2-9. For all models, prediction error correlates
negatively with training size, which is expected. In nearly all cases, the pre-trained
GSM achieves the lowest prediction errors, followed by the pointwise surrogate and
finally by the GSM trained from scratch. Transfer learning improves prediction MAEs
by 48.6%, 40.0% and 34.1% for DMs 5, 7, and 9 respectively. The implication is that
the amount of training data required to achieve a given predictive performance is
reduced by roughly one or two orders of magnitude. For DM 5, a pre-trained GSM
requires only 200 designs to achieve an MAE that is within 10% of the MAE produced
by training on 1,000 designs. For DM7, just 100 designs were sufficient to achieve a
similar result.
Interestingly, transfer learning was most beneficial for the medium-sized training
sets. It is presumed that the smallest training sets do not sufficiently represent the
differences between source and target distributions, while the largest training sets are
37
sufficiently large to train a GSM to its predictive limit from scratch. Positive transfer
was observed across all design models and training sizes, with the exception of the
largest training set for DM9 in which transfer learning increased MAE by 13.7%. This
was the only observed case of negative transfer throughout all trials.
The loss histories of both GSMs reveal further insights about the effects of transfer
learning. Figure 2-10 shows the evolution of training and validation losses for both
GSMs, plotted for four training set sizes. Note that the validation losses for the
transfer learned GSM at epoch zero are initially high and comparable to an untrained
model. At this point, the conditions are quite similar to those in Trial C : the model is
attempting to generalize to an unseen topology. However as training progresses, the
transfer learned validation losses converge faster and to a lower value than those of
the models trained from scratch. Roughly 30 epochs are sufficient to re-train a model,
compared to 100 epochs without transfer learning, representing further computational
savings.
Encouraged by the positive transfer observed in Trial D, one might ask “for which
source and target data sets is transfer learning useful?” The design models DM5-9
differ in topology but have the same outer profile, supports and loads. The following
trials were designed to test other source/target differences that might occur in a
design process. In Trial E, a GSM is pre-trained on 1,000 designs from DM 7 and
re-trained on identical geometry but with different loads (point-loads at the ends as
opposed to a uniform load across the top). Thus Trial E, tests the ability to transfer-
learn across load cases. The remaining two trials test the ability to transfer-learn
across domains. In Trial F, a GSM is again pre-trained on DM 7 and retrained on
a set of trussed towers. The towers were generated by sampling three handcrafted
design parameters. Each is pinned at the bottom and loaded horizontally on the
remaining joints. The spanning trusses (DM5-9) and towers differ in topology and
outer profile, but have a similar number of bars (27 and 26, respectively). In Trial G,
the GSM is pre-trained on DM 7 and re-trained on a set of densely trussed bridges.
The bridges each consist of 404 bars, making them significantly more complex than
the trusses. The bridges are uniformly loaded across the top and simply supported at
38
Table 2.1: The difference in mean absolute error (∆ MAE) between the pre-trained
GSM and a GSM trained from scratch, averaged across all training sizes. Transfer
learning improved prediction accuracy by 19-48%.
the bottom. The hyperparameters described in section 2.4.2 were used for all trials
with the exception of Trial G, which used a batch size of 128 and learning rate of
5e-4.
The results from Trials E, F and G are shown in the bottom row of Figure 2-
9. In Trial E, pre-training on the same geometry but different load cases improved
MAE by an average of 25.5% across all training sizes. In Trial F, pre-training on
trusses improved MAE predictions on towers by an average of 54.1%, and in Trial
G, the same process improved MAE predictions on bridges by 19.8%. The result is a
significant reduction in the amount of required training data. For example, in Trial
E, a GSM pre-trained on trusses achieves better prediction accuracy when re-trained
on 20 tower designs as a GSM which was trained on 500 towers from scratch. Table
2.1 summarizes the findings from Trials D-G. Positive transfer was observed across
all trials and training sizes, although to varying degrees. As before, the medium-
sized training sets generally showed largest benefit and the smallest and largest data
sets showed the least. These results further motivate the use of transfer learning to
repurpose design data and surrogate models for new tasks.
39
not rely on handcrafted design parameters, it can be trained on data from multiple
design spaces simultaneously, and often benefits from doing so. Transfer learning
was presented as an effective method for repurpose GSMs to new tasks by leveraging
historical data. GSMs that are pre-trained on a related data set achieve 19-48%
lower prediction errors than those trained from scratch. The result is a more flexible,
general and data-efficient surrogate model for space frame structures.
Future work could consider the increasingly wide array of graph-based learning
methods and assess their suitability for space frames. A similar analysis could be
performed for surface and volumetric meshes. Though both are easily represented
as graphs, meshes differ from space frames in that the topology is not physically
meaningful. In terms of transfer learning, further work is required to be able to predict
the most effective sources for a given target. One might also explore alternative
transfer learning strategies in which learnable parameters are added or frozen during
re-training. Finally, the ability to learn across designs of varying complexity (Trial
G) might support hierarchical learning strategies in which models are progressively
trained on higher complexity designs.
40
Chapter 3
Recent advancements in geometric deep learning have enabled a new class of en-
gineering surrogate models; however, few existing shape datasets are well-suited to
evaluate them. This chapter introduces the Simulated Jet Engine Bracket Dataset
(SimJEB): a new, public collection of crowdsourced mechanical brackets and high-
fidelity structural simulations designed specifically for surrogate modeling. SimJEB
models are more complex, diverse, and realistic than the synthetically generated
datasets commonly used in parametric surrogate model evaluation. In contrast to
existing engineering shape collections, SimJEB’s models are all designed for the same
engineering function and thus have consistent structural loads and support conditions.
The models in SimJEB were collected from the original submissions to the GrabCAD
41
Jet Engine Bracket Challenge: an open engineering design competition with over 700
hand-designed CAD entries from 320 designers representing 56 countries. Each model
has been cleaned, categorized, meshed, and simulated with finite element analysis ac-
cording to the original competition specifications. The result is a collection of diverse,
high-quality and application-focused designs for advancing geometric deep learning
and engineering surrogate models. This chapter also appeared as a standalone paper
accessible at https://arxiv.org/abs/2105.03534.
3.1 Introduction
42
realistic shapes. As a result, most synthetically generated collections suffer from ei-
ther excessive homogeneity or a lack of realism. On the other hand are shape datasets
collected from various public repositories (i.e. "the wild"). Collected shapes do not
suffer from the realism problem and have been invaluable for developing tasks like
classification and segmentation; however, the lack of control over shape variation and
function typically makes these datasets ill-suited for surrogate modeling, where each
shape should be designed for the same engineering task.
This work explores a third source of shape data: online design competitions.
Design competition entries occupy the sweet spot between generated and collected
datasets. The designs are complex, diverse, and realistic since each one is hand-
designed by a different domain expert, and yet each conforms to the functional engi-
neering requirements enforced by the competition. Furthermore, the participating en-
gineers typically design CAD with structural simulation in mind, resulting in cleaner,
higher quality CAD models than one might encounter in the wild.
This chapter introduces the Simulated Jet Engine Bracket Dataset (SimJEB),
a new public shape collection for testing geometric machine learning methods with
an emphasis on surrogate modeling (Figure 3-1). The bracket designs in SimJEB
originate from the GE Jet Engine Bracket Challenge: an open engineering design
competition hosted in 2013 by GrabCAD.com [36] (Figure 3-2). The original compe-
tition featured over 700 entries, representing 320 designers from 56 countries whose
work is estimated to have taken 14 person-years [52]. The diversity of the entries
reflects that of their creators, employing a broad range of design strategies, styles
and structural behaviors. As mandated by the competition, each bracket has the
same four bolt holes and interface point so that they might all be used for the same
engineering task. In SimJEB, each design has been cleaned, meshed, and simulated
according to the competition’s original structural load cases by the author of this
thesis. For additional analytical potential, each design is labeled as belonging to one
of six design categories as determined by a domain expert. Although these particular
brackets were designed for use in a jet engine, their structural behavior and design
objectives are representative of most structural engineering tasks in civil, mechanical
43
and biomedical engineering. The primary contributions are summarized as follows:
Section 3.2 describes how SimJEB differs from existing datasets, section 3.3 de-
scribes the geometry processing pipeline used to create the dataset, section 3.4 char-
acterizes the dataset in terms of shape and structural behavior, section 3.5 addresses
licensing, access and attributions, section 3.6 proposes how SimJEB might be used
as a surrogate modeling benchmark, and section 3.7 summarizes the conclusions and
offers suggestions for future work.
Figure 3-2: The GE Jet Engine Bracket Competition hosted by GrabCAD.com drew
contributions from engineers of many backgrounds and experience levels to compete
for cash prizes
44
3.2 Related work
Many techniques exist for generating 3D shapes. The graphics community has made
extensive use of procedural models for generating content like buildings [54], space-
ships [67] and indoor scenes[60] (see [74] for a review of procedural modeling tech-
niques). While effective for digital content generation, procedural modeling can be
difficult to apply to engineering design where manufacturing constraints or package
space requirements necessitate more precise control over allowable shapes. Paramet-
ric CAD models allow engineers to precisely explore shape parameters (i.e. a design
space) [73]. Parametric models can also be constructed via mesh morphing [40],
a technique used extensively in mechanical engineering for shape optimization [76].
Both procedural and parametric generation require the user to manually codify all of
the ways in which shapes may vary. Put eloquently by Krispel et al., "Shape design
becomes rule design" [39]. This constraint limits the diversity and realism of shapes
that can be generated.
More recently, deep learning methods have been used to learn shape generation
schemes from a collection of training shapes (see [16] for a recent review). Deep shape
generation has even been applied to engineering design. [80] used an autoencoder
to learn shape parameters from collected vehicle designs. [55] trained a generative
adversarial model on samples from a parametric topology optimization model. While
learning is a promising direction for shape generation, models require large training
sets for each new application. Datasets like SimJEB can play a critical role in learning
more realistic and practical shape generation models for mechanical design.
Several large shape datasets have been released in recent years, including the Prince-
ton Shape Benchmark [75], ModelNet [89], ShapeNet [14], Thingi10k [96], ABC [38]
and Fusion360 [87]. These datasets have been impactful on a wide range of geometry
45
Figure 3-3: The semi-automated pipeline used to filter, clean, mesh and simulate
the raw CAD contest submissions. Tasks such as orienting, meshing, and checking
cleanliness are relatively easy to automate, while tasks that require engineering intu-
ition like assessing part relevancy are best left up to a domain expert.
Engineering design competitions have long been a source of innovation (e.g. The Lon-
gitude Act of 1714 [12] and the Tower of London competition of 1890 [82]). Though
46
modern design competitions, like those hosted by NASA [22], and ASME [3], con-
tinue to yield useful designs, they are a relatively untapped source for functional
shape data. Previous works have utilized data from the GE Jet Engine Bracket Chal-
lenge, including several that have used the geometry and loads for testing topology
optimization methods [27, 19, 53, 51]. These works used the competition package
space and loads but did not consider the dataset as a whole. [52] studied 10 of the
challenge entries in detail as part of a case study in sustainable design but did not
perform physical simulation or release the data. [47] trained a voxel-based surrogate
model to predict support material and print time for additive manufacturing using
300 voxelized bracket designs. The data were not made public.
SimJEB is a public collection of 381 cleaned, meshed and simulated designs from
the GE Jet Engine Bracket Challenge. The collection is a step towards advancing
geometric deep learning methods for realistic, functional engineering components.
The GE Jet Engine Bracket Challenge was a large engineering design competition
hosted by General Electric and GrabCAD.com in 2013 [36]. The competition at-
tracted 700 submissions from 320 designers representing 56 countries. By one esti-
mate, the total amount of human-hours required to design the brackets was 700 work
weeks or 14 human years [52]. Participants were challenged to design the lightest pos-
sible lifting bracket for a jet engine, subject to the constraint that the maximum stress
in the part did not exceed the yield stress of Ti-6Al-4V titanium over four specified
load cases. Entries were limited to designs that fit within a provided package space,
and were required to have four bolt holes and an interface hole in specific locations.
47
The bracket designs were also required to be manufacturable via additive manufactur-
ing. Cash prizes totalling to $30,000 USD were distributed among multiple winners
selected by a panel of mechanical engineers from GE and GrabCAD.com.
Figure 3-4: The "leaky" pipeline used to filter, clean, orient, mesh, and simulate
bracket models. Often models can be repaired manually but with diminishing return
On the date of access (June 4th, 2020), the GE Jet Engine Bracket Competition
website had 629 entries. While most entries contained a single CAD file, some entries
contained redundant designs in different CAD file formats, some contained multiple
design variations, and still others were missing a CAD file entirely. In the first pass,
files were filtered programmatically. If entries contained multiple CAD files with dif-
ferent names (e.g. "bracket_v1.stp", "bracket_v2.stp") then both files were retained
as they were assumed to be different design variants. If entries contained multiple
CAD files with the same name save the extension (e.g. "model1.stp", "model1.igs"),
only the file with the highest priority extension was retained, where the priority was
defined as follows: .catpart, .sldasm, .sldprt, .prt, .stp, .step, .x_b, .x_t, .iges, .igs,
.ipt. Note that native formats were preferred over neutral ones. In the second pass,
the remaining entries with multiple CAD files were manually screened for obvious
redundancies (e.g. "GE_Bracket.stp", "GE_Bracket_color_changed.stp"). At the
end of the file selection process, 56 entries had more than one valid CAD file, 518
48
entries had exactly one, and 55 entries did not have any, resulting in a total of 635
raw CAD files.
Prior to performing structural analysis, each CAD model was cleaned, tagged, scaled,
and oriented to a canonical pose through the following semi-automated process. First,
an automated check was performed to get the total number of closed volumes in the
model. Next, the user was prompted to review the model and optionally assign one of
the following tags: duplicate, irrelevant, non-repairable. The model was considered to
be clean if it contained exactly one closed volume and was not assigned a tag. 447 of
the 635 models were determined to be clean after the first pass. The units of length
for each clean model were automatically inferred from the volume of its bounding box
and the appropriate scale was applied such that all models were defined in millimeters.
The user was then prompted to select three reference points on the model which were
used to translate and rotate it to a canonical pose. The models that were considered
unclean were either manually cleaned, by deleting extraneous geometry and sliver
surfaces or patching non-watertight volumes, or determined to be non-repairable in
a second pass, resulting in a total of 514 clean CAD models (Figure 3-4).
A similar semi-automated process was used to build the finite element models using
commercial software [1, 2]. First, the user was prompted to select the surfaces defin-
ing all four bolt holes and the interface hole. Next, a first-order tetrahedral mesh was
generated with an average element size of 2 mm. Each bolt was modeled by a rigid
RBE2 spider element connected to each mesh node on the selected bolt surfaces and
constrained by a Single Point Constraint (SPC) at the center. An RBE3 spider ele-
ment was used to distribute each of the four loads across the surfaces of the interface
hole. As specified in the original competition, the bracket material was modeled as
Ti-6Al-4V aluminum (E=113.8 MPa, 𝜈 =0.342, 𝜌 =4.47e-2 g/mm3 ). Finally, each
49
of the four load conditions were simulated using linear-static FEA and the resulting
displacements and von Mises stresses were recorded for each node (Figure 3-5). Note
that the structural analysis performed for SimJEB may use slightly different assump-
tions than those of the original competition are are not meant to replace or correct
any simulations performed by the original designers.
Figure 3-5: Each bracket was simulated according to the four load conditions speci-
fied by the competition. Five vertex-valued scalar fields were extracted for each load
case: the displacement in X,Y,Z directions, the displacement magnitude, and the von
Mises stress.
The broad range of designer backgrounds, experience levels, and software tools behind
the bracket designs are reflected in the design diversity. While each design has the
same bolt holes and interface point mandated by the competition, the remainder of the
shape was left to the engineer’s imagination. The topology, complexity, and structural
50
design strategy thus vary significantly (Figure 3-6). In SimJEB, each bracket has been
manually assigned to one of six general design categories: block, flat, arch, butterfly,
beam and other. Block designs were defined as those that occupy a large portion of the
allotted package space. Flat designs were considered to those that have mostly flat
regions between the bolt holes and interface point, while arch and butterfly designs
have positive and negative curvature in these regions, respectively. Beam designs
were considered those that have long, slender beam-like regions. Designs that did not
fit well into any of these categories were labeled as other. The above categorization
serves two purposes: 1) it provides a convenient way to partition the data into more
homogeneous subsets, and 2) it can be used as labels for classification tasks. Note
that both the definition and assignment of these categories is subjective and imperfect
but may be useful in practice for certain modeling or geometry processing tasks.
Figure 3-6: Top left: all brackets are manually labeled as belonging to one of six gen-
eral design categories. Top right: The volume was bounded above by the competition-
specified package space (apart for three designs which violate the rule). Bottom:
brackets range in the number of triangular and tetrahedral elements in the surface
and simulation meshes, respectively.
51
3.4.2 Characterization of structural performance
52
Figure 3-7: Multi-objective plots for each of the design categories and for the Pareto
optimal designs. Maximum displacements are taken over all vertices and load cases.
Optimal designs, which are lightweight and stiff, are located closer to the origin.
Note the variety of structural performance within each design category and across
the dataset.
8). Access requires making a free Harvard Dataverse account. The following data
are available for each bracket design: clean CAD (.stp), finite element model (.fem),
tetrahedral mesh (.vtk ), triangular surface mesh (.obj ) and simulation results (.csv ).
Each file type is packaged into a separate zip file to facilitate use cases requiring
only a subset of file types. A single metadata file provides summary statistics for
each bracket and three train/test splits for a benchmark (further explained in section
3.6). Additionally, a sample zip file containing one of each file type is provided for
convenience. Models are identified by an integer; the files 0.stp, 0.fem, 0.vtk, 0.obj
and 0.csv thus all belong to model 0. A global README and local README in
each zip file attribute each model to its original designer and provide a link to the
original submission on grabCAD.com. Any questions regarding the dataset should
be directed to ewhalen@mit.edu.
53
Figure 3-8: The SimJEB dataset is available for public use and hosted through
Harvard Dataverse. See section 3.5 for access instructions.
54
Table 3.1: The Mean Absolute Error (MAE) of the naive surrogate model averaged
over three train/test splits. These values can be used as a reference point for bench-
marking future surrogate models.
Note that MAE is typically preferred over other potential quality metrics because the
units are easily interpretable.
55
56
Chapter 4
Physics-informed neural networks (PINNs) have the potential to improve the data-
efficiency of structural surrogate models by leveraging governing equations; however,
the training process is more complex and less robust than that of purely data-driven
methods. This work proposes two heuristics that aid in the training process. The first
concerns the normalization of each term in the loss function. The second concerns
a multi-step refinement strategy in which the magnitudes of the predictions in one
step are used to scale the outputs of the next. Both heuristics are demonstrated on
a canonical linear elastic problem, for both hard and soft boundary conditions. The
proposed methods have the potential to improve the accuracy and convergence of
PINNs used for structural applications.
57
4.1 Introduction
The challenges associated with training PINNs are well-documented [85]. Since
the loss function is typically formulated as a weighted sum of residuals, one challenge
concerns choosing effective coefficients for each term. Improper weighting of the loss
terms can result in poor accuracy and even a failure to converge to the correct solution.
Though some heuristics exist for choosing loss weights, they are often chosen through
trial and error. The second challenge concerns output scaling. Previous works have
noted that scaling the PINNs outputs to be the same order of magnitude as the
desired solution aids training, but this requires knowing the solution magnitudes
a priori. While one might argue that the magnitude of the solution in structural
problems can often be inferred from engineering judgement, this is not always the
case. For new types of materials, geometries, or loads, guessing the magnitude of the
solution is nontrivial.
This chapter presents new heuristics for addressing two existing challenges in
training PINN’s. The first challenge concerns choosing effective coefficients for each
term in the loss function. By noting that training performance improves when all loss
terms are roughly of equal magnitude, this work proposes normalizing each term in
the loss function by dividing it by the losses produces from the first training step. The
second challenge concerns output scaling. Rather than guessing solution magnitudes,
58
this work suggests a multi-step training process, where the outputs of an un-scaled
network become the scales of the second network. The result is shown to improve
prediction accuracy and convergence on a canonical linear elastic problem. Finally,
this work concludes with a quantitative comparison between hard and soft implemen-
tations of the boundary conditions in the refinement step. The key contributions of
this chapter are as follows:
1. A multi-step training procedure for heuristically selecting loss weights and out-
put scales for physics-informed neural networks that results in improved con-
vergence and accuracy
2. A quantitative comparison of the heuristics for both hard and soft constraint
enforcement
The remainder of the chapter is organized as follows: section 4.2 briefly reviews
previous works, section 4.3 proposes the heuristics for training PINNs, section 4.4
presents experimental results, and section 4.5 concludes the chapter and offers sug-
gestions for future work.
The use of neural networks for approximating the solution to PDEs dates back to the
1990s [41]; however, the PINN was developed and popularized by [64], who suggested
using the network’s auto differentiation [7] capabilities for solving PDEs in the strong
form. Since then, PINNs have been proposed for a wide range of PDEs and appli-
cations, including fluid dynamics [65, 5, 88], fracture mechanics [29], and linear and
nonlinear structural mechanics [66, 31, 71]. Rao et al. [66] introduced the concept
of hard boundary conditions for elastic body problems, guaranteeing that Dirichlet
boundary conditions are exactly satisfied. This work utilizes a similar PINN formu-
lation to that in Rao et al. but applies the new heuristics for weighting loss terms
and scaling outputs.
59
This is not the first work to propose heuristics for training PINNs. Wang et al.
[85] explored the numerical instabilities that occur when training PINNs, drawing
on intuition from forward Euler integrators. Wang et al. propose a heuristic for
dynamically annealing the loss terms based on the concept of momentum. In contrast,
the method proposed in this work is simpler to implement and only needs to be
calculated at the beginning of training.
4.3 Methodology
The following section briefly outlines PINNs for static, plane-stress, linear elastic
problems and then introduces two heuristics for improving the convergence and ac-
curacy of PINNs in practice.
The behavior of elastic bodies under load is governed by the following system of
PDEs, known as the the elasticity equations:
∇ · 𝜎 + 𝐹 = 𝜌𝑢𝑡𝑡 (4.1)
1
𝜀 = (∇𝑢 + (∇𝑢)𝑇 ) (4.2)
2
𝜎 = 𝐶𝜀 (4.3)
where 𝜎 is the Cauchy stress tensor, 𝐹 is the body force, 𝜌 is the density, 𝑢𝑡𝑡 is
the acceleration vector, 𝜀 is the strain tensor, 𝑢 is the displacement vector, and
𝐶 is the fourth-order constitutive tensor. Equation 4.1 is known as the equation of
motion, equation 4.2 is the strain-displacement relationship, and 4.3 is the constitutive
equation. Under the assumptions of equilibrium, plane stress, and linear isotropic
materials, and in the absence of body forces, the problem can be rewritten as follows:
60
𝜕𝜎𝑥𝑥 𝜕𝜏𝑥𝑦
+ =0 (4.4)
𝜕𝑥 𝜕𝑦
𝜕𝜎𝑦𝑦 𝜕𝜏𝑥𝑦
+ =0 (4.5)
𝜕𝑦 𝜕𝑥
𝐸 𝜕𝑢𝑥 𝜕𝑢𝑦
( + 𝜈 ) − 𝜎𝑥𝑥 = 0 (4.6)
1 − 𝜈 2 𝜕𝑥 𝜕𝑦
𝐸 𝜕𝑢𝑥 𝜕𝑢𝑦
2
(𝜈 + ) − 𝜎𝑦𝑦 = 0 (4.7)
1−𝜈 𝜕𝑥 𝜕𝑦
𝐸 𝜕𝑢𝑥 𝜕𝑢𝑦
( + ) − 𝜏𝑥𝑦 = 0 (4.8)
2(1 + 𝜈) 𝜕𝑦 𝜕𝑥
where 𝐸 is the Young’s modulus, 𝜈 is the Poisson’s ratio, 𝜎𝑥𝑥 and 𝜎𝑦𝑦 are the normal
stresses in the 𝑥 and 𝑦 directions, and 𝜏𝑥𝑦 is the shear stress.
The loads and supports in a structural analysis problem define boundary conditions,
which in turn specify a unique solution to the elasticity equations. The two most
common types of boundary conditions in structural problems are enforced displace-
ments and enforced tractions. Supports are a special case of enforced displacement
where the displacement is zero.
Enforced displacements can be described by the following Dirichlet boundary con-
ditions:
𝑢𝑥 − 𝑑𝑥 = 0 on 𝜕Ω (4.9)
𝑢𝑦 − 𝑑𝑦 = 0 on 𝜕Ω (4.10)
61
as a traction:
𝜎𝑥𝑥 𝑛𝑥 + 𝜏𝑥𝑦 𝑛𝑦 − 𝑇𝑥 = 0 on 𝜕Ω (4.11)
where 𝑇 is the traction in units of force per area, and 𝑛𝑥 and 𝑛𝑦 are the 𝑥 and 𝑦
components of the surface normal. Note that when the applied load is normal to
the surface the traction conditions impose a normal stress. Similarly, loads that are
tangent to the surface impose a shear stress.
A neural network can approximate the solution to a PDE by learning a map between
spatial or temporal variables and state variables (Figure 4-1). For the case of a static,
plane stress problem, the network can take the following form:
where 𝒩𝜃 is a fully connected neural network with learnable parameters 𝜃. Note that
while the displacement vector alone is enough to fully define the state of an elastic
body problem, this work additionally includes stresses in the output. This mixed
variable formulation was shown by [66] to improve accuracy and convergence.
Training a neural network equates to solving an optimization problem, where
the optimal network parameter values are those that minimize a loss function. In
a traditional, purely data-driven neural network, the most common loss function for
regression tasks is the mean squared error (MSE) between the predictions and the
ground truth:
𝑁
1 ∑︁
ℒ𝑑𝑎𝑡𝑎 = (𝑠𝑖 − 𝑠ˆ𝑖 )2 = ‖𝑠𝑖 − 𝑠ˆ𝑖 ‖2 (4.14)
𝑁 𝑖=1
where 𝑠ˆ𝑖 denotes the ground truth value for observation 𝑖 in a training set of size 𝑁 .
All that is required to convert a data-driven network into a physics-informed network
62
predict
...
...
...
...
auto diff
Figure 4-1: The PINN learns a map between 2D spatial coordinates and the displace-
ment and stress field by minimizing a loss function containing PDE residuals. Partial
derivatives are computed via auto differentiation.
is to add terms to the loss function that enforce the PDE and boundary conditions.
Note that equations 4.6 - 4.12 evaluate to zero for the desired solution, so a natural
loss function is to minimize the MSE of these terms:
ℒ𝑝𝑖𝑛𝑛 = ℒ𝑑𝑎𝑡𝑎
⃦ 𝜕𝜎𝑥𝑥 𝜕𝜏𝑥𝑦 ⃦2
⃦ ⃦
+𝜆𝑚𝑥 ⃦ ⃦ + ⃦
𝜕𝑥 𝜕𝑦 ⃦Ω
⃦ 𝜕𝜎𝑦𝑦 𝜕𝜏𝑥𝑦 ⃦2
⃦ ⃦
+𝜆𝑚𝑦 ⃦ ⃦ + ⃦
𝜕𝑦 𝜕𝑥 ⃦Ω
⃦ ⃦2
⃦ 𝐸 𝜕𝑢 𝑥 𝜕𝑢 𝑦 ⃦
+𝜆𝜀𝑥 ⃦ (
⃦ 1 − 𝜈 2 𝜕𝑥 + 𝜈 ) − 𝜎 𝑥𝑥
⃦
𝜕𝑦 ⃦
⃦Ω
⃦2
⃦
⃦ 𝐸 𝜕𝑢𝑥 𝜕𝑢𝑦
+𝜆𝜀𝑦 ⃦
⃦ (𝜈 + ) − 𝜎𝑦𝑦 ⃦ (4.15)
1 − 𝜈 2 𝜕𝑥 𝜕𝑦 ⃦
⃦Ω
⃦2
⃦
⃦ 𝐸 𝜕𝑢 𝑥 𝜕𝑢 𝑦
+𝜆𝜀𝑥𝑦 ⃦ (
⃦ 2(1 + 𝜈) 𝜕𝑦 + ) − 𝜏 𝑥𝑦 ⃦
⃦
𝜕𝑥 Ω
where 𝜕Ω𝑖 is the 𝑖th boundary and the 𝜆s are scalar weights that determine the
priority of each term. Note that the data loss ℒ𝑑𝑎𝑡𝑎 is optional and that without it,
63
the PINN effectively becomes a solver.
where 𝐷𝑖 (𝑥, 𝑦) is the shortest distance from a point (𝑥, 𝑦) to the boundary 𝜕Ω𝑖 on
which the value 𝑠𝑖 (𝑥, 𝑦) is imposed. Note that this formulation guarantees that pre-
dictions on the boundary will match the prescribed value. Note also that boundary
conditions imposed in a hard manner no longer need to be included in the loss func-
tion.
This work proposes an automated method for selecting the coefficients 𝜆 in the
weighted loss function (Eq. 4.3.3). Based on the observation that PINNs converge
faster when the loss terms are roughly of equal magnitude, this work proposes nor-
malizing each loss term by selecting 𝜆s such that the terms are scaled to one. In
practice, this can easily be achieved by training for a single step, recording the value
of each term in the loss function, and setting the coefficients equal to the reciprocal:
1
𝜆𝑛 = (4.17)
ℒ0𝑛
where 𝜆𝑛 is the coefficient for the 𝑛th term in the loss function and ℒ0𝑛 is the value
of that term after the 0th training step. In practice, this loss term normalization
seems to work well, although there are some cases where a term needs to be further
64
Initialize weights
L-BFGS: 5k steps
Train ...
ADAM: 1k steps, lr=1e-3
...
L-BFGS: 1k steps ...
Predict
Contributions
Figure 4-2: The loss term normalization is applied before training to ensure that each
objective is weighted equally. After the initial training, additional refinement trainings
can be used to improve the prediction. Refinement trainings scale the outputs by the
magnitude of the predictions from the previous training.
This work also proposes a method for selecting the values by which the outputs of
the network are scaled. In the absence of output scaling, PINNs generally achieve
poor accuracy; however, the magnitude of the predicted solution is frequently correct.
This work thus proposes that training occur in two steps:
The proposed training procedure is depicted in Figure 4-2. Note that, since the initial
training only needs to produce a good initial guess of the solution, it does not have
to be trained for as long as the refinement step.
65
4.4 Results
The proposed heuristics were tested on a simple linear elastic problem. The PINN
was implemented using the DeepXDE python library [44]. For each trial, the PINN
was implemented using a fully connected network with four hidden layers, each with
64 neurons, hyperbolic tangent activation functions, and Glorot uniform parameter
initialization [28]. Unless otherwise noted, all trainings used the following optimiza-
tion sequence: 10,000 steps of the ADAM optimizer [37] with a learning rate of 1e-2,
5,000 steps of the L-BFGS optimizer [97], 1,000 steps of ADAM with a learning rate
of 1e-3, and finally 1,000 steps of L-BFGS. The L-BFGS steps used an early stop-
ping mechanism with default hyperparameters. All models were trained without any
labeled data.
The initial training was performed using the hard boundary conditions described in
section 4.3.4. Loss term normalization was applied after the first step. Figure 4-4
depicts the weighted value of each term in the loss function over time. Note that loss
values are only approximately equal to one due to the random initialization of the
network. Interestingly, all terms except the equation of motion in the x-direction,
(𝑚𝑥), decrease during the first phase of ADAM optimization. The 𝑚𝑥 term then
66
train points
0.05
0.00
y
−0.05
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
x
test points
Figure 4-3: The cantilever beam problem used for testing. Left: the beam is fully
constrained at the wall and loaded on the end with a distributed shear load. Two
point clouds are sampled: one for testing and one for training. Right: the analytical
solution.
0.01
0.0e+0
sigX
loss
sigX
1.0e+3
0.0001
0.0e+0
1.0e+3
sigY
sigY
1.0e-4 0.000001
5.0e-5
0.0e+0
tauXY
tauXY 1e-8
1.9e-6
Figure 4-4: Results from the initial training with all loss terms normalized. The
displacement predictions are off by one or two orders of magnitude.
decreases during L-BFGS, after which the training converges. Figure 4-4 also shows
the predicted solution. While the heat maps appear to have the correct patterns,
they are off by one or two orders of magnitude.
Since the 𝑚𝑥 term was clearly the limiting factor, the initial training was repeated
by scaling 𝜆𝑚𝑥 by 0.1, the results of which can be seen in Figure 4-5. Though the loss
history looks similar, the predicted results are now on the correct order of magnitude.
The mean absolute error (MAE) of the displacement terms are 5e-4 m and 8e-3 m
for 𝑢𝑥 and 𝑢𝑦 , respectively.
67
Initial training
uX
100
uX loss term
2.0e-3
total
mX
2.1e-3 mY
1
uY eX
uY eY
2.0e-2
eXY
1.0e-2
0.01
0.0e+0
sigX
sigX
loss
1.0e+3
0.0001
0.0e+0
1.0e+3
sigY
sigY
1.0e-5 0.000001
0.0e+0
1.0e-5
tauXY
tauXY
1e-8
1.7e-3
Figure 4-5: The initial training repeated with 𝜆𝑚𝑥 scaled by 0.1. The predictions are
now close to the analytical solution and can be used in future refinement steps.
uX
Refinement - soft BCs
uX loss term
100,000
2.0e-3 total
0.0e+0 mX
2.0e-3 mY
uY eX
uY eY
3.8e-2 eXY
1 Ty_end
Dx_wall
1.6e-5 Dy_wall
sigX
Tx_end
loss
sigX
1.0e+3 T_top/bot
0.0e+0 0.00001
1.0e+3
sigY
sigY
1.0e-6
5.0e-7
1e-10
0.0e+0
tauXY
tauXY
3.8e-2
Figure 4-6: Results from the refinement step with soft boundary conditions. Predic-
tions show good agreement with the analytical solution.
In the refinement training, the outputs are scaled by the maximum absolute value of
the predictions from the initial training. The predictions and loss histories for the soft
refinement step can be seen in Figure 4-6. Note that the loss function now has ten
terms since boundary conditions are included. Following similar logic to the previous
section, the 𝜆𝑇 𝑦𝑒𝑛𝑑 coefficient was scaled by 10. The predictions have improved by
almost an order of magnitude over the initial training: the MAE in displacement is
now 7e-5 m and 1e-3 m for 𝑢𝑥 and 𝑢𝑦 , respectively.
68
uX
Refinement - hard BCs
uX loss term
2.0e-3 total
0.0e+0 100 mX
2.0e-3 mY
uY eX
uY eY
3.9e-2 eXY
1
0.0e+0
sigX
loss
sigX
1.0e+3
0.0e+0
0.01
1.0e+3
sigY
sigY
0.0e+0
5.0e-6 0.0001
1.0e-5
tauXY
tauXY
4.9e-5
0.000001
0 4,000 8,000 12,000 16,000
3.0e+1 step
Figure 4-7: Results from the refinement step with hard boundary conditions. The
results are slightly better than those produced by the soft boundary conditions, sug-
gesting that hard boundary conditions should be used in the refinement step.
Figure 4-8: Prediction errors compared across the initial training and the two refine-
ment trainings. Both refinements improved over the initial training, with the hard
enforcement slightly outperforming soft enforcement in displacement prediction.
The same refinement step was repeated using hard boundary conditions to directly
compare the multi-step refinement process on both boundary condition types. The
results from the refinement step with hard boundary conditions can be seen in Figure
4.3.4. A slight improvement is observed in the prediction: the MAE in displacement
is now 5e-5 m and 7e-4 m for 𝑢𝑥 and 𝑢𝑦 , respectively. A comparison of the prediction
errors from all three trainings is shown in Figure 4-8. These results suggest that a
refinement step with hard boundary conditions is an effective strategy for improving
PINN prediction accuracy.
69
4.5 Conclusions and future work
This work proposed two heuristics for improving the accuracy and convergence of
PINNs trained on linear elastic problems. The first heuristic normalizes each term
in the loss function, while the second uses a multi-step refinement technique to scale
the network outputs. Both heuristics can be implemented in a few lines of code and
have been demonstrated to improve the training performance on a canonical elastic
problem. More research is required to verify whether these heuristics are effective
on other PDEs and domains. Future work may also explore where more than two
refinement steps are advantageous, possibly using shorter training cycles in each step.
While PINNs show promise for improving the data efficiency of structural surrogate
models, more work is required to improve the ease and robustness of the training
process.
70
Chapter 5
Conclusion
This work makes four main contributions towards advancing the state of surrogate
modeling for structural engineering applications:
1. A graph-based surrogate model (GSM) is proposed which can predict the struc-
tural behavior of space frames given only their geometry, loads, and supports
as inputs. Since the GSM does not rely on hand-crafted design parameters to
make predictions, it can be trained on designs from multiple sources, often with
a performance advantage.
4. Two heuristics are presented for improving the accuracy and convergence of
71
physics-informed neural networks (PINNs) for structural applications. These
methods, demonstrated on a canonical problem, represent an important step
towards making PINNs easier to train and use in practice.
Combined, the proposed methods have potential to improve the generalizability and
data efficiency of surrogate models used to design engineering structures.
72
References
[4] Eman Ahmed, Alexandre Saint, Abd El Rahman Shabayek, Kseniya Cherenkova,
Rig Das, Gleb Gusev, Djamila Aouada, and Bjorn Ottersten. A survey on Deep
Learning Advances on Different 3D Data Representations. arXiv:1808.01462 [cs],
April 2019. arXiv: 1808.01462.
[6] Pierre Baque, Edoardo Remelli, Francois Fleuret, and Pascal Fua. Geodesic
Convolutional Shape Optimization. In Jennifer Dy and Andreas Krause, ed-
itors, Proceedings of the 35th International Conference on Machine Learning,
volume 80 of Proceedings of Machine Learning Research, pages 472–481, Stock-
holmsmässan, Stockholm Sweden, July 2018. PMLR.
[7] Atilim Gunes Baydin, Barak A Pearlmutter, Alexey Andreyevich Radul, and
Jeffrey Mark Siskind. Automatic differentiation in machine learning: a survey.
Journal of machine learning research, 18, 2018. Publisher: Journal of Machine
Learning Research.
[8] Davide Boscaini, Jonathan Masci, Emanuele Rodoià, and Michael Bronstein.
Learning Shape Correspondence with Anisotropic Convolutional Neural Net-
works. In Proceedings of the 30th International Conference on Neural Infor-
mation Processing Systems, NIPS’16, pages 3197–3205, Red Hook, NY, USA,
2016. Curran Associates Inc. event-place: Barcelona, Spain.
[9] Paul Bratley and Bennett L Fox. Algorithm 659: Implementing Sobol’s quasiran-
dom sequence generator. ACM Transactions on Mathematical Software (TOMS),
14(1):88–100, 1988. Publisher: ACM New York, NY, USA.
73
[10] Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre
Vandergheynst. Geometric deep learning: going beyond Euclidean data. IEEE
Signal Processing Magazine, 34(4):18–42, July 2017. arXiv: 1611.08097.
[11] Nathan C Brown and Caitlin T Mueller. Design variable analysis and genera-
tion for performance-based parametric modeling in architecture. International
Journal of Architectural Computing, 17(1):36–52, March 2019.
[12] M Diane Burton and Tom Nicholas. Prizes, patents and the search for longitude.
Explorations in Economic History, 64:21–36, 2017. Publisher: Elsevier.
[13] Weijuan Cao, Trevor Robinson, Yang Hua, Flavien Boussuge, Andrew R. Colli-
gan, and Wanbin Pan. Graph Representation of 3D CAD Models for Machining
Feature Recognition With Deep Learning. In Volume 11A: 46th Design Au-
tomation Conference (DAC), page V11AT11A003, Virtual, Online, August 2020.
American Society of Mechanical Engineers.
[14] Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing
Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianx-
iong Xiao, Li Yi, and Fisher Yu. ShapeNet: An Information-Rich 3D Model
Repository. arXiv:1512.03012 [cs], December 2015. arXiv: 1512.03012.
[15] Kai-Hung Chang and Chin-Yi Cheng. Learning to simulate and design for struc-
tural engineering. arXiv:2003.09103 [cs, stat], August 2020. arXiv: 2003.09103.
[16] Siddhartha Chaudhuri, Daniel Ritchie, Jiajun Wu, Kai Xu, and Hao Zhang.
Learning Generative Models of 3D Structures. Computer Graphics Forum,
39(2):643–666, May 2020.
[17] Noel Cressie. Spatial prediction and ordinary kriging. Mathematical Geology,
20(4):405–421, May 1988.
[19] Asmaa Ibrahem Dallash and Amr Ali Abdelmonaem. Optimal design of jet
engine bracket. Military Technical College, Cairo, Egypt, July 2017.
[20] Renaud Danhaive. Structural Design Synthesis Using Machine Learning. PhD
thesis, Massachusetts Institute of Technology, September 2020.
[21] Renaud Danhaive and Caitlin Mueller. Design subspace learning: Structural de-
sign space exploration using performance-conditioned generative modeling. Au-
tomation in Construction (in press), 2021.
[22] Brian Dunbar and Lillian Gipson. NASA Design Challenges and Competitions,
November 2019.
74
[23] Nira Dyn, David Levin, and Samuel Rippa. Numerical Procedures for Surface
Fitting of Scattered Data by Radial Functions. SIAM Journal on Scientific and
Statistical Computing, 7(2):639–659, April 1986.
[24] Matthias Fey and Jan E. Lenssen. Fast Graph Representation Learning with
PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs
and Manifolds, 2019.
[26] Anthony P. Garland, Benjamin C. White, Scott C. Jensen, and Brad L. Boyce.
Pragmatic generative optimization of novel structural lattice metamaterials with
machine learning. Materials & Design, page 109632, March 2021.
[27] Aboma Wagari Gebisa and Hirpa G Lemu. A case study on topology optimized
design for additive manufacturing. In IOP Conference Series: Materials Science
and Engineering, volume 276, page 012026. IOP Publishing, 2017. Issue: 1.
[28] Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep
feedforward neural networks. In Yee Whye Teh and Mike Titterington, editors,
Proceedings of the Thirteenth International Conference on Artificial Intelligence
and Statistics, volume 9 of Proceedings of Machine Learning Research, pages
249–256, Chia Laguna Resort, Sardinia, Italy, May 2010. PMLR.
[29] Somdatta Goswami, Cosmin Anitescu, and Timon Rabczuk. Adaptive fourth-
order phase field analysis using deep energy minimization. Theoretical and Ap-
plied Fracture Mechanics, 107:102527, June 2020.
[30] Xiaoxiao Guo, Wei Li, and Francesco Iorio. Convolutional Neural Networks
for Steady Flow Approximation. In Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, pages 481–
490, San Francisco California USA, August 2016. ACM.
[31] Ehsan Haghighat, Maziar Raissi, Adrian Moure, Hector Gomez, and Ruben
Juanes. A physics-informed deep learning framework for inversion and surro-
gate modeling in solid mechanics. Computer Methods in Applied Mechanics and
Engineering, 379:113741, June 2021.
[32] Rana Hanocka, Amir Hertz, Noa Fish, Raja Giryes, Shachar Fleishman, and
Daniel Cohen-Or. MeshCNN: A Network with an Edge. ACM Transactions on
Graphics, 38(4):1–12, July 2019. arXiv: 1809.05910.
[33] Jida Huang, Hongyue Sun, Tsz-Ho Kwok, Chi Zhou, and Wenyao Xu. Ge-
ometric Deep Learning for Shape Correspondence in Mass Customization by
Three-Dimensional Printing. Journal of Manufacturing Science and Engineer-
ing, 142(6):061003, June 2020.
75
[34] Yijiang Huang. pyconmech - https://pypi.org/project/pyconmech/, 2020.
[35] Haoliang Jiang, Zhenguo Nie, Roselyn Yeo, Amir Barati Farimani, and Lev-
ent Burak Kara. StressGAN: A Generative Deep Learning Model for 2D Stress
Distribution Prediction. In ASME 2020 International Design Engineering Tech-
nical Conferences and Computers and Information in Engineering Conference.
American Society of Mechanical Engineers Digital Collection, 2020.
[36] Kaspar Kiis, Jared Wolfe, Gregg Wilson, David Abbott, and William Carter.
GE Jet Engine Bracket Challenge, 2013. https://grabcad.com/challenges/ge-
jet-engine-bracket-challenge.
[37] Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimiza-
tion. arXiv:1412.6980 [cs], January 2017. arXiv: 1412.6980.
[38] Sebastian Koch, Albert Matveev, Zhongshi Jiang, Francis Williams, Alexey Arte-
mov, Evgeny Burnaev, Marc Alexa, Denis Zorin, and Daniele Panozzo. ABC: A
Big CAD Model Dataset for Geometric Deep Learning. In 2019 IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition (CVPR), pages 9593–9603,
Long Beach, CA, USA, June 2019. IEEE.
[39] Ulrich Krispel, Christoph Schinko, and Torsten Ullrich. The Rules Behind –
Tutorial on Generative Modeling. Proceedings of Symposium on Geometry Pro-
cessing / Graduate School, 12:2:1–2:49, 2014.
[40] Aaron W. F. Lee, David Dobkin, Wim Sweldens, and Peter Schröder. Mul-
tiresolution mesh morphing. In Proceedings of the 26th annual conference on
Computer graphics and interactive techniques - SIGGRAPH ’99, pages 343–350,
Not Known, 1999. ACM Press.
[41] Hyuk Lee and In Seok Kang. Neural algorithm for solving differential equations.
Journal of Computational Physics, 91(1):110–131, 1990. Publisher: Elsevier.
[42] Jaekoo Lee, Hyunjae Kim, Jongsun Lee, and Sungroh Yoon. Transfer Learning
for Deep Learning on Graph-Structured Data. page 7.
[43] Dandan Li, Senzhang Wang, Shuzhen Yao, Yu-Hang Liu, Yuanqi Cheng, and
Xian-He Sun. Efficient Design Space Exploration by Knowledge Transfer.
page 10.
[44] Lu Lu, Xuhui Meng, Zhiping Mao, and George E. Karniadakis. DeepXDE:
A deep learning library for solving differential equations. arXiv:1907.04502
[physics, stat], February 2020. arXiv: 1907.04502.
[45] Ali Madani, Ahmed Bakhaty, Jiwon Kim, Yara Mubarak, and Mohammad R. K.
Mofrad. Bridging Finite Element and Machine Learning Modeling: Stress Predic-
tion of Arterial Walls in Atherosclerosis. Journal of Biomechanical Engineering,
141(8):084502, August 2019.
76
[46] Jonathan Masci, Davide Boscaini, Michael M. Bronstein, and Pierre Van-
dergheynst. Geodesic convolutional neural networks on Riemannian manifolds.
arXiv:1501.06297 [cs], June 2018. arXiv: 1501.06297.
[48] Mark C. Messner. Convolutional Neural Network Surrogate Models for the
Mechanical Properties of Periodic Structures. Journal of Mechanical Design,
142(2):024503, February 2020.
[49] Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, and
Leonidas J. Guibas. StructureNet: Hierarchical Graph Networks for 3D Shape
Generation. arXiv:1908.00575 [cs], August 2019. arXiv: 1908.00575.
[50] Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svo-
boda, and Michael M. Bronstein. Geometric deep learning on graphs and mani-
folds using mixture model CNNs. arXiv:1611.08402 [cs], December 2016. arXiv:
1611.08402.
[51] Sourena MOOSAVI, Dominique CHAMORET, S Tie Bi, Naoual Sabkhi, and
Yannick Culnard. Topology Optimization Considering Additive Manufacturing
Constraints In An Industrial Context. In 14th WCCM-ECCOMAS Congress
2020, volume 1000, 2021.
[54] Pascal Müller, Peter Wonka, Simon Haegler, Andreas Ulmer, and Luc Van Gool.
Procedural modeling of buildings. In ACM SIGGRAPH 2006 Papers, pages
614–623. 2006.
[55] Sangeun Oh, Yongsu Jung, Seongsin Kim, Ikjin Lee, and Namwoo Kang. Deep
Generative Design: Integration of Topology Optimization and Generative Mod-
els. arXiv:1903.01548 [cs], May 2019. arXiv: 1903.01548.
[56] Sinno Jialin Pan and Qiang Yang. A Survey on Transfer Learning. IEEE Trans-
actions on Knowledge And Data Engineering, 22(10):15, 2010.
77
[58] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine
Learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
[59] Tobias Pfaff, Meire Fortunato, Alvaro Sanchez-Gonzalez, and Peter W Battaglia.
Learning Mesh-Based Simulation with Graph Networks. arXiv preprint
arXiv:2010.03409, 2020.
[60] Siyuan Qi, Yixin Zhu, Siyuan Huang, Chenfanfu Jiang, and Song-Chun Zhu.
Human-Centric Indoor Scene Synthesis Using Stochastic Grammar. In Pro-
ceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), June 2018.
[61] Nestor V. Queipo, Raphael T. Haftka, Wei Shyy, Tushar Goel, Rajkumar
Vaidyanathan, and P. Kevin Tucker. Surrogate-based analysis and optimization.
Progress in Aerospace Sciences, 41(1):1–28, January 2005.
[62] Abdel-Rahman A Ragab and Salah Eldin Ahm Bayoumi. Engineering solid
mechanics: fundamentals and applications. Routledge, 2018.
[63] Ayush Raina, Christopher McComb, and Jonathan Cagan. Learning to Design
From Humans: Imitating Human Designers Through Deep Learning. In Volume
2A: 45th Design Automation Conference, Anaheim, California, USA, August
2019. American Society of Mechanical Engineers.
[65] Maziar Raissi, Alireza Yazdani, and George Em Karniadakis. Hidden fluid me-
chanics: Learning velocity and pressure fields from flow visualizations. Science,
367(6481):1026–1030, 2020. Publisher: American Association for the Advance-
ment of Science.
[66] Chengping Rao, Hao Sun, and Yang Liu. Physics informed deep learning for com-
putational elastodynamics without labeled data. arXiv:2006.08472 [cs, math],
June 2020. arXiv: 2006.08472.
[67] Daniel Ritchie, Ben Mildenhall, Noah D. Goodman, and Pat Hanrahan. Control-
ling procedural modeling programs with stochastically-ordered sequential Monte
Carlo. ACM Transactions on Graphics, 34(4):1–11, July 2015.
[68] Charles Ruizhongtai Qi. Deep Learning on 3D Data. In Yonghuai Liu, Nick
Pears, Paul L. Rosin, and Patrik Huber, editors, 3D Imaging, Analysis and
Applications, pages 513–566. Springer International Publishing, Cham, 2020.
78
[69] David Rutten. Grasshopper 3D. v6. Robert McNeel & Associates.
[70] Jerome Sacks, William J. Welch, Toby J. Mitchell, and Henry P. Wynn. De-
sign and Analysis of Computer Experiments. Statistical Science, 4(4):409–423,
November 1989.
[71] Esteban Samaniego, Cosmin Anitescu, Somdatta Goswami, Vien Minh Nguyen-
Thanh, Hongwei Guo, Khader Hamdia, Timon Rabczuk, and Xiaoying Zhuang.
An Energy Approach to the Solution of Partial Differential Equations in Compu-
tational Mechanics via Machine Learning: Concepts, Implementation and Appli-
cations. Computer Methods in Applied Mechanics and Engineering, 362:112790,
April 2020. arXiv: 1908.10407.
[72] Teseo Schneider, Yixin Hu, Xifeng Gao, Jeremie Dumas, Denis Zorin, and
Daniele Panozzo. A Large Scale Comparison of Tetrahedral and Hexahedral
Elements for Finite Element Analysis. arXiv preprint arXiv:1903.09332, 2019.
[73] Adriana Schulz, Jie Xu, Bo Zhu, Changxi Zheng, Eitan Grinspun, and Wojciech
Matusik. Interactive design space exploration and optimization for CAD models.
ACM Transactions on Graphics, 36(4):1–14, July 2017.
[74] Noor Shaker, Julian Togelius, and Mark J Nelson. Procedural content generation
in games. Springer, 2016.
[75] Philip Shilane, Patrick Min, Michael Kazhdan, and Thomas Funkhouser. The
princeton shape benchmark. In Proceedings Shape Modeling Applications, 2004.,
pages 167–178. IEEE, 2004.
[77] Chuanqi Tan, Fuchun Sun, Tao Kong, Wenchang Zhang, Chao Yang, and Chun-
fang Liu. A Survey on Deep Transfer Learning. arXiv:1808.01974 [cs, stat],
August 2018. arXiv: 1808.01974.
[78] Tin Kam Ho. Random decision forests. In Proceedings of 3rd International
Conference on Document Analysis and Recognition, volume 1, pages 278–282,
Montreal, Que., Canada, 1995. IEEE Comput. Soc. Press.
[79] Stavros Tseranidis, Nathan C Brown, and Caitlin T Mueller. Data-driven ap-
proximation algorithms for rapid performance evaluation and optimization of
civil structures. Automation in Construction, 72:279–293, 2016. Publisher: El-
sevier.
79
[80] Nobuyuki Umetani. Exploring generative 3D shapes using autoencoder networks.
In SIGGRAPH Asia 2017 Technical Briefs on - SA ’17, pages 1–4, Bangkok,
Thailand, 2017. ACM Press.
[81] Nitika Verma, Edmond Boyer, and Jakob Verbeek. FeaStNet: Feature-Steered
Graph Convolutions for 3D Shape Analysis. In 2018 IEEE/CVF Conference on
Computer Vision and Pattern Recognition, pages 2598–2606, Salt Lake City, UT,
June 2018. IEEE.
[83] Nikolaos Vlassis, Ran Ma, and WaiChing Sun. Geometric deep learning for
computational mechanics Part I: Anisotropic Hyperelasticity. Computer Meth-
ods in Applied Mechanics and Engineering, 371:113299, November 2020. arXiv:
2001.04292.
[85] Sifan Wang, Yujun Teng, and Paris Perdikaris. Understanding and mitigating
gradient pathologies in physics-informed neural networks. arXiv:2001.04536 [cs,
math, stat], January 2020. arXiv: 2001.04536.
[87] Karl DD Willis, Yewen Pu, Jieliang Luo, Hang Chu, Tao Du, Joseph G Lam-
bourne, Armando Solar-Lezama, and Wojciech Matusik. Fusion 360 Gallery:
A Dataset and Environment for Programmatic CAD Reconstruction. arXiv
preprint arXiv:2010.02392, 2020.
[88] Jin-Long Wu, Heng Xiao, and Eric Paterson. Physics-Informed Machine Learning
Approach for Augmenting Turbulence Models: A Comprehensive Framework.
Physical Review Fluids, 3(7):074602, July 2018. arXiv: 1801.02762.
[89] Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou
Tang, and Jianxiong Xiao. 3D ShapeNets: A Deep Representation for Volumetric
Shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), June 2015.
[90] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and
Philip S. Yu. A Comprehensive Survey on Graph Neural Networks. IEEE Trans-
actions on Neural Networks and Learning Systems, pages 1–21, 2020. arXiv:
1901.00596.
80
[91] Jiayang Xu and Karthik Duraisamy. Multi-level convolutional autoencoder net-
works for parametric prediction of spatio-temporal dynamics. Computer Methods
in Applied Mechanics and Engineering, 372:113379, 2020. Publisher: Elsevier.
[92] Zack Xuereb Conti and Sawako Kaijima. A flexible simulation metamodel for
exploring multiple design spaces. In Proceedings of IASS Annual Symposia, vol-
ume 2018, pages 1–8. International Association for Shell and Spatial Structures
(IASS), 2018. Issue: 2.
[93] Soyoung Yoo, Sunghee Lee, Seongsin Kim, Kwang Hyeon Hwang, Jong Ho Park,
and Namwoo Kang. Integrating Deep Learning into CAD/CAE System: Gen-
erative Design and Evaluation of 3D Conceptual Wheel. arXiv:2006.02138 [cs],
February 2021. arXiv: 2006.02138.
[94] Zhibo Zhang, Prakhar Jaiswal, and Rahul Rai. FeatureNet: Machining feature
recognition based on 3D Convolution Neural Network. Computer-Aided Design,
101:12–22, August 2018.
[95] Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang,
Changcheng Li, and Maosong Sun. Graph Neural Networks: A Review of Meth-
ods and Applications. arXiv:1812.08434 [cs, stat], July 2019. arXiv: 1812.08434.
[96] Qingnan Zhou and Alec Jacobson. Thingi10k: A dataset of 10,000 3d-printing
models. arXiv preprint arXiv:1605.04797, 2016.
[97] Ciyou Zhu, Richard H Byrd, Peihuang Lu, and Jorge Nocedal. Algorithm 778:
L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization.
ACM Transactions on Mathematical Software (TOMS), 23(4):550–560, 1997.
Publisher: ACM New York, NY, USA.
81