Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Technology|Architecture + Design

ISSN: 2475-1448 (Print) 2475-143X (Online) Journal homepage: https://www.tandfonline.com/loi/utad20

Generative Deep Learning in Architectural Design

David Newton

To cite this article: David Newton (2019) Generative Deep Learning in Architectural Design,
Technology|Architecture + Design, 3:2, 176-189, DOI: 10.1080/24751448.2019.1640536

To link to this article: https://doi.org/10.1080/24751448.2019.1640536

Published online: 25 Oct 2019.

Submit your article to this journal

Article views: 19

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=utad20
Generative Deep Learning in Architectural Design 176
TAD 3 : 2
NEWTON 177

David Newton
University of Nebraska-Lincoln

Generative Deep
Learning in Architectural
Design

PEER REVIEW / OPEN


Generative Adversarial Networks (GANs) are
an emerging research area in deep learning
that have demonstrated impressive abilities to
synthesize designs, however, their application
in architectural design has been limited. This
research provides a survey of GAN technolo-
gies and contributes new knowledge on their
application in select architectural design tasks
involving the creation and analysis of 2D
and 3D designs from specific architectural
styles. Experimental results demonstrate how
the curation of training data can be used to
control the fidelity and diversity of generated
designs. Techniques for working with small
training sets are introduced and shown to
improve the visual quality of synthesized
designs. Lastly, experiments demonstrate
how GANs might be used analytically to gain
insight into specific architectural oeuvres.
Generative Deep Learning in Architectural Design 178

Introduction nondeterministic (i.e., usually produces a different output


Researchers in computational generative design have long from the same input) approaches to the problem of automat-
sought approaches that could realize the goal of elevating the ing design production. These approaches can be divided into
computer from a dumb drafting machine to that of an artifi- four categories: optimization and search; physically based algo-
cially intelligent agent capable of participating in the creation rithms; generative grammars; and probabilistic algorithms.
of designs as a collaborating partner, or as the primary author.
Russell and Norvig (2010, 1–4) define this agent-based type of Optimization and Search Algorithms
artificial intelligence (AI) as the ability to learn from and inter- Optimization and search algorithms have been used exten-
pret experience to meet goals adaptively. The impact of creat- sively in a number of fields in engineering for design gen-
ing such an agent could be transformative for the discipline eration (Deb 2012, 8–32), as well as in architectural design
of architecture. As a collaborating partner in the design pro- on a variety of design tasks such as plan generation (Caldas
cess, these agents could drastically reduce labor time, improve 2008), building massing (Besserud and Cotten 2008; Frazer
design quality, and allow for more affordable architecture by 1995, 65–75), and building envelope optimization (Gange and
helping architects explore designs more efficiently. In more Andersen 2010; Turrin et al. 2011). Their application in the dis-
autonomous roles, they may redefine the role of the archi- cipline has demonstrated the ability to find designs through
tect, as certain design tasks become completely automated. both deterministic and nondeterministic methods (Wortmann
Although progress in the development of these artificially 2017). These approaches show an ability to learn from a short
intelligent agents has been limited, recent advancements in AI window of past experience, but can be easily misled because
research have the potential to move the discipline significantly of their short memory (Whitley 1991). DNNs have also been
closer to their realization. combined with these methods in order to improve their speed
Machine learning (ML) is a subfield of AI focused on the (Snoek et al. 2015) and to represent qualitative objectives
development of techniques to recognize patterns in data (Newton 2018).
(Russell and Norvig 2010, 1–4). The emergence of deep neu-
ral networks (DNNs) in the field of ML has revolutionized the Physically Based Algorithms
automation of tasks involving recognizing qualitative patterns Physically based algorithms abstract form-generating pro-
TAD 3 : 2

in data. These deep discriminative models have performed cesses found in nature as the basis to generate architectural
so well that they have even been able to outperform human forms. Some early work in this area used physics-based com-
experts on classification tasks as varied as image and sound puter animation techniques to create schematic building mass-
recognition. In addition to discriminative tasks, DNNs are also ing from simulated forces (Lynn and Kelly 1999). More recent
capable of generative tasks in which data patterns can be pro- work combines sophisticated physics simulation with mathe-
duced. These “deep generative models” have been less studied matical growth models to generate structural designs (Klemmt
but are emerging as a major field of study. One deep gener- and Bollinger 2017). Other researchers have explored the
ative model that has demonstrated impressive results in the simulation of agents modeled on biological processes, such as
generation of 2-D and 3-D designs is the generative adversari- reaction-diffusion processes (Erioli and Zomparelli 2012) and
al network (GAN) (Goodfellow et al. 2014). GANs have demon- cellular automata (Dincer et al. 2014), to create architectural
strated a level of generative sophistication that previous work forms. This work has demonstrated an ability to simulate spe-
in computational generative design has not equaled. Their abil- cific form-generating processes in nature, but not an ability to
ity to learn from examples and extrapolate that learning into learn from and interpret experience.
the creation of new instances has the potential to impact many
design-related disciplines, but their application in architecture Generative Grammars
has only been explored in a limited fashion in the generation of Generative grammars use the iterative application of a set
2-D images of plans and facades. of replacement rules to create 2-D and 3-D designs (Stiny
This research addresses this gap in previous work and con- 1980). They have been used in the creation of architectural
tributes knowledge on the potentials and limitations of the GAN plans (Stiny and Mitchell 1978), master planning (Halatsch et
deep generative model in 2-D and 3-D architectural design al. 2008), and to generate architectural facades (Müller et al.
generation tasks. One contribution of the work is the use of 2006). They have also been used in the generation of 3-D
GANs to generate 3-D building massing models within a spe- building designs (Koning and Eizenberg 1981). This previous
cific urban and stylistic context. Another is their use to generate work demonstrates an ability to automate design creation from
architectural plans and facade designs from particular stylistic a set of predefined replacement rules defined by the designer,
movements in the history of architecture. Lastly, the research but does not demonstrate the ability to learn.
provides insight into the curation of architectural datasets for
GAN use, as well as techniques to deal with small datasets. Probabilistic Algorithms
Probabilistic generative approaches use probability distribu-
Previous Approaches in Generative Design tions generated from example designs to guide the creation
Previous research on computational generative design has of new ones. Bayesian networks have been used to generate
explored a wide range of both deterministic (i.e., always architectural plans from examples (Merrell et al. 2010). Markov
produces the same output from the same input) and chains have been used to generate urban plans (Swahn 2018).
NEWTON 179

PEER REVIEW / OPEN


v Opening Figure: Facade designs created by a generative adversarial These networks are trained through either supervised (i.e.,
network (GAN) at different stages of training as it learns to synthesize
new designs from examples.
training data is labeled requiring significant labor costs), semi-
supervised (i.e., some training data is labeled), or unsupervised
r Figure 1. The diagram shows the generator and discriminator (i.e., no training data is labeled) methods. Unsupervised meth-
networks that make up a GAN. The GAN architecture shown here is for
the DCGAN.
ods are the best suited for researchers in generative design
because they require less labor for data preparation than other
approaches. They have also been the most explored approach
In addition, they have been used for recognition tasks involv- among ML researchers due to these capabilities, and their
ing 3-D objects ( Li et al. 2003). Research in this area demon- application has been explored on a variety of 2-D and 3-D
strates the ability to learn from experience and to interpret design tasks. A full list of unsupervised approaches is found in
data, but the precision with which probabilistic models are Goodfellow et al. (2016, 651–716).
able to accomplish these tasks is outperformed significantly by Generative adversarial networks (GANs) proposed by
approaches that use DNNs. Goodfellow et al. (2014) are the most widely used unsuper-
In summary, the majority of surveyed generative approach- vised deep generative model and use two competing DNNs
es demonstrate little to no artificial intelligence by the defini- (e.g., a generator network and a discriminator network) to com-
tion given by Russell and Norvig (2010, 1–4). Specifically, they petitively learn from examples and create new data instances
demonstrate little ability to learn from experience and to inter- related to those examples. Figure 1 shows the basic architec-
pret that experience adaptively towards the goal of generating ture of a GAN. Early uses of GANs involved the synthesis of
architectural designs. simple 2-D images, such as handwritten digits (Goodfellow et
al. 2014). Radford et al. (2015) then proposed the deep convo-
Deep Generative Models for Design lutional GAN (DCGAN) to improve the GAN’s ability to repre-
In contrast to previously described approaches, deep genera- sent, recognize, and synthesize such images. An architecturally
tive models have demonstrated the ability to learn from and relevant example of image synthesis they demonstrate involves
interpret data to synthesize 2-D and 3-D designs with a level using DCGAN to learn from example photographs of real bed-
of sophistication and adaptability that outperforms compet- rooms in order to generate images of new bedroom configu-
ing approaches (Creswell et al. 2018). Deep generative models rations (Radford et al. 2015). Wu et al. (2016) adapt DCGAN
accomplish this through the use of multiple layers of artificial into three dimensions and propose 3D-GAN for the synthe-
neurons to create a piece-wise high-dimensional function able sis of 3-D objects, using them to create new designs of furni-
to approximate the probability distributions that underline the ture, vehicles, and guns. This previous work demonstrated that
qualitative features of 2-D and 3-D design examples (Xu et al. GANs could be used for 2-D and 3-D design synthesis, but it
2015). These learned abstractions have even been visualized also revealed that the output of that synthesis was difficult to
and used to analyze the deep organizational nature of specific control. For example, generating a chair with predefined fea-
2-D images (Radford et al. 2015). tures (e.g., specific armrest styles, etc.) was not possible.
Generative Deep Learning in Architectural Design 180

r Figure 2. Left: Training


losses per epoch for DCGAN
on Le Corbusier plans training
set. Right: Training losses
per epoch for WGAN on Le
Corbusier plans training set.

v Figure 3. Images of
plans generated with the
WGAN from a training set
of 100 example images of Le
TAD 3 : 2

Corbusier plans after 1000


epochs of training.

Chen et al. (2016) proposed the information GAN (infoGAN) improvement in both control and fidelity of synthesized 2-D
to address this issue. InfoGAN uses conditional GANs (cGANs) images over previous approaches, but their application in 3-D
in order to allow users to control image synthesis. Isola et al. synthesis has not yet been explored.
(2017) extended the use of cGANs in a software tool called GANs are able to generate the highest resolution images of
“pix2pix” to interpret the real-time sketching of designers in any deep generative model (Creswell et al. 2018). This prop-
order to generate new 2-D designs for products and building erty, in combination with the spectrum of synthesis problems,
facades. Zhu et al. (2017) proposed cycleGANs as a general- can be applied to make them useful for architectural design.
ized form of cGANs able to work with unpaired data to per- Their use in the field, however, has been limited to the syn-
form 2-D and 3-D style transfer tasks (Zhang et al. 2018) in thesis of 2-D images and graph structures. Huang and Zheng
which the qualitative features (e.g., brush strokes, color pal- (2018) applied pix2pix to generate 2-D images of floor plans
ette, geometric language, etc.) from one design is transferred for a single-family house. Zheng (2018) also explored pix2pix
to another. In order to improve the control and fidelity of GAN to generate aerial images of cities from sketches. Chaillou (n.d.)
synthesis in 2-D, Li et al. (2017) proposed the adversarial used pix2pix in a nested fashion to create multiunit residential
learned inference with conditional entropy (ALICE) GAN. This plans nested within site boundaries. As, Pal, and Basu (2018)
model, in addition to the adversarial variational Bayes (AVB) used infoGAN to generate graph structures representing the
GAN proposed by Mescheder et al. (2017), demonstrated layout of single-family houses.
NEWTON 181

PEER REVIEW / OPEN


r Figure 4. Top row: Examples of Gothic facades generated with DCGAN after 500 epochs with no data augmentation. Bottom row: Examples of
same process with augmentation of the training set from 239 to 1000 images.

Table 1. Parameter Settings for the 2D and 3D GAN Synthesis Experiments

Generative GAN Used Training Optimizer Learning Rate: Epochs Training Batch Augmentation Input/Output
Task Set Size Generator / Time Size Resolution
Discriminator
2D Plans – DCGAN 100 ADAM 0.002 / 300 1 hours 32 None 256 x 256 /
Le Corbusier 0.00005 256 x 256

2D Plans – WGAN 100 ADAM 0.002 / 1000 3 hours 32 None 256 x 256 /
Le Corbusier 0.00005 256 x 256

2D Facade - DCGAN 239 ADAM 0.002 / 500 2 hours 32 None 256 x 256 /
Gothic Style 0.00005 256 x 256

2D Facade - DCGAN 1000 ADAM 0.002 / 500 10 hours 32 Reflect, Skew, 256 x 256 /
Gothic Style 0.00005 Tilt 256 x 256

2D Facade DCGAN 606 ADAM 0.002 / 500 4 hours 32 None 256 x 256 /
- 20th -21st 0.00005 256 x 256
Century Styles
2D Facade DCGAN 1000 ADAM 0.002 / 500 10 hours 32 Reflect, Skew, 256 x 256 /
- 20th -21st 0.00005 Tilt 256 x 256
Century Styles
3D Building 3D-IWGAN 7500 ADAM 0.002 / 1000 24 hours 100 Rotation 32 x 32 x 32 /
Massing in 0.00005 32 x 32 x 32
NYC
Generative Deep Learning in Architectural Design 182

r Figure 5. Left: Training losses per epoch for DCGAN on Gothic facade Two different GANs were used in this generative task. The
training set without data augmentation. Right: Training losses per epoch
for DCGAN on Gothic facade training set with data augmentation.
deep convolutional generative adversarial network (DCGAN)
(Radford et al. 2015) was used first and then the Wasserstein
generative adversarial network (WGAN) (Arjovsky et al. 2017)
Methodology was applied. Both GANs were coded in Python using the
This research will contribute to the previous work discussed in Tensorflow and Keras libraries for deep learning. These librar-
a number of ways. First, it will explore the potential of GANS ies were chosen because of their extensive documentation and
to generate architectural plans relating to particular stylis- relative ease of use compared to other available options (e.g.,
TAD 3 : 2

tic movements in architecture from limited data. Specifically, PyTorch; Caffe; Microsoft Cognitive Toolkit). Implementations
experiments will investigate how small training sets affect of both GAN architectures are available through Github. The
GAN synthesis and how different loss functions can be used GANs were trained on a workstation with one GTX 1080
to improve results. Next, their ability to generate 2-D images of NVIDIA GPU using the parameters listed in Table 1. These set-
building facades related to specific architectural styles will be tings were determined based on what the authors of the various
demonstrated, and augmentation techniques for small training GAN architectures recommended. Due to the computationally
sets will be tested. Experiments will also explore how the DNN expensive nature of working with GANs, training on machines
layers of GANs can be visualized to provide analytic insight. without a GPU is not advisable (Shi et al. 2016).
Lastly, the use of 3-D GANs to generate 3-D building designs The results of early experiments with DCGAN were not very
within a specific urban context will be explored. The results of legible due to training instability. Dealing with instability during
these generative tasks are evaluated qualitatively through the training is common when working with GANs, and there are a
visual comparison of synthesized 2-D and 3-D designs and their couple of primary problems (Goodfellow et al. 2014). One issue,
respective training sets by an expert in the field of architecture. called “mode collapse,” occurs when the generator network con-
verges quickly and is only able to produce a small set of similar
GANs for Stylistic Architectural Plan Generation examples with little diversity (Goodfellow et al. 2014). Another
and Analysis issue involves the discriminator network converging before the
The first generative task explored in this research was the gen- generator can learn to synthesize data properly (Salimans et al.
eration of 2-D architectural plans in the International Style of 2016). Figure 2 (left) shows the training losses for DCGAN per
architecture represented by the work of Le Corbusier. The train- epoch (i.e., training iteration) on the Le Corbusier plan training
ing data for this task was comprised of 100 scanned images of set and reveals that this latter issue was the main culprit, most
architectural plans by Le Corbusier from a variety of projects. likely due to the small training set. Creswell et al. (2018) provide
This training set size was quite small relative to norms for work- an overview of heuristics to address both problems. Specifically,
ing with GANs, but this choice provided a chance to test the this problem was mitigated through the use of WGAN. WGAN
capabilities of GANs on training sets which are more the norm uses a different loss function than DCGAN that has been shown
for architectural subjects. As a point of reference, training set to address some of its stability issues (Arjovsky et al. 2017). The
sizes that have been demonstrated to achieve peak perfor- parameter settings for the training with WGAN can be seen in
mance for image synthesis tasks can contain 10,000 to 50,000 Table 1. Figure 2 (right) shows that the premature convergence
images (Im et al. 2018). The plans were gray scale and scaled to issue was improved but not alleviated with the use of WGAN.
fit within a 256 x 256-pixel area. The plans were therefore at The long-term loss trajectory of the generator network is still
different scales relative to one another, creating a training set not showing signs of decreasing through the training epochs.
that exhibited high diversity in scale and content. This data was Ideally, this loss would decrease through time until it gets lower
prepared by hand and took about sixteen hours. than the discriminator’s, but a small dataset with high diversity
NEWTON 183

PEER REVIEW / OPEN


r Figure 6. Top row: Examples of facades generated with DCGAN from GANs for Stylistic Facade Generation and Analysis
the CMP facades dataset after 500 epochs with no data augmentation.
Bottom row: Examples of same process with augmentation of the
The second 2-D generation task involved the generation of
training set from 239 to 1000 images. building facades from two different architectural styles. A key
area of investigation was evaluating the effect of techniques to
in content requires additional training examples to improve per- increase the size of the curated training sets. Specifically, image
formance further. augmentation techniques were explored. Image augmentation
The results generated with the WGAN can be seen in Figure can increase the number of training examples by applying a vari-
3. They have a sketchy quality due to the curation of the training ety of operations (e.g., rotate, translate, skew, tilt, reflect, scale,
set size and the diversity of plans chosen, yet they reveal com- add noise, etc.) to existing training examples. Another important
mon organizational themes found in Le Corbusier’s plans. In the area of investigation involved studying how the diversity of the
top row of Figure 3, a pinwheel pattern becomes apparent as an training examples impacted the diversity of the synthesized
organizing device. This pinwheel pattern is found in several Le images.
Corbusier projects (e.g., the Venice Hospital, the planned Palais The first training set used here was curated with exterior pho-
des Congrès in Strasbourg). Parts d and e show spiraling forms tographs of buildings from the Gothic architectural style. The
negotiating between low and high spatial density to organize Gothic training dataset was comprised of 239 exterior photos
what looks like circulation cores around their center of mass. of multiple Gothic buildings scraped from Google’s image search
The generated plans also reveal a hierarchical set of axial cores using a Python script. The second training set was the Center
that create an asymmetric spatial counterpoint and provide a for Machine Perception (CMP) facade dataset (Tyleček and Šára
scaffold for clustered cellular growths. Part f in the figure seems 2013), which contains 606 photos of facades from a number
to reflect Le Corbusier’s diverse use of geometric types as it of different architectural styles largely from the twentieth and
mixes curvilinear geometry at the perimeter of the composition twenty-first centuries. Python scripts were used to augment
with a latticework of interior orthogonal partitions. both training sets. Specifically, reflect, tilt, and skew operations
The results from this generative task imply that even small were used to produce a training set size of 1,000 images for each
training set sizes can generate useful results to inform archi- training set. These training sets were tested against versions
tects. The low resolution of the synthesized plans can actually with no augmentation to evaluate the effects of augmentation
be a benefit in two key ways. Firstly, they can provide an ana- on image synthesis.
lytic lens to reveal organizational features that are implicit in a The GAN models used for this generation task and the train-
body of architectural work. Secondly, the unfinished and open- ing parameters are shown in Table 1. Due to the larger size and
ended quality of the images can inspire interpretation and lead content (i.e., photographs versus drawings) of the training sets,
to new design ideas. Previous research on the use of “glitches” the DCGAN performed better than WGAN for this generation
in digital images for the generation of architectural designs rely task. The results for the generation of facades in the Gothic
on a similar interpretive method (Austin and Matthews 2018). style are shown in Figure 4. In the top row of the figure, sample
This point contrasts with previous work that typically assumes outputs are shown from the training set without augmentation.
GANs should be used to produce photo-realistic images. As with the plan generation, the small training set tended to
Generative Deep Learning in Architectural Design 184

The results from the CMP facade dataset can be seen in


Figure 6. In the top row of the figure, the results from the non-
augmented training set can be seen. Pictured here, the rhyth-
mic grids from more modern architectural movements can be
visually intuited as if emerging from a dream. This more abstract
and painterly quality could be useful for architects as generative
tools, by allowing designers to make creative associations from
these partial, atmospheric, and expressionistic forms. In the bot-
tom row of the figure, which uses augmented images in its data-
set, the generated compositions take on a new level of realism
as Neoclassical, Romanesque, and Modernist style facades seem
to blend together, creating new hybrid conditions. This blending
is the direct result of the curation of the training set. The many
styles in it allowed the GAN to generate hybrid conditions within
a larger design space. If an architect wanted to synthesize facade
images from a particular movement, then the training set could
be curated with images of that style, such as the Gothic.
In addition to these synthesized results, the trained GANs
can also provide analytic insights into the architectural styles
they are generating. This can be done by visualizing the internal
representations they have learned during the training. Figure 7
shows a visualization of the features the generator network of
the GAN has learned that help it to create new instances from a
given training set. In the top of the figure, sample activation pat-
terns from the artificial neuron layers of the generator network
TAD 3 : 2

of the CMP dataset are shown. The first layer of the network
shows high-level features being learned that are part of the
deep organizing logic of the dataset, specifically, the organiz-
ing grids of the more modern facades. The second layer shows
lower-level features related to shading. The bottom of the fig-
ure shows the final synthesized image that relate to the acti-
vation patterns. Figure 8 shows the activation patterns of the
discriminator network for the CMP dataset. Layer 1 in the fig-
ure shows low-level features being learned, such as corners and
edges, while layers 2–3 begin to reflect more compact repre-
sentations of higher-level features (e.g., window arrangements)
r Figure 7. Sample activation patterns from the artificial neuron layers that combine several lower-level features. For brevity, the fig-
of the generator network for the CMP dataset. ures only show select samples of activation patterns from each
layer, but there are dozens of features learned for each layer
produce images that were more abstract in appearance. Their that could be used for analytic insight into these architectural
partially finished contours suggest new Gothic compositions, styles. Choosing which layers might be most relevant is an open
but leave enough openness that the mind can make interpre- problem and requires more research.
tive connections. In the bottom row of the figure, samples are
shown from the training set with augmentation. The resolution GANs for 3-D Architectural Design Generation
of these samples is qualitatively much better than the train- The generation of 3-D objects using GANs is a more involved
ing set without augmentation. In these images, entrances but- process than the generation of 2-D images. The process requires
tresses, archways, windows, and spires come into full view. The the choice of a representational strategy for these 3-D objects.
graphs of the training losses in Figure 5 confirm that the gen- There are several representational strategies available for archi-
erator is doing a slightly better job at fooling the discriminator tects to consider, such as point clouds (Li et al. 2018), the use of
with the augmented training set as seen on the right, as com- 2-D views (Su et al. 2015), and mixed methods (Hegde and Zadeh
pared to the left. The figure also shows that the networks are 2016). A voxel-based representational strategy was used for this
more stable in their competition with one another as compared research because of its straightforward implementation and its
to those in Figure 2. This is most likely due to the larger training potential to represent properties relevant for the specificities
set size in relation to the diversity of the training set content. of architectural design in addition to shape, such as materiality,
These results demonstrate that augmentation techniques can atmospheric qualities, and socioeconomic factors (Prado 2019).
be applied to small architectural datasets to improve the visual Voxel-based approaches represent geometric objects with a
quality of synthesized images significantly. 3-D matrix of pixels (Maturana and Scherer 2015; Brock et al.
NEWTON 185

The results for this 3-D generation task are pictured in Figure
10. These generated massing models reflect the limitations of
the low resolution available with voxel-based approaches. The
process generated several strange hybrid massing forms. Two of
these can be seen in Figure 10. On the left, a multitude of thin
towers sprout from a large plinth. On the right, a large ellipti-
cal tower form seems to fragment into tiny towers. As with the
experiments presented with 2-D image synthesis, the curation
of the dataset was crucial in the control of the diversity of forms
produced. The wide spectrum of building styles found in down-
town Manhattan led to a GAN synthesis that was quite diverse,
blending several styles.
Limited as they are, in terms of their blocky resolutions, these
results become suggestive of new organization ideas. Through
the use of higher resolution representational approaches, such
as point clouds, 3-D GANs could be useful for a number of other
3-D generation tasks, such as urban, landscape, facade, and
detail design. Materiality could also be brought into the genera-
tive learning by associating voxels, or points in a point cloud, with
particular material properties. 3-D GANs could also be used to
generate 3-D models that explore more abstract organization-
al issues such as program and circulation organization. Further,
they could be used to generate expressionistic and conceptual

PEER REVIEW / OPEN


models that could be creatively interpreted by architects, such
as with the examples of 2-D image synthesis covered previously.

Discussion
The results of the research demonstrate that the curation of
architectural datasets is important when working with GANs.
In curating the type (e.g., 2-D image; 3-D model; text; etc.), size
r Figure 8. Sample activation patterns from the artificial neuron layers (i.e., number of training examples), and content (e.g., plans, per-
of the discriminator network of the CMP dataset. spective images, architectural styles, etc.) of the training set, the
architect directly impacts the generative output of the GAN. For
2016; Wu et al. 2016). One major drawback of this approach is example, if an architect curates a training collection that is small
that their maximum resolution when used with GANs tops out at in size and focused on a particular style, such as Gothic archi-
around 64 x 64 x 64 voxels before they require significant com- tecture, the GAN will produce images of Gothic architecture
puting resources to train (Smith and Meger 2017). This means that are more atmospheric and abstract in nature. If that training
that generated designs tend to look blocky, and fine-grained set size is increased, the results will become much more photo-
detail may be lost. Point cloud approaches do not have these realistic. In addition, if the variety of images in the training set is
drawbacks, but their implementation is more challenging, and high, the output will be more diverse, while a training set with
they cannot represent nongeometric data with the same flexibil- less diversity will produce more focused results. For example, a
ity as voxels. training set with images of Classical facades will tend to gener-
The training data consisted of 500 building massing models ate new images of Classical-looking facades. A training set that
located in downtown New York City translated into a voxel rep- has fifty percent Classical and fifty percent Modernist examples
resentation with a resolution of 32 x 32 x 32 voxels. The models will create facade images that look classical, modern, and/or a
were downloaded from an online supplier of CAD models (e.g., blend between the two. Through these curation choices, the
Cadmapper, cityGML). The dataset was augmented through architect can control how the GAN explores a space of possibili-
the use of rotation to create a final training set size of 7,500. ties and whether the synthesized images provide more literal or
The models were transformed into a voxel-based representa- conceptual design prompts.
tion with a custom Python script and then used to train a 3-D This research also demonstrated several ways that GANs
GAN implemented in Python using the same libraries and com- might play a role in the architectural design process. First,
puter setup mentioned previously. Several 3-D GAN architec- they have value as analytic tools capable of revealing hidden
tures were explored, but the most successful was the improved organizational principles in the architectural work introduced
Wasserstein generative adversarial network (3D-IWGAN) as precedents. Second, they can be used to create diagram-
(Smith and Meger 2017). The parameters for the training can be matic, conceptual, and expressionistic images that can serve
seen in Table 1 and a diagram of the basic 3-D GAN architecture as the basis for further analysis and ideation by the designer.
used is pictured in Figure 9. Third, they can be used to produce specific designs. All of these
Generative Deep Learning in Architectural Design 186

r Figure 9. The diagram shows the generator and discriminator The scarcity and accessibility of 2-D and 3-D architecturally
networks that make up the 3-D IWGAN. relevant training data is also a major roadblock that needs to be
addressed. For example, some architectural subjects may have
applications require a designer to be in the loop. GANs, there- a relatively low number of 2-D or 3-D examples (e.g., Gothic
fore, may lend themselves to more collaborative roles in the architecture, Le Corbusier’s plans, etc.) when compared to train-
design process, rather than autonomous ones. ing set sizes (e.g., 10,000 –50,000) more typical of peak DNN
The challenges and limitations for architects working with and GAN performance (Im et al. 2018). In other circumstanc-
TAD 3 : 2

these tools are significant and worthy of some brief discus- es, there may be ample data, but it may not be easily acces-
sion. Curating the training set of a GAN provides some control sible without large investments in labor. This limited availability
of GAN synthesis, but it has limits. For example, if an architect of 2-D and 3-D architectural datasets has held exploration of
wanted to use GANs to explore facade variations in which spe- ML technologies in the discipline back. Data augmentation tech-
cific design features (e.g., fenestration pattern, materiality, etc.) niques can play a role in addressing some of these issues. The
were varied while others were held constant, another mode development of accessible databases could have an even larger
of control would be needed. To address this issue, research- impact for the architectural community, however, by allowing
ers have proposed several GAN variants described previously researchers across disciplines to explore architectural datasets.
(e.g., infoGAN, ALICE, AVB) that can allow for some of this fine-
grained control. Conclusions
Another challenge GANs pose is that they cannot produce This research explored the application of the GAN deep gen-
design examples that may be truly novel, if novelty is defined as erative model on 2-D and 3-D architectural design tasks and
generating a design outside the implicit probability distribution contributed knowledge on their use with architectural datasets,
of the training data (Cherti et al. 2017). This in-distribution sam- their potentials as design tools, and their limitations. Specifically,
pling generates objects from the category of the objects in the experiments demonstrated how the curation of 2-D and 3-D
training set. This synthesis therefore often creates look-alikes training data can be used to control the visual quality and diver-
that resemble the original training examples. Research in this sity of synthesized designs from different stylistic movements
area is developing fast, however, and Cherti et al. (2017) pro- in architecture for architectural plans, facades, and 3-D massing
pose one way to address this issue through out-of-distribution models. Techniques for working with small training sets were
sampling from dynamic probability distributions. also introduced and shown to improve the fidelity of generated
Training deep generative models, such as GANs, also requires designs, implying that small datasets are not a roadblock to using
significant computational resources and time, while having deep generative models in architecture. Further, experiments
a very steep learning curve for architects without any coding showed how GANs might be used for analytic purposes on
or ML experience. Once trained, many deep generative mod- precedent architectural work and to create conceptual images
els are only capable of creating 2-D and 3-D designs at fairly for design ideation. These results point towards a number of
low resolutions due to computational bottlenecks. These chal- important avenues for future research.
lenges are being addressed, however, through the availability One area of future development is evaluating the perfor-
of high-performance cloud computing services and deep learn- mance of alternative GAN architectures (e.g., infoGAN; ALICE;
ing libraries that are increasingly less costly and easier to use. AVB) and deep generative models (e.g., variational autoencoders
Furthermore, partnerships between architectural offices and and autoregressive models) for architectural design. The evalua-
university research groups also have the potential to address tion of deep generative models, however, remains an open prob-
the learning curve issue. lem in the field of ML (Creswell et al. 2018). Due to the visual
NEWTON 187

v Figure 10. 3-D GAN


examples showing more
aberrant and hybrid
conditions.

Austin, M., and L. Matthews. 2018. “Drawing Imprecision:


nature of architectural design, qualitative assessments may
The Digital Drawing as Bits and Pixels.” InIn Recalibration on
make the most sense as a mode of evaluation for the discipline. Imprecision and Infidelity–Proceedings of the Thirty-Eighth Annual
Using multiple human assessors in the form of “Mechanical Conference of the Association for Computer Aided Design in
Turks” or focus groups to evaluate the quality of synthesized Architecture (ACADIA), 36–45. Mexico City, Mexico, October
designs can improve the rigor of such measures and aid in the 18–20.
comparison of deep generative models. Besserud, K., and J. Cotten. 2008. “Architectural Genomics.” In
Another area of development is investigating how GANs Proceedings of the Twenty-Eighth Annual Conference of the
might be further integrated within various stages of the archi- Association for Computer Aided Design in Architecture (ACADIA),
238–245. Minneapolis, MN, October 16–19.

PEER REVIEW / OPEN


tectural design process for synthesis tasks at varying scales
(e.g., site to detail scale) and levels of abstraction (e.g., concep- Brock, A., T. Lim, J. M. Ritchie, and N. Weston. 2016. “Generative
tual diagrams to specific designs). Further, their development as and Discriminative Voxel Modeling with Convolutional Neural
Networks.” Preprint, submitted August 15, 2016, pp. 1–10. http://
analytic tools is particularly important. In addition, investigating arxiv.org/abs/1608.04236.
their efficacy on different types of design problems is crucial.
For example, their application to historic restoration seems par- Caldas, L. 2008. “Generation of Energy-efficient Architecture
Solutions Applying GENE_ARCH: An Evolution-based Generative
ticularly relevant. There are also a number of challenges that Design System.” Advanced Engineering Informatics 22 (1): 59-70.
future research will need to address in relation to developing
better ways of controlling GAN synthesis, data availability, and Chaillou, S. March 13, 2019. “AI & Architecture.” AI Time Journal
(website). https://www.aitimejournal.com/@stanislas.chaillou/
ease of use. ai-architecture.
Deep generative models are a leading AI technology that
may comprise the computer-aided design tools of tomorrow. Chen, X., Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P.
Abbeel. 2016. “InfoGAN: Interpretable Representation Learning.”
Their successful development will require the engagement of In the Thirtieth Conference on Neural Information Processing
the discipline to address these challenges and creatively pur- Systems (NIPS 2016), 2172–2180. Barcelona, Spain.
pose these tools.
Cherti, M., B. Kégl, and A. Kazakçı. 2017. “Out-of-Class Novelty
Generation: An Experimental Foundation.” In IEEE Twenty-Ninth
Acknowledgments International Conference on Tools with Artificial Intelligence
This research was supported by funding from the College of (ICTAI), 1312–1319. Boston, Massachusetts, November 6–8.
Architecture at the University of Nebraska-Lincoln. This work Creswell, A., T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta,
was completed utilizing the Holland Computing Center of and A. A. Bharath. 2018. “Generative Adversarial Networks: An
the University of Nebraska, which receives support from the Overview.” IEEE Signal Processing Magazine 35, no. 1: 53–65. https://
Nebraska Research Initiative. Krishnamohan Sunkara served as doi.org/10.1109/MSP.2017.2765202.
the primary research assistant in the project. Deb, K. 2012. Optimization for Engineering Design: Algorithms and
Examples. 2nd edition. New Delhi: PHI Learning Private Limited.
Dincer, A. E., G. Çağdaş, and H. Tong. 2014. “A Digital Tool for
References Customized Mass Housing Design.” In Proceedings of the Thirty-
Second International Conference on Education and Research in
Arjovsky, M., S. Chintala, and L. Bottou. 2017. “Wasserstein Computer Aided Architectural Design, 1:10–12. Newcastle upon
Generative Adversarial Networks.” In the International Conference Tyne, September 10–12.
on Machine Learning, 214–223. Sydney, Australia, August 7–9.
Erioli, A., and A. Zomparelli. 2012. “Emergent Reefs.” In
As, I., S. Pal, and P. Basu. 2018. “Artificial Intelligence in Proceedings of the Thirty-Second Annual Conference of the
Architecture: Generating Conceptual Design via Deep Learning.” Association for Computer Aided Design in Architecture (ACADIA),
International Journal of Architectural Computing 16, no. 4: 306–327. 139–148. San Francisco, CA, October 18–21.
https://doi.org/10.1177/1478077118800982.
Frazer, J. 1995. An Evolutionary Architecture. London: Architectural
Association Publications.
Generative Deep Learning in Architectural Design 188

Gange, J. M. L., and M. Andersen. 2010. “Multi-Objective Facade IEEE/RSJ International Conference on Intelligent Robots and
Optimization for Daylighting Design Using a Genetic Algorithm.” In Systems (IROS), 922–928. Hamburg, Germany, September 28–
the Fourteenth National Conference of IBPSA USA SimBuild, 4.1: October 2. https://doi.org/10.1109/IROS.2015.7353481.
110-117. New York City, NY, August 11-13.
Merrell, P., E. Schkufza, and V. Koltun. 2010. “Computer-Generated
Goodfellow, I., Y. Bengio, and A. Courville. 2016. Deep Learning. Residential Building Layouts.” ACM Transactions On Graphics 29, no.
Cambridge, MA: MIT Press. https://doi.org/10.1016/B978-0-12- 6: 181. https://doi.org/10.1145/1882261.1866203.
391420-0.09987-X. ISBN 9780262035613
Mescheder, L., S. Nowozin, and A. Geiger. 2017. “Adversarial
Goodfellow, I. J., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde- Variational Bayes: Unifying Variational Autoencoders and
Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. “Generative Generative Adversarial Networks.” In Proceedings of the Thirty-
Adversarial Nets.” In Proceedings of the Twenty-Seventh Fourth International Conference on Machine Learning, 2391–
International Conference on Neural Information Processing 2400. Sydney, Australia, August 7–9.
Systems (NIPS), 2:2672–2680. Montreal, Canada, December 8–13.
Müller, P., P. Wonka, S. Haegler, A. Ulmer, and L. V. Gool. 2006.
Halatsch, J., A. Kunze, and G. Schmitt. 2008. “Using Shape “Procedural Modeling of Buildings.” ACM Transactions On Graphics
Grammars for Master Planning.” In Proceedings of the Third 25, no. 3: 614–623. https://doi.org/10.1145/1179352.1141931.
International Conference on Design Computing and Cognition,
1:655–673, Atlanta, Georgia, June 23-25. Newton, D. 2018. “Multi-Objective Qualitative Optimization
(MOQO) in Architectural Design.” In Computing for a Better
Hegde, V., and R. Zadeh. 2016. “FusionNet: 3D Object Tomorrow: Proceedings of the Education and Research in
Classification Using Multiple Data Representations.” Preprint, Computer Aided Architectural Design in Europe (ECAADe), Faculty
submitted July 16, 2016, pp. 1–9. http://arxiv.org/abs/1607.05695. of Civil Engineering, Architecture and Environmental Engineering,
1:187–196. Lodz, Poland, September 17–21.
Huang, W., and H. Zheng. 2018. “Architectural Drawings
Recognition and Generation Through Machine Learning.” In Prado, M. 2019. “Morphogenic Spatial Analysis: A Novel Approach
Recalibration on Imprecision and Infidelity—Proceedings of the for Visualizing Volumetric Urban Conditions and Generating
Thirty-Eighth Annual Conference of the Association for Computer Analytical Morphology.” Technology| Architecture + Design 3, no. 1:
Aided Design in Architecture (ACADIA), 156–165. Mexico City, 65–75.
Mexico, October 18–20.
Radford, A., L. Metz, and S. Chintala. 2015. “Unsupervised
Im, D. J., H. Ma, G. Taylor, and K. Branson. 2018. “Quantitatively Representation Learning with Deep Convolutional Generative
Evaluating GANs with Divergences Proposed for Training.” Adversarial Networks.” In Proceedings of the Fourth International
TAD 3 : 2

Paper presented at the International Conference on Learning Conference on Learning Representations, 1–16. San Juan, Puerto
Representations, Vancouver, BC, April 30-May 3 . Rico, May 2–4.
Isola, P., J. Zhu, T. Zhou, and A. A. Efros. 2017. “Image-to-Image Russell, S. J. and P. Norvig. 2016. Artificial Intelligence: A Modern
Translation with Conditional Adversarial Networks.” In Proceedings Approach. 3rd ed. Boston: Pearson Education.
of the Thirtieth IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2017), 1125–1134. Honolulu, HI, July 21–26. Salimans, T., I. Goodfellow, W. Zaremba, V. Cheung, A. Radford,
https://doi.org/10.1109/CVPR.2017.632. and X. Chen. 2016. “Improved Techniques for Training GANs.”
In the Thirtieth Conference on Neural Information Processing
Klemmt, C., and K. Bollinger. 2017. “Angiogenesis as a Model Systems (NIPS), 2234–2242. Barcelona, Spain, December 8–10.
for the Generation of Load-Bearing Networks.” International
Journal of Architectural Computing 15, no. 1: 18–37. https://doi. Shi, S., Q. Wang, P. Xu, and X. Chu. 2016. “Benchmarking State-
org/10.1177/1478077117691599. of-the-Art Deep Learning Software Tools.” In Proceedings of the
Seventh International Conference on Cloud Computing and Big
Koning, H., and J. Eizenberg. 1981. “The Language of the Prairie: Data (CCBD), 99–104. Taipa, Macau, China, November 16–18.
Frank Lloyd Wright’s Prairie Houses.” Environment and Planning B: https://doi.org/10.1109/CCBD.2016.029.
Planning and Design 8, no. 3: 295–323. https://doi.org/10.1068/
b080295. Smith, E. J., and D. Meger. 2017. “Improved Adversarial Systems
for 3D Object Generation and Reconstruction.” In Proceedings of
Li, C., H. Liu, C. Chen, Y. Pu, L. Chen, R. Henao, and L. Carin. 2017. Machine Learning from the Conference on Robot Learning (PMLR),
“ALICE: Towards Understanding Adversarial Learning for Joint 78:87-96. Mountain View, California, November 13-15.
Distribution Matching.” In the Thirty-First Conference on Neural
Information Processing Systems (NIPS 2017), 5495–5503. Long Snoek, J., O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram,
Beach, CA, December 4-9. M. Patwary, M. Prabhat, and R. P. Adams. 2015. “Scalable Bayesian
Optimization Using Deep Neural Networks.” In Proceedings of
Li, F., R. Fergus, and P. Perona. 2003. “A Bayesian Approach the Thirty-Second International Conference on Machine Learning,
to Unsupervised One-Shot Learning of Object Categories.” In 2171-2180. Lille, France, July 6–11.
Proceedings of the Ninth IEEE International Conference on
Computer Vision (ICCV 2003), 1134–1141. Nice, France, October Stiny, G. 1980. “Introduction to Shape and Shape Grammars.”
13-16. Environment and Planning B: Planning and Design 7 (3): 343–351.

Li, J., B. M. Chen, and G. H. Lee. 2018. “SO-Net: Self-Organizing Stiny, G., and W. J. Mitchell. 1978. “The Palladian Grammar.”
Network for Point Cloud Analysis.” In Proceedings of the IEEE Environment and Planning B: Urban Analytics and City Science 5, no.
Computer Society Conference on Computer Vision and Pattern 1: 5–18. https://doi.org/10.1068/b050005.
Recognition, 9397–9406. Salt Lake City, Utah, June 19–21. https:// Su, H., S. Maji, E. Kalogerakis, and E. Learned-Miller. 2015. “Multi-
doi.org/10.1109/CVPR.2018.00979. View Convolutional Neural Networks for 3D Shape Recognition.”
Lynn, G,. and T. Kelly. 1999. Animate Form. New York: Princeton In Proceedings of the IEEE International Conference on Computer
Architectural Press. Vision, 945–953. Santiago, Chile, December 13–16.

Maturana, D., and S. Scherer. 2015. “Voxnet: A 3D Convolutional Swahn, E. 2018. “Markovian Drift—Iterative Substitutional
Neural Network for Real-Time Object Recognition.” In the 2015 Synthesis of 2D and 3D Design Data Using Markov Models of
NEWTON 189

Source Data.” In Computing for a Better Tomorrow— Proceedings


of the Thirty-Sixth ECAADe Conference, 113–120. Lodz, Poland,
September 19–21.
Turrin, M., P. Buelow, and R. Stouffs. 2011. “Design Explorations
of Performance Driven Geometry in Architectural Design
Using Parametric Modeling and Genetic Algorithms.” Advanced
Engineering Informatics 25, no. 4: 656–675. https://doi.
org/10.1016/j.aei.2011.07.009.
Tyleček, R., and R. Šára. 2013. “Spatial Pattern Templates
for Recognition of Objects with Regular Structure.” In the
Thirty-Sixth German Conference on Pattern Recognition,
364–374. Saarbrücken, Germany, September 3–6. https://doi.
org/10.1007/978-3-642-40602-7_39.
Whitley, L. D. 1991. “Fundamental Principles of Deception in
Genetic Search.” Foundations of Genetic Algorithms 1: 221–241.
Wortmann, T. 2017. “Model-Based Optimization for Architectural
Design: Optimizing Daylight and Glare in Grasshopper.”
Technology|Architecture + Design 1, no. 2: 176–185. https://doi.org
/10.1080/24751448.2017.1354615.
Wu, J., C. Zhang, T. Xue, W.T. Freeman, and J.B. Tenenbaum. 2016.
“Learning a Probabilistic Latent Space of Object Shapes via 3D
Generative-Adversarial Modeling.” In the Thirtieth Conference
on Neural Information Processing Systems (NIPS 2016), 82–90.
Barcelona, Spain, December 5-10.
Xu, J., H. Li, and S. Zhou. 2015. “An Overview of Deep Generative

PEER REVIEW / OPEN


Models.” IETE Technical Review 32, no. 2: 131–139. https://doi.org/1
0.1080/02564602.2014.987328.
Zhang, Z., L. Yang, and Y. Zheng. 2018. “Translating and
Segmenting Multimodal Medical Volumes with Cycle- and Shape-
Consistency Generative Adversarial Network.” In Proceedings of
the IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, 9242–9251. Salt Lake City, Utah, June 19–21.
https://doi.org/10.1109/CVPR.2018.00963.
Zheng, H. 2018. “Drawing with Bots: Human-Computer
Collaborative Drawing Experiments.” In Learning, Prototyping
and Adapting, Short Paper Proceedings of the Twenty-Third
International Conference on Computer-Aided Architectural Design
Research in Asia (CAADRIA), 127–132. Beijing, China, May 17–19.
Zhu, J. Y., T. Park, P. Isola, and A. A. Efros. 2017. “Unpaired
Image-to-Image Translation Using Cycle-Consistent Adversarial
Networks.” In Proceedings of the IEEE International Conference on
Computer Vision, 2242–2251. Venice, Italy, October 22–29. https://
doi.org/10.1109/ICCV.2017.244.

David Newton is an Assistant Professor and leads the


Computational Architecture Research Lab (CARL) in the
College of Architecture at the University of Nebraska-Lincoln.
He holds degrees in architecture and computer science. This
background informs a research and teaching agenda that is
cross-disciplinary, bridging computer science and the allied
design fields.

You might also like