Professional Documents
Culture Documents
2017 Using White-Box Nonlinear Optimization
2017 Using White-Box Nonlinear Optimization
2017 Using White-Box Nonlinear Optimization
Abstract
We present a new strategy for the direct optimization of the values of policy functions. This
approach is particularly well suited to model actors with a global perspective on the system and
relies heavily on modern mathematical white-box optimization methods. We demonstrate our
strategy on two classical models: market growth and World2. Each model is first transformed
into an optimization problem by defining how the actor can influence the models’ dynamics and
by choosing objective functions to measure improvements. To improve comparability between
different runs, we also introduce a comparison measure for possible interventions. We solve the
optimization problems, discuss the resulting policies and compare them to the existing results
from the literature. In particular, we present a run of the World2 model which significantly
improves the published “towards a global equilibrium” run with equal cost of intervention.
Copyright © 2018 System Dynamics Society
Additional Supporting Information may be found online in the supporting information tab for
this article.
Introduction
a
Helmut Schmidt University/University of the Federal Armed Forces Hamburg, Holstenhofweg 85, 22043
Hamburg, Germany
b
Zuse Institute Berlin, Takustra_e 7, 14195 Berlin, Germany
c
Bern University of Applied Science, School of Engineering, https://maps.google.com/?q=Quellgasse+10&
entry=gmail&source=g, Quellgasse 10, 2105 Biel, Switzerland
* Correspondence to: Ingmar Vierhaus. E-mail: vierhaus@zib.de
Accepted by Andreas Größler, Received 21 July 2016; Revised 23 May 2017, 22 September 2017 and
20 October 2017; Accepted 29 October 2017
138
I.Vierhaus et al.: Using White-box Optimization Methods in Policy Design 139
free variables. In the case of direct parameter policy design, the parameters
of the policy function are the free variables. Consequently, the goal of the
policy improvement is to find a set of parameter values within the given
range that defines a policy function, which in turn improves the value of the
objective function. We call the set of all possible solutions of an optimization
problem the search space. In direct parameter policy design, the search
space is defined by the parameters of the policy function. Consider the
S-shaped logistic function Sðx Þ = 1 +ae −x as an example. The function is param-
etrized by a single parameter a. If the goal is to find the best value for this
parameter, the search space of the corresponding optimization problem is
only one dimensional and the problem would be easy to solve. On the other
hand, only S-shaped policy functions can be the result of such an optimiza-
tion. The expectations of the modeler on the shape of the policy limit the
possible results. By “modeler,” we refer to the person developing the model
and within this process also describing the actor. If a software package offers
parameter optimization capabilities, it is usually possible to attempt the
solution of such direct parameter policy design problems. In Yücel and
Barlas (2011) the “pattern-oriented parameter specifier (POPS)” routine was
presented, which aims to find parameter settings that produce a desired pat-
tern of behavior in one of the model variables. In this routine, the resulting
optimization problem is solved using a genetic algorithm. Here, the core idea
is to optimize, by repeated simulation, one of the conventional approaches to
SD optimization (Liu et al., 2012). For the optimization algorithm, the under-
lying model is a black box, which receives the current parameter values and
then returns the objective value after carrying out one single simulation run
of the model. Such black-box approaches have the advantage that any model
that can be simulated can also be optimized since there are no requirements
on the properties of the model equations. Examples of the application of
black-box optimization to SD can be found, for instance, in Sterman (2000).
However, approaches using repeated simulation suffer from the “curse of
dimensionality” (Bellman, 2003), where the significant dimension is that of
the space of free variables. An additional free variable adds a dimension to
the optimization algorithm’s search space. Solving optimization problems
with a large number of free variables therefore quickly becomes impractical.
Table function policy design is a possible way to generalize direct parame-
ter policy design, by defining a parametrized table function instead of an
analytic function (Keloharju and Wolstenholme, 1989). In this case, the mod-
eler has to define the number of data points of the table function and two
intervals that define valid values of the data points on the x- and y-axes. This
approach removes the modeler’s expectations of the shape of the policy from
the process. However, the possible policies are reduced to the space of the
piecewise linear functions with the selected number of points. If the data
points are required to have a predefined distance on the y-axis, the possible
solutions are reduced further, but the number of parameters and thus the
number of free variables decrease. As in the previous case, the goal of the
policy improvement is to find parameter values (i.e. data points of the table
function) that improve the value of the objective function. A software pack-
age that supports table function policy design is the Powersim Studio plug-
in SOPS (Moxnes and Krakenes, 2005).
The policy function is a model about what information an actor uses to
make decisions in a system. If the actor has only a bounded view of the sys-
tem, then the policy will only depend on the variables and information that
are available to this actor (Sterman, 2000). An improved policy will enable
this actor to make better decisions based on the limited information available
to him. Recent work has focused on improving policies for such actors,
using, for instance, co-evolutionary analysis (Liu et al., 2012). We will con-
sider a different kind of actor. Our actor has a global perspective on the
model, i.e. he or she has information on all the state variables at all times
within the simulation time horizon. Modeling a policy of such a comprehen-
sively aware actor with the conventional approaches to policy analysis is a
difficult endeavor. One way would be to define a table function for each
state, which depends only on that state. A mixed policy function that
depends on all states can then be defined as a sum of these table functions
(Keloharju and Wolstenholme, 1989). As a consequence of the “curse of
dimensionality,” the degrees of freedom of a mixed policy function are lim-
ited from a practical perspective, if an optimization of the policy by repeated
simulation is attempted.
We follow a different route and directly optimize the values of the policy
function. This is equivalent to defining the policy as a time-dependent table
function with one data point for each discrete time step within the time hori-
zon. In the context of physical systems, this type of problem is known as an
“optimal control problem” (Betts, 2010). In this approach, no assumptions
on the properties of the policy function are made a priori. It is only neces-
sary to select the free variables. In a conventional approach, these free vari-
ables contain the values of the policy functions. For each of these variables,
the range of valid values must be defined. It is then the task of the optimiza-
tion to find the optimal value for each free variable at each time, where the
continuous time is discretized into time steps. This leads to very large
parameter spaces compared to the conventional approaches. In particular, to
optimize np policy functions in a model with nt time steps, the search space
is of dimension np × nt. We propose and demonstrate this approach by
employing state-of-the-art nonlinear optimization algorithms to solve the
resulting optimization problems (Hanson et al., 2009; Betts, 2010) that are
not as affected by the curse of dimensionality as the conventional methods.
Here, the optimization algorithm works directly on the model equations and
does not require a full simulation of the model at each iteration. In contrast
to the conventional black-box approaches, this can be seen as a white-box
Copyright © 2018 System Dynamics Society
DOI: 10.1003/sdr
142 System Dynamics Review
first partial derivatives of the SD model, which provides hints about the
direction of the steepest descent and ascent. When no further ascent is possi-
ble, the second derivatives of the model reveal the curvature at this point.
With this information, a local optimum can be guaranteed. Let us express
the white-box approach with the hiker analogy. Now, the hiker has in addi-
tion to the GPS device also a map of the local neighborhood at any given
point. So the hiker can use the information from the map to plan the next
steps with greater care, and falling off a cliff can be avoided.
The black-box optimization approach was in the focus of the existing liter-
ature on the optimization of SD models. The white-box approach, however,
has found considerable attention in the mathematical control community
(Betts, 2010), but so far has left no significant footprint in the literature on
optimization in SD. Our work addresses this gap. Although the mathematical
methods such as interior point – first introduced by Karmarkar (1984)—or
sequential quadratic programming—first introduced by Wilson (1963)—have
been around for quite some time, they were vastly refined over the last
decade, and new ideas entered the stage that made it possible also to tackle
large-scale problems; some milestones of these developments can be found
in the work of Han (1976), Powell (1978), Fletcher (1982a, 1982b), Fletcher
and Leyffer (2002). Needless to say, those SD problems with hundreds of
stocks and flows and relations between them fall into the large-scale
category.
in the direction of the steepest ascent that leads to the nearest peak. How-
ever, it is not sufficient to find the highest mountain of all. For this, the hiker
had to carry a global map of all mountains, but carrying such heavy map
would slow the hiker down significantly.
A solution found by a local method is only optimal within a certain neigh-
borhood of this solution. This is a much more tractable problem, since it
only needs to be shown that no better solution exists for any possible
descent direction from the computed incumbent solution. Karush (1939) and
independently Kuhn and Tucker (1951) provide a mathematical description,
i.e. a criterion, to verify whether a given solution is indeed locally optimal.
These conditions are called KKT conditions, taken from the initials of the
inventors. Existing software packages, e.g. CONOPT (Drud, 1985) and IPOPT
(Wächter and Biegler, 2006), verify this criterion and thus terminate with
proven local optimal solutions.
Methods such as Box’s and Powell’s, as well as genetic algorithms, are
found in a further category of methods that may or may not find feasible
solutions. These methods, however, do not provide a formal, mathematical
guarantee of their local optimality. Moreover, such methods also do not pro-
vide a proof of global optimality (also called “certificate of optimality”).
They just return the best solution according to the termination criterion of
the method that does not come with any guarantee or proof of optimality. To
summarize, in some cases these methods are able to find good solutions.
However, they do not give any clues as to how good the provided solution
is. Hence modelers have to believe the results or rely on experience with
similar problems.
Here x1, …, xn are free variables, f is the objective function and gk are the
constraint functions. The method is based on the KKT conditions, which are
necessary (but not sufficient) for the local optimality of a solution to a con-
strained optimization problem. These conditions yield a nonlinear equality
Copyright © 2018 System Dynamics Society
DOI: 10.1003/sdr
146 System Dynamics Review
system which is solved multiple times for a parameter that converges to zero
during an optimization run. Once the parameter is (approximately) zero, a
local optimal solution has been found. In order to compute the KKT condi-
tions, first and second derivative information is needed. Hence it must be
assumed that all functions f and gk are at least twice continuously
differentiable.1
Using this additional information, it is possible to apply the method to
large problem sizes. Wächter and Biegler implemented their method in the
publicly available software code IPOPT (Wächter and Biegler, 2006), and
applied it to a test set of problems, where the largest has about 250,000 vari-
ables and constraints. Interior point methods are, however, limited because
they always work on the full set of constraints. This problem is avoided by
sequential quadratic programming methods, which apply an active set strat-
egy, and which are explained in the following section.
Summary
Sequential quadratic programming and interior point methods both can typi-
cally solve larger problems than Box’s and Powell’s methods. While both
methods (Box’s and Powell’s) operate on f(x) as a black box, these two more
modern methods (SQP and IP) make use of derivative information and
exploit them as steepest descent directions to converge faster. Additionally,
in this paper we consider constraint optimization problems, which cannot
be solved using Powell’s method. However, it is not entirely clear which of
the two, SQP or IP, is the faster method. Therefore, one has to implement the
model in both and then try out which of them is actually faster. For the two
system dynamics optimization models we use as test problems, we
attempted a solution of the optimization problem with the SQP solver CON-
OPT as well as with the interior point solver IPOPT. We report on our results
in the following sections.
Application
Market growth
First, we consider the market growth model as presented by Forrester (1968).
A stock and flow diagram is shown in Figure 2. The model describes the pol-
icies governing the growth of sales and production capacity in a new prod-
uct market. Forrester’s original model resulted from a case study of an
electronics manufacturer and represents the opinions of the company’s
senior management about the way that corporate growth was managed.
We use the version of the market growth model as published by
Richardson (2011).
In the well-known base run of the market growth model, the evolution of
the firm shows an unexpected stagnation. Over the first 40 months, sales
rise, fueled by the increasing number of salesmen. Then, sales suddenly
level off. This is because those sales are not being backed up with added Pro-
duction capacity; this produces delivery delays, starting around month
20, and begins to affect sales a few months after that. The development of
Production capacity and Salesmen is shown in Figure 5.
In the original model, one of the key policies is modeled via the table
function CEF, which defines the Capacity expansion fraction depending on
the Delivery delay condition. Here, a production manager chooses by how
Copyright © 2018 System Dynamics Society
DOI: 10.1003/sdr
I.Vierhaus et al.: Using White-box Optimization Methods in Policy Design 149
Delivery delay
Orders B
Salesmen operating goal
booked
Delivery delay
Time to hire Salesmen indicated
salesmen hired B
R Backlog Delivery delay
Delivery delay condition
minimum
Delivery delay <CEF>
Indicated Delivery rate Production <Delivery rate bias
salesmen capacity fraction average> Capacity OPTIMAL
expansion fraction SWITCH
<Time>
<PCF> CEFOPTIMAL
Salesman salary Budget
Delivery rate Production Production
average capacity capacity on Fractional costs for
Production order Production Costs for new new production
Revenue to sales capacity receiving capacity ordering production capacity capacity
Deliv rate avg time Costs for using Production
production capacity delay time
Fig. 2. Structure of the adjusted market growth model (Forrester, 1968). Source: Richardson (2011). Our modifications to the
original model are shown in bold
Problem statement
As described in the previous section, the dynamics of the market growth
model lead to a decline in Production capacity as the company reacts to fluc-
tuations in demand using the policies for capacity expansion, hiring sales-
men and recognition of delivery delays. Our base model has the standard
Copyright © 2018 System Dynamics Society
DOI: 10.1003/sdr
150 System Dynamics Review
values for the policies as described by Forrester (1968). The key control the
company has over the expansion or reduction of the Production capacity is
in setting the variable Capacity expansion fraction. In the original model,
this variable is defined in terms of a table function of the Delivery delay con-
dition. It is clear that setting the Capacity expansion fraction to a positive
value will lead to a continuous increase in the Production capacity. The
decline of the company would be avoided. However, the original paper does
not consider the costs that would be associated with an expansion of produc-
tion capacity and that might make a constant expansion strategy impossible.
When extending a pure simulation model to an optimization problem, this is
a common challenge: in the simulation case, the modeler has full control over
all variables and constantly checks the results for plausibility. In the case of opti-
mization, such plausibility checks need to be formulated as algebraic expres-
sions such that the optimization algorithm is able to take them into account.
In order to formalize this, we introduce a “cost of intervention.” In the
case of market growth, we consider the policy in question (i.e. the expansion
of the production capacity) as the intervention of interest. To quantify the
magnitude of this intervention, we assume that there is a cost associated
with the maintenance of the existing production capacity and another cost
associated with increasing the capacity. For our exemplary case, we assume
that adding one unit of new capacity is 16 times more expensive than main-
taining one unit of capacity. (If the method is to be applied to a concrete
company, this value has to be adjusted accordingly.)
To implement this concept, we extend the original market growth model
by introducing two cost variables c1(t), c2(t), two weights w1, w2 and one
new state variable, Accumulated Costs, integrating over the incurred cost:
Note that information flows only into these additional variables but never
back to the original model variables. Thus the dynamics of the model are not
changed by our modifications. Based on these definitions, we can now com-
pare two simulations for the magnitude of the intervention. In the base run,
the accumulated cost incurred after 100 months amounts to a value of
3.47 × 105. In the following, we consider two questions concerning the model:
In the next section, we answer the first question by formulating and solv-
ing an optimization problem. By analyzing the solution to this problem, we
answer the second question.
The first statement (Eq. 3a) defines the objective function. We will com-
pare two simulations by comparing the value of Production capacity at the
end of the simulation at time T. We consider a maximization problem,
i.e. we prefer higher values of the objective function. We use z(t) = Capacity
expansion fraction (t) to denote the time-varying free variable. Note that we
added a constraint that limits the accumulated cost of intervention to the
cost incurred in the base run (Eq. 3c). This way, the magnitude of the inter-
vention in the base run and in the optimized run remain comparable. We
computed the Accumulated Costs (T) for the base run (i.e. a simulation of
the original model) to be 3.5 × 105. Constraint (Eq. 3d) represents the model
equations of the base market growth model.
2
We distinguish between constant and time-varying variables. A constant variable takes on the same value at
each time step of the simulation. Most model parameters are constant variables. A time-varying variable can
have a different value at each time step in the simulation.
At first sight one might think that, by removing the link from the variable
Delivery delay condition to the Capacity expansion fraction, the capacity
expansion feedback loop is cut. However, the optimization algorithm uses the
information of all variables in the model, including the information about
delivery delay condition, to compute the optimal values for capacity expan-
sion fraction. This means that the mentioned feedback loop is still intact and
that additional feedback loops were created. In fact, by applying optimization
to the modified problem, we compute a more comprehensive policy function.
Locally optimal solutions of problems such as Eq. (3) often show so-called
“bang-bang behavior” (Sonneborn and Van Vleck, 1965; Artstein, 1980);
i.e. the value of the free variable remains at its upper bound for some time,
and then switches abruptly to its lower bound and vice versa. Even though
such a solution can theoretically be very efficient, it might be impossible to
realize in practice because the free variable cannot be changed infinitely fast.
To control how fast a free variable z can change in reality, we introduce an
additional parameter zc limiting the amount of change from one time step to
the next. This variable has the unit Unit of free
time
variable
and is incorporated into
the model via a constraint of the form
zðt + 1Þ− zðt Þ
− zc ≤ ≤ zc (3g)
Δt
In the case of the market growth model, the variable zc constrains the rate
of the free variable Capacity expansion fraction. The corresponding con-
straint is Eq. 3e.
We conducted experiments for different values of zc. For a constant Capac-
ity expansion fraction, i.e. zc = 0, no solution was found that respects the
cost constraint. Therefore, we varied zc from a value of 0.0001 (which leads
to a very slowly changing free variable) to zc = 0.4 (which allows the free
variable to jump from one bound to the other within one time step and is
therefore equivalent to no limit).
The results are summarized in Table 1 and Figure 3. In the following, we
select a value of zc = 0.002, which leads to sufficiently slowly changing solu-
tions for the market growth model. This constraint rate of change results in a
reduction of 42 percent in the value of the objective function.
Optimization results
We solve the problem using the NLP solvers CONOPT and IPOPT. More
information on the necessary reformulation can be found in the Appendix
10− 4
10− 3
10− 2
10− 1
zc
300
0.06
0.1
200
0.04 100
0 0.05
0.02
-100
0 0
-200
-0.02 -300
-400 -0.05
-0.04
-500
Fig. 4. Broken lines show base run, solid lines show optimization solution. (a) Comparison of values of the free variable in
base run and optimal run. (b) Comparison of the ordered Production capacity as a result of the chosen expansion fraction.
(c) Plot of functional dependency between capacity expansion fraction and delivery delay condition
10000 3.5 35
Production capacity
Salesmen
8000 30
Backlog
2.5
6000 25
2
4000 1.5 20
1
2000 15
0.5
0 0 10
0 50 100 0 50 100 0 20 40 60 80 100
t [Months] t [Months] t [Months]
Fig. 5. Comparison of state variable values (a–c) of the market growth model in base run and optimized run. Broken lines
show the base run values
second half of the considered time frame. Figure 4(b) shows how the solu-
tions result in adding or reducing Production capacity, and Figure 5(a) shows
the actual values of the Production capacity over time. At the beginning of
the time horizon, the Production capacity is declining in the base run as well
as in the optimized solution. Counterintuitively, in the optimized solution
the decline is more pronounced. A possible interpretation of this is that, in
the optimized solution, the company uses the first 40 months to reach a bet-
ter “starting position,” i.e. a better ratio of the Production capacity, backlog
and salesmen. Indeed, the Production capacity shows a regular u-shape, end-
ing with a higher Production capacity at the end of the time frame than at
t = 0. While in the base run the Production capacity drops to a value of
5260, in the optimized run the Production capacity at the end of the time
window reaches a value of 13,163. This represents an increase of 10 percent
compared to the start value, and an improvement of 250 percent compared
to the final value in the base run.
With the above considerations, we can answer question 1 with “Yes.” We
found a satisfying solution, spending a similar cost of intervention, but
avoiding the continuous decline in Production capacity. Indeed, the backlog
and the number of salesmen have also grown within the time frame
(Figure 5(b), (c)). Hence we do not have to expect a new strong oscillation
after the considered time frame.
Copyright © 2018 System Dynamics Society
DOI: 10.1003/sdr
I.Vierhaus et al.: Using White-box Optimization Methods in Policy Design 155
To answer question 2, we refer to Figure 4(c). For the base run, we plotted
the table function CEF, i.e. the function has exactly one value on the second
axis for each value on the first axis. In the optimized solution, there is clearly
no function of the Delivery delay condition that would produce this plot.
The answer to question 2 is therefore “No.” Since, in the optimization proce-
dure, for each solution of the problem all variable values at all times are
known, the formulation and solution of the optimization problem (Eq. (3))
can be interpreted as modeling a decision maker who is aware of the full
model and its development over time. We showed that it is impossible to for-
mulate a policy function of one argument that reproduces this behavior.
Whether it would be possible to find a policy that depends on several state
variables and leads to a similar solution is an interesting question and
remains a topic for future research.
The main benefits of the policy computed with our approach can be sum-
marized as follows:
• Whereas in the base run the Production capacity was reduced, in the opti-
mized solution the Production capacity increases at the final time com-
pared to t = 0. The improvement compared to the base run amounts to
250 percent.
• We introduced a cost of intervention to account for the costs that result
from a different policy that were not accounted for in the original model.
Using the cost of intervention from the base run as a constraint of our
optimization problem, our optimized solution can be considered a redis-
tribution of effort rather than an increase of effort.
• With the exception of a permitted range of values, no assumptions about
the policy function, i.e. on the relationships between the policy function
values and other model variables, were made. Therefore, the search space
is much less limited than in a conventional policy improvement approach.
• The solution is guaranteed to be a local optimum within this larger search
space.
• In the original model, the Capacity expansion fraction was defined as a
function of the Delivery delay condition. In our approach we do not
assume this dependency and consequently the computed policy can no
longer be expressed as a function of this variable. Therefore, it could not
be computed using a conventional policy improvement approach to an
arbitrary function of the Delivery delay condition.
World2
The World2 model was introduced by Forrester (1971b). Figure 6 shows its
stock and flow diagram. The model is Forrester’s answer to the futility of
addressing world challenges in a piecemeal fashion. Instead the problem
should be addressed as a system of problems. The model consists of five
Copyright © 2018 System Dynamics Society
DOI: 10.1003/sdr
156 System Dynamics Review
<Time>
births pollution
multiplier <Time>
births food
multiplier POPULATION food ratio
LAND AREA DENSITY food crowding
multiplier POLLUTION
NORMAL food pollution STANDARD
multiplier
crowding
births pollution
crowding deaths ratio
multiplier crowding FOOD PER
multiplier DEATH RATE CAPITA
<Time> NORMAL NORMAL
deaths food
-Birth rate pollution
Difference in Birth rate multiplier food per capita absorption time
normal 1970 Population quality crowding potential
+ multiplier
births deaths
+
BIRTH RAT <Time> quality pollution
Costs to change E NORMAL deaths pollution multiplier
birth rate multiplier
CAPITAL quality food pollution
INVESTMENT QUALITY OF LIFE absorption
multiplier
births material deaths material RATE NORM STANDARD
multiplier multiplier quality of life
Capital Pollution
Cap inv rate capital capital CAPITAL <Time>
1970 investment depreciation DEPRECIATION
<Time>
+ - NORM
pollution
capital generation
Difference in cap in rate investment pollution capital
multiplier capital multiplier
<Time>
Pollution per
material ratio
+ cap 1970
standard quality material
Costs to change of living multiplier POLLUTION
cap inv rate capital PER CAPITA
investment from NORM
effective capital ratio quality ratio
capital ratio agriculture +
EFF CAPITAL CAPITAL -
RATIO NORM AGRI FRAC Difference in Pollution per cap
nat res matl NORM +
multiplier <Time>
natural resource Costs to change
extraction multiplier pollution per cap
Capital Agriculture
Fraction
capital agri frac ind
Natural
<Time> Resources NATURAL
natural CAPITAL
RESOURCES
NAT RES natural resource resource AGRI FRAC
UTILIZATION fraction INITIAL
utilization ADJ TIME
NORM remaining
NRUN 1970
-
+
Difference in NRUN
+
<Time>
+ accumulated
<Costs to change Total Costs total costs
Costs to birth rate> + +
change NRUN + +
<Costs to change
food coeff>
<Costs to <Costs to change
change NRUN> cap inv rate>
Fig. 6. Model structure of the modified World2 model (Forrester, 1971b). Source: teaching material by George Richardson,
University of Albany, PAD 624. Free variables are shown in red. Additional bookkeeping variables are shown in green
1.2
6
20
1.1
5
quality of life
15
Population
1
Capital
4
0.9
10
3
0.8
5
2
0.7
1 0 0.6
0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
t [Years] t [Years] t [Years]
Fig. 7. Comparison of variables of the World2 model in base run (dashed line) and optimized run (solid line) (a–c)
Problem statement
In Forrester (1971b), several scenarios are considered for the World2 model.
In particular, in chapter 6, “Towards a Global Equilibrium” parameter set-
tings are presented, which lead to a sustainable state of the world system
within a relatively short time. We use this scenario as our base run. Selected
state variables of this run are shown in Figure 7 and the parameter changes
suggested by Forrester are listed in Table 2 in the column “Base run.” In the
base run we see a stabilized level of the Population, a high level of quality of
life and low level of Pollution. For our optimization, we intend to improve
the system behavior of the World2 compared to the best policy run of Forres-
ter. We want to identify a policy to achieve a more sustainable world mea-
sured in quality of life with the smallest costs necessary to achieve this. As
fc, i ðzi ðt Þ,zinit , i Þ = αð expðβðzi ðt Þ− zinit , i Þ=wi Þ −1 + expð −βðzi ðt Þ −zinit , i Þ=wi Þ −1Þ
(4)
where we choose the parameter values α = 2.9 × 104 and β = 3.6. The value
of this function is zero if zi(t) = zinit,i, i.e. if the value of the free variable
remains at its initial value. This means there is no intervention, and there-
fore no costs are incurred. If the free variable is set to a different value from
the initial value a cost is computed for each timestep. The cost grows expo-
nentially with the difference between the new and the original value of the
variable. Each free variable zi must remain within a given interval. The
weights wi are chosen according to the width of the allowed interval. These
intervals, as well as the cost coefficients, are listed in Table 2. The chosen
weights wi are normalizations for different interval widths and ensure an
adequate weighting of the exponential growth of the individual cost compo-
nents. Furthermore, since zi and wi have the same unit and we choose α to
be of unit 1, fc,i is of unit 1 as well. We sum the costs of changing the free
variables from their initial value and accumulate them over time. The final
cost of intervention of a given run is then calculated as follows:
!
X
T X
5
Accumulated total costs ðT Þ = Δt fc, i ðzi ðt Þ,zinit , i Þ (5)
t=0 i=1
As in the market growth model, information flows only into our newly
introduced bookkeeping variables. Therefore, the model dynamics are not
changed by these modifications.
With the model adjusted as just described, we re-simulated the model
with Forrester’s parameter selection and derived the costs incurred by his
policy changes. This total accumulated cost we call the “Forrester budget,”
i.e. the costs necessary to implement Forrester’s final policy for World2.
The computed value is 3.8 × 107. Note that the utility of the unitless cost
of intervention is in the comparison of simulations. The absolute value of a
single simulation on its own is not yet useful. We will therefore use this
Forrester budget as a reference value for the optimizations in the following
sections.
The desired result is a time-varying policy for all five free variables. In
terms of the objective function, we chose to accumulate the product of Popu-
lation and quality of life. In the base run, the value of this accumulated prod-
uct takes on a value of 610.14.
Optimization results
As before, we attempted the solution with CONOPT as well as IPOPT. How-
ever, IPOPT was unable to find a feasible solution within 10 minutes. The
solving time with CONOPT for this model was 142 seconds, and the locally
Copyright © 2018 System Dynamics Society
DOI: 10.1003/sdr
160 System Dynamics Review
1.2
0.038 4
1.15
0.036
1.1 Birth rate normal 1970 3.5
Food coeff 1970
Total Costs
1.05 0.034 3
1 0.032 2.5
0.95
0.03 2
0.9
0.028 1.5
0.85
0.8 0.026 1
100 150 200 100 150 200 100 150 200
t [Years] t [Years] t [Years]
Fig. 8. Comparison of values of the selected free variables and intervention cost of the World2 model in base run (dashed
line) and optimized run (solid line) (a–c)
(a) (b)
1,000
800 100
600 0
200
100
50
20
10
1
200
100
50
20
10
1
Control Interval [Years] Control Interval [Years]
Fig. 9. The impact of changing the control interval in the World2 model on the achievable objective function (a) and the CPU
time to solve the resulting problems (b). The control interval defines how much time must pass before a control variable can
be assigned a new value. With a shorter control interval, the number of free variables increases. We would therefore expect
solving times to increase, which is in fact the case. The achieved objective value already increases significantly when
changing the control interval from 200 to 100 years. For further decreases of the control interval we see smaller but still
significant improvements
gain the last few percent of optimality, one has to intervene almost instanta-
neously and adapt the policy values with a high frequency. Systems that are
controlled with such high frequency exist. For example, the U.S. economy
can be seen as a highly complex dynamical system, related to national
(e.g. inflation, employment rate, psychological factors) and international
effects (other economies and political systems). The Federal Open Market
Committee of the Federal Reserve Bank steers this system by setting the
value of a free variable, the short-term objective for the Fed’s open market
operations. This value has been changed between one and 11 times per year
since the year 2000 (Board of Governors of the Federal Reserve System,
2015; Figure 9).
optimization methods will allow many practitioners to find new and inter-
esting solutions in models that we considered completely familiar.
Acknowledgements
Biographies
References
Artstein Z. 1980. Discrete and continuous bang-bang and facial spaces, or: look for
the extreme points. SIAM Review 22(2): 172–185.
Bellman R. 2003. Dynamic Programming. Dover Books on Computer Science Series.
Dover: Mineola, NY.
Benson HY, Shanno DF, Vanderbei R. 2004. Interior-point methods for nonconvex
nonlinear programming: jamming and numerical testing. Mathematical Program-
ming 99(1): 35–48.
Betts JT. 2010. Practical Methods for Optimal Control Using Nonlinear Programming.
Advances in Design and Control. Society for Industrial and Applied Mathematics:
Philadelphia, PA.
Board of Governors of the Federal Reserve System. 2015. Open market operations
archive. Available: https://www.federalreserve.gov/monetarypolicy/openmarket_
archive.htm [8 March 2017].
Byrd RH, Nocedal J, Waltz RA. 2006. KNITRO: an integrated package for nonlinear
optimization. In Large-Scale Nonlinear Optimization, di Pillo G, Roma M (eds).
Springer: Heidelberg; 35–59.
Conn A. 2014. A trust region method for solving grey-box mixed integer nonlinear
problems. In International Workshop on MINLP 2014, CMU, Pittsburgh, PA, 2–5
June 2014 (lecture slides).
Dangerfield B, Roberts C. 1996. An overview of strategy and tactics in system dynam-
ics optimization. Journal of the Operational Research Society 47: 405–423.
Drud AS. 1985. CONOPT: a GRG code for large sparse dynamic nonlinear optimiza-
tion problems. Mathematical Programming 31(2): 153–191.
Drud AS. 1994. A large scale GRG code. ORSA Journal on Computing 6(2): 207–216.
Dynaplan AS. 2015. Dynaplan SMIA software. Available: http://dynaplan.com
[30 June 2015].
Fletcher R. 1982a. A model algorithm for composite nondifferentiable optimization
problems. Mathematical Programming 17: 67–76.
Fletcher R. 1982b. Second order corrections for nondifferentiable optimization. In Numeri-
cal Analysis, Vol. 912 of Lecture Notes in Mathematics. Springer: Berlin; Watson, G.A.
(Ed.), 85–114.
Fletcher R, Leyffer S. 2002. Nonlinear programming without a penalty function.
Mathematical Programming 91(2): 239–269.
Ford DN, Sterman JD. 1998. Dynamic modeling of product development processes.
System Dynamics Review 14(1): 31–68.
Forio Corporation. 2015. Forio software. Available: http://forio.com [30 June 2015].
Forrester JW. 1968. Market growth as influenced by capital investment. Industrial
Management Review 9(2): 83–105.
Forrester JW. 1969. Urban Dynamics. Pegasus Communications: Waltham, MA.
Forrester JW. 1971a. Counterintuitive behavior of social systems. Industrial Manage-
ment Review 73(2): 52–68.
Forrester JW. 1971b. World Dynamics. Wright-Allen: Boston, MA.
Fortmann-Roe S. 2014. Insight maker: a general-purpose tool for web-based model-
ing & simulation. Simulation Modelling Practice and Theory 47: 28–45.
Gill PE, Murray W, Saunders MA. 2005. SNOPT: an SQP algorithm for large-scale
constrained optimization. SIAM Review 47(1): 99–131.
GoldSim Technology Group. 2015. Goldsim software. Available: http://goldsim.com
[30 June 2015].
Groesser SN, Jovy N. 2016. Business model analysis using computational modeling: a
strategy tool for exploration and decision-making. Journal of Management Control
27: 61–88.
Grösser SN. 2014. Co-evolution of legal and voluntary standards: development of
energy efficiency in Swiss residential building codes. Technological Forecasting
and Social Change 87(1): 1–16.
Han SP. 1976. Superlinearly convergent variable metric algorithms for general non-
linear programming problems. Mathematical Programming 11(3): 263–282.
Hanson DA, Kryukov Y, Leyffer S, Munson TS. 2009. Optimal control model of tech-
nology transition. International Journal of Global Energy Issues 33: 154–175.
Isee Systems, Inc. 2015. STELLA software. Available: http://iseesystems.com [30 June
2015].
Karmarkar NK. 1984. A new polynomial-time algorithm for linear programming.
Combinatorica 4: 373–395.
Karush W. 1939. Minima of functions of several variables with inequalities as side
constraints. Master’s thesis. Department of Mathematics, University of Chicago,
Chicago, IL.
Keloharju R, Wolstenholme E. 1988. The basic concepts of system dynamics optimi-
zation. Systemic Practice and Action Research 1(1): 65–86.
Keloharju R, Wolstenholme E. 1989. A case study in system dynamics optimization.
Journal of the Operational Research Society 40(3): 221–230.
Kuhn H, Tucker W. 1951. Nonlinear programming. In Proceedings of 2nd Berkeley
Symposium on Mathematical Statistics and Probability, Neyman J (ed). University
of California Press: Berkeley, CA; 481–492.
Kwakkel JH, Pruyt E. 2013. Exploratory modeling and analysis: an approach for
model-based foresight under deep uncertainty. Technological Forecasting and
Social Change 80(3): 419–431.
Linné CV. 1751. Philosophia Botanica In Qua Explicantur Fundamenta Botanica Cum
Definitionibus Partium, Exemplis Terminorum, Observationibus Rariorum. Adjec-
tis Figuris Aeneis. Kiesewetter: Stockholm. pp.1–362.
Liu H, Howley E, Duggan J. 2012. Co-evolutionary analysis: a policy exploration method
for system dynamics models. System Dynamics Review 28(4): 361–369.
Morecroft JDW. 1985. Rationality in the analysis of behavioral simulation models.
Management Science 31(7): 900–916.
Moxnes E, Krakenes A. 2005. SOPS: a tool to find optimal policies in stochastic
dynamic systems. In Proceedings of the 23rd International Conference of the Sys-
tem Dynamics Society, Sterman JD, Repenning NP, Langer RS, Rowe JI, Yanni JM
(eds). University of Bergen: Bergen. pp.1–16. http://www.systemdynamics.org/
conferences/2005/proceed/papers/MOXNE288.pdf
Murtagh BA, Saunders MA. 2003. MINOS 5.51 user’s guide. Technical report SOL
83-20R. Systems Optimization Laboratory, Department of Management Science
and Engineering, Stanford University, Stanford, CA.
Powell MJD. 1978. A fast algorithm for nonlinearly constrained optimization calcula-
tions. In Numerical Analysis, Vol. 630 of Lecture Notes in Mathematics,
Watson GA (ed). Springer: Heidelberg; 144–157.
Powersim Software AS. 2015. Powersim software. Available: http://powersim.com
[30 June 2015].
Rahmandad H, Repenning N. 2016. Capability erosion dynamics. Strategic Manage-
ment Journal 37(4): 649–672.
Repenning NP, Goncalves P, Black LJ. 2001. Past the tipping point: the persistence
of firefighting in product development. Industrial Management Review 43(4): 44–54.
Richardson GP. 2011. Reflections on the foundations of system dynamics. System
Dynamics Review 27(3): 219–243.
Schwaninger M, Grösser S. 2008. System dynamics as model-based theory building.
Systems Research and Behavioral Science 25(4): 447–465.
Simon H. 1984. Models of Bounded Rationality, Vol. 1. Economic Analysis and Pub-
lic Policy, (1st ed.) MIT Press: Cambridge, MA.
Sonneborn L, Van Vleck F. 1965. The bang-bang principle for linear control systems.
SIAM Journal of Control 2: 151–159.
Sterman JD. 2000. Business Dynamics: Systems Thinking and Modeling for a Com-
plex World. Irwing McGraw-Hill: Boston, MA.
The AnyLogic Company. 2015. Anylogic software. Available: http://anylogic.com
[30 June 2015].
Ventana Systems, Inc. 2015. Vensim software. Available: http://vensim.com [30 June
2015].
Wächter A, Biegler LT. 2006. On the implementation of an interior-point filter line-
search algorithm for large-scale nonlinear programming. Mathematical Program-
ming 106: 25–57.
Wilson RB. 1963. A simplicial method for convex programming. PhD thesis. Harvard
University, Cambridge, MA.
Yücel G, Barlas Y. 2011. Automated parameter specification in dynamic feedback
models based on behavior pattern features. System Dynamics Review 27(2):
195–215. https://doi.org/10.1002/sdr.457.
Supporting information