Professional Documents
Culture Documents
Ning Cornellgrad 0058F 12204
Ning Cornellgrad 0058F 12204
AND APPLICATIONS
A Dissertation
of Cornell University
Doctor of Philosophy
by
Chao Ning
August 2020
© 2020 Chao Ning
DATA-DRIVEN OPTIMIZATION UNDER UNCERTAINTY IN THE ERA OF BIG
AND APPLICATIONS
under uncertainty, including its modeling frameworks, solution algorithms, and a wide
variety of applications. Specifically, three research aims are proposed, including data-
that accommodates real-time uncertainty data, and an efficient solution algorithm for
There are two distinct research projects under the first research aim. In the first related
integer nonlinear programming model for the optimal biomass with agricultural waste-
to-energy network design under uncertainty. A data-driven uncertainty set of feedstock
second related project, we develop a novel deep learning based distributionally robust
which are further tackled using a scenario approach. Additionally, we derive a priori
bound on the required number of synthetic wind power data generated by f-GAN to
The second research aim addresses the online learning of real-time uncertainty data for
stochastic model predictive control is proposed for linear time-invariant systems under
constraints on system states are required to hold for an ambiguity set of disturbance
distributions. By leveraging a Dirichlet process mixture model, the first and second-
ambiguity set. As more data are gathered during the runtime of controller, the ambiguity
set is updated based on real-time data. We then develop a novel constraint tightening
recursive feasibility and closed-loop stability of the proposed model predictive control.
The third research aim focuses on algorithm development for data-driven multistage
bundle algorithm. By partitioning recourse decisions into state and control decisions,
affine decision rules are applied exclusively on the state decisions. In this way, the
proximal bundle method. The finite convergence of the proposed solution algorithm is
guaranteed for the multistage robust optimization problem with a generic uncertainty
lower bounding technique. The effectiveness and advantages of the proposed algorithm
Automation. He received the M.S. degree in Control Science and Engineering from
Tsinghua University, China, in 2015. He joined Professor Fengqi You’s research group
transferred to Cornell University with Professor You to continue his Ph.D. program. His
dynamics and control, big data analytics and machine learning, power systems
vi
ACKNOWLEDGMENTS
First and foremost, I would like to express my sincerest thanks to my advisor, Professor
Fengqi You, for his kind help, constant support, and heartfelt encouragement.
Throughout my PhD study, he spends great effort to help me with how to do high-impact
go through many research challenges. Without his kind help and constant support on
my research, I will never make this PhD. His vision on research directions, broad
knowledge, unbounded energy, and great enthusiasm about research are always ture
inspiration for me and will have a great impact on my future career. I feel greatly proud
Professor Oliver Gao, for their kind guidance and help. They have offered me with very
My thanks also go to all my colleagues and friends in the PEESE group, who has made
my Ph.D. study life wonderful and enjoyable. Dr. Dajun Yue helped me a lot with using
also helped me get into the background of batch process scheduling. Dr. Jian Gong was
always the go-to person when I encountered any types of questions in the lab. He helped
manuscript writting. Dr. Jiyao Gao was always willing to help me and gave constructive
from him on biomass process network. Dr. Karson Leperi helped me a lot by kindly
providing critical comments on my presentations. Dr. Chao Shang and I had great
discussions on data-driven optimization. Dr. Inkyu Lee kindly taught me how to draw
vii
fantastic figures with Powerpoint and had great disscusions with me on energy systems.
Xueyu Tian, Wei-Han Chen, Yanqiu Tao, Ning Zhao, Jiwei Yao, Jack Nicoletti, Raaj
Bora, Akshay Ajagekar, Abdulelah Alshehri, and Xiang Zhao, thank you guys for all
your kind help and the wonderful time with me. Most importantly, they are amazing
friends and I will never forget the happy hours we spent together. It was a pleasure to
discuss electric power systems with Haifeng Qiu and learned a lot from our discussions.
Many thanks to Natalia Lujan Juncua for her kind help in the unit commitment project.
Visiting scholars including Dr. Minbo Yang, Dr. Yuting Tang, Dr. Hua Zhou, Dr. Na
Luo, Dr. Zuwei Liao, Dr. Liang Zhao, Dr. Li Sun, and Dr. Runda Jia helped me with
both life and research, and provided me with valuable guindance and suggestions on
future career.
Last but not the least, I want to express my deepest gratitude to my father and mother
for their unconditional love, support, and encourgagement along the way.
viii
TABLE OF CONTENTS
1.2 Existing methods for data-driven optimization under uncertainty ................ 10
1.3 Various types of deep learning techniques and their potentials..................... 29
ix
3.2 Mathematical formulation.............................................................................. 87
3.3 Deep learning based ambiguous joint chance constrained economic dispatch
4.4 The theoretical properties of the proposed online learning based risk-averse
x
A TRANSFORMATION-PROXIMAL BUNDLE ALGORITHM FOR SOLVING
LARGE-SCALE MULTISTAGE ADAPTIVE ROBUST OPTIMIZATION
PROBLEMS ............................................................................................................... 156
5.1 Introduction .................................................................................................. 156
xi
LIST OF FIGURES
Figure 1. The data-driven uncertainty model based on the Dirichlet process mixture
Figure 5. The empirical probability distributions of total cost for (a) the stochastic
programming method, (b) the proposed data-driven Wasserstein DRO approach. ..... 64
...................................................................................................................................... 66
Figure 8. Cost breakdowns determined by (a) the stochastic programming method, (b)
method, (b) the proposed data-driven Wasserstein DRO approach. ............................ 69
Figure 10. Sensitivity analysis of discount rate for the data-driven Wasserstein DRO
xii
Figure 11. Sensitivity analysis of the in-sample objective value, out-of-sample average
cost, and computational time with different radii of Wasserstein balls. ...................... 71
Figure 12. Upper and lower bounds in each iteration of the reformulation-based branch-
and-refine algorithm for global optimization of the (WDRO) problem in the case study.
...................................................................................................................................... 72
Wasserstein DRO method based on the testing of 100 uncertainty scenarios. ............ 74
Figure 14. The dependences of the average cost reduction and standard deviation
Figure 17. The cost breakdown of (a) the DRCCED method with moment information,
Figure 18. The power dispatch of each conventional generator determined by the
Figure 19. The spatial correlations of the ten wind farm energy outputs for (a) real wind
power data, and (b) wind power data generated by f-GAN. The color darkness of one
single cell represents the level of spatial correlation coefficient for corresponding two
wind farms. Comparison of spatial correlations can be made by focusing on the darkness
patterns of heat maps. The temporal correlations of WF10 for (c) real wind power data,
and (d) wind power data generated by f-GAN. The level of auto-correlation coefficient
the height of each bar for every time lag. ................................................................... 106
xiii
Figure 20. The empirical distribution of the wind power utilization efficiency for (a)
DRCCED with moment information, and (b) the proposed approach. ...................... 108
Figure 21. The pseudocode of the proposed online-learning based risk-averse stochastic
Figure 22. The average computational times of the proposed online learn-ing based
risk-averse stochastic MPC method over 2,000 time steps. ....................................... 141
Figure 23. (a): The closed-loop trajectories of system states for the proposed online
sequences, (b): The zoom-in view of state trajectories near the upper limit of x(2). . 143
Figure 24. The online adaption of constraint tightening parameters in the proposed MPC
Figure 26. Inventory profiles determined by different control policies under the worst-
Figure 27. Cost breakdowns determined by (a) the affine control policy, (b) the
Figure 28. Lower bounds of multi-period inventory cost determined by the proposed
Figure 29. The impacts of the number of uncertainty scenarios on the generated lower
bound of the original multistage ARO problem and computational time in the data-
xiv
Figure 31. The optimal design and planning decisions at the end of the planning horizon
determined by (a) the affine decision rule method, and (b) the transformation-proximal
Figure 32. Optimal capacity expansion decisions over the entire planning horizon
determined by (a) the affine decision rule method, and (b) the transformation-proximal
Figure 33. Revenues and cost break down determined by the affine decision rule method
Figure 35. Revenues and cost break down at each time period determined by the affine
decision rule method (denoted by ADR in the figure) and the transformation-proximal
Figure 36. Optimal capacity expansion decisions over the entire planning horizon
Figure 37. Optimal feedstock purchase at each time stage determined by (a) the affine
decision rule method, and (b) the transformation-proximal bundle algorithm. ......... 208
Figure 38. Spider charts showing optimal sale quantities (kt/y) of final products at each
time stage determined by (a) the affine decision rule method, and (b) the transformation-
xv
LIST OF TABLES
Table 4. Comparisons of problem sizes and computational results for the DRCCED
Table 5. Comparisons of problem sizes and computational results for the DRCCED
robust inventory control problem under demand uncertainty for T=5. ...................... 210
robust inventory control problem under demand uncertainty for T=10. .................... 211
xvi
Table 10. Computational performances of different solution algorithms in the multistage
robust inventory control problem under demand uncertainty for T=15. .................... 212
xvii
CHAPTER 1
INTRODUCTION
disturbance [4]. Such uncertain parameters can be product demands in process planning
durations in batch process scheduling [7], among others. The issue of uncertainty could
unfortunately render the solution of a deterministic optimization problem (i.e. the one
disregarding uncertainty) suboptimal or even infeasible [8]. The infeasibility, i.e. the
solution quality. Motivated by the practical concern, optimization under uncertainty has
attracted tremendous attention from both academia and industry [4, 9-11].
In the era of big data and deep learning, intelligent use of data has a great potential to
benefit many areas. Although there is no rigorous definition of big data [12], people
typically characterize big data with five Vs, namely, volume, velocity, variety, veracity
and value [13]. Torrents of data are routinely collected and archived in process
industries, and these data are becoming an increasingly important asset in process
control, operations and design [14-18]. Nowadays, a wide array of emerging machine
learning tools can be leveraged to analyze data and extract accurate, relevant, and useful
power in deciphering multiple layers of representations from raw data without any
1
domain expertise in designing feature extractors [19]. More recently, dramatic progress
[21], especially in deep learning over the past decade [22], sparks a flurry of interest in
model is formulated based on data, thus allowing uncertainty data “speak” for
uncertainty data can be harnessed in an automatic manner for smart and data-driven
decision making.
optimization under uncertainty, highlight the current research trends, point out the
research challenges, and introduce promising methodologies that can be used to tackle
research papers on data-driven optimization under uncertainty and classify them into
four categories according to their unique approach for uncertainty modeling and distinct
research directions on optimization under uncertainty in the era of big data and deep
2
witnessed by various successful applications in process synthesis and design [10, 37],
production scheduling and planning [7, 38], and process control [35, 39-42]. In this
and robust optimization. For extensive and detailed surveys in the field of conventional
optimization under uncertainty methods, we refer the reader to the previous reviews on
uncertainty that aims to optimize the expected objective value across all the uncertainty
realizations [45]. The key idea of the stochastic programming approach is to model the
approach can effectively accommodate decision making processes with various time
stages. In single-stage stochastic programs, there are no recourse variables and all the
programming with recourse can take corrective actions after uncertainty is revealed.
Among the stochastic programming approach with recourse, the most widely used one
3
is the two-stage stochastic program, in which decisions are partitioned into “here-and-
min c T x Q x,
x X (1.1)
s.t. Ax d
Q x, min b y
T
y Y
(1.2)
s.t. W y h T x
after observing the uncertainty realization. The objective of the two-stage stochastic
programming model includes two parts: the first-stage objective cTx and the expectation
of the second-stage objective b(ω)Ty(ω). The constraints associated with the first-stage
to solve because of the growth of computational time with the number of scenarios. To
this end, decomposition based algorithms have been developed in the existing literature,
including Benders decomposition or the L-shaped method [48, 49], and Lagrangean
decomposition [50]. The location of binary decision variables is critical for the design
and operation of batch processes [55-57], optimization of flow sheets [58], energy
systems [59, 60], and supply chain management [61-64]. Due to its wide applicability,
immense research efforts have been made on the variants of stochastic programming
approach. For instance, the two-stage formulation in (1.1) can be readily extended to a
constrained program was first introduced in the seminal work of [69], and attracted
5
are flexible enough to quantify the trade-off between objective performance and system
reliability [70].
follows,
min f x
xX
(1.3)
s.t. ξ G x, ξ 0 1
where x represents the vector of decision variables, X denotes the deterministic feasible
level.
The chance constraint ξ G x, ξ 0 1 guarantees that decision x satisfies
constraints with a probability of at least 1−ε. Note that when the number of constraints
m=1, the above optimization model is an individual chance constrained program; for
m>1, it is called joint chance constrained program [71]. A salient merit of chance
constrained programs is that it allows decision makers choose their own risk levels for
stage chance constrained optimization with recourse was recently studied and had
Despite of its promising modeling power, the resulting chance constrained program is
generally computationally intractable for the following two main reasons. First,
6
integral, which is believed to be computationally prohibitive. Second, the feasible
region is not convex even if set X is convex and G(x, ξ) is convex in x for any
body of related literature is devoted into the development of solution algorithms for
[77]. Note that chance constrained programs admit convex reformulation for some very
special cases. For example, individual chance constrained programs are endowed with
tractable convex reformulations for normal distributions [45]. Chance constraints with
In the PSE community, chance constraints are usually employed for customer demand
satisfaction, product quality specification, service level, and reliability level of chemical
processes [78-81]. Due to its practical relevance, chance constrained optimization has
been applied in numerous applications, including model predictive control [82, 83],
process design and operation [84], refinery blend planning [85], biopharmaceutical
7
in robust optimization framework [94]. Given a specific uncertainty set, the idea of
robust optimization is to hedge against the worst case within the uncertainty set. The
realization giving rise to the largest constraint violation, the realization leading to the
lowest asset return [95] or the one resulting in the highest regret [96].
The conventional box uncertainty set is not a good choice since it includes the unlikely-
where Ubox is a box uncertainty set, u is a vector of uncertain parameters, ui is the i-th
component of uncertainty vector u. uiL and uiU represent the lower bound and the upper
bound of uncertain parameter ui, respectively. Box uncertainty set simply defines the
range of each uncertain parameter in vector u. One cannot easily control the size of this
uncertainty set to meet his or her risk-averse attitude. To this end, researchers propose
U budget u ui ui ui zi , 1 zi 1,
z
i
i , i
(1.5)
where Ubudget denotes a budgeted uncertainty set, u and ui have the same definitions as
in (1.4), ui is the nominal value of ui, ui is the largest possible deviation of uncertain
parameter ui, zi denotes the extent and direction of parameter deviation, and Γ is an
uncertainty budget.
[98], make all the decisions at once. This modeling framework cannot well represent
8
sequential decision-making problems [5, 99-105]. Adaptive robust optimization (ARO)
conservative solutions than static robust optimization [105, 107-109]. The general form
where x is the first-stage decision made before uncertainty u is realized, while the
continuous and integer variables, while y only includes continuous variables. c and b
are the vectors of the cost coefficients. U is an uncertainty set that characterizes the
Besides the two-stage ARO framework, the multistage ARO method has attracted
uncertainties over time [110, 111]. In multistage ARO, decisions are made sequentially,
and uncertainties are revealed gradually over stages. Note that the additional value
method has demonstrated applications in process scheduling and planning [104, 105,
112].
9
Despite popularity of the above three leading paradigms for optimization under
uncertainty, these approaches have their own limitations and specific application
scopes. To this end, research efforts have been made on “hybrid” methods that leverage
programming was integrated with robust optimization for supply chain design and
along with global solution algorithms were developed and applied to process design
In this section, we review the recent advances in optimization under uncertainty in the
era of big data and deep learning. Recent years have witnessed a rapidly growing
various topics and can be roughly classified into four categories, namely data-driven
the uncertainty model is perfectly given a priori, rather they all focus on the practical
10
1.2.1 Data-driven stochastic program and distributionally robust
optimization
modeled via a family of probability distributions that well capture uncertainty data on
hand. This set of probability distributions is referred to as ambiguity set. We then present
and analyze various types of ambiguity sets alongside their corresponding strengths and
uncertainty distribution is rarely available in practice. Instead, what the decision maker
has is a set of historical and/or real-time uncertainty data and possibly some prior
an ambiguity set. Rather than assuming a single uncertainty distribution, the DRO
hedging against the distribution errors, and accounts for the input of uncertainty data.
follows [122].
where x is the vector of decision variables, X is the feasible set, l is the objective
known to reside in an ambiguity set . The DRO approach aims for optimal decisions
under the worst-case distribution, and as a result offers performance guarantee over the
family of distributions.
The DRO or data-driven stochastic optimization framework enjoys two salient merits
compared with the conventional stochastic programming approach. First, it allows the
data into the optimization. As a result, the data-driven stochastic programming approach
greatly mitigates the issue of optimizer’s curse and improves the out-of-sample
tractability from robust optimization and some resulting problems can be solved exactly
discretization. For example, optimization problem (1.7) for a convex program with
The choice of ambiguity sets plays a critical role in the performance of DRO. When
choosing ambiguity set, the decision maker need to consider the following three factors,
12
namely tractability, statistical meaning, and performance [123]. First, the data-driven
quadratic or semidefinite programs. Second, the derived ambiguity set should have clear
uncertainty data were extensively studied [122, 124, 125]. Third, the devised ambiguity
approaches, in which first and second order information is extracted from uncertainty
data using statistical inference [126]. The ambiguity set that specifies the support, first
Ξ 1
(1.8)
T
where ξ represents the uncertainty vector, Ξ is the support, represents the probability
expectation with respect to distribution . Parameters μ and Σ represent the mean vector
The ambiguity set in (1.8) fails to account for the fact that the mean and covariance
matrix are also subject to uncertainty. To this end, an ambiguity set was proposed based
on the distribution’s support information as well as the confidence regions for the mean
and second-moment matrix in the work of [122]. The resulting DRO problem could be
where ξ represents the uncertainty vector, Ξ is the support, represents the probability
realizations reside in the support set Ξ. Parameters ψ1 and ψ2 are used to define the sizes
of confidence regions for the first and second moment information, respectively.
tractability. For example, DRO with the ambiguity set based on principal component
analysis and first-order deviation functions was developed [125]. Additionally, the
process network planning and batch production scheduling [125]. Recently, a data-
driven DRO model was developed for the optimal design and operations of shale gas
supply chains to hedge against uncertainties associated with shale well estimated
ultimate recovery and product demand [127]. However, the moment-based ambiguity
set is not guaranteed to converge to the true probability distribution as the number of
uncertainty data goes to infinity. Consequently, this type of ambiguity set suffers from
the conservatism with moderate uncertainty data. To address the above issue with
d , 0 (1.10)
14
where is the probability distribution of uncertain parameters, 0 represents the
distance between two distributions, and θ stands for the confidence level.
Ambiguity set in (1.10) can be further classified based on the adopted distance metric,
example, a DRO model was proposed for lot-sizing problem, in which the chi-square
goodness-of-fit test and robust optimization were combined. The ambiguity set of
demand was constructed from uncertainty data by using a hypothesis test in statistics,
called the chi-square goodness-of-fit test [129]. This set is well defined by linear
constraints and second order cone constraints. It is worth noting that the input of their
vector to characterize the distribution. The adopted statistic belonged to the phi-
the adaptive DRO method by incorporating recourse decision variables [131, 132]. A
following form:
s.t. Ax d (1.11)
min b ξ yT
Q x,ξ yY
s.t. T ξ x W ξ y h ξ
15
where x presents the vector of first-stage decision variables that need to be determined
variables that can be adjustable based on the realized uncertain parameters ξ, sets X and
minimize the worst-case expected cost with respect to all possible uncertainty
distributions within the ambiguity set . Based on the literature, multistage data-
Data-driven stochastic programming has several salient merits over the conventional
stochastic programming approach. However, based on the existing literature, there are
few papers on its PSE applications [125, 127]. In real world applications, the trend of
big data has fueled the increasing popularity of data-driven stochastic programming in
many areas, especially in power systems operation. Recently, DRO emerges as a new
an ambiguity set, and has various applications in power systems, such as unit
expected objective. Although both data-driven chance constrained program and DRO
adopt ambiguity sets in the uncertainty models, they have distinct model structures.
16
Specifically, data-driven chance constrained program features constraints subject to
uncertainty in probability distributions, while DRO typically only involves the worst-
distribution information is perfectly known. However, the decision maker only has
uncertainty data or obtained from expert knowledge. On the other hand, even if the
cumbersome. In practice, one can only have partial information on the probability
emerges as another paradigm for hedging against uncertainty in the era of big data.
min f x
x X
(1.12)
s.t. min ξ G x, ξ 0 1
where x represents the vector of decision variables, X denotes the deterministic feasible
can vary depending on both the ambiguity sets and the structure of the optimization
17
problem. In the following, we summarize the relevant papers according to the adopted
Distributionally robust individual linear chance constraints under the ambiguity set
comprised of all distributions sharing the same known mean and covariance were
distribution families of (a) independent random variables with box-type support and (b)
radially symmetric non-increasing distributions over the orthotope support. The worst-
chance constraints was studied assuming first and second moment [139], and the
unimodality was incorporated into the ambiguity set, and the corresponding ambiguous
In real world applications, exact moment information can be challenging to obtain, and
can only be estimated through confidence intervals from uncertainty realizations [122].
ambiguity set [142], employing Chebyshev ambiguity set with bounds on second-order
18
moment [143], characterizing a family of distributions with upper bounds on both mean
and covariance [144]. Ambiguous joint chance constraints were studied where the
ambiguity set was characterized by the mean, convex support, and an upper bound on
the dispersion [145], and the resulting constraints were conic representable for right-
distributionally robust chance constraints were made under the ambiguity sets defined
by mean and variance [147], convex moment constraints [148], mean absolute deviation
Although moment-based ambiguity sets achieve certain success, they do not converge
to the true probability distribution as the number of available uncertainty data increases.
the Prohorov metric was introduced into the distributionally robust chance constraints,
and the resulting optimization problem was approximated by using robust sampled
problem [151]. Distributionally robust chance constraints with the ambiguity set
divergence were cast as classical chance constraints with an adjusted risk level [128].
Data-driven chance constrained programs with ϕ-divergence based ambiguity set were
19
proposed [152], and further extensions were made using the kernel smoothing method
[27, 31]. Recently, data-driven chance constraints over Wasserstein balls were exactly
Wasserstein ambiguity set were studied for linear constraints with both right and left
areas, such as power system [158], stochastic control [159], and vehicle routing problem
[160].
determine robust optimal solutions and therefore should be devised with special care.
typically set a priori using a fixed shape and/or model without providing sufficient
flexibility to capture the structure and complexity of uncertainty data. For example, the
geometric shapes of uncertainty sets in (1.4) and (1.5) do not change with the intrinsic
structure and complexity of uncertainty data. Furthermore, these uncertainty sets are
A data-driven ARO framework that leverages the power of Dirichlet process mixture
model was proposed [32]. The data-driven approach for defining uncertainty set was
20
developed based on Bayesian machine learning. This machine learning model was then
integrated with the ARO method through a four-level optimization framework. This
feature is that multiple basic uncertainty sets are used to provide a high-fidelity
features, it does not account for an important evaluation metric, known as regret, in
ARO framework was developed that effectively accounted for the conventional
In some applications, uncertainty data in large datasets are usually collected under
data [163]. Machine learning methods including Dirichlet process mixture model and
illustrated in Figure 1. This framework was further proposed based on the data-driven
problem followed the two-stage stochastic programming approach, while ARO was
21
Figure 1. The data-driven uncertainty model based on the Dirichlet process mixture
model.
To mitigate computational burden, research effort has been made on convex polyhedral
framework that leveraged the power of principal component analysis and kernel
smoothing for decision-making under uncertainty was studied [34]. In this approach,
asymmetric distributions, forward and backward deviation vectors were utilized in the
22
uncertainty set, which was further integrated with robust optimization models. A data-
driven static robust optimization framework based on support vector clustering that aims
to find the hypersphere with minimal volume to enclose uncertainty data was proposed
[164]. The adopted piecewise linear kernel incorporates the covariance information,
thus effectively capturing the correlation among uncertainties. These two data-driven
robust optimization approaches utilized polyhedral uncertainty learned from data, and
were developed for static robust optimization based on statistical hypothesis tests [165],
density M-estimation was developed [112]. The salient feature of the framework was
Robust kernel density estimation was employed to extract probability distributions from
immunized to data outliers. An exact robust counterpart was developed for solving the
In recent years, data-driven robust optimization has been applied to a variety of areas,
such as power systems [33], industrial steam systems [168], planning and scheduling
23
1.2.4 Scenario optimization approach for chance constrained programs
A salient feature of scenario-based optimization is that it does not require the explicit
type of robust optimization that has a discrete uncertainty set consisting of uncertainty
data, it can provide probabilistic guarantee for those unobserved uncertainty data in the
testing data set. Note that the scenario-based optimization approach provides a viable
are utilized in a more direct manner compared with other data-driven optimization
methods. This data-driven optimization framework was first introduced in [170], and
has gained great popularity within the systems and control community [171]. As in data-
realizations. Specifically, the scenario approach enforces the constraint satisfaction with
min cT x
x X
(1.13)
s.t.
f x, ui 0, i 1, , N
24
where x is the vector of decision variables, X represents a deterministic convex and
closed set unaffected by uncertainty, c is the vector of cost coefficients, and f denotes
In the scenario optimization literature, u 1 , , u N is referred to as the multi-
sample or scenario that is drawn from the product probability space. Due to the random
nature of the multi-sample, the optimal solution of the scenario optimization problem
(1.13), denoted as x*(ω), is also random. One key merit of the scenario approach is that
the scenario optimization problem admits the same problem type as its deterministic
f(x, u) is convex in x [172]. Moreover, the optimal solution x*(ω) is guaranteed to satisfy
the constraints with other unseen uncertainty realizations with a high probability [173].
For the sake of clarity, we revisit the following definition and theorem [173].
defined as follows:
V x u f x, u 0 (1.14)
where V(x) denotes the probability of violation for a given x, and Ξ represents the
25
Theorem 1.1 Assuming x*(ω) is the unique optimal solution of the scenario
n 1 N
N V x* 1 i 1
N i
(1.15)
i 0 i
where n is the number of decision variables, N denotes the number of uncertainty data,
The above theorem implies that the optimal solution x*(ω) satisfies the corresponding
chance constraint with a certain confidence level. The proof of this theorem depends on
the fundamental fact that the number of support constraints, the removal of which
changes the optimal solution, is upper bounded by the number of decision variables
[170]. Note that (1.15) holds with equality for the fully-supported convex optimization
problem [173], meaning that the probability bound is tight. Additionally, the result holds
By exploiting the structured dependence on uncertainty, the sample size required by the
considerable research efforts have been made on the degree of violation [175], expected
probability of constraint violation [176], and the performance bounds for objective
values [177]. To make a trade-off between feasibility and performance, the case was
studied where some of the sampled constraints were allowed to be violated for
improving the performance of the objective [178]. Subsequent work along this direction
optimization framework was proposed in which the level of robustness was assessed a
26
posteriori after the optimal solution was obtained [180]. Recently, the extension of
scenario-based optimization to the multistage decision making setting was made [181,
182].
While the scenario optimization problems with continuous decision variables are
extensively studied [171], the mixed-integer scenario optimization was less developed.
An attempt to extend the scenario theory to random convex programs with mixed-
integer decision variables was made [183], and the Helly dimension in the mixed-integer
variables. This result suggests that the required sample size can be prohibitively large
for scenario programs with many discrete variables. Along this research direction, two
In some real-world applications, the required sample size can be very large, resulting in
great computational burden for scenario optimization problems with huge number of
developed for convex scenario optimization problems [185], and fell into the
these sequential algorithms is that validating a given solution with a large number of
feasibility check [187]. The trade-off between the sample size and the expected number
27
of repetitions was also revealed in the repetitive scenario design [187]. Note that the
one seeks to find the solution at one step. Another effective way to reduce the
can be efficiently solved via constraint consensus schemes [190]. Along this direction,
a distributed computing framework was developed for the scenario convex program
with multiple processors connected by a graph [188]. The major advantage of this
approach is that the computational cost for each processor becomes lower and the
procedure, i.e. the optimization step and detuning step [191]. As a consequence, the total
problems, in which the number of support constraints is upper bounded by the number
of decision variables. However, such upper bounds are no longer available in nonconvex
scenario theory to the nonconvex setting. To date, few works have considered
nonconvex uncertain program using the scenario approach. One contribution is that of
manner through the concept of support sub-sample was proposed. The proposed
28
scenario optimization problems. Another attempt to address nonconvex scenario
optimization made use of the statistical learning theory for bounding the violation
probability, and devised a randomized solution algorithm [193]. The statistical learning
theory-based method provided the probabilistic guarantee for all feasible solutions, as
opposed to the convex scenario approach where such guarantee is valid only for the
optimal solution. This unique feature regarding probabilistic guarantees for all feasible
solutions granted by the statistical learning based method is of practical relevance [194],
global optimality. A class of non-convex scenario optimization problem, which has non-
convex objective functions and convex constraints, was recently studied [195]. Since
the Helly’s dimension for the optimal solution of such non-convex scenario program
theorem is impossible. To overcome the research challenge, the feasible region was
restricted to the convex hull of few optimizers, thus enabling the application of sample
In this subsection, we present three types of deep learning techniques, including deep
belief networks, convolutional neural networks, and recurrent neural networks, and
Among deep learning techniques, deep belief networks (DBNs) are becoming
29
latent features [196]. DBNs essentially belong to probabilistic graphical models and are
network structure is designed based on the fact that a single RBM with only one hidden
layer fall shorts of capturing the intrinsic complexities in high-dimensional data. As the
building blocks for DBNs, RBMs are characterized as two layers of neurons, namely
hidden layer and visible layer. Note that the hidden layer can be regarded as the abstract
representation of the visible layer. There are undirected connections between these two
layers, while there exist no intra-connections within each layer. The training process of
scheme. Armed with multiple layers of hidden variables, DBNs enjoy unique power in
practical applications. As a result, DBNs have been applied in a wide spectrum of areas,
including fault diagnosis [197], soft sensor [198], and drug discovery [199]. DBNs can
Gaussian process model was proposed as a special type of DBN based Gaussian process
mappings. Due to its unique advantage in nonlinear regression, deep Gaussian process
Convolutional neural networks (CNNs) are one specialized version of deep neural
networks [200], and they have become increasingly popular in areas such as image
CNNs are designed to fully exploit the three main ideas, namely sparse connectivity,
30
weight sharing, and equivariant representations [19]. This kind of neural network is
suited for processing data in the form of multiple arrays, particularly two-dimensional
nonlinear layers, and pooling layers. In convolution layers, feature maps are extracted
by performing convolutions between local patch of data and filters. The filters share the
same weights when moving across the dataset, leading to reduced number of parameters
in networks. The obtained results are further passed through a nonlinear activation
function, such as rectified linear unit (ReLU). After that, pooling layers, such as max
pooling and average pooling, are applied to aggregate semantically similar features.
Such different types of layers are alternatively connected to extract hierarchical features
with various abstractions. For the purpose of classification, a fully connected layer is
stacked after extracting the high-level features. Although CNNs are mainly used for
image classification, they have been used to learn spatial features of traffic flow data at
nearby locations which exhibit strong spatial correlations [201]. Given its unique power
in spatial data modeling, CNNs hold the potential to model uncertainty data with large
addition, the CNNs can be trained for the labeled multi-class uncertainty data to perform
the task of classification. Therefore, the output of the CNN potentially acts as the
Besides the aforementioned models for spatial data, recurrent neural networks (RNNs)
are widely recognized as the state-of-the-art deep learning technique for processing time
series data, especially those from language and speech [202]. RNNs can be considered
31
as feedforward neural networks if they are unfolded in time scale. The architecture of
neural networks in a RNN possesses a unique structure of directed cycles among hidden
units. In addition, the inputs of the hidden unit come from both the hidden unit of
previous time and the input unit at current time. Accordingly, these hidden units in the
architecture of RNNs constitute the state vectors and store the historical information of
past input data. With this special architecture, RNNs are well-suited for feature learning
for sequential data and demonstrate successful applications in various areas, including
natural speech recognition [202], and load forecasting [203]. However, one drawback
of RNNs is its weakness in storing long-term memory due to gradient vanishing and
exploding problems. To address this issue, research efforts have been made on variants
of RNNs, such as long short-term memory (LSTM) and gated recurrent unit (GRU)
[204]. By explicitly incorporating input, output and forget gates, LSTM enhances the
uncertain parameters are collected. Uncertainty data realized at different time stages
often exhibit temporal dynamics. To this end, deep learning techniques, such as deep
RNNs and LSTM, could be leveraged to decipher the temporal dynamics and
32
In Chapter 2, we propose a novel data-driven Wasserstein distributionally robust
optimization model for hedging against uncertainty in the optimal biomass with
which is utilized to quantify their distances from the data-based empirical distribution.
Equipped with this ambiguity set, the two-stage distributionally robust optimization
model not only accommodates the sequential decision making at design and operational
stages, but also hedges against the distributional ambiguity arising from finite amount
of uncertainty data. A solution algorithm is further developed to solve the resulting two-
Specifically, wind power data are utilized to train f-GAN, in which its discriminator
Based upon this ambiguity set, a data-driven joint chance constrained ED model is
regarding wind power utilization. To facilitate its solution process, the resulting
free chance constraints, which are further tackled using a scenario approach. Theoretical
33
a priori bound on the required number of synthetic wind power data generated by f-
Predictive Control (MPC) for linear time-invariant systems under additive stochastic
required to hold for a family of distribu-tions called an ambiguity set. The ambiguity set
that is self-adaptive to the underlying data structure and complexity. Specifically, the
moment information of each mixture component is incorporated into the ambiguity set.
set. As more data are gathered during the runtime of controller, the ambiguity set is
updated online using real-time disturbance data, which enables the risk-averse
recursive feasibility and closed-loop stability of the proposed MPC are established via
34
partitioning recourse decisions into state and control decisions, the proposed algorithm
applies affine control policy only to state decisions and allows control decisions to be
proposed multi-to-two transformation remains valid for other types of causal control
policies besides the affine one. The proximal bundle method is developed for the
35
CHAPTER 2
2.1 Introduction
With growing concerns over energy crisis and global warming, the utilization of
renewable energy sources is growing rapidly around the globe [205]. As a renewable
energy source, biomass can be easily stored until needed and has the potential to be
converted into a plethora of biofuels and bioproducts [206]. Biomass feedstock has
fractionation [211], have advanced significantly in recent years [212]. There are
biofuels typically feature the feedstocks of edible energy crops, such as corn and
sugarcane, and lead to the competition between food and fuel. To address this issue, the
generation is known to be produced from algae and can reduce land use compared with
Additionally, agricultural and organic waste sources, like animal manure and slurry
renewable energy demand [217]. For instance, food waste is considered as a valuable
organic content [218]. Given a myriad possible feedstocks and technologies, unraveling
the optimal biomass processing routes from a process and product network in a
network design add more complexity in the decision-making process [220]. The
industry, and this data holds huge potential to support the network design. Recently,
employing the power of machine learning techniques [221] that include, but are not
limited to, Bayesian nonparametric models [32], kernel learning [164], principal
component analysis [34], and robust kernel density estimation [222]. Nowadays, a wide
information for better decisions in the bioconversion network design and operation.
Due to the significance of energy systems design, a growing body of literature leverages
37
[225], and life cycle optimization [226]. Nevertheless, the issue of uncertainty could
infeasible [8]. To this end, the bioenergy system design subject to uncertainty has been
extensively investigated in the existing literature [227]. There are various types of
[106], adaptive robust optimization based network design method was proposed to
identify economical and efficient biofuel and bioproduct production pathways [108].
While robust optimization has achieved success in various applications, this method
popularity due to the fact that it can incorporate the probability distribution information
to alleviate the conservatism, yet it generally scales poorly in the problem dimensions
[45]. Recently, the biodiesel production model considering diversified raw materials
account for risk aversion, stochastic programming models based on conditional value at
risk and downside risk were proposed for the optimal network design of hydrocarbon
biorefinery under supply and demand uncertainties [63]. The stochastic programming
design problem, and the resulting optimization model aimed to minimize the
38
expectation of costs under a number of scenarios associated with biomass availability,
fuel demand, and technology evolution [230]. To address the design of sustainable
model was presented, in which uncertain purchase prices were assumed to follow
The employment of the stochastic programming method is widespread in this area, and
most existing studies typically use the Monte Carlo method to generate uncertainty data
known, and it can only be observable through a finite number of uncertainty data. Due
to such limited amount of uncertainty data, the assumed probability distribution could
significantly deviate from the underlying true distribution. If the stochastic program for
when evaluating its optimal solution with a testing dataset [121]. The out-of-sample
solution evaluated at some uncertainty scenarios, which are different from the ones used
The moment-based ambiguity set in DRO is not guaranteed to converge to the true
probability distribution, as the number of uncertainty data increases. Therefore, this type
of ambiguity set suffers from the conservatism issue [124]. Thus, it is imperative to
39
develop a novel optimization method for biomass network design that can (a) effectively
hedge against the distributional ambiguity; (b) leverage the value of uncertainty data via
statistical machine learning; (c) lead to tractable model formulations that are amenable
for applications; and (d) provide optimal solutions with better out-of-sample
performance in terms of lower average cost and lower variance compared with
distributionally robust network design model, in which technology selection and sizing
are made at the first stage, while operation decisions are made at the second stage.
realistic setting where the true probability distribution can be inferred from a set of
historical uncertainty data. Based on the Wasserstein metric, we construct the data-
driven ambiguity set as a ball (a.k.a. Wasserstein ball) in the probability space centered
can be used to measure the distance between probability distributions, our research work
adopts the Wasserstein metric rather than using the Lévy-Prokhorov metric in the DRO
framework following the literature [38]. The ambiguity set based on the Wasserstein
Wasserstein ambiguity set has gained increasing popularity, and is widely adopted in
multistage adaptive DRO [232], adaptive robust stochastic optimization [233], and
stochastic programming model can be considered as a special case of the proposed DRO
40
model when the “radius” of the Wasserstein ball is tuned to be zero. Nonlinear scaling
technology’s capital cost associated with the corresponding capacity. Notably, this
research work involves all the three generations of biofuels. According to the taxonomy
of uncertainty types [228], uncertainty can be classified into three categories, namely
randomness, epistemic, and deep uncertainty. In the studied problem, the uncertainty
address this type of uncertainty, because it hedges against the ambiguity of distribution
distributions. The data-driven Wasserstein DRO model harnesses the advantages of both
orientated approach regularizes the optimization problem and effectively hedges against
the worst-case distribution within the ambiguity set [235], thereby remedying the
drawback of the stochastic programming method. To the best of our knowledge, the
proposed model represents the first attempt to employ the data-driven Wasserstein DRO
to address the biomass network design problem under uncertainty. The resulting
which cannot be solved directly by any off-the-shelf optimization solvers. The “multi-
level” means that the resulting optimization problem has a “min-max-min” optimization
41
uncertainty to demonstrate the effectiveness of the proposed approach. The better out-
cost and lower variance is validated in a case study of a biomass with agricultural waste-
sensitivity analysis is also performed to evaluate the impact of the ambiguity set’s size
In this section, we formally state the problem of biomass with agricultural waste-to-
network. This network has various conversion pathways featuring a diversified portfolio
materials or feedstocks into sustainable energy and useful bioproducts such as biofuels
and biogas. Accordingly, the network holds great value for not only producing clean
energy, but also for managing agricultural waste. There is a total of 216 processing and
feedstocks include soybean, corn, sugarcane, hard wood, soft wood, switchgrass, algae,
cassava, brown grease, corn stover, tomato peels, potato peels, orange peels, olive
waste, municipal solid waste, dairy manure, poultry litter, and swine manure. These
various types of feedstocks in the network are converted to energy and bioproducts in
the following way. First, feedstocks are decomposed into some basic chemical
42
compounds via processing technologies, such as hydrothermal liquefaction [237]. Those
chemical compounds are then used for producing biofuels or bioproducts through
upgrading technologies. The final products are sold to the market. Note that some of the
potential pathways in this network are “waste-to-energy” pathways, meaning that they
convert waste materials into energy-rich bioproducts and biofuels [238]. One main
component in waste feedstocks is the agricultural waste, including food waste and
animal manure [239]. Specifically, tomato peels, potato peels, and orange peels can
serve as feedstocks to produce chemical materials, like beta carotene, chlorogenic acid,
caffeic acid, and pectin. Different types of anaerobic digesters (ADs), such as the mixed
plug AD and the horizontal plug flow AD, present as technologies that convert dairy
animal manure, poultry litter, and swine manure into biogas [240]. As a fuel source, the
biogas can be further used to produce heat and electricity [241], thus providing immense
43
Figure 2. The structure of the biomass with agricultural waste-to-energy network
The most recognized type of uncertainty is the volatility in purchasing prices of biomass
resources. In this research work, the feedstock price uncertainty is considered for the
following reasons. On one hand, uncertain biomass feedstock prices typically fluctuate
due to policy changes and energy markets. Given the lifetime of equipment, the
the optimal bioconversion network design. On the other hand, real feedstock price data
are well documented and can be easily acquired to validate the effectiveness of the
proposed data-driven approach. Within the proposed Wasserstein DRO model, useful
statistical information embedded in the uncertainty data is leveraged, and then the
ambiguity set based on the Wasserstein metric is constructed. Note that policy changes
and energy markets could lead to time-variant price distributions, which further cause
the ambiguity of probability distributions. For these two sources of uncertainty, the
44
DRO approach works because it uses an ambiguity set to hedge against the distributional
the DRO method works, since their underlying true distributions can be partially known
due to the limited number of uncertainty data. If the probability distribution can be
perfectly known to the decision maker, adding the distributional robustness is not
necessary.
annualized cost. This worst-case expected cost is taken with respect to all feedstock
price distributions within the Wasserstein ambiguity set. This data-driven ambiguity set
distribution on biomass feedstock price data. An illustrative figure on the biomass with
prior to the feedstock price uncertainty realization. The second-stage decisions are
operational decisions that are postponed in a “wait-and-see” manner after knowing the
ready for operation at the second stage. Details of these decisions are summarized as
follows:
These design and operation decisions are optimized based on the following given
parameters:
The upper and lower bounds of the capacity of each processing and upgrading
technology;
An initial capital cost corresponding to the base capacity for each technology;
Discount rate;
46
The fixed operating expense (OPEX) for each technology;
technology;
ambiguity set for the feedstock price uncertainty. With this ambiguity set, a two-stage
proposed for the biomass with agricultural waste-to-energy network design. Finally, a
solution strategy integrating the reformulation of worst-case expectation and the branch-
problem.
As mentioned in the problem statement, feedstock prices are subject to uncertainty and
the decision maker has access to the price dataset Dtrain ξ 1 , , ξ N . ξ(n) denotes the
T
n-th data vector of feedstock prices, i.e. ξ c3, 1 , , c3,I
n
n n
, and N represents the
number of data samples. For the stochastic programming-based network design, the
assumed probability distribution might deviate from the underlying true distribution due
to the finite amount of feedstock price data. In addition, relying on a single probability
measure the distance between feedstock price distributions, we define the Wasserstein
d w 1 , 2 min ξ1 ξ 2 dξ1 , dξ 2
2
where represents the set of all probability distributions with support set Ξ, and
denotes the norm of a vector. We adopt l1 norm in this work due to its computational
From the definition, we can see that the Wasserstein metric is defined through an
distribution 1 to distribution 2 .
Based on the Wasserstein metric in (2.1), the data-driven ambiguity set for feedstock
d w , ˆ N (2.2)
48
where ˆ N denotes the empirical distribution. The probability distribution ˆ N is the
1 N
uniform distribution on N available feedstock price data, i.e. ˆ N δξ n , where
N n 1
δξ n represents the Dirac measure at the price data point ξ(n). Note that ˆ N is a discrete
parameter used for controlling the size of the data-driven ambiguity set . The support
set Ξ can be specified via the upper and lower bounds of uncertain parameters and is
shown as follows.
ξ ξ imin ξ i ξ imax , i (2.3)
Wasserstein distances from the empirical distribution is no larger than θ. Therefore, the
empirical distribution ˆ N . Note that the size of the data-driven ambiguity set can be
adjusted by using the tuning parameter θ. Specifically, decreasing the value of parameter
θ reduces the size of the Wasserstein ambiguity set. The decision maker can utilize the
m log 1 C C N , if m log 1 C C N 1
1 2 1 2
(2.4)
max log 1 C1 C2 N ,1 , else
where β denotes the confidence level, m (m>2) represents the dimension of uncertainty
vector, and N is the number of feedstock price data. Here it is assumed that there exist
49
numbers. In general, Equation (2.4) is not a practicable way to obtain the Wasserstein
radius, because the constants C1 and C2 are difficult to estimate and could give loose
bounds. For this reason, cross-validation can be used as an empirical way to tune the
The steps of k-fold cross validation to tune the Wasserstein radius are given as follows
[246]. First, ξ , , ξ
1 N
are partitioned into k subsets. For each holdout run, only one
subset is used as a training dataset, while the remaining subsets are merged as a
validation dataset. Second, the Wasserstein radius is tuned such that the corresponding
average cost for the validation dataset is minimized. Lastly, the optimal Wasserstein
radius from cross-validation is set to be the average of the optimal radii determined in
The data-driven ambiguity set for the feedstock price is cast as a set of possible
probability distributions that are “close” to the empirical distribution in the sense of the
Wasserstein metric. There are several merits of the Wasserstein ambiguity set. First, this
ambiguity set directly leverages the uncertainty data information via the empirical
distribution, while at the same time effectively hedges against the distributional
uncertainty based upon the Wasserstein metric. This feature is useful in the network
design problem where the distribution of uncertain feedstock prices is only observable
through a finite amount of price data. Second, there exists a statistical guarantee that the
Wasserstein ambiguity set contains the unknown true distribution with a certain
confidence level [245]. Specifically, with the Wasserstein radius in (2.4), it can be
50
1 C1e C2 N , if 1
m
guaranteed that P d w true , ˆ N
1 C e C2 N
, else
. This favorable feature
1
equips the resulting DRO solution with better out-of-sample performance in terms of
lower average cost and lower variance. Such out-of-sample performance is of practical
relevance, since price data different from the training dataset Dtrain are used to test the
data-driven network design decision. Third, the decision maker can readily adjust the
level of conservatism by tuning the radius θ of the Wasserstein ball. Lastly, the DRO
problem with the Wasserstein ambiguity set admits a tractable reformulation, which
grants the resulting biomass with agricultural waste-to-energy network design problem
model
biomass with agricultural waste-to-energy network design model using the data-driven
ambiguity set presented in the previous section. In a biomass with agricultural waste-
to-energy network, biomass feedstocks, such as microalgae [247], and dairy manure
[248], are converted into a variety of biofuels and bioproducts via different processing
and upgrading technologies [219]. One needs to make decisions on the selection of
technology pathway, capacity and operating level of each technology, purchase amounts
of feedstocks and quantities of products to sell. The objective is to minimize the worst-
case expected total annualized cost with regard to the Wasserstein ambiguity set. Since
the proposed model aggregates yearly operations, the issue of biomass feedstock
51
seasonality is not considered in this work. In this research work, we focus on the
selection of technologies for the biomass network design, and do not consider the issue
The data-driven Wasserstein DRO model for the network design under uncertainty can
selection and capacity decisions to be made before uncertainty realizations, while also
allowing for production, purchasing, and sale decisions to be made after uncertainty has
been realized. Specifically, the first-stage decision variables are decisions on the
levels, quantity of biomass to use, and amounts of products to sell. The objective
(2.5). The constraints include technology capacity constraint (2.6), production level
constraint (2.7), mass balance constraint (2.8), biomass feedstock availability constraint
Nomenclature section, where all parameters are denoted in lower-case symbols, and all
variables are denoted in upper-case symbols. The two-stage Wasserstein DRO (WDRO)
c Q j j max min c2, jW j c3,i Pi c4,i Si
sf
(WDRO) min 1, j (2.5)
Y ,Q
jJ
W , P,S jJ iI iI
52
Wj Qj , j J (2.7)
Pi ij W j Si 0, i I (2.8)
j
Pi bi , i I (2.9)
Si d i , i I (2.10)
Q j , Pi , S i , W j 0 i I , j J (2.11)
Y j 0,1 , j J (2.12)
d w , ˆ N
1 N (2.13)
ˆ N δc n
N n 1 3,i
where c1,j, c2,j, c3,i, and c4,i respectively represent economic evaluation parameters for
the capital cost associated with technology j, the operating cost associated with
technology j, the purchase cost of biomass feedstock i, and the selling price of
bioproduct i. At the first stage (a.k.a. the design stage), “here-and-now” decisions
“here-and-now”, since they should be made prior to any uncertain feedstock price
realizations. At the second stage or the operational stage, the decision variables,
including the operating level of each technology Wj, the amount of feedstock purchased
Pi, and the amount of product sold Si, can be postponed in a “wait-and-see” manner after
The objective function can be roughly divided into two terms. The first term represents
the first-stage cost, namely the total capital cost. The nonlinearity arises within the
53
relation between the facility’s capital cost and its capacity [249]. Specifically, the
sf
nonlinear functions, namely power function Q j j , are employed to evaluate technology
capital costs in the (WDRO) model. Following the literature [250], sfj is typically set to
be 0.6. The second term is the worst-case expectation of the second-stage costs, and as
a result, the proposed optimization model is capable of hedging against the worst-case
feedstock price distribution within the Wasserstein ball. Based on Constraint (2.7), it
becomes clear that the decision variable for operating level Wj can be adjusted in the
range from zero to the total capacity of technology j. Therefore, the (WDRO) model
appropriately accommodates the fact that real facilities do not always operate at the
maximum capacity.
It is worth noting that the proposed (WDRO) model reduces to the stochastic
programming when the value of parameter θ is set to be 0, since the induced ambiguity
set changes to the singleton set ˆ N . In summary, we develop a data-driven two-stage
feedstock price can only be inferred from a finite training dataset. To effectively hedge
against the distributional uncertainty, the (WDRO) model employs the objective
ambiguity set. The proposed biomass with agricultural waste-to-energy network design
model has the following merits. First, it directly incorporates uncertain feedstock price
data into the optimization model. Second, the (WDRO) model effectively accounts for
the ambiguity of the feedstock price distribution, thereby enjoying a better out-of-
54
sample performance in terms of lower average cost and lower variance compared with
However, the multi-level optimization structure, coupled with nonconvex terms in the
method that works in solving the resulting Wasserstein distributionally robust MINLP
In this section, we develop a tailored solution method to globally optimize the (WDRO)
sf j
involved in the ambiguity set. The concave function Q j renders the optimization
existing solution methods for two-stage DRO problems cannot handle a mixed-integer
55
algorithm is then adopted to solve the resulting single-level optimization problem by
reformulate the (WDRO) problem into (WDRC) comes from the literature [124].
N
1
c Qj j +
sf
(WDRC) min 1, j n (2.14)
jJ N n 1
Wnj Q j , j J , n N d (2.16)
Pni bi , i I , n N d (2.18)
Sin di , i I , n N d (2.19)
Y j 0,1 , j J (2.21)
c
jJ
2, jWnj c3,i
n
iI
Pni c4,i Sni
iI
(2.22)
iI
c3,imax c3,i
n
γ ni c3,imin c3,i
1 n
γ ni n , n N d
iI
2
56
γ n1 1
where γ n 2 , γ ni and γ ni2 are the i-th entries of vectors γ n1 and γ n2 , respectively.
γ n
max min
c3,i and c3,i represent the upper and lower bounds for the price of feedstock i,
respectively.
equivalent to the (WDRO) model. The (WDRC) for the biomass with agricultural waste-
feature of (WDRC) is that its model size, namely the number of decision variables and
the number of constraints, scales linearly with the number of price data N. Moreover,
the uncertain price data are directly incorporated into the proposed (WDRC) model as
witnessed in (2.22).
The resulting (WDRC) problem turns out to be a nonconvex MINLP with separable
concave terms in its objective function (2.14). These concave terms appear due to the
calculation of capital cost based on “six-tenths rule” scaling with technology capacity
[250]. Although this single-level MINLP can be solved directly using some off-the-shelf
linear approximation to solve the (WDRC) problem to its global optimality [236]. The
key idea is to approximate the concave capital cost using a series of piecewise linear
underestimates that are formulated via special ordered sets of type 1 (SOS1) variables.
The piecewise linear under-estimator for the capital cost of technology j, denoted by Ej,
is formulated in (2.25)-(2.27).
57
NP
E j fe jp PWip , j J (2.25)
p 1
NP
Q j fx jp PW jp , j J (2.26)
p 1
fe jp fx 0.6
jp , j J , p P (2.27)
where fxjp is the predefined partition point value, fejp denotes the corresponding power
function value, p is the index of partition point, NP is the total number of partition points,
Constraints on weighting factor PWjp and position indicator PEjp are defined in (2.28)-
(2.33).
NP
PW
p 1
jp 1, j J (2.28)
NP
PE
p 1
jp 1, j J (2.29)
PW j1 PE j1 , j J (2.30)
PW jp PE jp PE jp 1 , j J , 2 p NP (2.31)
PW jp 0, PE jp SOS1 , j J (2.33)
where IWj,p is defined as a SOS1 variable such that only one interval is selected.
58
Algorithm. The proposed solution algorithm
1: Set LB←−∞, UB←+∞, iter←0, and ζ;
2: Reformulate problem (WDRO) to problem (WDRC);
3: While UB LB
4: iter ← iter + 1;
5: *
Solve problem (WDRC) with piecewise linear objective function, and obtain Qiter
and objective value OBJ*;
6: Update LB max LB, OBJ * ;
7: Evaluate the original nonlinear objective value OBJ∆ using Qiter
*
;
8: Update UB min UB, OBJ ;
9: *
Add a new partition point at the candidate solution Qiter ;
10: End
11: Return the optimal solution
sf
Since the capital cost is underestimated when substituting Q j j with Ej, the resulting
MILP problem provides a valid lower bound. Note that the optimal solution of the MILP
(WDRC) problem. Accordingly, a valid upper bound for the total annualized cost can
be obtained by calculating the original nonconvex objective value with the candidate
solution. The gap is then computed as the difference between the upper and lower
bounds and is utilized for determining whether a new partition point is needed to further
refine the piecewise linear approximation. Partition points are added at the candidate
solutions iteratively until the gap between the upper and lower bounds reaches below a
method for the global optimization of the (WDRO) problem in detailed steps. Note that
the tolerance for the optimality gap is denoted by ζ. The number of partition points
59
increases by one only for those selected technologies in each iteration of the solution
algorithm.
network design approach and the solution algorithm, we consider a specific biomass
with agricultural waste-to-energy network design in this section. Optimal solutions are
found and validated through the optimization process of the network design problem
In the considered biomass with agricultural waste-to-energy network, there are 216
study, we consider the energy market, which involves the demands for biodiesel,
gasoline, ethanol, methane, and biogas. The problem parameters for technologies, such
as mass balance coefficients, generally are not influenced by the geographical region,
whereas the price parameters hold for the geographical region of USA. Note that the
datasets of problem parameters used in this work can be found in the recently published
papers [240, 252]. Thus, the data reflects the recent trend in terms of its age. Because of
market fluctuations, feedstock prices are subject to uncertainty. Since the feedstock
price data are well-documented [253], we use real price data for the case study.
We also implement the deterministic optimization method and the conventional two-
stage stochastic programming method using the same price data as scenarios, in addition
to the proposed data-driven Wasserstein DRO approach for the purpose of comparison.
60
All optimization problems are modelled in GAMS 25.0.3 [254]. The computational
experiments are performed on a computer with an Intel (R) Core (TM) i7-6700 CPU @
3.40 GHz and 32 GB RAM. In each iteration of the developed solution algorithm, an
MILP problem is solved with the solver CPLEX 12.8.0. The optimality gap for CPLEX
12.8.0 is set to be 0, and the optimality tolerance for the reformulation-based branch-
and-refine algorithm is 10-6. In the case studies, the radius of Wasserstein ball θ is
obtained through cross validation [124]. The lower and upper bounds of uncertain
parameters in support set Ξ are estimated using the empirical bounds, which are directly
obtained from the training data. Note that the average price is used as a nominal value
for the deterministic optimization method. For the stochastic programming and DRO
approaches, price uncertainty data, which represent possible price realizations, are used
in their corresponding optimization problems. At the end of the case study, a sensitivity
analysis is performed to investigate how the value of θ influences the Wasserstein DRO
solution.
The problem sizes and computational results of different methods are summarized in
Table 1. A total of 12 training samples is used for these optimization methods. From the
table, it can be observed that the number of continuous decision variables and the
number of constraints in the reformulated problem (WDRC) are both larger than those
in the two-stage stochastic program. This is because auxiliary variables and constraints
are introduced to reformulate the worst-case expectation problem. Although the mixed-
integer optimization problem for biomass network design is NP-hard, it can be solved
61
within a reasonable amount of time empirically. From Table 1, we can see that the data-
driven Wasserstein DRO problem takes about 50% longer computational time to solve
Note that the biomass with agricultural waste-to-energy network design problem, which
Although the stochastic program and the DRO problem have a similar number of
decision variables and constraints, their model structures are quite different, leading to
method is $16.37 MM, whereas the objective value determined by the proposed data-
driven Wasserstein DRO approach is $21.67 MM. The reason is that the conventional
stochastic programming method minimizes the expected total cost based on a single
empirical distribution, while the data-driven Wasserstein DRO approach aims for the
lowest worst-case expected cost with respect to a family of candidate feedstock price
in decision making under uncertainty, and is used to measure the largest amount a
decision maker would be willing to pay in return for perfect information [45]. Given its
definition, EVPI is suitable for the conventional stochastic programming method instead
of DRO, because the true probability distribution required to calculate the expectation
is not known. In this case, the EVPI is $0.42MM for the stochastic programming
62
considered as a special case of the proposed data-driven DRO approach, when the radius
simulation with the testing feedstock price data. The testing dataset consists of different
feedstock price realizations which are obtained from the same source [253]. The number
of testing samples is 60 in this case study. For each method, we calculate their average
cost, worst-case cost, best-case cost, and standard deviation of cost under different
It can be seen from the table that the average cost and the worst-case cost of the proposed
data-driven DRO approach using Wasserstein metric are 5.7% and 17.4% lower than
network design that is less sensitive to feedstock price variations. Specifically, the costs
63
than its stochastic programming counterpart. The simulation results clearly demonstrate
that the data-driven Wasserstein DRO approach compares favorably against the
the stochastic programming and DRO, we present the empirical probability distributions
approach.
Figure 5. The empirical probability distributions of total cost for (a) the stochastic
and the proposed data-driven WDRO approach are presented in Figure 6 and Figure 7,
64
respectively. Note that the deterministic optimization method, which uses the average
value of feedstock prices, generates the same optimal network design as the stochastic
and stochastic programming methods are ascribed to the specific parameter setup and
ranges of uncertainties in this specific case study. For all these methods, the biomass
process producing glycerol, which can be used to synthesize PHB. The pyrolysis of
switchgrass is selected in the network because of its ability to produce raw bio-oil. This
type of bio-oil can be transformed into a number of products [255], boosting the
ethanol. Note that the anaerobic digester (AD) converts dairy manure into biogas, which
can be used as a fuel source or further utilized as a material in other chemical reactions
[256]. As shown in Figure 6 and Figure 7, the best way to produce biogas from dairy
manure is through horizontal plug flow AD. Meanwhile, municipal solid waste is used
selected only in the optimal network determined by the stochastic programming method.
65
Figure 6. The optimal bioconversion network design determined by the stochastic
66
Figure 7. The optimal bioconversion network design determined by the data-driven
processes.
The details on cost breakdowns, including capital cost, operating cost, and feedstock
cost are shown in Figure 8. From the donut charts, we can see that more than half of the
total annualized cost comes from purchasing feedstocks for both the stochastic
programming method and the Wasserstein DRO approach. Additionally, the percentage
because a larger quantity of soybeans is purchased in the optimal network design. For
both optimization methods, the capital cost contributes to the second largest portion,
67
meaning that the selection and capacities of technologies play a critical role in lowering
Figure 8. Cost breakdowns determined by (a) the stochastic programming method, (b)
To take a closer look at the capital costs of different approaches, we present the capital
methods in Figure 9 (a) and (b), respectively. From Figure 9 (a), we can see that the
comes from landfill methane extraction and glycerol to isobutanol process. It indicates
that the processing pathways used to produce methane and isobutanol are expensive to
build. Switchgrass pyrolysis accounts for 6.8% of the capital cost, showing that this
gasoline. As for the capital cost distribution determined by the data-driven Wasserstein
DRO approach, we can see from Figure 9 (b) that the landfill methane extraction
contributes to 61.7% of the capital cost and accounts for the largest portion, which is
similar to the result of the stochastic programming method. The second largest cost
68
(9.4%) comes from switchgrass pyrolysis, thus again showing the significance of this
the cassava, including cassava peeling and crushing, cassava fermentation, and cassava
distillation, together account for merely 2.2% of the capital cost, implying that the
pathway producing ethanol from cassava is economically favorable. Note that the ratios
between capital investment costs for the two optimization methods can be different. The
reason for different ratios is that the optimal capacities of some technologies obtained
by the stochastic programming method and the DRO approach are not the same.
Following the existing literature [252], the discount rate is set to be 10%. To investigate
the impact of the discount rate on the computational results of the Wasserstein DRO
approach, we conduct a sensitivity analysis and present the result in Figure 10. From
the figure, we can see that the objective value of the DRO approach increases by 18.0%
when the discount rate changes from 5% to 10%. Additionally, the objective value
grows by 17.3% if we further increase the discount rate from 10% to 15%. Note that the
69
optimal investment decisions, i.e. technology selection and capacity, do not change
Figure 10. Sensitivity analysis of discount rate for the data-driven Wasserstein DRO
approach.
To investigate how the in-sample objective value, out-of-sample average cost, and
computational time of (WDRO) change with the radius of Wasserstein ball, we perform
a sensitivity analysis and present results under different values of parameter θ in Figure
11. The value of θ specifies the size of the Wasserstein ambiguity set, so the decision
maker can use it to adjust the level of conservatism. The ambiguity set encapsulates
distributions are hedged against in the (WDRO) model. Since the (WDRO) model
optimizes the worst-case expected cost with respect to the ambiguity set, increasing the
11. Additionally, we can observe that the out-of-sample average cost corresponding to
70
the testing samples decreases from $22.53MM to $21.24MM, when the radius of
Wasserstein ball changes from 0.01 to 0.03. When the radius further increases from 0.03
to 0.15, the out-of-sample performance in terms of average cost remains the same. The
optimal Wasserstein radius obtained from cross-validation is 0.25, which results in the
same out-of-sample average cost ($21.24MM) as the radii that perform best on the
testing data in Figure 11. From the orange line in Figure 11, we can see that increasing
the radius of Wasserstein ball does not add computational burden, and that the
computational time for solving the corresponding (WDRO) problem varies from 7.1 s
to 15.8 s.
Figure 11. Sensitivity analysis of the in-sample objective value, out-of-sample average
To demonstrate the efficiency of the proposed solution algorithm, we display the upper
and lower bounds in each iteration of the algorithm for the instance with Wasserstein
radius of 0.1 in Figure 12. In this figure, the green dots represent the upper bounds, and
71
the yellow circles stand for the lower bounds. The X-axis represents the iteration
number, and the Y-axis denotes the objective function values. From the figure, it can be
seen that the relative optimality gap decreases significantly from 57.9% to 9.8% during
the first two iterations. The reformulation-based branch-and-refine algorithm takes only
three iterations to reduce the relative optimality gap to 0.0%. The result demonstrates
that this solution algorithm works in solving the (WDRO) network design problem.
Figure 12. Upper and lower bounds in each iteration of the reformulation-based
case study.
To further explore the impacts of the number of training uncertainty data on the
optimization methods. In this case study, the number of training samples N increases
from 12 to 100. Specifically, we consider a case study in which 100 feedstock price
uncertainty realizations are used in the optimization problems, and another 100
72
uncertainty data are utilized for testing their out-of-sample performances. As the number
of training data increases, the computational times of both stochastic programming and
data-driven Wasserstein DRO methods grow to 14.9 seconds and 30.5 seconds,
respectively. The size of training samples does not influence the problem size of
deterministic optimization, which utilizes the average of training data as the nominal
values of parameters. By contrast, the size of training data has an impact on the problem
sizes of stochastic programming and data-driven Wasserstein DRO. This is because the
number of constraints and continuous variables for both methods increases, as the
amount of training samples grows. The results of the out-of-sample performance for
different methods are plotted in Figure 13, where the green diamonds denote the
Wasserstein DRO solution, and the orange circles represent the stochastic programming
solution. For clear visualization, their average costs over all testing scenarios are
represented as the horizontal lines in the figure. Additionally, the statistics of out-of-
sample performances under the larger amount of uncertainty data are summarized in
Table 3. As can be seen from the table, the data-driven Wasserstein DRO approach still
cost by 3.8% for the testing dataset. We investigate the dependence of this average cost
testing samples, and present the results in Figure 14. It can be observed from the figure
that both the average cost reduction and the standard deviation reduction change slightly
73
Figure 13. Out-of-sample performance of stochastic programming and the proposed
74
Figure 14. The dependences of the average cost reduction and standard deviation
2.6 Summary
optimization model for the biomass with agricultural waste-to-energy network design
subject to feedstock price uncertainty. Based on the Wasserstein metric and support set,
the data-driven ambiguity set was constructed that encompassed all candidate
distributions of feedstock price. This ambiguity set was formulated as the Wasserstein
ball with a variable radius, which granted more flexibility in adjusting the level of
operational decisions with full adaptability, but also hedged against the distributional
75
derived an equivalent distributionally robust counterpart for the network design problem
design was presented. Computational results showed that increasing the size of
ambiguity set did not result in more computational time, and that the proposed method
worked under a large amount of training data. As for the out-of-sample performance,
the proposed approach compared favorably against both deterministic optimization and
validation, the optimal Wasserstein radius was tuned to be 0.25, which generated the
best out-of-sample average cost of $21.24MM on the testing samples. The advantage of
the distributionally robust optimization approach lies in its robustness to hedge against
distributional ambiguity. If the underlying true distribution is invariant and the number
insignificant. Therefore, the advantage of using ambiguity set, whose size is nonzero, is
not evident. However, when the number of training samples is limited or the probability
merit of the distributionally robust optimization approach becomes more manifest. The
dependence results of average cost reduction and standard deviation reduction on the
number of testing samples showed that these reduction values remained relatively stable
76
2.7 Appendix: Derivation of Wasserstein distributionally robust
counterpart
For the ease of exposition, we present (WDRO) model for biomass with agricultural
waste-to-energy network design in the following abstract form, in which vectors and
s.t. Ax g (2.34)
min cTy y Gξ T y
l x, ξ y
s.t. Wy h x
where x denotes the vector of all first-stage decisions including Yj and Qj; y is the vector
of second-stage decisions Wj, Pi, and Si; f(x) represents the first-stage cost; ξ is the vector
of uncertain parameters c3,i; and l(x, ξ) presents the second-stage cost. Note that the
second-stage cost is divided into two parts, namely the deterministic cost c Ty y
unaffected by uncertainty and the random cost (Qξ)Ty depending on the specific
uncertainty realization.
ambiguity set in (2.2), we can re-express the worst-case expectation max l x, ξ
decisions [245]:
1 N
max
i
l x, ξ n dξ
N n1
(2.35)
77
1 N
s.t.
N n1
ξ ξ n n dξ (2.36)
1 1
n dξ , n N (2.37)
N N
According to the strong duality of the generalized moment problem [257], we can obtain
1 N
min +
, sn
n
N i 1
(2.38)
s.t. l x, ξ ξ ξ n n , ξ , n N d (2.39)
0 (2.40)
where λ and τn are the dual variables corresponding to constraints (2.36) and (2.37),
respectively.
Since constraint (2.39) holds for any uncertainty realizations within the support set Ξ,
max l x, ξ ξ ξ n , n N d
n
(2.41)
ξ
ξ zn *
max l x, ξ max z n T ξ ξ
n
max min l x, ξ z n T ξ ξ n
ξ z n *
(2.42)
min max l x, ξ z n T
z n * ξ
ξ ξ
n
78
where * denotes the dual norm and zn is the introduced decision variables. Since l1
norm is adopted in this work, the corresponding dual norm is l∞ norm. Note that the
max l x, ξ z n T ξ ξ
ξ
n
, n N
n d (2.43)
zn *
, n N d (2.44)
For the ease of derivation, we express the support set (2.3) in the following compact
matrix form:
ξ Cξ d (2.45)
I ξ max
where C and d min .
I ξ
According to the definition of l(x, ξ) in (1.6), we further reformulate the left-hand side
max
min
ξξ Cξ d y n y n Wy n h x
cTy y n Gξ T y n z n T ξ ξ n
min max cTy y n Gξ y n z n T ξ ξ
y n y n Wy n h x ξξ Cξ d
T n
(2.46)
Note that the first equality in (2.46) is due to the minimax theorem [124], and the second
79
By substituting z n G T y n CT γ n into constraints (2.43) and (2.44), and replacing the
left-hand side of constraint (2.43) with (2.46), we equivalently reformulate the (WDRO)
N
1
min
x , , sn , n , y n , γ n
f x +
N
i 1
n
s.t. Ax g, x X
Gξ
T T
y n cTy y n d Cξ
n
γ n n , n N d
n
(2.47)
Wy n h x , n N d
G T y n CT γ n , n N d
*
γ n 0, n N d
2.8 Nomenclature
Sets
Parameters
bi availability of compound i
80
c4,i price of bioproduct i
Binary variable
Yj selection of technology j
SOS1 variable
Continuous variables
81
CHAPTER 3
DEEP LEARNING BASED AMBIGUOUS JOINT CHANCE CONSTRAINED
ECONOMIC DISPATCH UNDER SPATIAL-TEMPORAL CORRELATED WIND
POWER UNCERTAINTY
3.1 Introduction
power systems operations [258]. It seeks to determine the optimal power output of
available generators for serving electricity demand with the minimum operating cost
[259]. With a pressing need to reduce carbon emissions from fossil fuels, the penetration
of renewable energy sources, especially wind power, into power grids has increased
rapidly in recent years. However, such a high penetration poses a threat to the security
and reliability of large-scale power systems due to the intermittency of renewable power
uncertainty models for an effective utilization of wind power. Methods along this
direction can be broadly categorized into three paradigms, namely, robust optimization,
period ED model was presented based on a dynamic uncertainty set, which modeled the
temporal and spatial correlations of wind output using linear systems [262]. By
the storage device dynamics and renewable energy variability were considered in an
82
affinely adjustable robust multi-period power dispatch. A two-level robust method was
developed to address the multi-microgrid ED problem subject to wind power and tie-lie
output schedule of wind farms and a set-point schedule of conventional generators were
formulation [267]. The aforementioned robust ED methods typically suffer from the
issue of conservatism, because they aim to immunize against the worst-case realization
An alternative approach was stochastic ED [268, 269], which mitigates the conservatism
constraints [270]. In [271, 272], versatile distribution models were suggested and
spatial correlations among multiple wind farms, the Gaussian mixture model was
introduced to chance constrained ED problems [273, 274]. In [275], beta kernel density
problem with individual chance constraints. To account for joint chance constraints in
power dispatch, an iterative bounding technique was developed using support vector
constraint violation was guaranteed with a sufficiently large number of uncertainty data
drawn from the underlying true distribution. The above research works on stochastic
83
However, such perfect information on the probability distribution of uncertain wind
power is far-fetched in practice. Instead, power system operators typically only have
[221], and it emerges as a promising paradigm for decision making in electric power
is one of those attempts, aiming to take advantage of both robust optimization and
constructs a family of probability distributions, called the ambiguity set, from a finite
number of data. For distributionally robust power dispatch problems, ambiguity sets for
well as additional unimodality information [288]. By using both mean and covariance
to describe the ambiguity set of wind power, a distributionally robust ED model was
flexibility of power systems, the co-optimization of ED and the do-not-exceed limit was
cast as a distributionally robust joint chance constrained program, which focused on the
distributionally robust chance constraint on transmission line capacity limit, which was
chance constrained power dispatch problem was investigated based on an ambiguity set
consisting of Gaussian distributions whose mean and variance were within certain
ranges [292]. The co-dispatch of energy, reserve and storage was suggested, and the
84
variable were used for constructing ambiguity sets [293]. In addition, another widely
adopted means of constructing an ambiguity set is via the notion of statistical distance
for the reliability of power systems, and incorporate the complicated, and likely
solutions.
To fill the knowledge gap, we propose a novel data-driven ambiguous joint chance
specific forms of probability distributions [298]. These salient features of GANs make
it desirable for optimization under uncertainty [221, 299]. Based on the extracted
f-divergence ball in the probability space centered around the distribution embodied in
the generator network. Note that the f-divergence not only plays a key role in designing
85
a unified framework of deep GANs, but also in providing a natural way to characterize
the distance-based ambiguity set. Rather than disregarding uncertainty correlations and
violating constraints with regard to wind utilization is below a tunable risk level. To
developed through the exploitation of the ED problem structure. An illustrative six bus
and IEEE 118 bus test systems are used to demonstrate the effectiveness of the proposed
ED framework.
wind power;
An f-divergence based ambiguity set for wind power distributions using f-GAN,
of which the training objective intimately aligns with the choice of divergence
86
A tailored solution method for ambiguous joint chance constrained ED problem
approach;
Theoretical bound on the data complexity of f-GAN for the ambiguous joint
problem, one schedules energy production and allocation at each time period to minimize
the total cost. The decisions include conventional thermal energy dispatch, wind power
dispatch, and load shedding amount. The available wind power is assumed to be
uncertain.
formulated as follows. The objective of the ED problem is to minimize the total cost in
eq. (3.1). The total cost includes the operating cost of thermal units, as well as electricity
load shedding cost. Eq. (3.2) enforces the energy balance for each time period. The
minimum and maximum power output limits of each thermal unit are specified in
Constraint (3.3). Constraints (3.4)-(3.5) enforce the ramping rate limits of thermal units
for each time period. The capacity constraints for transmission lines are described in
The ambiguous joint chance constraint (3.9) requires that, with a worst-case probability
87
of at least 1 , the power outputs of wind farms cannot exceed the random wind power
[133], and that the percentage of wind utilization is at least β. The satisfaction of these
two requirements is a random event because the available wind power is random and
wind power dispatch wbt is a scheduled quantity [270]. The wind power Wbt is subject to
ambiguous joint chance constraint enjoys the following advantages. First, it provides a
stronger guarantee on overall power systems security than individual chance constraints
enables a systematic trade-off between economic performance and the risk level of
constraints, it is well capable of hedging against distributional ambiguity due to the finite
min CiV pit CbLS qbt (3.1)
i t
pit , wbt , qbt
b t
s.t. p it wbt Dbt qbt , t (3.2)
b iGb b b
K p bl it wbt Dbt qbt Fl , l , t (3.6)
b iGb
88
K p
bl it wbt Dbt qbt Fl , l , t (3.7)
b iGb
The ED problem is cast as the above ambiguous joint chance constrained program.
Notably, the joint feature of ambiguous chance constraints not only leads to a less
among uncertain wind power at different buses and time periods. By contrast,
constraints for each bus and each time period are respected separately.
inf wbt Wbt 1 ˆbt , b, t
(3.10)
inf wbt Wbt 1 ˆ (3.11)
b t b t
Given the ambiguous individual chance constraints, the existing works typically
construct ambiguity set by employing the mean and variance of wind power
generation. The knowledge gap to fill is a data-driven ambiguity set that is capable of
sources. Such an informative ambiguity set can be seamlessly integrated with the
89
3.3 Deep learning based ambiguous joint chance constrained
In this section, we propose a novel deep learning based ambiguous joint chance
first present an introduction to the f-divergence and f-GANs. Then, we develop a deep
learning based ambiguity set of wind power distributions with the f-divergence. Finally,
In this subsection, we first present the f-divergence. To measure the discrepancy between
divergence, as follows:
dP
Df P Q f dQ (3.12)
dQ
f-divergence is widely used in the field of information theory and machine learning. It is
The functions of the f-divergence in this work are two-fold. On one hand, the divergence
represented by eq. (3.12) provides a powerful way to characterize ambiguity sets. On the
other hand, it plays a critical role in defining the objective function for a f-GAN model.
90
In this sense, the f-divergence offers a unique vehicle to link the ambiguity set used in
The promise of GANs is to decipher rich and hierarchical structures to characterize the
powerful generalization of vanilla GAN, the f-GAN has achieved great success in
machine learning and computer vision [300]. The f-GAN model leverages the variational
D f P Q max x ~ P x x ~ Q f * x
(3.13)
where denotes the expectation, and the maximum in the above variational formulation
is taken over all possible functions φ. The set of functions can be well approximated by
the expressive class of deep neural networks. Note that f * in eq. (3.13) represents the
In the f-GAN architecture, there are two deep neural networks, namely generator network
Generator: The generator network takes the noise vector Z with a known distribution Z
layers and fully connected layers. Note that Z is typically selected as uniform
discriminator network by generating data samples that mimic the real wind power data.
follows.
LG Z f * T G Z
(3.14)
where LG is utilized to update parameter θ. A small value of LG implies that the data
Discriminator: The generative model is pitted against the discriminator model. Given a
generator network, the real data samples and the samples generated by network G are fed
distinguish the generated wind power data from real ones. It employs a series of down-
sampling operations to generate a scale value T x , where x is either a sample from the
random variable drawn from the true data distribution r . The loss function that the
LT X T X Z f * T G Z
(3.15)
where LT represents the loss function of the discriminator, and is employed to update
parameter ω. In light of eq. (3.13), the discriminator network T serves as a critic that
Formally, the f-GAN is framed as the following two-player minimax game with value
function V G , T :
min max V G , T X T X Z f * T G Z
(3.16)
92
where function T x g f H x . The output activation function g f : dom f * is
The competition between the generator network and the discriminator network in the
above minimax game drives both deep neural networks to refine their model parameters.
Notably, f-GAN is general enough to incorporate several types of GANs. For example,
the vanilla GAN can be regarded as a special case when f-divergence is chosen as the
data from a probability distribution induced by the weights of the feedforward neural
In this subsection, we develop a novel ambiguity set for wind power distributions based
amount of wind power data, distribution G is exactly the same as distribution r from
a theoretical point of view [302], and their f-divergence reduces to zero accordingly.
might not be perfect due to the finite amount of available training wind power data.
93
Therefore, one needs to construct a family of distributions based on the information
distributions using the f-divergence and the deep generator network in f-GAN, which is
expressed by,
w D f w G (3.17)
where w denotes the probability distribution for the random vector of wind power Wbt,
represents the set of all probability distributions, and ρ is the divergence tolerance
or radius of divergence ball. Parameter ρ can be used to adjust the size of the ambiguity
based ambiguity sets that adopt the discrete empirical distribution as a reference
distribution, the proposed ambiguity set utilizes the continuous wind power distribution
The proposed deep learning based ambiguity set enjoys several advantages. First,
compared with conventional moment-based ambiguity set, the proposed deep learning
based ambiguity set can accurately capture the spatial-temporal correlations of wind
power at different buses and time periods. Second, the proposed method makes no
assumption on the specific form of wind power distributions owing to the power of deep
94
generative modeling. Another nice feature is stated as follows: if a certain f-divergence
f-GAN, we know from the previous subsection that the learned distribution G should
be quite close to the true data distribution r in terms of χ2-divergence. Thus, employing
the χ2-divergence to characterize the ambiguity set, rather than using other types of f-
Equipped with the data-driven ambiguity set (3.17), a deep learning based ambiguous
formulated as follows. For the ease of exposition, the model is represented in a compact
min cT x (3.18)
s.t. x S (3.19)
ai x ξ bi x , i 1 ,
T
(3.20)
where x denotes the vector of decision variables including pit, wbt, and qbt; c denotes the
vector of cost coefficients, and ξ is the vector of uncertain wind power. The objective
function (3.18) represents eq. (3.1), while set S stands for a domain defined by
95
leveraging the deep learning based ambiguity set .
joint chance constrained program. However, the resulting optimization problem cannot
For constraint (3.20), the corresponding joint chance constraints must be satisfied for all
probability distributions within the deep learning based ambiguity set. Therefore, we
consider the worst-case distribution, and ensure that the joint chance constraints are
T
inf ai x ξ bi x , i 1 (3.21)
where the infimum is taken over the wind power distributions in the deep learning based
ambiguity set.
Since the χ2-divergence is appropriate for small risk levels [280], we employ it for
training f-GAN and constructing the ambiguity set for the rest of this chapter. The
ambiguous joint chance constraint (3.21) can be further reformulated based on the
96
Proposition 3.1: Given a χ2-divergence based ambiguity set, the ambiguous joint chance
constraint (3.21) is respected if and only if the following classical joint chance constraint
(3.22) is satisfied.
G ai x ξ bi x , i 1
T
(3.22)
2 4 2 1 2
(3.23)
2 2
given in eq. (18), (19) and (22), with probability distribution G induced by the deep
generator network.
analytical reformulation for constraint (3.22). To address this challenge, we leverage the
The scenario approach, a.k.a. constraint sampling, has been widely used in a variety of
optimization problems. The key idea of the scenario approach is to draw independently
identically distributed random scenarios from the probability distribution and enforce
the constraints with respect to all sampled uncertainty scenarios. It is worth noting that
the scenario approach is well suited for the deep learning based joint chance constraint
97
(3.22), because it only requires the scenario sampling from G rather than the explicit
expression of wind power distribution. Therefore, the scenario approach facilitates the
Suppose ξ G1 , , ξ G K are the generated wind power data produced by the generator
network in f-GANs, where K represents the number of data samples. Thus, constraint
ai x ξG bi x , i, k K
T k
(3.24)
The scenario approach provides a theoretical guarantee that, with a sufficiently large
value of K, the optimal solution of the corresponding scenario program satisfies the
structure of the multi-period ED problem, we explicitly derive the data complexity for
Proposition 3.2: Given a confidence level 1 0,1 , and risk level 0,1 for the
1
4 ln BW NT 1
N , (3.25)
2 4 2 1 2
2
1
where BW represents the number of buses having wind farms, NT denotes the total
98
number of time periods, and ρ is the radius of the deep learning based ambiguity set.
Proof. The required number of wind power samples is dependent on the number of
support constraints. Note that the support constraints are defined as those constraints,
the removal of which changes the optimal solution. For a fixed b and t, constraint
wbt Wbt is active for the scenario with the smallest wind power value at bus b and time
by the scenario with the largest total wind power over all buses and time periods.
ED problem.
Based on the notion of support constraints [277, 303], we could obtain the bound on the
2 1
N , ln BW NT 1 (3.26)
According to eq. (3.23), we further have eq. (3.25) when χ2-divergence is employed,
Remark 3.1 The sample complexity in eq. (3.25) is for the generator network in f-GANs.
In contrast to the finite amount of real historical wind power data, the unique merit of f-
GAN lies in its capability of efficiently generating as many data samples as required in
eq. (3.25).
Remark 3.2 We can further leverage a prescreen technique described as follows. After
sampling N(, ) wind power scenarios from the generator network, instead of putting
all scenarios into (3.24), a prescreening technique can be leveraged to select the most
critical uncertainty scenarios. Specifically, the lowest levels of wind power at each bus
99
and time period, as well as the highest level of total wind power over all BW buses and
NT time periods are selected beforehand among N(, ) generated data. This technique
In this section, case studies on the six-bus and IEEE 118-bus systems are presented. To
ambiguity set based on first and second-order moment information of 𝑊 , of which the
All optimization problems are solved with CPLEX 12.8.0, implemented on a computer
with an Intel (R) Core (TM) i7-6700 CPU @ 3.40 GHz and 32 GB RAM. The optimality
tolerance for CPLEX 12.8.0 is set to 0. In case studies, the confidence level is set to be
The six-bus system has three conventional thermal generators and 11 transmission lines
[27]. Additionally, three wind farms are installed at Buses 1-3. To promote a high
penetration of wind energy, the percentage of wind utilization β is set to be 30%. The
load shedding cost is set to be $5/MW [270]. The wind power data we use comes from
we can see that there are three conventional thermal generators and 11 transmission lines.
Additionally, three wind farms are installed at Buses 1-3, as illustrated in Figure 15.
The training process of the f-GAN in case study on the six-bus system is shown in Figure
16. In Figure 16, the x-axis represents the number of iterations, while the y-axis denotes
the value of losses. From Figure 16, we can readily observe that it achieves a fast
training speed at the beginning of the training process. As the training moves on, the f-
divergence between real wind power distribution and the generated distribution
approaches to zero. This implies that the wind data generated by the generator neural
101
network look as realistic as the real ones, and that they cannot be distinguished by the
discriminator network.
Due to the high wind utilization percentage of 30%, the resulting DRCCED with
moment information turns out to be infeasible with both constraint (10) and constraint
constraint (11) regarding the efficiency of wind utilization. The number of wind power
eq. (3.25), N(, )=3,102.48 and the number of scenarios is chosen to be 3,103.
The computational results are provided in Table 4. Compared with DRCCED with
because it introduces a set of constraints for each generated wind power scenario. As a
result, it consumes 2.6 more CPU seconds of solution time. In terms of economic
performance, the proposed ED method is more cost-effective than the DRCCED method
102
with moment information via slashing the total cost by 33.3%. As can be observed from
the results in Table 4, the proposed method with the prescreening technique significantly
reduces its memory and computational time, which are comparable with those of the
Table 4. Comparisons of problem sizes and computational results for the DRCCED
The proposed ED
DRCCED
The proposed ED method with pre-
with moment
Parameter method screening
information
28% based on 100 testing wind power scenarios generated by f-GAN. This constraint
violation probability is much higher than the prescribed risk level of 10%, thus
jeopardizing the security of power systems under intermittent wind energy. By contrast,
which satisfies the requirement on risk level. Based on the comparison results, the
103
benefit of the derived theoretical bound lies in that it quantitatively dictates the required
To take a closer look at the cost breakdowns in case study on the six-bus system, we
present the cost distributions determined by the DRCCED method with moment
information and the proposed approach in Figure 17(a) and Figure 17(b), respectively.
From Figure 17(a) and Figure 17(b), we can readily see that the load shedding costs
account for more than 25% of the total costs for both methods. This is ascribed to the
relatively low load-shedding price. Notably, the percentage of load shedding determined
by the DRCCED method with moment information is 23% higher than that of the
proposed approach. The reason for this is described as follows. The DRCCED method
with moment information is less effective in wind power utilization compared with the
Figure 17. The cost breakdown of (a) the DRCCED method with moment
104
When the load-shedding price increases from $5/MW to $15/MW, no electricity load is
shed over the entire time horizon (NT=24). The power outputs of conventional
generation units are displayed in Figure 18. From Figure 18, we can see that most of the
Generator 1 contributes to 61.24% of the total cost for the proposed ED approach.
Figure 18. The power dispatch of each conventional generator determined by the
proposed ED approach.
scalability and effectiveness of the proposed deep learning based ED approach. This
system consists of 118 buses, 54 thermal generators, 186 transmission lines, and 91
loads [305]. Moreover, ten wind farms (denoted by WF1-WF10) are installed at Buses
8, 12, 23, 36, 42, 56, 69, 77, 88 and 93. In this case study, the percentage of wind power
105
Figure 19. The spatial correlations of the ten wind farm energy outputs for (a) real
wind power data, and (b) wind power data generated by f-GAN. The color darkness of
one single cell represents the level of spatial correlation coefficient for corresponding
two wind farms. Comparison of spatial correlations can be made by focusing on the
darkness patterns of heat maps. The temporal correlations of WF10 for (c) real wind
power data, and (d) wind power data generated by f-GAN. The level of auto-
can be done by considering the height of each bar for every time lag.
To demonstrate that the adopted f-GAN is able to capture spatial and temporal
106
the wind farms located at different buses. Figure 19(a) and 19(b) visualize the spatial
correlation results of the real wind power data and generated data, respectively. Note
that darker colors indicate stronger correlations in these heat maps. By comparing
Figure 19(a) and 19(b), we can easily identify the resembling patterns of spatial
correlation for the real data and generated data. For example, from Figure 19(b), we
observe that the wind power output at WF1 have strong positive correlations with the
ones at WF4, WF7 and WF10, which is exactly consistent with the underlying true
correlations in wind power time series, the autocorrelation coefficients are calculated
for each wind site. The results of real data and generated data for WF10 are displayed
in Figure 19(c) and 19(d). By inspection, the generated wind output series of WF10 has
similar auto-correlation coefficients as the real ones. Note that the auto-correlation
coefficients of generated wind data are close to the real one for other wind farms as well.
Thus, the wind scenarios generated by f-GAN retain both spatial and temporal
The problem sizes and computational results are summarized in Table 5. By setting
ˆbt ˆ 0.1 241 based on the Bonferroni approximation, the corresponding DRCCED
results for the DRCCED method with moment information ( ˆbt ˆ 0.1 ) are provided.
Note that N(, ) equals to 9,747.19 based on eq. (3.25). Thus, the number of wind power
Table 5. Comparisons of problem sizes and computational results for the DRCCED
The proposed ED
DRCCED
The proposed ED method with pre-
with moment
Parameter method screening
information
Figure 20. The empirical distribution of the wind power utilization efficiency for (a)
108
To examine the wind energy utilizations of the DRCCED method with moment
utilization for each generated wind power scenario and obtain its empirical probability
distributions, as shown in Figure 20. As can be observed from Figure 20(a), the wind
utilization percentage for the DRCCED method with moment information ranges from
is much higher than the prescribed percentage of wind power utilization 15%. Moreover,
the probability distribution in Figure 20(b) is lopsided with more probability mass
locating on higher values. This observation again illustrates that the proposed approach
3.6 Summary
In this work, a novel f-GAN based ambiguous joint chance constrained ED optimization
framework was proposed. The deep learning based ambiguity set well captured the
number of generated wind power data, which depended on the number of installed wind
farms and the number of time periods in the ED problem. The comparison results with
an arbitrarily chosen number of scenarios showed that the developed theoretical bound
technique could be further leveraged to speed up the solution process, thus facilitating
109
the scalability of the proposed approach in the large-scale IEEE 118 bus system.
3.7 Nomenclature
Parameters
Kbl Power flow distribution factor for the transmission line l due to
110
RDi Ramp down rate of generator i
Decision Variables
111
CHAPTER 4
ONLINE LEARNING BASED RISK-AVERSE STOCHASTIC MODEL
PREDICTIVE CONTROL OF CONSTRAINED LINEAR UNCERTAIN SYSTEMS
4.1 Introduction
Over the past few decades, model predictive control (MPC) has established itself as a
modern control strategy with theoretical grounding and a wide variety of applications
systems subject to control input and state constraints. Implemented in a receding horizon
fashion, MPC solves a finite-horizon optimal control problem at each sampling instant
and only performs the first control action. This procedure is repeated at the next instant
with a new measurement update. However, the presence of uncertainty could inflict
Motivated by this fact, a lot of research efforts have been made on designing MPC that
accounts for the uncertainty of prediction. Robust MPC strategies, in which disturbances
are modeled using a bounded and deterministic set, aim to satisfy the hard constraints
of states and control inputs for all possible uncertainty realizations [311-314]. Designing
robust MPC is necessary and efficient when state and input constraints need to be
violations in a systematic way [315]. Additionally, stochastic MPC can increase the
112
region of attraction by the means of chance constraints, thus allowing for the systematic
Due to its attractive feature, stochastic MPC has stimulated considerable research
interest from the control community [315-317]. The existing literature in stochastic
MPC can be typically grouped into two main categories depending on whether the
constraints via the explicit use of uncertainty distributions [318-325]. The conventional
stochastic MPC strategies rely heavily on the assumption that the probability
disturbance distribution can be inferred from data. In such data-driven settings, chance
constraints in these stochastic MPC frameworks are no longer ensured because there
always exists a gap between the underlying true distribution and the estimated one. For
the case with unknown disturbance distribution, the probabilistic constraints can be
inequality based method guarantees constraint satisfaction for any distributions sharing
the same mean-covariance information, which is in the same spirit with an emerging
113
linear systems with probabilistic constraints assuming the first and second-order
covariance information is utilized, and the existing ambiguity set is not updated in an
leverage the information embedded within data to improve control performance. The
remarkable progress in machine learning and big data analytics leads to a broad range
Additionally, the dramatic growth of computing power has enabled such organic
integration. Recently, learning-based MPC has attracted increasing attention from the
control community [335-338]. One such method leveraged statistical identification tools
approximate model with uncertainty bounds for constraint tightening in robust tube
MPC [335, 339]. Along the same research direction, a learning-based robust MPC was
developed to integrate control design with offline system model learning via a set
uncertainty, an adaptive dual MPC was designed [341], where control served as a
probing role to learn system model [342, 343]. To learn system nonlinearities from data,
several MPC strategies leveraged Gaussian process regression which provides system
dynamic model as well as residual bounds [344, 345]. Learning-based MPC is well
suited for the exploitation of data value to enhance control performance [336], and as
such makes it a practical and appealing control tool in the era of big data [346].
114
models, which essentially learn a function by the means of regression techniques along
with their error bounds. Although some of these learning-based MPC methods allow for
online or adaptive model learning, most of them focus on the robust control framework.
Few studies have organically integrated online learning with stochastic MPC.
To fill this research gap, there are several computational and theoretical challenges that
adaptive to data complexity automatically. In the context of online learning, one can
hardly pin down the complexity of disturbance model at the beginning, since more data
stream in over the runtime of MPC and data complexity can grow over time. Another
key research challenge is how to develop a framework that organically integrates online
learning with MPC for intelligent control. In particular, with more and more data
collected, the online learning method needs to be scalable with sample size in terms of
both memory and computational time. The third challenge lies in the development of a
uncertainty distribution. This calls for the theory extension in distributionally robust
optimization with the nonparametric ambiguity set, since there are no theoretical results
available for direct use. The fourth key research challenge is how to guarantee recursive
feasibility and stability of online learning-based stochastic MPC. This challenge arises
115
from the integration between online learning with stochastic MPC. Specifically, the
recursive feasibility.
This work proposes an online learning-based risk-averse stochastic MPC framework for
setting where the distribution can be partially inferred from data. To immunize the
the ambiguity set based on the structural property, namely multimodality, along with
local first and second-order moment information of each mixture component, the
number of which is automatically derived from disturbance data. During the runtime of
the controller, real-time disturbance data are exploited to adapt the uncertainty model in
ambiguity set from real-time disturbance data and controlling the system with updated
the resulting distributionally robust CVaR constraints over the DPMM-based ambiguity
116
Additionally, we introduce a safe online update scheme for ambiguity set such that the
recursive feasibility and closed-loop stability are ensured. Numerical simulation and
comparison studies show that the proposed MPC method enjoys less-conservative
control performance compared with the conventional distributionally robust control that
uses global mean and covariance information of disturbance. Additionally, thanks to the
online learning scheme, the proposed MPC is advantageous in terms of low constraint
A novel online Bayesian learning based risk-averse stochastic MPC framework that
An online data-driven approach with DPMM to devise ambiguity sets that are self-
A novel constraint tightening method for risk-averse stochastic MPC, in which data-
driven CVaR constraints over the DPMM-based ambiguity set are equivalently
with the introduction of a novel safe update scheme for ambiguity sets.
One important contribution of this work is a novel and organic integration of online
nonparametric ambiguity set based on the DPMM is first developed in this manuscript.
Note that the objective of the constraint tightening method is to facilitate the solution of
the proposed MPC. To the best of our knowledge, few research works investigate
theoretical guarantees on recursive feasibility and stability of stochastic MPC in the face
117
of time-varying disturbance distribution. Therefore, the establishment of recursive
feasibility and stability with the safe update scheme is another novel contribution,
Notation: The notation used in this chapter is standard. For sets and ,
a b a , b and a a denote the Minkowski set
addition and Pontryagin set difference, respectively. [A]j represents the j-th row of the
matrix A, and [a]j denotes the j-th entry of the vector a. For matrices A and B, their
Consider the following linear, time-invariant uncertain system with additive stochastic
disturbance.
additive disturbance. Let x n Hx h
and u m Gu g be the
polytopic constraints on state and input, both of which contain the origin in the interior.
We make the following assumptions regarding the system and additive disturbance.
118
Assumption 4.1 The measurement of system state xk is available at time k.
This is a common assumption. At each sampling time t+1, the realization of disturbance
access to real-time disturbance data for online learning which refines the knowledge of
uncertainty.
Assumption 4.3 The additive disturbance w has a bounded and convex support set
Given the state measurement at time k, the predicted model in MPC is given by,
where xl k and ul k represent the l-step ahead state and control input predicted at time k,
constraint violation via the use of chance constraints. Following the literature [318, 348,
349], we consider the chance constraints of states, as in (4.3). The advantage of such
H i xl 1 k h i xl k 1 i , i 1, p (4.3)
119
where H pn , h p , parameter [ε]i is a pre-specified risk level for i-th constraint
on system state. Note that (4.3) becomes hard constraints when the probability mass of
all disturbances is strictly greater than zero and [ε]i is set to be zero.
Chance constraints ensure the constraints be satisfied with a probability of at least 1−[ε]i,
Stochastic MPC can enlarge the feasible region of the corresponding finite-horizon
optimal control problem via tunable risk levels in chance constraints, thus improving
practice, however, such precise knowledge of probability is rarely available, and only
partial information can be inferred from historical as well as real-time disturbance data.
Due to the finite amount of uncertainty data, the assumed disturbance probability
could deviate from the underlying true distribution. Consequently, the actual constraint
violation resulted from conventional stochastic MPC could become worse than the pre-
specified one. Additionally, the chance constraints per se (in the form of (4.3)) focus on
the frequency of constraint violation and fail to account for the violation magnitude
[159, 350]. Therefore, we introduce the definition of CVaR and distributionally robust
120
(CVaR) of random loss function L at level with respect to probability is defined
below.
1
-CVaR L inf L
(4.4)
where denotes the expectation with respect to probability . The CVaR can be
probability distribution of L.
distributionally robust CVaR version of constraints H i xl 1 k h i xl k 1 i ,
sup -CVaR
k i
H x
i l 1 k
h i 0 (4.5)
sup -CVaR
k i
H x
i l 1 k
h i 0 not only hedge against the distributional
ambiguity, but also penalize severe constraint violations that could be detrimental to
system safety.
Meanwhile, hard constraints are imposed on control inputs due to the physical limitation
Gul k g (4.6)
xl k zl k el k (4.7)
where zl k and el k denote the nominal part and stochastic error part of predicted state
xl k , respectively.
feedback, the predicted input for the uncertain system can be represented by
ul k Kel k vl k (4.8)
where K is a stabilizing feedback gain, vl k represents the predicted control input for the
follows.
vl k Kzl k cl k (4.9)
With predictive control laws ul k Kel k vl k and vl k Kzl k cl k , the system dynamics
el 1 k el k wl k , e0 k 0 (4.11)
We consider the cost for the nominal system in this work. Specifically, the control
objective is to minimize the infinite horizon cost at sampling time k, as given below.
122
J zlTk Qzl k vlTk Rvl k (4.12)
l 0
Furthermore, the following assumption on detectability is made such that there exists
an LQR solution.
Suppose the feedback gain matrix K is chosen to be LQR optimal, we can further rewrite
N 1
J clTk R BT PB cl k xkT Pxk (4.13)
l 0
that the second term xkT Pxk is a constant. A quadratic finite-horizon cost is given by
N 1
J N c N k clTk R BT PB cl k (4.14)
l 0
T
where c N k c0 k T ,, cN 1 k T represents the decision vector at time k with a horizon
of length N.
ambiguity set for the stochastic disturbance based on the DPMM. Then, an efficient
constraint tightening method for the CVaR constraints on system states over the
ambiguity set is developed for the synthesis of stochastic predictive controller. Finally,
123
based on an online safe update scheme, the predictive control algorithm that organically
data
follows.
The Dirichlet process (DP) constitutes a fundamental building block for the DPMM that
denoted as G ~ DP , G0 . For any fixed partitions (A1, …, Ar) of Θ0, we have the
following
Following the stick-breaking procedure [351], a random draw from the DP can be
sampled from G0, and δ(φk) denotes the Dirac delta function at k . The parameter
𝛽̅ represents the proportion being broken from the remaining stick, and follows a Beta
124
The Bayesian nonparametric model, i.e. DPMM, employs k as the parameters of some
data distribution. Based on the DP, we summarize the basic form of a DPMM as follows
[352, 353]:
k , k k 1 ~ DP , G0
ln ~ Mult (4.16)
on ~ F (ln )
where Mult denotes a multinomial distribution, ln is the label indicating the component
or cluster of observation on, n is an index ranging from 1 to Nd, and data o1, …, oNd are
G is discrete with probability one. Such discreteness further induces the clustering of
data.
Due to its computational efficiency, the variational inference has become a method of
choice for approximating the conditional distribution of latent variables in the DPMM
given observed data [353]. In the variational inference, the problem of computing the
using a coordinate ascent method. Following the literature [352], we use the mixtures of
Gaussian in this work. Therefore, we can choose n ~ NW k , k , k , k 1 , where
k k , H k includes mean vector and precision matrix and NW represents the normal
For the online learning setting, suppose that the real-time data wt is collected for the
control system. To learn from the streaming data, we employ an online variational
125
inference algorithm in this work [347]. This algorithm features faster computation and
bounded memory requirement for each round of learning. It is well-suited to learning the
distribution online from real-time data over the runtime of MPC. The algorithm iterates
between the model building phase and the compression phase. In the model building
phase, clump constraint Cs is the set of indices satisfying that i, j Cs , disturbance
data wi and wj are generated from the same mixture component. Disturbance data within
the same clump are summarized via the average sufficient statistics, which encapsulates
all the information needed for the purpose of inference [347]. The new disturbance data
at time t are used to update the inference results, then they are discarded to reduce the
memory requirement and fast computation. By introducing the compression phase, the
algorithm not only is computationally efficient, but also requires bounded memory
space. The clump constraints are determined in the compression phase in a top-down
recursive fashion. Specifically, the computation burden at each model update using new
disturbance data does not grow with the processed disturbance data amount. For more
details on this online learning algorithm for the DPMM, we refer the readers to [347].
Definition 4.2 (Data-driven ambiguity set). The ambiguity set based on the DPMM,
m
k
j 1
j j , j , j
k k k
(4.17)
126
where w Ew f is the support set of the additive disturbance under
Assumption 4.3, m(k) denotes the number of mixture components, the mixing weight
estimates for the j-th mixture component obtained from the online learning algorithm.
Note that the support set plays a key role in ensuring recursive feasibility by the means
of terminal set.
The data-driven ambiguity set for the stochastic disturbance based on the DPMM is
devised as a weighted Minkowski sum of several basic ambiguity sets, the number of
which is automatically determined from disturbance data using the online variational
d 1
j , j , j
k k
d
j
k
(4.18)
T
T d jk jk jk
measure.
There are several highlights of the proposed data-driven ambiguity set. First, the
data. Second, each basic ambiguity set is devised using the mean and covariance
information, which endows the resulting stochastic MPC with enormous computational
127
benefits. Third, the proposed ambiguity set leverages the fined-grained distribution
information, namely local moment information. This feature implies that the resulting
stochastic MPC enjoys a less conservative control performance compared against the
control method with a conventional ambiguity set based on global moment information.
constrained optimization
The purpose of constraint tightening is to obtain constraints for system states of the
nominal system, such that the states of the uncertain systems are satisfied with a
For state constraints, the corresponding distributionally robust CVaR constraints can be
distributionally CVaR constraints if and only if the nominal system satisfies the
tightened constraints zl 1 k k l
i 1 k given below.
i , with
k z n Hz h k (4.19)
where [η(k)]i is the optimal objective value of the following problem (4.20).
128
min
m
s.t. i i j tij j T ij j j Tj ij 0
j 1
1
ij
2
ij E T ij
0, j (4.20)
1 ET T
ij ij tij f ij
T
2
ij
1
2
ij H i E T ij
T
0, j
1 H T ET
T
ij ij tij i f ij
T
2 i
ij 0, ij , ij 0, j
Note that we drop the index k from m k , j , j and j for notational simplicity.
k k k
The constraints in (4.20) are LMI constraints. The proof of Theorem 4.1 is provided in
Appendix B.
Remark 4.1 The support set is assumed to be a polyhedron in Assumption 4.3, because
polyhedral support can be relaxed to a general compact convex set. In this case, those
constraints still admit robust counterparts, yet more complicated than LMIs.
(SDP), which can be solved efficiently using the off-the-shelf optimization solvers, such
online fashion.
129
For hard constraints on control inputs Gul k g , the corresponding constraint
vl k l K l 1
i 0 i (4.21)
where u m Gu g .
The proposed online learning based risk-averse stochastic MPC In this section, we first
introduce the finite horizon optimal control problem of the proposed MPC with a safe
ambiguity set update scheme. Then, the overall description of the resulting online-
In the online learning based risk-averse stochastic MPC paradigm, the finite horizon
optimal control problem is solved repeatedly online. The optimal control problem
min J N c N k
cN k
(4.22)
vl k Kzl k cl k (4.24)
zl 1 k l 1 , l 0, N 1
k
(4.25)
vl k l , l 0, N 1 (4.26)
zN k f
k
(4.27)
130
where sets l k 1 and l k 1 are defined using a safe update scheme described as follows.
In the developed update scheme, the condition is identified under which it remains safe
to utilize the tightened constraints with the updated ambiguity set in the risk-averse
resulted from current results of online variational inference, one can safely incorporate
the newly-learned uncertainty information into the control problem; otherwise, one
To describe the condition checked by the safe update scheme, we need the definition of
c*N k c0* k , c1*k , , c*N 1 k to the MPC problem at time k, the candidate solution at time
c k 1 c1*k , , c*N 1 k , 0 (4.28)
Note that it is common to employ the candidate solution as the shifted optimal input
augmented by zero [314, 355]. This work employs the dual mode prediction paradigm
nominal system zl 1 k Azl k Bvl k is able to steer the nominal system state (in the
terminal set) to the origin. The last term of zero in the candidate solution actually comes
the last term to be zero. The explicitly given candidate solution plays a critical role not
131
only in establishing recursive feasibility, but also in proving closed-loop stability [314,
355].
Based on the candidate solution, the safe update scheme checks the following
conditions:
ˆ k 1
zl k 1 (4.29)
l
zN k 1 ˆ f
k 1
(4.30)
where zl k 1 denotes the predicted state of nominal system corresponding to the
terminal constraints.
Definition 4.4 (Robust positively invariant set) A set Ω is a robust positively invariant
Definition 4.5 (Maximal Robust Positively Invariant set) A set Ω is a maximal robust
if Ω is a robust positively invariant set and contains all robust positively invariant sets.
Remark 4.2 The MRPI sets are computed by using a standard approach based on the
132
ˆ k is defined as the Maximal Robust
Definition 4.6 (Terminal set) The terminal set f
Positively Invariant (MRPI) set for the following system zN k 1 zN k N wk that
ˆ k and Kz .
satisfies the tightened constraints z N k N Nk N
Based on Definition 4.6, the terminal set should be MRPI set which satisfies
ˆ k N
ˆ k and z ˆ k z ˆ k , Kz .
f f f N N
Note that the terminal controller using the feedback gain K respects state and input
It is safe to update the ambiguity set when an indicator called flag equals to 1; otherwise,
when flag=0, updating the ambiguity set could jeopardize the recursive feasibility of the
proposed MPC. Therefore, the corresponding tightened constraints adopted in the online
l
k 1 ˆ
flag l
k 1
1 flag
l
k
(4.31)
Similarly, we can safely update the terminal set in the following way:
f
k 1 ˆ k 1 1 flag k
flag (4.32)
f f
The proposed MPC algorithm is detailed in Figure 21. The online-learning based risk-
averse stochastic MPC algorithm can be roughly divided into two blocks: (i) offline
computation of the sets, ambiguity set construction based on historical disturbance data,
and (ii) online learning from real-time disturbance data and online optimization. At each
time, the MPC strategy only implements the first control action. From Figure 21, we
can see that it alternates online optimal control of the system, and online learning for
ˆ k 1
real-time disturbance data. If the candidate solution satisfies the conditions zl k 1 l
133
ˆ k 1 , the newly learned uncertainty information is incorporated into the
and zN k 1 f
predictive control strategy to improve the control performance over its runtime. In the
time, and only the first control action is performed as control input uk . The
corresponding first control action u0* k Ke0 k v0* k . Since e0 k 0 , we can further have
u0* k 0 v0* k v0* k . Therefore, we have control input uk v0* k as in Step 5 of the
algorithm.
134
Remark 4.3 The upper bound of memory requirement of the online learning algorithm
n 2 3n
is 1 N c nN s , where N c is the number of clumps, N s is the number of
2
singlets, and n denotes the data dimension. Note that computational cost for storing the
n 2 3n
sufficient statistics for each clump is 1 . Compared with the batch learning
2
only O K N c N s 1 during the model building phase [347], where K denotes the
In this section, the properties of the proposed online learning based risk-averse
is initially feasible, it remains feasible for all the subsequent sampling instant.
As pointed out in Section 4.1, the important property of MPC, namely recursive
online updated ambiguity set of disturbance distributions. To this end, a novel safe
update scheme for the ambiguity set is developed for the MPC framework, along with
135
By employing the safe ambiguity set update scheme developed in Section 4.3.3, the
recursive feasibility and closed-loop stability are ensured despite that the disturbance
not time-varying for the ease of exposition, as indicated in Assumption 4.3 where
matrices E and f are not indexed by time k. For the time-varying support, the derivation
and conclusion of proposed constraint tightening in Theorem 4.1 are still valid. The
only issue with the varying support is that the MRPI set could become empty when the
support becomes sufficiently large, which further leads to the infeasibility issue.
A standard assumption underpinning tube based MPC on the terminal set is made as
Assumption 4.5 There exists a nonempty terminal set *f for the tightened constraints
* z n Hz h where max H .
0 0 i i
Proposition 4.1. Under Assumption 4.5, the terminal MRPI set f is always
k
Remark 4.4 Based on the proof in Appendix C, we can see that set *f is a robust
positively invariant set, not necessarily the MRPI set, for the updated tightened
constraints. Therefore, in the case where very limited computational budget is imposed,
the offline-computed set *f can serve as the terminal set to guarantee recursive
feasibility and stability without re-computing the MRPI set at each time step.
136
First, we prove the recursive feasibility of the proposed MPC, as given in the following
theorem.
Theorem 4.2 (Recursive feasibility). Let xk denote the feasible region of the finite
horizon optimal control problem (OL-SMPCk) for state xk. If x0 , then given
Remark 4.5 Since the distribution of stochastic disturbance can be arbitrarily time-
varying, the feasibility of the candidate solution based on the adaptive constraints
(without using the safe update scheme) cannot hold universally. To the best of our
knowledge, the developed safe update scheme presents the first attempt to successfully
disturbance distributions.
Before proving the stability of the proposed MPC, we provide the definition of minimal
Definition 4.8 (minimal robust positively invariant set) Minimal robust positively
R limli 0 i (4.33)
l
The following theorem establishes the stability of the closed-loop system with the
the risk-averse stochastic MPC using global mean and covariance in the ambiguity set
and the risk-averse stochastic MPC without online learning, in addition to the proposed
MPC approach for the purpose of comparison. The online learning algorithm and the
experiments are performed on a computer with an Intel (R) Core (TM) i7-6700 CPU @
3.40 GHz and 32 GB RAM. We use the YALMIP toolbox in MATLAB R2018a [359].
The GUROBI 8.0 solver is adopted to solve the finite-horizon optimal control problem,
and SeDuMi 1.3 is employed to solve the constraint tightening problem (4.20). The
related sets, including terminal sets and the robust positively invariant set, are obtained
via multi-parametric toolbox 3.0 [360]. The prediction horizon N is set to be nine. Note
that the distributionally robust control method (Van Parys et. al, 2016) considers a
setting similar to ours, e.g. the risk-averse stochastic control setting where the
sampled double integrator [312], to demonstrate the effectiveness of the proposed MPC.
The state and control constraints in the risk-averse stochastic MPC is given as follows.
k
sup -CVaR 0.2 0 1 xl 1 k 2 0 (4.35)
u u 5 (4.36)
The initial condition of system states x0 5, 2 , and the support set of disturbance
T
is w w 0.6 . In the numerical example, the matrix gain K is decided to be the
1 0
unconstrained LQR solution with matrix Q and R=0.01. Specifically, the
0 1
0.6696 0.3370
matrix gain K 0.6609 1.3261 , so . Set R∞ can be
0.6609 0.3261
simulations are performed with a simulation horizon of 20 time steps. The closed-loop
cost function is J cost ks1 xkT Qxk ukT Ruk , where simulation horizon length Ts=20.
T
Compared with the risk-averse stochastic MPC with a mean-covariance ambiguity set,
the proposed MPC method exploits the fine-grained uncertainty information, and is less
conservative via reducing the closed-loop cost by an average of 9.77% over all
simulation runs. To take a closer look at the computational time breakdown of the
proposed MPC, we present the average computational time for online learning,
constraint tightening through solving (4.20), online control via solving (OL-SMPCk),
139
and computing the MRPI set in Figure 22. The average values of computational time
are calculated over 2,000 (20 100) time steps. From the results, we can see that the
each time step, the proposed MPC not only enables the incorporation of updated
computational cost. Since the example is a benchmark problem, the computational time
approach.
Remark 4.6 The adopted approach of computing the MRPI set is computationally
recursions [358]. In the above numerical example, it takes only 0.057s on average to
compute the MRPI set, and the longest time of computing the MRPI set is 0.105s.
140
Figure 22. The average computational times of the proposed online learn-ing based
standard deviation used to generate historical disturbance data is 0.005, while the
standard deviation increases to 0.3 for the real-time disturbance data generation. In this
1 0
example, matrix Q , R=1, and the support set of disturbance is
0 1
w w 0.10 , and the distributionally CVaR constraints is given below.
k
sup -CVaR 0.15 0 1 xl 1 k 1.2 0 (4.37)
141
To take a closer look at constraint violations under time-varying disturbance
disturbance sequences. Figure 23 shows a set of state trajectories using the proposed
MPC for a simulation horizon of 20 time steps. Note that the prediction horizon N=9.
For the proposed MPC strategy, the average constraint violation in the first nine steps
over all simulations is 7.2%, even if the disturbance variation becomes significant
stochastic MPC using distributionally robust CVaR constraints without online learning
This constraint violation is higher than the prescribed tolerance of 15.0%, thus
jeopardizing the safety of the control system. Notably, the risk-averse stochastic MPC
without online learning scheme implements constraint tightening offline using historical
142
Figure 23. (a): The closed-loop trajectories of system states for the proposed online
sequences, (b): The zoom-in view of state trajectories near the upper limit of x(2).
By leveraging the online update of the ambiguity set, the time-varying distribution is
well captured by the proposed MPC method, and the constraint tightening is adaptive to
tightening of the proposed MPC over time in a simulation run. From Figure 24, we can
readily see that the effect of adaptation in the proposed MPC is evident. Specifically, η
increases from 0.016 to 0.10 based on the updated information of the stochastic
143
disturbance. The values of flag equal to one over the entire horizon, which indicates that
Figure 24. The online adaption of constraint tightening parameters in the proposed
4.6 Summary
disturbance. It incorporated online learning into MPC with desirable theoretical control
systematic approach to construct the ambiguity set from real-time disturbance data was
developed, which leveraged the structural property of multimodality and local moment
information. Additionally, the exact reformulation for the distributionally robust CVaR
144
constraints was derived as LMI constraints to facilitate constraint tightening. The online
safe update scheme to ensure the recursive feasibility and stability of the resulting MPC.
The computational results demonstrated that the control performance of the developed
MPC is less conservative compared with the one using mean and covariance information
below [356].
yl 1 k yl k , l 0
(4.38)
z0 k
c0 k B
where the initial state y0 k , and the system matrix 0 , matrix
cN 1 k
0 Im 0
I m 0 0 , and matrix .
0 0 Im
0 0 0
145
Based on the autonomous system dynamics, the stage cost of
J zlTk Qzl k vlTk Rvl k can be written below.
l 0
R Kz
T
zlTk Qzl k vlTk Rvl k zlTk Qzl k Kzl k cl k lk
cl k
(4.39)
ˆ
ylTk Qylk
Q K T RK K T R
where matrix Qˆ .
RK T R
T
J ylTk Qy
ˆ yT y
lk 0k 0k (4.40)
l 0
zc
We express matrix as z . By substituting , and Q̂ into the
cz c
equation T Qˆ , we have
z T z Q K T RK (4.41)
c B z B T cz B B zc T c T R
T T
(4.43)
Since feedback gain K is the LQR optimal solution and (4.41), we have is the unique
solution of the Riccati equation and z P . Based on the expression of LQR solution
146
BT z RK
1
BT z A B B T z B R B T z A R BT z B R BT z A
1
(4.44)
B z A B z B R B z B R B z A
T T T 1 T
0
According to (4.42) and (4.44), we have cz T cz 0 , which implies cz 0 is
the solution. Since matrix is symmetric, we can further have zc 0 . Therefore, the
c T c T R BT PB (4.45)
c1,1 c1, N
Let matrix c . By plugging the block matrix into (4.45), we
N ,1
c cN , N
R BT PB, i j
c
i, j
(4.46)
0, i j
N 1
Based on (4.40) and (4.46), we have J clTk R BT PB cl k xkT Pxk .
l 0
xl 1 k zl 1 k el k wl k (4.47)
Note that conditional distributionally robust CVaR constraints need to be hold for any
147
zl 1 k ˆ l k 1 k l
i 1 i (4.48)
z sup -CVaR
k
k i
H z h H w 0, i 1, p
i i i lk
(4.49)
sup -CVaR
k i
H z h H w 0
i i i lk
(4.50)
the definition of CVaR and the stochastic minimax theorem [139, 362, 363].
sup -CVaR
k i
H z h H w
i i i lk
k
i
sup inf i
1
i H z h H w
i i i lk
i
(4.51)
i
inf i
1
sup
i k H z h H w
i i i lk
i
where
represents the expectation with respect to probability . The first equality
theorem.
148
m
H z h H
sup j i i i i j d
1 ,, m j 1
j d 1
(4.52)
s.t. j d j j 1, , m
T j d j j Tj
t
m
min j ij j T ij j j Tj ij (4.53)
tij ,ij ,ij
j 1
H i z h i H i i tij (4.55)
T ij T ij 0, , j
ij 0, j (4.56)
where tij, ωij, and Ωij are the dual variables corresponding to the constraints in the
ambiguity set. Since constraints (4.54)-(4.55) are semi-infinite constraints, some further
following LMI constraints, based on the duality of convex quadratic programs and
149
1
ij
2
ij E T ij
0, j 1, , m (4.58)
1 ET T
ij ij tij f ij
T
2
ij
1
2
ij H i E T ij
T
0, j (4.59)
1 H T ET
tij i h i H i z f ij
T
T
ij ij
2 i
sup -CVaR
k i
H z h H w 0 over the DPMM-based ambiguity set is
i i i lk
reformulated as follows.
j 1
ij
1
2
ij E T ij
0, j
1 ET T
ij ij tij f ij
T
2 (4.60)
ij
1
2
ij H i E T ij
T
0, j
2
ij i ij ij i i
1 H T ET T t h H z f T
i ij
ij 0, ij 0 , ij 0, j
where
150
k min
i
(4.62)
s.t. sup -CVaR
k i
H i wl k 0
The constraint in (4.62) can be converted into constraints (4.20). Then, we convert
H i zl 1 k
max
h
i
k
i
(4.63)
k sup -CVaR
i
k
i H i wl k 0
We reformulate constraint (4.63) into (4.60) using the same reformulation technique.
i min
s.t. sup -CVaR
*
i
H w i lk
0 (4.64)
i min
s.t. i i ti 0
(4.65)
ti 0
ti i max H i
1
According to constraints in (4.65), we can further have i max H i 1 ti .
i
151
Since k * , it holds that 0 i , i, k . Given inequality 0 i ,
k k
i i
a robust positively invariant set (not necessarily MRPI set) for the updated tightened
constraints. Since *f is nonempty under Assumption 4.5, there always exists a
nonempty MRPI set according to Definition 4.5. This completes the proof. □
c*N k c0* k , c1*k , , c*N 1 k is the optimal solution to problem (OL-SMPCk) at time k.
c k 1 c1*k , , c*N 1 k , 0 using Definition 4.3. The nominal state under control input c k 1
is shown as follows.
z1*k wk Bc1*k (4.68)
z2* k wk
152
According to z0 k 1 z1*k wk and z1 k 1 z2* k wk , we have the following relation by
induction.
Similarly, we have the relationship between the optimal solution at time k and the
For the optimal control problem (OL-SMPCk+1), we can consider two scenarios,
namely flag=0 and flag=1. For the scenario in which flag=0, we have l k 1 l k
according to l
k 1 ˆ k 1 1 flag k . Based on z z* l w and
flag l l l k 1 l 1 k k
Next, we derive the candidate nominal state at the end of the horizon as follows.
A z *N k N 1wk B Kz *N k K N 1wk (4.73)
z *N k N wk
153
Up to now, we have checked all the constraints, we can conclude that the candidate
For the scenario where flag=1, it implies that the constructed solution satisfies the
Proof. We define the optimal objective value for the problem (OL-SMPCk) as Jk below.
N 1 2
Jk V *
N xk cl*k 1
R B PB T
(4.75)
l 0
Since the candidate solution is feasible and the objective function remains the same for
J k 1 VN xk 1
N 1 2 N 1 2 2 (4.76)
cl k 1
cl*k
J k c0* k
l 0 l 1
where VN xk 1 represents the objective value corresponding to the candidate solution.
2
By rearranging J k 1 J k c0* k
, we have the following inequality.
2
J k 1 J k c0* k
(4.77)
2
By adding J k 1 J k c0* k
from k=0, we have the follows.
2
c
k 0
*
0k
J0 J (4.78)
154
2
Based on c
k 0
*
0k
J 0 J , we have lim c0* k 0 for both scenarios where flag=0
k
and flag=1. Additionally, the convergence of the state to a neighbor of the origin is
further established.
k k
lim xk lim k x0 i 1 Bc0* k 1 i 1wk i
k k
i 1 i 1
(4.79)
k
lim i 1wk i
k
i 1
Note that the second equality holds because Φ is Shur and lim c0* k 0 .
k
k
According to lim xk lim i 1wk i , we have the following relation.
k k
i 1
namely minimal RPI set, under the proposed online-learning based risk-averse
155
CHAPTER 5
A TRANSFORMATION-PROXIMAL BUNDLE ALGORITHM FOR SOLVING
LARGE-SCALE MULTISTAGE ADAPTIVE ROBUST OPTIMIZATION
PROBLEMS
5.1 Introduction
classified into three categories: static robust optimization, two-stage Adaptive Robust
Optimization (ARO), and multistage ARO. In static robust optimization, all the
decisions are made prior to observing uncertainty realizations [368]. By contrast, two-
stage ARO allows recourse decisions to be adaptive to realized uncertainties [106], thus
typically generating less conservative solutions than static robust optimization [369].
As a result, the two-stage ARO method has a variety of applications [370]. To overcome
the limitation of two-stage structures, multistage ARO emerges as a practical yet more
uncertainty [110]. In the multistage setting, the decision maker can dynamically adjust
problems are prevalent in various control problems, including constrained robust finite-
horizon optimal control [364], multiperiod portfolio optimization [372-374], and robust
Despite its attractiveness in modeling dynamic decision making under uncertainty, ARO
problems in general are notoriously demanding to solve [377]. To this end, extensive
156
research effort has been made toward solution techniques for ARO problems. One
popular approach is the affine control policy (the so called affine decision rule)
uncertainty realizations [378, 379]. In this way, ARO problem reduces to a static (single-
stage) robust optimization problem, which can be further addressed efficiently using
control policy sacrifices optimality for tractability [182, 381-383]. Instead of relying on
control policies, the K-adaptability method devises K contingency plans beforehand and
picks a best one among these preselected plans after observing uncertainty realizations
decomposition and extreme point enumeration were proposed as two exact solution
techniques exclusively suitable for two-stage ARO problems [103, 388-390]. Despite
the broad application scope of the multistage setting, solution techniques for multistage
ARO problems are limited based on the existing literature, and they usually suffer from
Hence, the research objective of our work is to propose a general algorithmic strategy
for multistage ARO and demonstrate its use in robust optimal control.
different groups, namely state decision variables and control/local decision variables
[391, 392]. We first propose a novel multi-to-two transformation scheme that converts
157
the multistage ARO problem into an equivalent two-stage counterpart. Specifically, by
enforcing only state decision variables to be affine functions of uncertainty, the original
variables from the affine control policy restriction, thereby leading to a higher-quality
robust optimization solution [393]. We perform theoretical analysis to prove that such
transformation is valid if state decisions follow causal control policies, such as affine
and piecewise affine control policy [106, 381]. The multi-to-two transformation scheme
is general enough to be combined with existing two-stage ARO solution algorithms for
solving MARMILPs. Specifically, we adopt a proximal bundle algorithm for the exact
solution of the resulting TARMILP. Since the worst-case recourse function in the two-
stage ARO problem lacks an analytical expression and can be non-smooth, the bundle
method is employed with an oracle evaluating the function value and its sub-gradients
of the proposed algorithm is presented for any types of uncertainty sets. Compared with
existing multistage ARO solution methods, including the affine control policy method
[106] and the piecewise affine control policy approach [396], the proposed algorithm
tractability. The affine control policy method assumes that both state and control
158
decision variables are affine functions of uncertainty [106], which is stronger than the
assumption adopted in this work. Chen & Zhang (2009) splits the uncertainty into the
positive and negative parts, and applies affine control policies to the parameterized
uncertainties. Thus, the piecewise affine control policy developed by Chen & Zhang
(2009) essentially assumes a piecewise affine dependence of uncertainty for both state
and control decision variables. Additionally, these approaches do not require an oracle
for evaluating the function value and its sub-gradients [106, 396], whereas the proposed
method needs such oracle to obtain cutting planes. To test and evaluate the performance
dynamic inventory systems is presented. Although this chapter only presents the
focuses on the development of a general methodology for multistage ARO, which can
A novel multi-to-two transformation scheme along with its theoretical analysis for
159
Application to the constrained robust optimal control of dynamic inventory systems
under demand uncertainty alongside comparisons with affine and piecewise affine
ARO problems. By employing the affine control policy only to state decision variables,
the proposed scheme can transform the original multistage ARO problem into its
control policies are only applied to state decision variables. Finally, a theoretical
In multistage ARO problems, decisions are made sequentially, and uncertainties are
revealed over time stages. The MARMILP in its general form is shown as follows:
T
min max cx dtst ut fty t ut
x , st , y t uU
t 1
s.t . Tt x A t st ut Bt st 1 ut 1 (5.1)
Wt y t u h H t ut , u U , t
t 0
t
Lt x Et st ut G t y t ut m t0 M t ut , u U , t
where T is the total number of time stages, u1, …, uT are uncertainties revealed over T
realizations, s1, …, sT are adjustable state decision variables, and y1, …, yT are adjustable
control decision variables. Note that the “here-and-now” decisions x include continuous
160
and integer variables, while the adjustable or recourse decisions involve continuous
decision variables. The prime symbol ′ stands for the transpose of a generic vector. Let
vector ut=[u1′, …, ut′]′ be the concatenated vectors of past uncertainty realizations, and
decisions. dt and ft are the vectors of cost coefficients corresponding to state decisions
and control decisions made at stage t, respectively. The state decision variables link
optimization problems of successive stages, while control decisions are only involved
in the current time stage [391]. Also note that a large class of multistage ARO problems
can be reformulated in this form through the introduction of additional variables and
constraints [392].
Remark 5.1 Constrained robust optimal control problems of linear systems with
constraints in (1) describe the state dynamics of a discrete-time linear system subject to
additive disturbance.
Decisions st(ꞏ) and yt(ꞏ) are general control policies or mappings, enabling the recourse
the mappings or policies. To this end, affine control policy is resorted to as a tractable
approximation technique that restricts both st(ꞏ) and yt(ꞏ) to be affine functions of
solution quality. Note that, in conventional robust optimal control, the control policies
161
st(ꞏ) and yt(ꞏ) are restricted to be affine with respect to uncertainty or disturbance. The
key idea of the proposed multi-to-two transformation scheme is to restrict only state st(ꞏ)
to follow an affine control policy as shown in (5.2), while endowing control variables
st ut Pu
t
t
qt (5.2)
where Pt and qt are the coefficients of the affine function and must be determined before
it only depends on the past uncertainty realization ut instead of the future ones. After
plugging the control policy (5.2) into the multistage ARO problem (1.6), the
follows:
T
T
min max cx dtq t d tPt u t fty t u t
x , Pt , qt uU
y t t 1 t 1
For the ease of exposition, we present the nested formulation of multistage ARO
min f 0 xˆ max min 1 f1 y1 max min T fT yT (5.4)
u1U1 y11 xˆ ,u uT UT yT T xˆ ,u
xˆ0
162
where xˆ x, Pt , q t is an aggregated “here-and-now” decisions, set Ω0 represents its
feasible region, and set Ωt( x̂ , ut) is the feasible region of adjustable control decisions
At Pt ut qt Bt Pt 1ut 1 qt 1
t xˆ , ut y t Wt y t ht0 H t ut Tt x (5.5)
Et Pt u qt G t y t mt M t ut Lt x
t 0
The objective functions in the nested multistage ARO formulation (5.4) at different time
T
0
f ˆ
x cx dtqt
(5.6)
t 1
f y f y d P ut , t 1, , T
t t t t t t
the MARMILP can be treated as a “joint” uncertainty set. In this sense, the uncertainty
U t Projut U u1 ,,u t 1 (5.7)
where Ut is defined as the projection of uncertainty set U onto ut given the values of u1
to ut-1.
transformation scheme converts (1.6) into problem (5.4). The following theorem
provides a theoretical proof that the multistage ARO problem (5.4) is equivalent to a
two-stage ARO problem. Therefore, the multistage ARO problem is reduced to a two-
the multistage ARO problem (1.6) is transformed into a two-stage ARO problem given
below.
T T
min cx dtqt max min dtPt ut fty t (5.8)
xˆ 0
t 1 t t t 1
uU y y y xˆ ,ut , t
Proof. Since the multistage ARO problem (1.6) is reformulated as (5.4) by applying
affine control policy in (5.2), we only need to concern the equivalence between
optimization problems (5.4) and (5.8). Considering the max-min optimization problem
max min fT 1 y T 1 umax min T fT y T
T 1
uT 1UT 1 yT 1T 1 xˆ ,u T UT yT 1 x
ˆ ,u
max min fT 1 y T 1 max min T fT y T
uT 1UT 1 yT 1T 1 xˆ ,uT 1 uT UT yT T xˆ ,u
(5.9)
max max min T 1 fT 1 yT 1 min T fT y T
uT 1UT 1 uT UT yT 1T 1 xˆ ,u yT T xˆ ,u
T
max minˆ t f t y t
uT 1 , uT ProjuT 1 , uT U u1 ,,uT 2 t T 1 y t t x ,u
The first equality in (5.9) is based on the fact that the optimization problem at t=T does
not involve control decisions at stage T-1. The second equality in (5.9) is valid because
the feasible region of yT-1 and fT-1(yT-1) do not depend on uT. The above derivation can
be performed backward until t=1, and as a result, the nested formulation collapses.
164
T
min f 0 xˆ max min t ft y t
u1 ,,uT Proju1 ,,uT U t 1 y t t xˆ ,u
xˆ 0
T
min f 0 xˆ max min t ft y t (5.10)
xˆ 0
uU
t 1 yt t xˆ ,u
T
min f 0 xˆ max
xˆ 0
min
t t t 1
uU y y y xˆ ,ut , t
ft y t
The first equality in (5.10) is due to the definition of projection. The second equality is
valid because the inner minimization problem can be decoupled by stage, given x̂ and
u. According to (5.6) and (5.10), the multistage ARO problem (5.4) is equivalent to a
Remark 5.2 Following a similar procedure, we can readily prove that such
transformation scheme is still valid if the adjustable state decision variables follow other
types of causal control policies. The proposed scheme is general enough to embrace
more advanced control policies, such as piecewise affine and polynomial control
policies.
Remark 5.3 One highlight of the proposed transformation scheme lies in its capability
of being employed in conjunction with existing two-stage ARO solution algorithms for
In this section, a proximal bundle method is first adopted for solving the resulting two-
stage ARO problem (5.8). We then propose an algorithmic framework for solutions of
165
scheme with the proximal bundle method and present the convergence analysis of the
The proximal bundle algorithm has proved to be an efficient solution method in various
and stochastic programming [399]. In the following, we present the proximal bundle
shown in (5.11).
T
Q xˆ max
t
min
t
uU y y y xˆ ,ut , t
t 1
d P u
t t
t
ft y t (5.11)
optimization problem.
Based on the definition of the worst-case recourse function, two-stage ARO problem
(5.12).
T
F xˆ cx dtqt Q xˆ (5.12)
t 1
Due to the multi-level optimization structure, the objective function F xˆ does not
166
suitable for addressing this type of optimization setting [400]. In the proximal bundle
method, bundle information includes the past query points xˆ l (l=1, .., k), their
corresponding function values F xˆ l , and sub-gradients of function F at these query
points. We need to solve the max-min optimization problem in (5.11) to obtain the
function value and a sub-gradient at one query point. To this end, the two-level
replacing the inner minimization problem in (5.11) using KKT conditions [401]. The
d P u
T
max t t
t
ft y t
uU
y t , φt , πt t 1
s.t . Wtφt G t πt ft , t
πt 0, t (SUP)
y t t xˆ , ut , t
G t y t Et Pt ut qt mt0 M t ut Lt x πt 0, t , i
i i
where φt and πt are the dual variables corresponding to the constraints in (5.5) at stage
introduce binary variables and linearize these constraints into (5.13) by using the big-M
πt i M wt i , t , i
(5.13)
G t y t Et Pt ut qt mt0 M t ut Lt x M 1 w t , t , i
i i
167
where wt represents a vector of binary decision variables, and M is a large positive
With sub-gradients and function values, we build the optimality cutting plane model for
Fk xˆ max F xˆ l g l , xˆ xˆ l
l 1,, k
(5.14)
where Fk xˆ is the optimality cutting plane model at the k-th iteration. gl is one sub-
gradient of the objective function F at the l-th query point and can be obtained using
optimal dual variables in the same way as the Benders decomposition [389]. Notably,
Fk xˆ is a lower piecewise linear approximation of function F. Note that the two-stage
ARO problem (5.8) may not satisfy the relative complete recourse assumption.
Therefore, for some query point xˆ l , there exist certain uncertainty realizations that
render the second-stage optimization problem infeasible. This implies F xˆ l or
function F (optimality cut) or obtain a cutting plane that separates xˆ l and dom F
(feasibility cut) [398]. To check whether xˆ l dom F or not, the following Feasibility
168
max min
uU y t ,α t ,α t ,βt
1α
t
t 1α t 1βt
s.t . At Pt ut qt Bt Pt 1ut 1 qt 1 Wt y t α t α t
ht0 H t ut Tt x, t (FP)
Et Pt u qt G t y t βt m M t ut Lt x, t
t 0
t
α t , α t , βt 0, t
where αt+, αt−, and βt are slack variables, and 1 is the vector of ones in an appropriate
form into a single-level optimization problem using the KKT condition and linearize
complementary slackness constraints using the big-M method [402]. Let xˆ l denote
the optimal value of problem (FP) associated with a query point xˆ l . If xˆ l 0 , there
exist feasible second-stage decisions for any uncertainty realizations in uncertainty set
U. Thus, we have xˆ l dom F and only need optimality cuts. If xˆ l 0 , the worst-
case uncertainty realization can lead to the nonexistence of feasible recourse decisions.
1
G xˆ Fk xˆ
2
xˆ z k (5.15)
2tk
where zk is the stability center for the k-th iteration and tk is the positive proximal
parameter [394, 400]. Note that the stability center represents the best current iterate.
The proximal bundle method uses the regularization term to make sure that the next
iterate is not far away from the stability center. In the proximal bundle algorithm, we
169
iteratively refine the cutting plane models by adding new query points on the fly. The
optimal solution of the following Master Problem (MP) provides the next query point.
1 2
min xˆ z k
xˆ 0 , 2tk
s.t . F xˆ l g l , xˆ xˆ l , l Lo (MP)
0 F xˆ l g lf , xˆ xˆ l , l L f
where η is an auxiliary variable. Lo and Lf denote the index sets of optimality and
0 F xˆ l glf , xˆ xˆ l is a feasibility cut. Besides the cuts derived in the dual space,
optimality cuts in the primal space can be added as well [103]. It is worth noting that
In the proximal bundle method, the expected decrease δk defined in equation (5.16) is
used to determine whether to update the stability center or remain at the current stability
center. Also, the expected decrease is used to check the stopping criterion in the
1 k 1 k
k F z k Fk xˆ k 1
2
xˆ z (5.16)
2tk
proximal bundle method updates the stability center only when the objective is
sufficiently decreased, i.e. F xˆ k 1 F z k m k .
170
The proximal bundle method is adopted for the two-stage ARO problem and designed
into a single-level problem. The cutting-plane models are refined gradually in each
iteration, and the stability center is guaranteed to converge to the optimal solution. The
multistage ARO problems is shown in Figure 25. The proposed algorithmic framework
is comprised of two primary blocks connected in series. The first block is the multi-to-
two transformation step to convert the multistage ARO problem into a two-stage ARO
problem. The second block is the proximal bundle method, which is employed to
address the resulting two-stage ARO problem. The proposed algorithm iteratively
solves a master problem, a feasibility problem, and a subproblem, until the expected
171
Algorithm. Transformation-proximal bundle algorithm
1: Step 1 (Initialization)
2: Set k 0 , flag 0 , m , tk and tol ;
3: Step 2 (Transformation step)
4: Substitute adjustable state decisions with affine control policy in (2);
5: While flag 0
6: Step 3 (Master problem)
7: Solve master problem (MP) to obtain xˆ k 1 , k 1 ;
1 k 1 k 2
8:
Update k F z k Fk xˆ k 1 2tk
xˆ z ;
24:
Update Fk 1 xˆ max Fk 1 xˆ , F xˆ k 1 g k 1 , xˆ xˆ k 1 ;
25: else
26: Update the feasibility cutting plane model;
27: end
28: k k 1;
29: end
k
30: Return z and F z ;
k
172
5.3.2 Convergence analysis
Proof. Based on (5.12), we can see that F xˆ is the sum of the linear function in x̂
as (5.17).
T T
Q xˆ max dtPt ut min t tf y
uU
t 1 yy y t t xˆ ,ut , t t 1 (5.17)
T
max dtPt ut R xˆ , u
uU
t 1
solution for the minimization problem involved in R xˆ 1 , u , and y*2t be the optimal
T
R xˆ 1 1 xˆ 2 , u ft y1*t 1 y*2t
t 1 (5.18)
R xˆ 1 , u 1 R xˆ 2 , u
The inequality in (5.18) is based on the fact that y1*t 1 y *2 t is feasible for the
is a convex function. □
(5.19).
173
el F z k F xˆ l g l , z k xˆ l , l (5.19)
Fk xˆ in (5.20).
Fk xˆ max F z k el gl , z k xˆ l gl , xˆ xˆ l
l 1,, k
(5.20)
F z k max el gl , xˆ z k
l 1,, k
To prove the convergence of the proposed algorithm, we first present five lemmas and
1
min Fk xˆ
2
xˆ z k (ROP)
xˆ 2tk
maxk
F z k
2
l gl l el (5.21)
αR l 1 l 1 l 1
k
l 1
1 2
min xˆ z k
xˆ , 2tk (5.22)
s.t . F z k
e
l g , xˆ z , l 1,.k
l k
k k
1
l l F z k el g l , xˆ z k
2
L xˆ z k
2tk l 1 l 1
174
L 1 k
xˆ t
l gl 0
k
ˆ
x z
k l 1
L 1 0
k
l 1
l
k
1
l F z k el g l , xˆ z k
2
xˆ z k
2tk l 1
k
1
F z k l el tk
2 2
tk l 1 l g l
k k
gl
l 1 l
(5.23)
2tk l 1
k 2
t k
F z l el k
k
l g l
l 1 2 l 1
Then, we have
(i) gˆ k Fk xˆ k 1
Fk xˆ k 1 F z k tk gˆ k
2
(ii) eˆk
tk k 2
(iii) k gˆ eˆk
2
(iv) gˆ k eˆk F z k
1 k 1 k
gˆ k
tk
xˆ z (5.24)
1
Since xˆ k 1 is an optimal solution to (ROP), we have 0 Fk xˆ xˆ z k .
tk
175
Therefore, based on (5.24), we arrive at (i).
1 k 1 k tk k
Fk xˆ k 1 F z k eˆk
2 2
xˆ z gˆ (5.25)
2tk 2
t 1
Fk xˆ k 1 F z k eˆk k gˆ k
2 2
tk gˆ k
2 2tk (5.26)
F z k eˆk tk gˆ k 2
k F z k F z k eˆk tk gˆ k
2
21t tk gˆ k
2
k
(5.27)
t 2
eˆk k gˆ k
2
(iv) Since F xˆ is convex based on Proposition 5.1 and Lemma 5.2 (i), we have the
following
F xˆ Fk xˆ k 1 gˆ k , xˆ xˆ k 1
F z k tk gˆ k
2
eˆk gˆ k , xˆ z k gˆ k , z k xˆ k 1 (5.28)
F z k gˆ k , xˆ z k eˆk
The first equality is based on Lemma 5.2 (ii), and the second equality is based on the
equation (5.24). □
Definition 5.1 (Serious Steps). For the proposed algorithm, serious steps refer to those
Lemma 5.3 Suppose F* be the optimal value of min F xˆ and F*>−∞. Then, we have
inequality (5.29).
176
F z0 F *
kLs
k
m
(5.29)
F z k F xˆ k 1 F z k F z k 1 m k
F z0 F * F z F z m
k k 1
k (5.30)
kLs kLs
By rearranging (5.30) and noting that F*>−∞, we have (5.29), which completes the
proof. □
Assumption 5.1 For the infinite number of serious steps, i.e. Ls , the sequence
F z k
k Ls
is assumed to converge and limkLs F z k F* .
Note that here we do not assume the converged value F* is the optimal value of the two-
stage ARO problem (5.8), which is denoted by another symbol F * . Theorem 5.2 will
(i) If t
kLs
k , then liminf gˆ 0 ;
k
F z 0 F*
2
tk gˆ k
kLs 2
kLs
k
m
(5.31)
177
Since t k , we can conclude that zero is a cluster point of gˆ
k
k Ls
.
kLs
2 2 2
xˆ * z k 1 xˆ * z k z k z k 1 2 xˆ * z k , z k z k 1
tk gˆ k
2 2 2
xˆ * z k 2tk xˆ * z k , gˆ k
t 2
2tk F xˆ * F z k eˆk k gˆ k
2
xˆ * z k (5.32)
2
2tk F xˆ * F z k k
2
xˆ * z k
2
xˆ * z k 2tk k
k 1
xˆ * z k 1 xˆ * z 0 2 tl l xˆ * z 0 2c l
2 2 2
(5.33)
l 1 lLs
Definition 5.2 (Null Steps). For the proposed algorithm, null steps are those steps in
Lemma 5.5 If there is a finite number of serious steps, i.e. Ls , let k0 be the index
of last serious step, xˆ k be the sequence of null steps, and z k0 be the stability center
k k0
1
2
F z k0 k xˆ xˆ k 1
2tk
(5.34)
1
Fk xˆ k 1 gˆ k , xˆ xˆ k 1
2
xˆ z k0
2tk
178
1 1 k 1 k0
F z k0 xˆ xˆ k 1 F z k0 Fk xˆ k 1
2 2
xˆ z
2tk 2tk
Fk xˆ k 1
1
2tk 2
xˆ xˆ k 1 xˆ k 1 z k0
2
(5.35)
xˆ xˆ k 1 xˆ k 1 z k0
2
1
Fk xˆ k 1
2tk
ˆ ˆ k 1 ˆ k 1 k0
2 x x , x z
Fk xˆ k 1
1
2tk xˆ z k0 2
2 xˆ xˆ k 1 , tk gˆ k RHS
The first equality is based on (5.16), while the third equality is based on the equation
(5.24). □
Lemma 5.6 For the proposed algorithm, the following equality and inequality hold.
(i) Fk xˆ k 1 gˆ k , xˆ k 2 xˆ k 1 F z k gˆ k , xˆ k 2 z k eˆk
(ii) Fk xˆ k 1 gˆ k , xˆ k 2 xˆ k 1 Fk 1 xˆ k 2
Proof. (i) Starting from the right-hand side of Lemma 5.6 (i), we have (5.36).
F z k gˆ k , xˆ k 2 z k eˆk
F z k gˆ k , xˆ k 2 xˆ k 1 gˆ k , xˆ k 1 z k eˆk
F z k gˆ k , xˆ k 2 xˆ k 1 gˆ k , tk gˆ k (5.36)
Fk xˆ k 1 F z k tk gˆ k
2
Fk xˆ gˆ , xˆ xˆ
k 1 k k 2 k 1
where the second equality holds according to (5.24) and Lemma 5.2 (ii).
(ii) Based on the expressions of gˆ k and eˆk in Lemma 5.2, we can have
179
F z k gˆ k , xˆ k 2 z k eˆk
F z k l 1 l el g l , xˆ k 2 z k
k
(5.37)
F z k max el g l , xˆ k 2 z k
l 1,, k
Fk xˆ k 2 Fk 1 xˆ k 2
k
The first inequality is based on the fact that α Rk and l 1
l 1 , the second equality
Based on Lemma 5.6 (i) and (5.37), we have Lemma 5.6 (ii).
Propositon 5.2 If there is a finite number of serious steps, let k0 be the index of last
serious step, xˆ k the sequence of null steps, and z k0 is the stability center generated
k k0
1 k 2
2
F z k0 k xˆ xˆ k 1
2tk
1 k 2
Fk xˆ k 1 gˆ k , xˆ k 2 xˆ k 1
2
xˆ z k0
2tk
(5.38)
Fk 1 xˆ 21t xˆ k 2 z k0
2
k 2
1
Fk 1 xˆ k 2
2
xˆ k 2 z k0 F z k0 k 1
2tk 1
where the first equality is based on Lemma 5.5, the first inequality is according to
Lemma 5.6, the second inequality is valid because of Assumption 5.1, the last equality
1 k 2 k 1 2
is based on (5.16). By rearranging (5.38), we have k k 1 xˆ xˆ .
2tk
180
Using one more time Lemma 5, we can have xˆ z k0 and (5.39).
1 k0
z xˆ k 1 Fk xˆ k 1 gˆ k , z k0 xˆ k 1
2
F z k0 k
2tk (5.39)
Fk z F z
k0 k0
2
Therefore, we have z k0 xˆ k 1 2 k tk 2 k0 tk0 due to fact that δk is decreasing and tk
is nonincreasing. Thus, xˆ k is bounded. Since the serious steps fail for any steps
beyond k0, we have m k F xˆ k 1 F z 0 .
k
Based on (5.16), we can have k F z 0 Fk xˆ k 1 . Therefore, we have
k
1 m k F xˆ k 1 Fk xˆ k 1
(5.40)
F xˆ k 1 F xˆ k Fk xˆ k Fk xˆ k 1 2 xˆ k 1 xˆ k
The equality in (5.40) is based on fact that F xˆ k Fk xˆ k . Therefore, we can obtain
1 m 2 1 m 2
2 2
1 k 2 2
k k 1 xˆ xˆ k 1 k 1 (5.41)
8 2tk 8 2 t k0
k
2tk
1 m
2
Thus, δk→0. □
181
Proof. For δtol=0, the transformation-proximal bundle algorithm loops forever. There
are two exclusive scenarios: (1) The algorithm implements an infinite number of serious
steps; (2) After a finite number of serious steps, the algorithm implements only null
steps.
eˆk 0 according to Lemma 5.2 (iii). Thus, the algorithm still converges to globally
optimal solution of (5.8) asymptotically. For tol 0 , suppose the algorithm does not
Remark 5.4 Since el defined in (5.19) is the linearization errors for a convex function,
both el and eˆk l 1 l el ( l 0 ) are nonnegative. According to Lemma 5.2 (iii), we
k
stopping condition will never be met, and the algorithm will loop forever.
182
5.4 The lower bounding technique
In this section, we devise a lower bounding technique, which serves to assess solution
quality of multistage ARO solution algorithms. Both affine control policy and the
approaches for solving computationally intractable MARMILPs, and they both yield
upper bounds on the optimal value of the original multistage robust optimization
problem. To measure the loss of optimality, we leverage the proposed solution algorithm
developed in the previous section in conjunction with the scenario-tree based method
There are in general two types of lower bounds, namely a priori bound and posteriori
bound. A priori lower bounding methods evaluate the worst-case bound for any problem
instances of MARMILPs. However, this type of lower bound might be too pessimistic
techniques, which can provide a lower bound for the optimal value of a specific
MARMILP instance. Posterior results fit our purpose to assess and compare loss of
set in MARMILPs with a finite number of uncertainty scenarios. The resulting scenario-
tree problem yields a lower bound, because it is a relaxation of the original MARMILP.
It is worth noting that the quality of lower bounds depends heavily on the choice of the
scenario set. Motivated by this observation, we resort to the uncertainty scenario set
and the feasibility problem (FP) during the oracle calling. This yields optimality or
feasibility cuts, which are then fed back to the master problem in each iteration. When
the proposed solution algorithm converges, the scenario set can be obtained by
follows.
T
min max cx dtst ut fty t u t
x , st , y t uU
t 1
s.t . A t st u t Bt st 1 ut 1 Wt y t ut
ht0 H t ut Tt x, u U , t
(5.42)
Et st u G t y t u m M t ut Lt x, u U , t
t t 0
t
ut i ut j st ut i st u t j , y t uti y t ut j i, j , t
U u , , u
1 N
where u(i) is an element of the scenario set U and N denotes the total number of
uncertainty scenarios. Note that additional constraints are introduced to model the non-
specific, if the trajectories of two uncertainty scenarios are the same up to stage t, the
in (5.43).
184
min cx
x , st , y t ,
T
s.t . d tst ut i fty t u t i , i 1, , N
t 1
A s u B s u W y u
t
t t 1
t 1 t
t t i i t t i (5.43)
h H t u i Tt x, i 1, , N , t
0
t
t
Et st u t i G t y t u t i m t0 M t u t i L t x, i 1, , N , t
u u s u s u , y u y u i, j , t
t
i
t
j t
t
i t
t
j t
t
i t
t
j
The above scenario-based problem constitutes an MILP problem, which can be solved
optimization solvers like CPLEX and GUROBI. In this sense, obtaining the lower
critical uncertainty realizations are identified through the proposed solution algorithm.
We quantitatively assess the solution quality of different algorithms using the relative
UB LB
optimality gap defined by, where UB denotes the upper bound, and LB
0.5 UB LB
represents the lower bound obtained via the STMARMILP. Note that this gap is an
indication of solution quality: a small gap implies a near-optimal solution, while a large
gap suggests a significant loss of optimality. Before closing this section, we summarize
Theorem 5.3 For any specific problem instance of MARMILPs, the following
where νS, ν*, νTPB, and νADR present the optimal values of STMARMILP, MARMILP,
185
Proof. Since the scenario set is a subset of the uncertainty set ( U U ), the scenario-
provides a lower bound for the original multistage ARO problem (νS ≤ ν*).
In the original MARMILP, the recourse decisions are general functions of uncertainty.
In both the affine control policy and the proposed transformation proximal bundle
algorithm, all or some of the recourse variables are restricted to a fixed functional form
of uncertainty realizations, thus providing upper bounds to the optimal value of the
original multistage ARO problems (ν* ≤ νADR and ν* ≤ νTPB). Additionally, any feasible
solution of the affinely adjustable robust counterpart is also feasible for the TARMILP
due to the fact that state decisions are restricted to affine control policy and control
Remark 5.6 The proposed algorithm provides an upper bound on ν* for the following
reason. First, the upper bound is based on F z k , instead of the lower piecewise linear
approximation Fk z k . Second, although we add the feasibility cuts on-the-fly into
problem (MP) to obtain a candidate solution xˆ k , this candidate solution must be feasible
in order to become a stability center. This is because if it is not feasible, i.e. its
corresponding objective value of (FP) xˆ k 0 , the condition for the serious step
(Line 17 of the algorithm pseudocode) cannot be met due to F xˆ k .
186
5.5 Applications
between the affine control policy [106], the piecewise affine control policy [396, 405],
solution quality and computational efficiency. All optimization problems are solved
with CPLEX 12.8.0, implemented on a computer with an Intel (R) Core (TM) i7-6700
CPU @ 3.40 GHz and 32 GB RAM. The optimality tolerance for CPLEX 12.8.0 is set
boosting profits. Due to the market fluctuations, customer demands are inevitably
over the entire time horizon. In this application, we consider a single-item multiperiod
robust optimal inventory control problem under demand uncertainty [381, 409, 410]. In
such a problem, a decision maker needs to serve customer demand as far as possible at
a minimum cost. There are two types of orders, standard orders and express orders, that
can be placed after knowing uncertainty realization at the beginning of each period. A
standard order of product arrives at the end of the time period, while the costlier express
orders arrive immediately. Any excess inventories are stored in a warehouse and incur
the holding cost. If customer demands are backlogged, the backlog cost should be paid.
187
The robust finite-horizon optimal inventory control problem under demand uncertainty
is shown as follows. The objective is to minimize the total cost, which is given in (5.45)
. The total cost includes ordering, holding, and backlog costs incurred over all time
periods. The constraints can be classified into inventory dynamics constraints (5.46),
c1 xt ξ t c2 yt ξ t
T
min max (5.45)
xt , yt , I t ξU
t 1 cH I t ξ cB I t ξ
t t
s.t. I t ξt It 1 ξt 1 xt 1 ξt 1 yt ξt t ξ U , t (5.46)
xt ξt 0, ξ U , t (5.47)
yt ξt 0, ξ U , t (5.48)
It ξt , xt ξt , yt ξt , t (5.49)
T
max max
U ξ lt t ut , t , t 2
2
(5.50)
t 1
where xt is a decision variable for standard order of the product at the beginning of time
period t, yt denotes a decision on express order of the product at the beginning of time
period t, and It is the inventory level at time period t. Moreover, ξt denotes the uncertain
demand at time period t, and ξt=[ξ1′, …, ξt′]′ represents the uncertainty realizations
available up to time period t. T denotes the total length of the time horizon. c1 and c2
represent the unit costs of standard and express orders, respectively. cH and cB are the
unit holding and unit backlogging costs, respectively. In the uncertainty set, the lower
188
and upper bounds of uncertain product demand are denoted by lt and ut, respectively.
Constant max represents the highest possible level of product demand for each time
period. The operator [.]+ in (5.45) represents max( . ,0), and can be tackled by the
optimal inventory control problem assumes the same formulation as the multistage ARO
problem. As a result, Iˆt is a state decision variable, while xt, yt, ηtH and ηtB are control
of different control strategies. The number of time periods T is set to be 5. The initial
inventory of the product is assumed to be zero. The unit costs for standard order, express
order, backlog, and holding are chosen randomly following the uniform distributions:
c1~Unif(0, 5), c2~Unif(5, 10), cB ~Unif(0, 10), and cH ~Unif(0, 5). Lower and upper
bounds of the product demand are generated according to the following distributions: lt
~Unif(0, 15) and ut ~Unif(75, 100). Note that the notation of Unif denotes the uniform
The computational results are summarized in Appendix A. For each problem instance,
UB LB
the relative gap is calculated as . Note that LB is the lower bound
0.5 UB LB
189
obtained using the proposed scenario-tree-based lower bounding technique, so it is the
same for different control policies in a specific instance. Accordingly, a large value of
the relative gap implies a high value of UB, which means a large loss of optimality
incurred by the corresponding control policy. In the application, the affine control policy
suffers from severe suboptimality. Its largest relative gap can reach as high as 53.43%,
and the average relative gap is 25.72%. By contrast, the control policy determined by
affine control policy and the piecewise affine control policy consistently across all the
problem instances. More specifically, the control policy resulting from the proposed
algorithm has a relative gap of 1.33% on average, while its highest relative gap is merely
4.27%. Additionally, it can yield near-optimal control strategies for Instances 13, 16, 17
and 21 with relative gaps below 0.02%. In terms of computational time, the robust
optimal inventory control problems using affine control policy and piecewise affine
control policy are more efficient to solve compared to the proposed approach, since they
involve solving only one linear programming problem. However, the proposed
approach solves the robust optimal inventory control problem instances within only 20.8
seconds on average. Note that the average computational times for solving the
reformulated (SUP) and (FP) are 0.41 seconds and 0.25 seconds, respectively. It is worth
noting that the inventory plan is typically made in a large time scale of days and weeks
[408]. Therefore, the computational time difference between affine control policy and
the proposed approach is insignificant. The solution quality in terms of optimality gap
190
sense, it provides an attractive trade-off between solution quality and computational
tractability.
single problem instance (Instance 13) determined by the affine control policy and the
control policy determined by the proposed algorithm in Figure 26. In this particular
instance, we show the inventory profiles over the entire time horizon. From the figure,
we can observe that the affine control policy tends to keep much higher inventory levels
Specifically, the inventory levels at period 3 and period 4 determined by the affine
control policy are more than double those of the proposed control policy, respectively.
As a result, the excessive inventory incurs additional costs, rendering the induced robust
Figure 26. Inventory profiles determined by different control policies under the worst-
191
We present the cost breakdowns determined by the affine control policy and the control
policy determined by proposed algorithm in Figure 27. From the pie charts, we can
observe that a major part of the total cost comes from ordering standard delivery of
products for both control policies. Although express orders can more promptly serve the
the percentage of holding cost determined by the affine control policy is 14% higher
than that of the proposed one due to their different inventory levels.
that samples uncertainty scenarios from the uncertainty set following the uniform
distribution [404, 411]. It is worth noting that the data-driven approach, which relies on
scenario sampling, only provides a lower bound of the original multistage ARO problem
due its relaxation. To guarantee a fair comparison, we employ the STMARMILP in the
proposed framework, since it also provides a lower bound. Additionally, the same
STMARMILP. We present the computational results in Figure 28, where the X-axis
denotes the index of instances and the Y-axis represents the lower bounds of total cost
in multiperiod inventory control. As can be observed from the figure, the proposed
192
Figure 27. Cost breakdowns determined by (a) the affine control policy, (b) the
Figure 28. Lower bounds of multi-period inventory cost determined by the proposed
approach, we select Instance 1 and plot lower bound ratios and computational times
193
under different numbers of scenarios in Figure 29. Note that the lower bound ratio is
defined as the ratio between the lower bounds generated by the data-driven approach
and STMARMILP. From the figure, we can see that the computational time of the data-
Although its corresponding lower bound becomes tighter when using more uncertainty
scenarios, the data-driven approach consumes 27.1 times more computational time than
the proposed method and still generates a less tight lower bound (lower bound ratio is
Figure 29. The impacts of the number of uncertainty scenarios on the generated lower
bound of the original multistage ARO problem and computational time in the data-
driven approach.
number of time stages, we implement computational experiments with T=10 and T=15.
194
For each value of T, 25 randomly generated robust optimal inventory control instances
are used to evaluate and compare different control policies as before. The computational
results for each problem instance with T=10 and T=15 are presented in Table A2 and
Table A3 of Appendix A, respectively. From these tables, we can see that the solution
qualities of both the affine control policy and piecewise affine control policy deteriorate
remarkably as the number of time stages increases. Specifically, their average relative
gaps soar significantly from 25.72% to 34.88% when the value of T changes from 5 to
15, while the largest relative gap changes from 53.43% to 111.20%. In stark contrast,
the average gap of the proposed control policy is increased by only 0.35%, which
Notably, the largest relative gap of the proposed solution algorithm becomes 6.29%
from 4.27% when the value of T increases from 5 to 15. It is worth mentioning that the
proposed control policy compares favorably against the other two control policies in all
problem instances. Moreover, the average computational time of the proposed algorithm
increases from 20.8s to 493.2s, which is still a reasonable amount of time for inventory
control problems. Since the obtained solution from the proposed algorithm is a stability
center, it indicates the feasibility of the obtained inventory control policy according to
Remark 5.6.
195
that consist of interconnected processes and various chemicals [412]. The objective of
the process network planning is to maximize the net present value (NPV) over the
strategic planning horizon. The considered chemical process network, which is shown
in Figure 30, consists of five chemicals (A-E) and three processes (P1, P2, and P3). In
the figure, chemicals A-C represent raw materials, which can be either purchased from
products, which are sold to the markets. In this application, we consider five time
periods over the 10-year planning horizon, and the duration of each period is two years.
It is assumed that all the processes do not have initial capacities, and they can be
installed at the beginning of the planning horizon. For the demand uncertainty set,
d 0jt 100 , jt 0.85 , and 0.6 . The mass balance relationships are given in Table
6.
196
Table 6. Mass balance relationships for different processes.
Process 2 0.64 A → D
The process network planning determines the purchase levels of feedstocks, sales of
final products, capacity expansion, and production profiles of processes at each time
period, in order to maximize the NPV over the strategic planning horizon. The
multistage ARO model for process network planning under demand uncertainty is
formulated as follows. The objective is to maximize the NPV, which is given in (5.53).
The constraints can be classified into capacity expansion constraints (5.54)-(5.55), mass
balance constraints (5.56), production level constraints (5.57), supply and demand
following the literature [413]. The “here-and-now” decision is binary decision Yit, while
all the other continuous decisions constitute the “wait-and-see” decisions. Based on
definitions of state and control decision variables, Qit is the adjustable state decision,
while QEit, Wit, Pjt, and Sjt are the adjustable control decisions. A list of indices/sets,
197
max min jt S jt dt c1it QEit dt c 2it Yit
QEit ,Qit ,Yit , dU
Wit , Pjt , S jt j t i t i t
(5.53)
c3it Wit d c 4 jt Pjt d
t t
i t j t
S jt dt d jt , d U , j, t (5.59)
S jt dt 0, d U , j, t (5.63)
U d jt 1 jt d 0jt d jt 1 jt d 0jt j, t , 1 d
jt
0
jt d jt (5.66)
j t j t
198
The above multistage adaptive robust process network planning problem is
transformation scheme and restrict only state decision Qit to follow affine control policy.
transformed into a two-stage ARO problem. In contrast, the affine and piecewise affine
control policies restrict all the adjustable decisions Qit, QEit, Wit, Pjt, and Sjt to be affine
The computational results are provided in Table 7. In this application, the proposed
compared with affine control policy. In terms of solution quality, the proposed solution
algorithm demonstrates a superior performance than the other two approaches and
generates a high-quality solution with a relative gap of 3.36%. Notably, the proposed
computational algorithm can solve this multistage ARO problem within merely 24.2
seconds, which is a reasonable amount of time given its high solution quality. It can be
optimal design and planning decisions at time period 5 determined by the affine decision
rule method and the proposed solution method are shown in Figure 31 (a) and Figure
31 (b), respectively. In Figure 31, the optimal total capacities are displayed under
operating processes.
199
Table 7. Computational results of different methods in the process network planning
application.
To illustrate the optimal capacity expansion activities, we present the capacity profiles
during the entire planning horizon determined by the affine decision rule method and
the developed solution algorithm in Figure 32 (a) and Figure 32 (b), respectively. As
can be observed from Figure 32 (a), Process 3 is expanded at the beginning of time
period 1, and it is further expanded at the second time period in the solution determined
determined by the two solution methods are different. For example, the optimal total
capacity of Process 2 is 123.4 kt/y at the end of planning horizon determined by the
affine decision rule approach, while the corresponding capacity is 30.6 kt/y larger for
200
Figure 31. The optimal design and planning decisions at the end of the planning
horizon determined by (a) the affine decision rule method, and (b) the transformation-
201
Figure 32. Optimal capacity expansion decisions over the entire planning horizon
determined by (a) the affine decision rule method, and (b) the transformation-proximal
bundle algorithm.
The details on revenues and cost breakdown, including fixed investment cost, variable
investment cost, operating cost, and purchase cost, are shown in Figure 33. As can be
observed from the bar charts, the proposed approach generates $27.53MM higher
revenues than the conventional affine decision rule method, which demonstrates that the
202
adjustable to demand uncertainty realizations. From the pie charts in Figure 33, we can
see that more than 40% of the total cost comes from purchasing feedstock for both
investment cost for the developed solution algorithm is 5% higher than that determined
by the affine decision rule method, because optimal process capacities determined by
the proposed transformation-proximal bundle method are larger in its optimal network
structure.
Figure 33. Revenues and cost break down determined by the affine decision rule
chemicals, six processes, four suppliers, and six markets [412]. The detailed network
schematic is depicted in Figure 34, where specific chemical names are listed. Chemicals
203
certain processes. For instance, Chemical D (Nitric Acid) can be either purchased from
a supplier or produced by Process 3. Chemicals E-J are products, which are sold to
markets for earning revenue. This complex process network has such flexibility that
many manufacturing options are available. For example, Chemical F is a type of product
to Process 3, Process 4, and Process 6. In this case study, we consider four time periods
over the planning horizon, and the duration of each time period is two years. It is
assumed that all processes can be installed at the beginning of time period 1. For the
204
Unlike the affine decision rule method which restricts all adaptive decisions to be affine
for full adjustability in the local decisions, thus boosting the NPV by 4.43% (increases
method. Figure 35 shows more details on NPV results, including revenue and cost break
down in each time period, for the affine decision rule method (represented by ADR for
(denoted by TPB in the figure), respectively. We can observe from Figure 35 that
investment costs occupy 48.6% of total costs in the first time period for both solution
methods. This result can be well explained by the fact that most chemical processes are
expanded or built within the first time period. During the last three time periods, the
majority of costs come from process operation and feedstock purchase. Notably, the
revenues determined by the optimal planning decisions of the proposed solution method
are 3.27%, 5.02%, and 4.48% higher, compared with the affine decision rule approach,
205
Figure 35. Revenues and cost break down at each time period determined by the affine
decision rule method (denoted by ADR in the figure) and the transformation-proximal
To illustrate the optimal capacity expansion activity, we present the capacity profiles
during the entire planning horizon for the proposed approach in Figure 36. From Figure
36, we can see that a total of five processes are selected to be built at time period 1 in
capacity expansions of different processes, we can conclude that the optimal expansion
frequency of Process 4 is the highest among all processes. This is partially ascribed to
the fact that a total of three products (Chemical J, Chemical H, and Chemical I) are
206
Figure 36. Optimal capacity expansion decisions over the entire planning horizon
In Figure 37 (a) and Figure 37 (b), we further present the optimal purchase levels of
feedstock determined by the conventional affine decision rule method and the proposed
solution algorithm, respectively. From the bar charts, we can observe a similar trend for
both solution methods that the purchase level of a feedstock increases as the time period
manufacturing need when process capacities expand over the planning horizon. By
comparing Figure 37 (a) and Figure 37 (b), a notable difference lies in that the total
than that of the affine decision rule method. In addition, the proposed solution algorithm
increases the total purchase amount of Chemical A during the entire planning horizon
207
Figure 37. Optimal feedstock purchase at each time stage determined by (a) the affine
To take a closer look at the optimal adjustable decisions on product sale, we present the
results for the affine decision rule method and the proposed approach as spider charts
shown in Figure 38 (a) and Figure 38 (b), respectively. Among all the products
(Chemicals E-J), there exist significant increases in sale amount of Chemical G at Period
2, Period 3, and Period 4. The sale level of Chemical G could reasonably be expected
to rise when the corresponding feedstock to Process 6 increases (as shown in Figure 37).
Compared with the optimal sale level of Chemical E in Figure 38 (a), the optimal sale
Period 4.
208
Figure 38. Spider charts showing optimal sale quantities (kt/y) of final products at
each time stage determined by (a) the affine decision rule method, and (b) the
5.6 Summary
were restricted to be affine functions. By employing the proposed scheme, the original
ARO problem. The proximal bundle algorithm was further developed as an efficient
global optimization algorithm of the resulting two-stage ARO problem. Since the local
decisions were exempt from the affine decision rule restriction, the proposed solution
algorithm sacrificed less optimality for the computational tractability compared with
conventional decision rule methods. The computational results showed that the
209
proposed transformation-proximal bundle algorithm significantly outperformed the
multistage robust inventory control problem under demand uncertainty for T=5.
210
Table 9. Computational performances of different solution algorithms in the
multistage robust inventory control problem under demand uncertainty for T=10.
211
Table 10. Computational performances of different solution algorithms in the
multistage robust inventory control problem under demand uncertainty for T=15.
5.8 Nomenclature
Sets/indices
Parameters
212
cB unit backlog cost
Continuous variables
Sets/indices
Parameters
Binary variables
Yit binary variable that indicates whether process i is chosen for expansion
in time period t
Continuous variables
214
CHAPTER 6
CONCLUSIONS
The data-driven optimization under uncertainty has been investigated with emphasis on
four main aspects, namely the two-stage adaptive distributionally robust optimization,
optimizing framework, and the algorithm design for large-scale multistage robust
with applications. We believe that the research in this dissertation lays a solid foundation
for future studies in this area. Additionally, the proposed frameworks and solution
under uncertainty, such as those for supply chain management, energy systems, and
process control. The summary of the dissertation as well as future research directions
for hedging against uncertainty in the optimal biomass with agricultural waste-to-energy
their distances from the data-based empirical distribution. Equipped with this ambiguity
set, the two-stage distributionally robust optimization model not only accommodates
the sequential decision making at design and operational stages, but also hedges against
the distributional ambiguity arising from finite amount of uncertainty data. A solution
215
algorithm is further developed to solve the resulting two-stage distributionally robust
network including 216 technologies and 172 compounds. Computational results show
better out-of-sample performance in terms of a 5.7% lower average cost and a 37.1%
programming method.
Specifically, wind power data are utilized to train f-GAN, in which its discriminator
Consequently, the proposed framework closely links the training objective of deep
learning with the characterization of ambiguity set via the same type of divergence.
Additionally, the GAN is well suited for capturing the complicated temporal and spatial
correlations among renewable energy sources. Based upon this ambiguity set, a data-
facilitate its solution process, the resulting distributionally robust chance constraints are
tackled using a scenario approach. This scenario approach leverages the sampling
216
efficiency of the generator network due to the feedforward nature of neural networks.
Theoretical a priori bound on the required number of synthetic wind power data
effectiveness and scalability of the proposed approach are demonstrated through the six-
bus and IEEE 118-bus systems. Computational results show that the proposed approach
We investigate the problem of designing data-driven stochastic MPC for linear time-
is unknown but can be partially inferred from data. We propose a novel online learning-
states are required to hold for a family of distributions called an ambiguity set. The
mixture model that is self-adaptive to the underlying data structure and complexity.
Specifically, the structural property of multimodality is exploited, so that the first and
ambiguity set. As more data are gathered during the runtime of controller, the ambiguity
set is updated online using real-time disturbance data, which enables the risk-averse
217
online variational inference algorithm obviates learning all collected data from scratch,
and therefore the proposed MPC is endowed with the guaranteed computational
stability of the proposed MPC are established via a safe update scheme. Numerical
examples are used to illustrate the effectiveness and advantages of the proposed MPC.
partitioning recourse decisions into state and control decisions, the proposed algorithm
applies affine control policy only to state decisions and allows control decisions to be
remains valid for other types of causal control policies besides the affine one.
two-stage ARO solution algorithms for MARMILPs, thus opening a new avenue for a
The proposed generic approach can be applied to a variety of control problems, such as
suboptimality with an average gap of 34.88%, while the proposed algorithm generates
218
The future research directions include closed-loop data-driven optimization and the
that integrates the data-driven system based on machine learning to extract useful and
relevant information from data, and the model-based system based on the mathematical
programming to derive the optimal decisions from the information. Existing data-driven
further improved by introducing feedback steps from the model-based system to data-
knowledge could serve as another informative input to the data-driven system. Relying
solely on the data to develop the uncertainty model could unfavorably influence the
decision maker knows about the uncertainty, and it can come in different forms. For
driven optimization framework could be substantially useful and provides more reliable
219
sources and a small data regime beget new challenges to the existing data-driven
decision making under uncertainty frameworks. The imbalance of datasets would lead
imbalance has two main adverse effects. First, decision makers would lose much
information imbedded within the majority data class, if they synthesize both minority
and majority uncertainty data through down-sampling. Based on the research works in
this thesis, the inefficient use of uncertainty data information would negatively
one builds data-driven uncertainty models for the minority and majority datasets
methods tend to suffer severely from the issue of “small data”, which has a direct impact
on uncertainty set construction. The small data regime could under-fit machine learning
and more brand new systems employed, this type of small data regime can be frequently
promising general framework for coping with the limited amount of uncertainty data
uncertainty data from the existing uncertainty data. As these newly generated data are
totally unseen, an uncertainty model built from augmented data is more likely to have a
220
increase the volume of minority data class, and therefore is well suited for addressing
the imbalanced uncertainty data issue. In the literature of machine learning, data
applications [414, 415]. First, data augmentation is useful for data imbalance during
training. Second, real data could convey some private information and therefore using
artificial generated data is capable of protecting data privacy. One potential method for
data augmentation to use is resampling techniques. The resampled uncertainty data are
used to augment the minority data class and to make the overall dataset balanced.
Another promising way is employing deep learning, especially deep generative models,
to generate synthetic uncertainty data for the purpose of data augmentation. The
complicated and unseen data patterns can be potentially captured by the powerful deep
more specific, a data-driven uncertainty set would be constructed from a hybrid use of
the majority dataset and the augmented minority dataset. Then, this data-driven
uncertainty set could be further integrated into dynamic robust optimization, which
221
REFERENCES
222
[14] S. J. Qin, "Process data analytics in the era of big data," AIChE J., vol. 60, no.
9, pp. 3092-3100, 2014. [Online]. Available:
http://dx.doi.org/10.1002/aic.14523.
[15] V. Venkatasubramanian, "DROWNING IN DATA: Informatics and modeling
challenges in a data-rich networked world," AIChE J., vol. 55, no. 1, pp. 2-8,
2009. [Online]. Available: http://dx.doi.org/10.1002/aic.11756.
[16] J. Li et al., "Data-driven mathematical modeling and global optimization
framework for entire petrochemical planning operations," AIChE J., vol. 62, no.
9, pp. 3020-3040, 2016, doi: 10.1002/aic.15220.
[17] S. Yin, X. Li, H. Gao, and O. Kaynak, "Data-Based Techniques Focused on
Modern Industry: An Overview," IEEE Transactions on Industrial Electronics,
vol. 62, no. 1, pp. 657-667, 2015, doi: 10.1109/TIE.2014.2308133.
[18] V. Venkatasubramanian, "The promise of artificial intelligence in chemical
engineering: Is it here, finally?," AIChE J., vol. 65, no. 2, pp. 466-478, 2019, doi:
doi:10.1002/aic.16489.
[19] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning. MIT
press Cambridge, 2016.
[20] I. E. Grossmann, "Advances in mathematical programming models for
enterprise-wide optimization," Comput. Chem. Eng., vol. 47, pp. 2-18, 2012.
[Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0098135412002220.
[21] M. I. Jordan and T. M. Mitchell, "Machine learning: Trends, perspectives, and
prospects," Science, vol. 349, no. 6245, pp. 255-260, 2015. [Online]. Available:
http://science.sciencemag.org/content/sci/349/6245/255.full.pdf.
[22] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, p. 436,
2015, doi: 10.1038/nature14539.
[23] D. Bertsimas, V. Gupta, and N. Kallus, "Data-driven robust optimization," arXiv
preprint arXiv:1401.0212, 2013.
[24] D. Bertsimas and A. Thiele, "Robust and data-driven optimization: Modern
decision-making under uncertainty," INFORMS tutorials in operations research:
models, methods, and applications for innovative decision making, pp. 95-122,
2006.
[25] B. A. Calfa, A. Agarwal, S. J. Bury, J. M. Wassick, and I. E. Grossmann, "Data-
Driven Simulation and Optimization Approaches To Incorporate Production
Variability in Sales and Operations Planning," Ind. Eng. Chem. Res., vol. 54, no.
29, pp. 7261-7272, 2015. [Online]. Available:
http://dx.doi.org/10.1021/acs.iecr.5b01273.
[26] B. A. Calfa, A. Agarwal, I. E. Grossmann, and J. M. Wassick, "Data-driven
multi-stage scenario tree generation via statistical property and distribution
matching," Comput. Chem. Eng., vol. 68, pp. 7-23, 2014. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S009813541400129X.
[27] B. A. Calfa, I. E. Grossmann, A. Agarwal, S. J. Bury, and J. M. Wassick, "Data-
driven individual and joint chance-constrained optimization via kernel
smoothing," Comput. Chem. Eng., vol. 78, pp. 51-69, Jul 2015, doi:
10.1016/j.compchemeng.2015.04.012.
223
[28] T. Campbell and J. P. How, "Bayesian nonparametric set construction for robust
optimization," in American Control Conference (ACC), 2015, 1-3 July 2015
2015, pp. 4216-4221, doi: 10.1109/ACC.2015.7171991.
[29] R. Jiang and Y. Guan, "Data-driven chance constrained stochastic program,"
Mathematical Programming, journal article vol. 158, no. 1, pp. 291-327, 2015,
doi: 10.1007/s10107-015-0929-7.
[30] R. Levi, G. Perakis, and J. Uichanco, "The data-driven newsvendor problem:
new bounds and insights," Operations Research, vol. 63, no. 6, pp. 1294-1306,
2015.
[31] Y. Zhang, Y. Feng, and G. Rong, "Data-driven chance constrained and robust
optimization under matrix uncertainty," Ind. Eng. Chem. Res., vol. 55, no. 21,
pp. 6145-6160, 2016. [Online]. Available:
http://dx.doi.org/10.1021/acs.iecr.5b04973.
[32] C. Ning and F. You, "Data-driven adaptive nested robust optimization: General
modeling framework and efficient computational algorithm for decision making
under uncertainty," AIChE J., vol. 63, no. 9, pp. 3790-3817, 2017, doi:
10.1002/aic.15717.
[33] C. Ning and F. You, "Data-Driven Adaptive Robust Unit Commitment under
Wind Power Uncertainty: A Bayesian Nonparametric Approach," IEEE Trans.
Power Syst., vol. (doi: 10.1109/TPWRS.2019.2891057), 2019, doi:
10.1109/TPWRS.2019.2891057.
[34] C. Ning and F. You, "Data-driven decision making under uncertainty integrating
robust optimization with principal component analysis and kernel smoothing
methods," Comput. Chem. Eng., vol. 112, pp. 190-210, 2018, doi:
https://doi.org/10.1016/j.compchemeng.2018.02.007.
[35] C. Shang and F. You, "A data-driven robust optimization approach to stochastic
model predictive control," Journal of Process Control, vol. 75, pp. 24-39, 2019.
[36] C. Shang, W.-H. Chen, A. D. Stroock, and F. You, "Robust Model Predictive
Control of Irrigation Systems with Active Uncertainty Learning and Data
Analytics," arXiv preprint arXiv:1810.05947, 2018.
[37] W. C. Rooney and L. T. Biegler, "Optimal process design with model parameter
uncertainty and process variability," AIChE J., vol. 49, no. 2, pp. 438-449, Feb
2003, doi: 10.1002/aic.690490214.
[38] P. M. Verderame, J. A. Elia, J. Li, and C. A. Floudas, "Planning and Scheduling
under Uncertainty: A Review Across Multiple Sectors," Ind. Eng. Chem. Res.,
vol. 49, no. 9, pp. 3993-4017, May 2010, doi: 10.1021/ie902009k.
[39] A. Mesbah, "Stochastic Model Predictive Control AN OVERVIEW AND
PERSPECTIVES FOR FUTURE RESEARCH," IEEE Control Systems
Magazine, vol. 36, no. 6, pp. 30-44, Dec 2016, doi: 10.1109/mcs.2016.2602087.
[40] A. Krieger and E. N. Pistikopoulos, "Model predictive control of anesthesia
under uncertainty," Comput. Chem. Eng., vol. 71, pp. 699-707, Dec 2014, doi:
10.1016/j.compchemeng.2014.07.025.
[41] D. W. Griffith, V. M. Zavala, and L. T. Biegler, "Robustly stable economic
NMPC for non-dissipative stage costs," Journal of Process Control, vol. 57, pp.
116-126, Sep 2017, doi: 10.1016/j.jprocont.2017.06.016.
224
[42] T. Y. Chiu and P. D. Christofides, "Robust control of particulate processes using
uncertain population balances," AIChE J., vol. 46, no. 2, pp. 266-280, Feb 2000,
doi: 10.1002/aic.690460207.
[43] N. V. Sahinidis, "Optimization under uncertainty: state-of-the-art and
opportunities," Comput. Chem. Eng., vol. 28, no. 6-7, pp. 971-983, Jun 2004,
doi: 10.1016/j.compchemeng.2003.09.017.
[44] I. E. Grossmann, R. M. Apap, B. A. Calfa, P. Garcia-Herreros, and Q. Zhang,
"Recent advances in mathematical programming techniques for the optimization
of process systems under uncertainty," Comput. Chem. Eng., vol. 91, pp. 3-14,
Aug 2016, doi: 10.1016/j.compchemeng.2016.03.002.
[45] J. R. Birge and F. Louveaux, Introduction to stochastic programming. Springer
Science & Business Media, 2011.
[46] J. R. Birge, "State-of-the-Art-Survey—Stochastic Programming: Computation
and Applications," INFORMS J. Comput., vol. 9, no. 2, pp. 111-133, 1997, doi:
10.1287/ijoc.9.2.111.
[47] A. Gupta and C. D. Maranas, "Managing demand uncertainty in supply chain
planning," Comput. Chem. Eng., vol. 27, no. 8, pp. 1219-1227, 2003, doi:
http://dx.doi.org/10.1016/S0098-1354(03)00048-6.
[48] R. M. Vanslyke and R. Wets, "L-SHAPED LINEAR PROGRAMS WITH
APPLICATIONS TO OPTIMAL CONTROL AND STOCHASTIC
PROGRAMMING," SIAM Journal on Applied Mathematics, vol. 17, no. 4, pp.
638-+, 1969, doi: 10.1137/0117061.
[49] G. Laporte and F. V. Louveaux, "THE INTEGER L-SHAPED METHOD FOR
STOCHASTIC INTEGER PROGRAMS WITH COMPLETE RECOURSE,"
Oper. Res. Lett., vol. 13, no. 3, pp. 133-142, Apr 1993, doi: 10.1016/0167-
6377(93)90002-x.
[50] F. Oliveira, V. Gupta, S. Hamacher, and I. E. Grossmann, "A Lagrangean
decomposition approach for oil supply chain investment planning under
uncertainty with risk considerations," Comput. Chem. Eng., vol. 50, pp. 184-195,
Mar 2013, doi: 10.1016/j.compchemeng.2012.10.012.
[51] S. Küçükyavuz and S. Sen, "An introduction to two-stage stochastic mixed-
integer programming," in Leading Developments from INFORMS Communities:
INFORMS, 2017, pp. 1-27.
[52] C. C. Caroe and R. Schultz, "Dual decomposition in stochastic integer
programming," Oper. Res. Lett., vol. 24, no. 1-2, pp. 37-45, Feb-Mar 1999, doi:
10.1016/s0167-6377(98)00050-9.
[53] S. Ahmed, M. Tawarmalani, and N. V. Sahinidis, "A finite branch-and-bound
algorithm for two-stage stochastic integer programs," Math. Program., vol. 100,
no. 2, pp. 355-377, Jun 2004, doi: 10.1007/s10107-003-0475-6.
[54] C. Li and I. E. Grossmann, "An improved L-shaped method for two-stage
convex 0–1 mixed integer nonlinear stochastic programs," Comput. Chem. Eng.,
vol. 112, pp. 165-179, 2018/04/06/ 2018, doi:
https://doi.org/10.1016/j.compchemeng.2018.01.017.
225
[55] M. G. Ierapetritou and E. N. Pistikopoulos, "DESIGN OF MULTIPRODUCT
BATCH PLANTS WITH UNCERTAIN DEMANDS," Comput. Chem. Eng.,
vol. 19, pp. S627-S632, 1995, doi: 10.1016/0098-1354(95)00130-t.
[56] A. Bonfill, M. Bagajewicz, A. Espuña, and L. Puigjaner, "Risk Management in
the Scheduling of Batch Plants under Uncertain Market Demand," Ind. Eng.
Chem. Res., vol. 43, no. 3, pp. 741-750, 2004, doi: 10.1021/ie030529f.
[57] A. Bonfill, A. Espuña, and L. Puigjaner, "Addressing Robustness in Scheduling
Batch Processes with Uncertain Operation Times," Ind. Eng. Chem. Res., vol.
44, no. 5, pp. 1524-1534, 2005, doi: 10.1021/ie049732g.
[58] J. Steimel and S. Engell, "Conceptual design and optimization of chemical
processes under uncertainty by two-stage programming," Comput. Chem. Eng.,
vol. 81, pp. 200-217, Oct 2015, doi: 10.1016/j.compchemeng.2015.05.016.
[59] P. Liu, E. N. Pistikopoulos, and Z. Li, "Decomposition Based Stochastic
Programming Approach for Polygeneration Energy Systems Design under
Uncertainty," Ind. Eng. Chem. Res., vol. 49, no. 7, pp. 3295-3305, 2010, doi:
10.1021/ie901490g.
[60] X. Peng, T. W. Root, and C. T. Maravelias, "Optimization-based process
synthesis under seasonal and daily variability: Application to concentrating solar
power," AIChE J., vol. (doi:10.1002/aic.16458), no. 0, doi:
doi:10.1002/aic.16458.
[61] J. Y. Gao and F. Q. You, "Deciphering and handling uncertainty in shale gas
supply chain design and optimization: Novel modeling framework and
computationally efficient solution algorithm," AIChE J., vol. 61, no. 11, pp.
3739-3755, Nov 2015, doi: 10.1002/aic.15032.
[62] F. Q. You, J. M. Wassick, and I. E. Grossmann, "Risk Management for a Global
Supply Chain Planning Under Uncertainty: Models and Algorithms," AIChE J.,
vol. 55, no. 4, pp. 931-946, Apr 2009, doi: 10.1002/aic.11721.
[63] B. H. Gebreslassie, Y. Yao, and F. You, "Design under uncertainty of
hydrocarbon biorefinery supply chains: Multiobjective stochastic programming
models, decomposition algorithm, and a Comparison between CVaR and
downside risk," AIChE J., vol. 58, no. 7, pp. 2155-2179, 2012, doi:
10.1002/aic.13844.
[64] L. J. Zeballos, C. A. Méndez, and A. P. Barbosa-Povoa, "Design and Planning
of Closed-Loop Supply Chains: A Risk-Averse Multistage Stochastic
Approach," Ind. Eng. Chem. Res., vol. 55, no. 21, pp. 6236-6249, 2016, doi:
10.1021/acs.iecr.5b03647.
[65] X. Li, A. Tomasgard, and P. I. Barton, "Nonconvex Generalized Benders
Decomposition for Stochastic Separable Mixed-Integer Nonlinear Programs," J.
Optim. Theory Appl., vol. 151, no. 3, pp. 425-454, Dec 2011, doi:
10.1007/s10957-011-9888-1.
[66] V. Gupta and I. E. Grossmann, "A new decomposition algorithm for multistage
stochastic programs with endogenous uncertainties," Comput. Chem. Eng., vol.
62, pp. 62-79, Mar 2014, doi: 10.1016/j.compchemeng.2013.11.011.
226
[67] V. Goel and I. E. Grossmann, "A Class of stochastic programs with decision
dependent uncertainty," Math. Program., vol. 108, no. 2-3, pp. 355-394, Jan
2007, doi: 10.1007/s10107-006-0715-7.
[68] A. Prékopa, "Stochastic programming, volume 324 of Mathematics and its
Applications," ed: Kluwer Academic Publishers Group, Dordrecht, 1995.
[69] A. Charnes and W. W. Cooper, "CHANCE-CONSTRAINED
PROGRAMMING," Manage. Sci., vol. 6, no. 1, pp. 73-79, 1959, doi:
10.1287/mnsc.6.1.73.
[70] P. Li, H. Arellano-Garcia, and G. Wozny, "Chance constrained programming
approach to process optimization under uncertainty," Comput. Chem. Eng., vol.
32, no. 1, pp. 25-45, 2008/01/01/ 2008, doi:
https://doi.org/10.1016/j.compchemeng.2007.05.009.
[71] B. L. Miller and H. M. Wagner, "CHANCE CONSTRAINED
PROGRAMMING WITH JOINT CONSTRAINTS," Oper. Res., vol. 13, no. 6,
pp. 930-&, 1965, doi: 10.1287/opre.13.6.930.
[72] X. Liu, S. Kucukyavuz, and J. Luedtke, "Decomposition algorithms for two-
stage chance-constrained programs," Math. Program., vol. 157, no. 1, pp. 219-
243, May 2016, doi: 10.1007/s10107-014-0832-7.
[73] M. A. Quddus, S. Chowdhury, M. Marufuzzaman, F. Yu, and L. K. Bian, "A
two-stage chance-constrained stochastic programming model for a bio-fuel
supply chain network," International Journal of Production Economics, vol. 195,
pp. 27-44, Jan 2018, doi: 10.1016/j.ijpe.2017.09.019.
[74] J. Luedtke and S. Ahmed, "A SAMPLE APPROXIMATION APPROACH FOR
OPTIMIZATION WITH PROBABILISTIC CONSTRAINTS," SIAM J. Optim.,
vol. 19, no. 2, pp. 674-699, 2008, doi: 10.1137/070702928.
[75] L. J. Hong, Y. Yang, and L. W. Zhang, "Sequential Convex Approximations to
Joint Chance Constrained Programs: A Monte Carlo Approach," Oper. Res., vol.
59, no. 3, pp. 617-630, May-Jun 2011, doi: 10.1287/opre.1100.0910.
[76] F. E. Curtis, A. Wachter, and V. M. Zavala, "A SEQUENTIAL ALGORITHM
FOR SOLVING NONLINEAR OPTIMIZATION PROBLEMS WITH
CHANCE CONSTRAINTS," SIAM J. Optim., vol. 28, no. 1, pp. 930-958, 2018,
doi: 10.1137/16m109003x.
[77] A. Nemirovski and A. Shapiro, "Convex approximations of chance constrained
programs," SIAM J. Optim., vol. 17, no. 4, pp. 969-996, 2006, doi:
10.1137/050622328.
[78] C. D. Maranas, "Optimization accounting for property prediction uncertainty in
polymer design," Comput. Chem. Eng., vol. 21, pp. S1019-S1024, 1997.
[Online]. Available: <Go to ISI>://WOS:A1997XD31300168.
[79] A. Gupta, C. D. Maranas, and C. M. McDonald, "Mid-term supply chain
planning under demand uncertainty: customer demand satisfaction and
inventory management," Comput. Chem. Eng., vol. 24, no. 12, pp. 2613-2621,
Dec 2000, doi: 10.1016/s0098-1354(00)00617-7.
[80] F. Q. You and I. E. Grossmann, "Stochastic Inventory Management for Tactical
Process Planning Under Uncertainties: MINLP Models and Algorithms," AIChE
J., vol. 57, no. 5, pp. 1250-1277, May 2011, doi: 10.1002/aic.12338.
227
[81] D. J. Yue and F. Q. You, "Planning and Scheduling of Flexible Process
Networks Under Uncertainty with Stochastic Inventory: MINLP Models and
Algorithm," AIChE J., vol. 59, no. 5, pp. 1511-1532, May 2013, doi:
10.1002/aic.13924.
[82] W. Shen, Z. Li, B. Huang, and N. M. Jan, "Chance-Constrained Model
Predictive Control for SAGD Process Using Robust Optimization
Approximation," Ind. Eng. Chem. Res., 2018/11/01 2018, doi:
10.1021/acs.iecr.8b03207.
[83] M. Cannon, B. Kouvaritakis, and X. J. Wu, "Probabilistic Constrained MPC for
Multiplicative and Additive Stochastic Uncertainty," IEEE Trans. Autom.
Control., vol. 54, no. 7, pp. 1626-1632, Jul 2009, doi: 10.1109/tac.2009.2017970.
[84] P. Li, H. Arellano-Garcia, and G. Wozny, "Chance constrained programming
approach to process optimization under uncertainty," Comput. Chem. Eng., vol.
32, no. 1-2, pp. 25-45, Jan-Feb 2008, doi: 10.1016/j.compchemeng.2007.05.009.
[85] Y. Yang, P. Vayanos, and P. I. Barton, "Chance-Constrained Optimization for
Refinery Blend Planning under Uncertainty," Ind. Eng. Chem. Res., vol. 56, no.
42, pp. 12139-12150, Oct 2017, doi: 10.1021/acs.iecr.7b02434.
[86] S. S. Liu, S. S. Farid, and L. G. Papageorgiou, "Integrated Optimization of
Upstream and Downstream Processing in Biopharmaceutical Manufacturing
under Uncertainty: A Chance Constrained Programming Approach," Ind. Eng.
Chem. Res., vol. 55, no. 16, pp. 4599-4612, Apr 2016, doi:
10.1021/acs.iecr.5b04403.
[87] K. Mitra, R. D. Gudi, S. C. Patwardhan, and G. Sardar, "Midterm supply chain
planning under uncertainty: A multiobjective chance constrained programming
framework," Ind. Eng. Chem. Res., vol. 47, no. 15, pp. 5501-5511, Aug 2008,
doi: 10.1021/ie0710364.
[88] J. Yang, H. Gu, and G. Rong, "Supply Chain Optimization for Refinery with
Considerations of Operation Mode Changeover and Yield Fluctuations," Ind.
Eng. Chem. Res., vol. 49, no. 1, pp. 276-287, Jan 2010, doi: 10.1021/ie900968x.
[89] F. Q. You and I. E. Grossmann, "Balancing Responsiveness and Economics in
Process Supply Chain Design with Multi-Echelon Stochastic Inventory," AIChE
J., vol. 57, no. 1, pp. 178-192, Jan 2011, doi: 10.1002/aic.12244.
[90] F. Q. You and I. E. Grossmann, "Mixed-Integer Nonlinear Programming Models
and Algorithms for Large-Scale Supply Chain Design with Stochastic Inventory
Management," Ind. Eng. Chem. Res., vol. 47, no. 20, pp. 7802-7817, Oct 2008,
doi: 10.1021/ie800257x.
[91] Y. Yuan, Z. Li, and B. Huang, "Robust optimization under correlated uncertainty:
Formulations and computational study," Comput. Chem. Eng., vol. 85, pp. 58-
71, 2016. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0098135415003464.
[92] A. Ben-Tal and A. Nemirovski, "Robust solutions of Linear Programming
problems contaminated with uncertain data," Math. Programming, vol. 88, p.
411, 2000.
[93] D. Bertsimas and M. Sim, "The price of robustness," Oper. Res., vol. 52, no. 1,
p. 35, 2004.
228
[94] A. Ben-Tal, L. E. Ghaoui, and A. Nemirovski, Robust Optimization. Princeton
University Press, 2009.
[95] C. Gregory, K. Darby-Dowman, and G. Mitra, "Robust optimization and
portfolio selection: The cost of robustness," Eur. J. Oper. Res., vol. 212, no. 2,
pp. 417-428, 2011, doi: http://dx.doi.org/10.1016/j.ejor.2011.02.015.
[96] T. Assavapokee, M. J. Realff, and J. C. Ammons, "Min-Max Regret Robust
Optimization Approach on Interval Data Uncertainty," J. Optim. Theory Appl.,
journal article vol. 137, no. 2, pp. 297-316, 2008, doi: 10.1007/s10957-007-
9334-6.
[97] A. L. Soyster, "Technical Note—Convex Programming with Set-Inclusive
Constraints and Applications to Inexact Linear Programming," Oper. Res., vol.
21, no. 5, pp. 1154-1157, 1973, doi: doi:10.1287/opre.21.5.1154.
[98] D. Bertsimas, D. B. Brown, and C. Caramanis, "Theory and applications of
robust optimization," SIAM Rev., vol. 53, no. 3, pp. 464-501, 2011.
[99] Á. Lorca, X. A. Sun, E. Litvinov, and T. Zheng, "Multistage adaptive robust
optimization for the unit commitment problem," Operations Research, vol. 61,
no. 1, pp. 32-51, 2016.
[100] A. Lorca and X. A. Sun, "Adaptive robust optimization with dynamic
uncertainty sets for multi-period economic dispatch under significant wind,"
Power Systems, IEEE Transactions on, vol. 30, no. 4, pp. 1702-1713, 2015.
[101] A. Atamtürk and M. Zhang, "Two-stage robust network flow and design under
demand uncertainty," Oper. Res., vol. 55, no. 4, pp. 662-673, 2007.
[102] D. Bertsimas, E. Litvinov, X. A. Sun, J. Zhao, and T. Zheng, "Adaptive Robust
Optimization for the Security Constrained Unit Commitment Problem," IEEE
Trans. Power Syst., vol. 28, no. 1, pp. 52-63, 2013.
[103] B. Zeng and L. Zhao, "Solving two-stage robust optimization problems using a
column-and-constraint generation method," Oper. Res. Lett., vol. 41, no. 5, pp.
457-461, 2013. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0167637713000618.
[104] Q. Zhang, M. F. Morari, I. E. Grossmann, A. Sundaramoorthy, and J. M. Pinto,
"An adjustable robust optimization approach to scheduling of continuous
industrial processes providing interruptible load," Comput. Chem. Eng., vol. 86,
pp. 106-119, 2016, doi: http://dx.doi.org/10.1016/j.compchemeng.2015.12.018.
[105] N. H. Lappas and C. E. Gounaris, "Multi-stage Adjustable Robust Optimization
for Process Scheduling under Uncertainty," AIChE Journal, vol. 62, no. 5, pp.
1646-1667, 2016, doi: 10.1002/aic.15183.
[106] A. Ben-Tal, A. Goryashko, E. Guslitzer, and A. Nemirovski, "Adjustable robust
solutions of uncertain linear programs," Math. Program., vol. 99, no. 2, pp. 351-
376, 2004, doi: 10.1007/s10107-003-0454-y.
[107] H. Shi and F. You, "A computational framework and solution algorithms for
two-stage adaptive robust scheduling of batch manufacturing processes under
uncertainty," AIChE J., vol. 62, no. 3, pp. 687-703, 2016. [Online]. Available:
http://dx.doi.org/10.1002/aic.15067.
[108] J. Gong, D. J. Garcia, and F. You, "Unraveling optimal biomass processing
routes from bioconversion product and process networks under uncertainty: An
229
adaptive robust optimization approach," ACS Sustain. Chem. Eng., vol. 4, no. 6,
pp. 3160-3173, 2016, doi: 10.1021/acssuschemeng.6b00188.
[109] J. Gong and F. You, "Optimal processing network design under uncertainty for
producing fuels and value-added bioproducts from microalgae: Two-stage
adaptive robust mixed integer fractional programming model and
computationally efficient solution algorithm," AIChE J., vol. 63, no. 2, pp. 582-
600, 2017, doi: 10.1002/aic.15370.
[110] E. Delage and D. A. Iancu, "Robust multistage decision making." Catonsville,
MD: INFORMS Tutorials in Operations Research, 2015, pp. 20-46.
[111] C. Ning and F. You, "A Transformation-Proximal Bundle Algorithm for Solving
Large-Scale Multistage Adaptive Robust Optimization Problems," arXiv
preprint arXiv:1810.05931, 2018.
[112] C. Ning and F. You, "A Data-Driven Multistage Adaptive Robust Optimization
Framework for Planning and Scheduling under Uncertainty," AIChE J., vol. 63,
no. 10, pp. 4343–4369, 2017b, doi: 10.1002/aic.15792.
[113] K. McLean and X. Li, "Robust Scenario Formulations for Strategic Supply
Chain Optimization under Uncertainty," Ind. Eng. Chem. Res., vol. 52, no. 16,
pp. 5721-5734, 2013, doi: 10.1021/ie303114r.
[114] D. Yue and F. You, "Optimal supply chain design and operations under multi-
scale uncertainties: Nested stochastic robust optimization modeling framework
and solution algorithm," AIChE J., vol. 62, no. 9, pp. 3041-3055, 2016, doi:
10.1002/aic.15255.
[115] C. Liu, C. Lee, H. Chen, and S. Mehrotra, "Stochastic Robust Mathematical
Programming Model for Power System Optimization," IEEE Trans. Power Syst.,
vol. 31, no. 1, pp. 821-822, 2016, doi: 10.1109/TPWRS.2015.2394320.
[116] L. Baringo and A. Baringo, "A Stochastic Adaptive Robust Optimization
Approach for the Generation and Transmission Expansion Planning," IEEE
Trans. Power Syst., vol. 33, no. 1, pp. 792-802, Jan 2018, doi:
10.1109/tpwrs.2017.2713486.
[117] C. Y. Zhao and Y. P. Guan, "Unified Stochastic and Robust Unit Commitment,"
IEEE Trans. Power Syst., vol. 28, no. 3, pp. 3353-3361, Aug 2013, doi:
10.1109/tpwrs.2013.2251916.
[118] G. D. Liu, Y. Xu, and K. Tomsovic, "Bidding Strategy for Microgrid in Day-
Ahead Market Based on Hybrid Stochastic/Robust Optimization," IEEE
Transactions on Smart Grid, vol. 7, no. 1, pp. 227-237, Jan 2016, doi:
10.1109/tsg.2015.2476669.
[119] E. Keyvanshokooh, S. M. Ryan, and E. Kabir, "Hybrid robust and stochastic
optimization for closed-loop supply chain network design using accelerated
Benders decomposition," Eur. J. Oper. Res., vol. 249, no. 1, pp. 76-92, Feb 2016,
doi: 10.1016/j.ejor.2015.08.028.
[120] P. Parpas, B. Rustem, and E. Pistikopoulos, "Global optimization of robust
chance constrained problems," Journal of Global Optimization, vol. 43, no. 2-3,
pp. 231-247, Mar 2009, doi: 10.1007/s10898-007-9244-z.
230
[121] J. E. Smith and R. L. Winkler, "The optimizer's curse: Skepticism and
postdecision surprise in decision analysis," Manage. Sci., vol. 52, no. 3, pp. 311-
322, 2006, doi: 10.1287/mnsc.1050.0451.
[122] E. Delage and Y. Y. Ye, "Distributionally Robust Optimization Under Moment
Uncertainty with Application to Data-Driven Problems," Oper. Res., vol. 58, no.
3, pp. 595-612, May-Jun 2010, doi: 10.1287/opre.1090.0741.
[123] G. A. Hanasusanto, V. Roitch, D. Kuhn, and W. Wiesemann, "A distributionally
robust perspective on uncertainty quantification and chance constrained
programming," Math. Program., vol. 151, no. 1, pp. 35-62, Jun 2015, doi:
10.1007/s10107-015-0896-z.
[124] P. M. Esfahani and D. Kuhn, "Data-driven distributionally robust optimization
using the Wasserstein metric: performance guarantees and tractable
reformulations," Math. Program., vol. 171, no. 1-2, pp. 115-166, 2018, doi:
10.1007/s10107-017-1172-1.
[125] C. Shang and F. Q. You, "Distributionally robust optimization for planning and
scheduling under uncertainty," Comput. Chem. Eng., vol. 110, pp. 53-68, Feb
2018, doi: 10.1016/j.compchemeng.2017.12.002.
[126] G. C. Calafiore and L. El Ghaoui, "On distributionally robust chance-
constrained linear programs," J. Optim. Theory Appl., vol. 130, no. 1, pp. 1-22,
Jul 2006, doi: 10.1007/s10957-006-9084-x.
[127] J. Gao, C. Ning, and F. You, "Data-driven distributionally robust optimization
of shale gas supply chains under uncertainty," AIChE J., vol.
(doi:10.1002/aic.16488), no. 0, 2018, doi: doi:10.1002/aic.16488.
[128] Z. Hu and L. J. Hong, "Kullback-Leibler divergence constrained distributionally
robust optimization," Available at Optimization Online, 2013.
[129] D. Klabjan, D. Simchi-Levi, and M. Song, "Robust Stochastic Lot-Sizing by
Means of Histograms," Production and Operations Management, vol. 22, no. 3,
pp. 691-710, May-Jun 2013, doi: 10.1111/j.1937-5956.2012.01420.x.
[130] G. Bayraksan and D. K. Love, "Data-Driven Stochastic Programming Using
Phi-Divergences," in The Operations Research Revolution, 2015, pp. 1-19.
[131] G. A. Hanasusanto and D. Kuhn, "Conic Programming Reformulations of Two-
Stage Distributionally Robust Linear Programs over Wasserstein Balls," Oper.
Res., vol. 66, no. 3, pp. 849-869, May-Jun 2018, doi: 10.1287/opre.2017.1698.
[132] D. Bertsimas, M. Sim, and M. Zhang, "Adaptive Distributionally Robust
Optimization," Manage. Sci., vol. 0, no. 0, p. null, doi: 10.1287/mnsc.2017.2952.
[133] P. Xiong, P. Jirutitijaroen, and C. Singh, "A Distributionally Robust
Optimization Model for Unit Commitment Considering Uncertain Wind Power
Generation," IEEE Trans. Power Syst., vol. 32, no. 1, pp. 39-49, Jan 2017, doi:
10.1109/tpwrs.2016.2544795.
[134] Y. W. Chen, Q. L. Guo, H. B. Sun, Z. S. Li, W. C. Wu, and Z. H. Li, "A
Distributionally Robust Optimization Model for Unit Commitment Based on
Kullback-Leibler Divergence," IEEE Trans. Power Syst., vol. 33, no. 5, pp.
5147-5160, Sep 2018, doi: 10.1109/tpwrs.2018.2797069.
231
[135] C. Duan, L. Jiang, W. L. Fang, and J. Liu, "Data-Driven Affinely Adjustable
Distributionally Robust Unit Commitment," IEEE Trans. Power Syst., vol. 33,
no. 2, pp. 1385-1398, Mar 2018, doi: 10.1109/tpwrs.2017.2741506.
[136] C. Y. Zhao and Y. P. Guan, "Data-Driven Stochastic Unit Commitment for
Integrating Wind Generation," IEEE Trans. Power Syst., vol. 31, no. 4, pp. 2587-
2596, Jul 2016, doi: 10.1109/tpwrs.2015.2477311.
[137] C. Wang, R. Gao, F. Qiu, J. Wang, and L. Xin, "Risk-Based Distributionally
Robust Optimal Power Flow With Dynamic Line Rating," IEEE Trans. Power
Syst., vol. 33, no. 6, pp. 6074-6086, 2018, doi: 10.1109/TPWRS.2018.2844356.
[138] Y. Guo, K. Baker, E. Dall'Anese, Z. Hu, and T. Summers, "Stochastic Optimal
Power Flow Based on Data-Driven Distributionally Robust Optimization," in
2018 Annual American Control Conference (ACC), 27-29 June 2018 2018, pp.
3840-3846, doi: 10.23919/ACC.2018.8431542.
[139] S. Zymler, D. Kuhn, and B. Rustem, "Distributionally robust joint chance
constraints with second-order moment information," Math. Program., vol. 137,
no. 1-2, pp. 167-198, 2013.
[140] B. Li, R. Jiang, and J. L. Mathieu, "Ambiguous risk constraints with moment
and unimodality information," Math. Program., journal article November 24
2017, doi: 10.1007/s10107-017-1212-x.
[141] Z. Chen, S. Peng, and J. Liu, "Data-Driven Robust Chance Constrained
Problems: A Mixture Model Approach," J. Optim. Theory Appl., journal article
vol. 179, no. 3, pp. 1065-1085, December 01 2018, doi: 10.1007/s10957-018-
1376-4.
[142] L. El Ghaoui, M. Oks, and F. Oustry, "Worst-case Value-at-Risk and robust
portfolio optimization: A conic programming approach," Oper. Res., vol. 51, no.
4, pp. 543-556, Jul-Aug 2003. [Online]. Available: <Go to
ISI>://WOS:000184512800004.
[143] J. Cheng, E. Delage, and A. Lisser, "Distributionally Robust Stochastic
Knapsack Problem," SIAM J. Optim., vol. 24, no. 3, pp. 1485-1506, 2014, doi:
10.1137/130915315.
[144] Y. Zhang, R. Jiang, and S. Shen, "Ambiguous Chance-Constrained Binary
Programs under Mean-Covariance Information," SIAM J. Optim., vol. 28, no. 4,
pp. 2922-2944, 2018, doi: 10.1137/17m1158707.
[145] G. A. Hanasusanto, V. Roitch, D. Kuhn, and W. Wiesemann, "Ambiguous Joint
Chance Constraints Under Mean and Dispersion Information," Oper. Res., vol.
65, no. 3, pp. 751-767, May-Jun 2017, doi: 10.1287/opre.2016.1583.
[146] W. Wiesemann, D. Kuhn, and M. Sim, "Distributionally Robust Convex
Optimization," Oper. Res., vol. 62, no. 6, pp. 1358-1376, Nov-Dec 2014, doi:
10.1287/opre.2014.1314.
[147] W. Z. Yang and H. Xu, "Distributionally robust chance constraints for non-linear
uncertainties," Math. Program., vol. 155, no. 1-2, pp. 231-265, Jan 2016, doi:
10.1007/s10107-014-0842-5.
[148] W. J. Xie and S. Ahmed, "On Deterministic Reformulations of Distributionally
Robust Joint Chance Constrained Optimization Problems," SIAM J. Optim., vol.
28, no. 2, pp. 1151-1182, 2018, doi: 10.1137/16m1094725.
232
[149] K. Postek, A. Ben-Tal, D. den Hertog, and B. Melenberg, "Robust Optimization
with Ambiguous Stochastic Constraints Under Mean and Dispersion
Information," Oper. Res., vol. 66, no. 3, pp. 814-833, May-Jun 2018, doi:
10.1287/opre.2017.1688.
[150] J. Lasserre and T. Weisser, "Distributionally robust polynomial chance-
constraints under mixture ambiguity sets," 2018.
[151] E. Erdogan and G. Iyengar, "Ambiguous chance constrained problems and
robust optimization," Math. Program., vol. 107, no. 1-2, pp. 37-61, Jun 2006,
doi: 10.1007/s10107-005-0678-0.
[152] R. W. Jiang and Y. P. Guan, "Data-driven chance constrained stochastic
program," Math. Program., vol. 158, no. 1-2, pp. 291-327, Jul 2016, doi:
10.1007/s10107-015-0929-7.
[153] Z. Chen, D. Kuhn, and W. Wiesemann, "Data-Driven Chance Constrained
Programs over Wasserstein Balls," arXiv preprint arXiv:1809.00210, 2018.
[154] R. Ji and M. Lejeune, "Data-Driven Distributionally Robust Chance-
Constrained Programming with Wasserstein Metric," 2018.
[155] R. Gao and A. J. Kleywegt, "Distributionally robust stochastic optimization with
Wasserstein distance," arXiv preprint arXiv:1604.02199, 2016.
[156] W. Xie, "On Distributionally Robust Chance Constrained Program with
Wasserstein Distance," arXiv preprint arXiv:1806.07418, 2018.
[157] A. R. Hota, A. Cherukuri, and J. Lygeros, "Data-Driven Chance Constrained
Optimization under Wasserstein Ambiguity Sets," arXiv preprint
arXiv:1805.06729, 2018.
[158] W. Xie and S. Ahmed, "Distributionally Robust Chance Constrained Optimal
Power Flow with Renewables: A Conic Reformulation," IEEE Trans. Power
Syst., vol. 33, no. 2, pp. 1860-1867, 2018, doi: 10.1109/TPWRS.2017.2725581.
[159] B. P. G. Van Parys, D. Kuhn, P. J. Goulart, and M. Morari, "Distributionally
Robust Control of Constrained Stochastic Systems," IEEE Trans. Autom.
Control., vol. 61, no. 2, pp. 430-442, Feb 2016, doi: 10.1109/tac.2015.2444134.
[160] S. Ghosal and W. Wiesemann, "The Distributionally Robust Chance
Constrained Vehicle Routing Problem," Available on Optimization Online, 2018.
[161] D. E. Bell, "Regret in Decision Making under Uncertainty," Oper. Res., vol. 30,
no. 5, pp. 961-981, 1982, doi: 10.1287/opre.30.5.961.
[162] C. Ning and F. You, "Adaptive robust optimization with minimax regret
criterion: Multiobjective optimization framework and computational algorithm
for planning and scheduling under uncertainty," Comput. Chem. Eng., vol. 108,
no. Supplement C, pp. 425-447, 2018, doi:
https://doi.org/10.1016/j.compchemeng.2017.09.026.
[163] C. Ning and F. You, "Data-driven stochastic robust optimization: General
computational framework and algorithm leveraging machine learning for
optimization under uncertainty in the big data era," Comput. Chem. Eng., vol.
111, pp. 115-133, 2018, doi:
https://doi.org/10.1016/j.compchemeng.2017.12.015.
233
[164] C. Shang, X. Huang, and F. You, "Data-driven robust optimization based on
kernel learning," Comput. Chem. Eng., vol. 106, pp. 464-479, 2017, doi:
https://doi.org/10.1016/j.compchemeng.2017.07.004.
[165] D. Bertsimas, V. Gupta, and N. Kallus, "Data-driven robust optimization," Math.
Program., journal article vol. 167, no. 2, pp. 235-292, February 01 2018, doi:
10.1007/s10107-017-1125-8.
[166] Y. Zhang, X. Z. Jin, Y. P. Feng, and G. Rong, "Data-driven robust optimization
under correlated uncertainty: A case study of production scheduling in ethylene
plant (Reprinted from computers and Chemical Engineering, vol 109, pg 48-67,
2017)," Comput. Chem. Eng., vol. 116, pp. 17-36, Aug 2018, doi:
10.1016/j.compchemeng.2017.10.039.
[167] Y. Zhang, Y. P. Feng, and G. Rong, "Data-driven rolling-horizon robust
optimization for petrochemical scheduling using probability density contours,"
Comput. Chem. Eng., vol. 115, pp. 342-360, Jul 2018, doi:
10.1016/j.compchemeng.2018.04.013.
[168] L. Zhao, C. Ning, and F. You, "Operational optimization of industrial steam
systems under uncertainty using data-Driven adaptive robust optimization,"
AIChE J., vol. (doi:10.1002/aic.16500), no. 0, doi: doi:10.1002/aic.16500.
[169] F. Miao et al., "Data-Driven Robust Taxi Dispatch Under Demand
Uncertainties," IEEE Transactions on Control Systems Technology, vol. 27, no.
1, pp. 175-191, Jan 2019, doi: 10.1109/tcst.2017.2766042.
[170] G. Calafiore and M. C. Campi, "Uncertain convex programs: randomized
solutions and confidence levels," Math. Program., journal article vol. 102, no.
1, pp. 25-46, January 01 2005, doi: 10.1007/s10107-003-0499-y.
[171] M. C. Campi, S. Garatti, and M. Prandini, "The scenario approach for systems
and control design," Annual Reviews in Control, vol. 33, no. 2, pp. 149-157, Dec
2009, doi: 10.1016/j.arcontrol.2009.07.001.
[172] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press,
2004.
[173] M. C. Campi and S. Garatti, "The Exact Feasibility of Randomized Solutions of
Uncertain Convex Programs," SIAM J. Optim., vol. 19, no. 3, pp. 1211-1230,
2008, doi: 10.1137/07069821x.
[174] X. J. Zhang, S. Grammatico, G. Schildbach, P. Goulart, and J. Lygeros, "On the
sample size of random convex programs with structured dependence on the
uncertainty," Automatica, vol. 60, pp. 182-188, Oct 2015, doi:
10.1016/j.automatica.2015.07.013.
[175] T. Kanamori and A. Takeda, "Worst-Case Violation of Sampled Convex
Programs for Optimization with Uncertainty," J. Optim. Theory Appl., vol. 152,
no. 1, pp. 171-197, Jan 2012, doi: 10.1007/s10957-011-9923-2.
[176] G. Calafiore, "On the Expected Probability of Constraint Violation in Sampled
Convex Programs," J. Optim. Theory Appl., vol. 143, no. 2, pp. 405-412, Nov
2009, doi: 10.1007/s10957-009-9579-3.
[177] P. M. Esfahani, T. Sutter, and J. Lygeros, "Performance Bounds for the Scenario
Approach and an Extension to a Class of Non-Convex Programs," IEEE Trans.
234
Autom. Control., vol. 60, no. 1, pp. 46-58, Jan 2015, doi:
10.1109/tac.2014.2330702.
[178] G. C. Calafiore, "RANDOM CONVEX PROGRAMS," SIAM J. Optim., vol. 20,
no. 6, pp. 3427-3464, 2010, doi: 10.1137/090773490.
[179] M. C. Campi and S. Garatti, "A Sampling-and-Discarding Approach to Chance-
Constrained Optimization: Feasibility and Optimality," J. Optim. Theory Appl.,
vol. 148, no. 2, pp. 257-280, Feb 2011, doi: 10.1007/s10957-010-9754-6.
[180] M. C. Campi and S. Garatti, "Wait-and-judge scenario optimization," Math.
Program., vol. 167, no. 1, pp. 155-189, Jan 2018, doi: 10.1007/s10107-016-
1056-9.
[181] N. Kariotoglou, K. Margellos, and J. Lygeros, "On the computational
complexity and generalization properties of multi-stage and stage-wise coupled
scenario programs," Systems & Control Letters, vol. 94, pp. 63-69, Aug 2016,
doi: 10.1016/j.sysconle.2016.05.009.
[182] P. Vayanos, D. Kuhn, and B. Rustem, "A constraint sampling approach for
multi-stage robust optimization," Automatica, vol. 48, no. 3, pp. 459-471,
2012/03/01/ 2012, doi: 10.1016/j.automatica.2011.12.002.
[183] G. Calafiore, D. Lyons, and L. Fagiano, "On mixed-integer random convex
programs," in 2012 IEEE 51st IEEE Conference on Decision and Control (CDC),
10-13 Dec. 2012 2012, pp. 3508-3513, doi: 10.1109/CDC.2012.6426905.
[184] J. A. De Loera, R. N. La Haye, D. Oliveros, and E. Roldan-Pensado, "Chance-
Constrained Convex Mixed-Integer Optimization and Beyond: Two Sampling
Algorithms within S-Optimization," Journal of Convex Analysis, vol. 25, no. 1,
pp. 201-218, 2018. [Online]. Available: <Go to ISI>://WOS:000428115600012.
[185] M. Chamanbaz, F. Dabbene, R. Tempo, V. Venkataramanan, and Q. G. Wang,
"Sequential Randomized Algorithms for Convex Optimization in the Presence
of Uncertainty," IEEE Trans. Autom. Control., vol. 61, no. 9, pp. 2565-2571,
Sep 2016, doi: 10.1109/tac.2015.2494875.
[186] T. Alamo, R. Tempo, A. Luque, and D. R. Ramirez, "Randomized methods for
design of uncertain systems: Sample complexity and sequential algorithms,"
Automatica, vol. 52, pp. 160-172, Feb 2015, doi:
10.1016/j.automatica.2014.11.004.
[187] G. Calafiore, "Repetitive Scenario Design," IEEE Trans. Autom. Control., vol.
62, no. 3, pp. 1125-1137, Mar 2017, doi: 10.1109/tac.2016.2575859.
[188] K. You, R. Tempo, and P. Xie, "Distributed Algorithms for Robust Convex
Optimization via the Scenario Approach," IEEE Trans. Autom. Control., pp. 1-
1, 2018, doi: 10.1109/TAC.2018.2828093.
[189] K. Margellos, A. Falsone, S. Garatti, and M. Prandini, "Distributed Constrained
Optimization and Consensus in Uncertain Networks via Proximal
Minimization," IEEE Trans. Autom. Control., vol. 63, no. 5, pp. 1372-1387,
May 2018, doi: 10.1109/tac.2017.2747505.
[190] L. Carlone, V. Srivastava, F. Bullo, and G. C. Calafiore, "Distributed Random
Convex Programming via Constraints Consensus," SIAM Journal on Control
and Optimization, vol. 52, no. 1, pp. 629-662, 2014, doi: 10.1137/120885796.
235
[191] A. Care, S. Garatti, and M. C. Campi, "FAST-Fast Algorithm for the Scenario
Technique," Oper. Res., vol. 62, no. 3, pp. 662-671, May-Jun 2014, doi:
10.1287/opre.2014.1257.
[192] M. C. Campi, S. Garatti, and F. A. Ramponi, "A General Scenario Theory for
Nonconvex Optimization and Decision Making," IEEE Trans. Autom. Control.,
vol. 63, no. 12, pp. 4067-4078, 2018, doi: 10.1109/TAC.2018.2808446.
[193] T. Alamo, R. Tempo, and E. F. Camacho, "Randomized Strategies for
Probabilistic Solutions of Uncertain Feasibility and Optimization Problems,"
IEEE Trans. Autom. Control., vol. 54, no. 11, pp. 2545-2559, Nov 2009, doi:
10.1109/tac.2009.2031207.
[194] G. Calafiore, F. Dabbene, and R. Tempo, "Research on probabilistic methods
for control system design," Automatica, vol. 47, no. 7, pp. 1279-1293, Jul 2011,
doi: 10.1016/j.automatica.2011.02.029.
[195] S. Grammatico, X. J. Zhang, K. Margellos, P. Goulart, and J. Lygeros, "A
Scenario Approach for Non-Convex Control Design," IEEE Trans. Autom.
Control., vol. 61, no. 2, pp. 334-345, Feb 2016, doi: 10.1109/tac.2015.2433591.
[196] A. R. Mohamed, G. E. Dahl, and G. Hinton, "Acoustic Modeling Using Deep
Belief Networks," Ieee Transactions on Audio Speech and Language Processing,
vol. 20, no. 1, pp. 14-22, Jan 2012, doi: 10.1109/tasl.2011.2109382.
[197] Z. P. Zhang and J. S. Zhao, "A deep belief network based fault diagnosis model
for complex chemical processes," Comput. Chem. Eng., vol. 107, pp. 395-407,
Dec 2017, doi: 10.1016/j.compchemeng.2017.02.041.
[198] C. Shang, F. Yang, D. X. Huang, and W. X. Lyu, "Data-driven soft sensor
development based on deep learning technique," Journal of Process Control,
vol. 24, no. 3, pp. 223-233, Mar 2014, doi: 10.1016/j.jprocont.2014.012.
[199] E. Gawehn, J. A. Hiss, and G. Schneider, "Deep learning in drug discovery,"
Molecular informatics, vol. 35, no. 1, pp. 3-14, 2016.
[200] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with
Deep Convolutional Neural Networks," Communications of the Acm, vol. 60, no.
6, pp. 84-90, Jun 2017, doi: 10.1145/3065386.
[201] Y. Wu, H. Tan, L. Qin, B. Ran, and Z. Jiang, "A hybrid deep learning based
traffic flow prediction method and its understanding," Transportation Research
Part C: Emerging Technologies, vol. 90, pp. 166-180, 2018/05/01/ 2018, doi:
https://doi.org/10.1016/j.trc.2018.03.001.
[202] A. Graves, A. R. Mohamed, G. Hinton, and Ieee, "SPEECH RECOGNITION
WITH DEEP RECURRENT NEURAL NETWORKS," in 2013 Ieee
International Conference on Acoustics, Speech and Signal Processing,
(International Conference on Acoustics Speech and Signal Processing ICASSP,
2013, pp. 6645-6649.
[203] J. Vermaak and E. C. Botha, "Recurrent neural networks for short-term load
forecasting," IEEE Trans. Power Syst., vol. 13, no. 1, pp. 126-132, 1998, doi:
10.1109/59.651623.
[204] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural
computation, vol. 9, no. 8, pp. 1735-1780, 1997.
236
[205] J. Potočnik, "Renewable Energy Sources and the Realities of Setting an Energy
Agenda," Science, vol. 315, no. 5813, pp. 810-811, 2007, doi:
10.1126/science.1139086.
[206] H. Kopetz, "Build a biomass energy market," Nature, Comment vol. 494, pp.
29-31, 2013, doi: 10.1038/494029a.
[207] D. Yue, F. You, and S. W. Snyder, "Biomass-to-bioenergy and biofuel supply
chain optimization: Overview, key issues and challenges," Comput. Chem. Eng.,
vol. 66, pp. 36-56, 2014, doi:
http://dx.doi.org/10.1016/j.compchemeng.2013.11.016.
[208] Z. Hu, Y. Wang, and Z. Wen, "Alkali (NaOH) pretreatment of switchgrass by
radio frequency-based dielectric heating," Appl. Biochem. Biotechnol., vol. 148,
no. 1-3, pp. 71-81, 2008, doi: 10.1007/s12010-007-8083-1.
[209] M. Safar et al., "Catalytic effects of potassium on biomass pyrolysis, combustion
and torrefaction," Applied Energy, vol. 235, pp. 346-355, 2019, doi:
https://doi.org/10.1016/j.apenergy.2018.10.065.
[210] V. Benedetti, F. Patuzzi, and M. Baratieri, "Characterization of char from
biomass gasification and its similarities with activated carbon in adsorption
applications," Applied Energy, vol. 227, pp. 92-99, 2018, doi:
https://doi.org/10.1016/j.apenergy.2017.08.076.
[211] W. Zhang, J. R. Barone, and S. Renneckar, "Biomass Fractionation after
Denaturing Cell Walls by Glycerol Thermal Processing," ACS Sustain. Chem.
Eng., vol. 3, no. 3, pp. 413-420, 2015, doi: 10.1021/sc500564g.
[212] T. Damartzis and A. Zabaniotou, "Thermochemical conversion of biomass to
second generation biofuels through integrated process design—A review,"
Renewable and Sustainable Energy Reviews, vol. 15, no. 1, pp. 366-378, 2011,
doi: https://doi.org/10.1016/j.rser.2010.08.003.
[213] K. Dutta, A. Daverey, and J.-G. Lin, "Evolution retrospective for alternative
fuels: First to fourth generation," Renewable Energy, vol. 69, pp. 114-122, 2014,
doi: https://doi.org/10.1016/j.renene.2014.02.044.
[214] R. A. Lee and J.-M. Lavoie, "From first- to third-generation biofuels: Challenges
of producing a commodity from a biomass of increasing complexity," Animal
Frontiers, vol. 3, no. 2, pp. 6-11, 2013, doi: 10.2527/af.2013-0010.
[215] L. Gil-Carrera, J. D. Browne, I. Kilgallon, and J. D. Murphy, "Feasibility study
of an off-grid biomethane mobile solution for agri-waste," Applied Energy, vol.
239, pp. 471-481, 2019, doi: https://doi.org/10.1016/j.apenergy.2019.01.141.
[216] J. Lee et al., "Pyrolysis process of agricultural waste using CO2 for waste
management, energy recovery, and biochar fabrication," Applied Energy, vol.
185, pp. 214-222, 2017, doi: https://doi.org/10.1016/j.apenergy.2016.10.092.
[217] M. Rajinipriya, M. Nagalakshmaiah, M. Robert, and S. Elkoun, "Importance of
Agricultural and Industrial Waste in the Field of Nanocellulose and Recent
Industrial Developments of Wood Based Nanocellulose: A Review," ACS
Sustain. Chem. Eng., vol. 6, no. 3, pp. 2807-2828, 2018, doi:
10.1021/acssuschemeng.7b03437.
237
[218] W. H. Chen et al., "A comprehensive analysis of food waste derived liquefaction
bio-oil properties for industrial application," Applied Energy, vol. 237, pp. 283-
291, 2019, doi: 10.1016/j.apenergy.2018.12.084.
[219] D. J. Garcia and F. You, "Multiobjective optimization of product and process
networks: General modeling framework, efficient global optimization algorithm,
and case studies on bioconversion," AIChE J., vol. 61, no. 2, pp. 530-554, 2015,
doi: 10.1002/aic.14666.
[220] A. Soroudi and T. Amraee, "Decision making under uncertainty in energy
systems: State of the art," Renewable and Sustainable Energy Reviews, vol. 28,
pp. 376-384, 2013, doi: https://doi.org/10.1016/j.rser.2013.08.039.
[221] C. Ning and F. You, "Optimization under uncertainty in the era of big data and
deep learning: When machine learning meets mathematical programming,"
Comput. Chem. Eng., vol. 125, pp. 434-448, 2019, doi:
https://doi.org/10.1016/j.compchemeng.2019.03.034.
[222] C. Ning and F. You, "A data-driven multistage adaptive robust optimization
framework for planning and scheduling under uncertainty," AIChE J., vol. 63,
no. 10, pp. 4343-4369, 2017, doi: 10.1002/aic.15792.
[223] P. Daoutidis, W. A. Marvin, S. Rangarajan, and A. I. Torres, "Engineering
Biomass Conversion Processes: A Systems Perspective," AIChE J., vol. 59, no.
1, pp. 3-18, 2013, doi: 10.1002/aic.13978.
[224] S. Rangarajan, A. Bhan, and P. Daoutidis, "Rule-Based Generation of
Thermochemical Routes to Biomass Conversion," Ind. Eng. Chem. Res., vol. 49,
no. 21, pp. 10459-10470, 2010, doi: 10.1021/ie100546t.
[225] J. Kim, S. M. Sen, and C. T. Maravelias, "An optimization-based assessment
framework for biomass-to-fuel conversion strategies," Energy & Environmental
Science, vol. 6, no. 4, pp. 1093-1104, 2013, doi: 10.1039/c3ee24243a.
[226] J. Gong and F. You, "Global Optimization for Sustainable Design and Synthesis
of Algae Processing Network for CO2 Mitigation and Biofuel Production Using
Life Cycle Optimization," AIChE J., vol. 60, no. 9, pp. 3195-3210, 2014, doi:
10.1002/aic.14504.
[227] J. Gong and F. You, "Sustainable design and synthesis of energy systems,"
Current Opinion in Chemical Engineering, vol. 10, pp. 77-86, 2015, doi:
10.1016/j.coche.2015.09.001.
[228] S. Bairamzadeh, M. Saidi-Mehrabad, and M. S. Pishvaee, "Modelling different
types of uncertainty in biofuel supply network design and planning: A robust
optimization approach," Renewable Energy, vol. 116, pp. 500-517, 2018, doi:
https://doi.org/10.1016/j.renene.2017.09.020.
[229] C. Caldeira, O. Swei, F. Freire, L. C. Dias, E. A. Olivetti, and R. Kirchain,
"Planning strategies to address operational and price uncertainty in biodiesel
production," Applied Energy, vol. 238, pp. 1573-1581, 2019, doi:
10.1016/j.apenergy.2019.01.195.
[230] K. Tong, J. Gong, D. Yue, and F. You, "Stochastic Programming Approach to
Optimal Design and Operations of Integrated Hydrocarbon Biofuel and
Petroleum Supply Chains," ACS Sustain. Chem. Eng., vol. 2, no. 1, pp. 49-61,
2014, doi: 10.1021/sc4002671.
238
[231] A. Osmani and J. Zhang, "Economic and environmental optimization of a large
scale sustainable dual feedstock lignocellulosic-based bioethanol supply chain
in a stochastic environment," Applied Energy, vol. 114, pp. 572-587, 2014, doi:
10.1016/j.apenergy.2013.10.024.
[232] D. Bertsimas, S. Shtern, and B. Sturt, "A Data-Driven Approach for Multi-Stage
Linear Optimization," optimization-online.org, 2019.
[233] Z. Chen, M. Sim, and P. Xiong, "Robust Stochastic Optimization: The Synergy
of Robust Optimization and Stochastic Programming," optimization-online.org,
2019.
[234] W. Xie, "On Distributionally Robust Chance Constrained Programs with
Wasserstein Distance," arXiv preprint arXiv:1806.07418, 2018.
[235] E. Delage and Y. Ye, "Distributionally Robust Optimization Under Moment
Uncertainty with Application to Data-Driven Problems," Oper. Res., vol. 58, no.
3, pp. 595-612, 2010, doi: 10.1287/opre.1090.0741.
[236] K. L. Hoffman, "A method for globally minimizing concave functions over
convex sets," Math. Program., journal article vol. 20, no. 1, pp. 22-32, 1981, doi:
10.1007/bf01589330.
[237] S. S. Toor, L. Rosendahl, and A. Rudolf, "Hydrothermal liquefaction of biomass:
A review of subcritical water technologies," Energy, vol. 36, no. 5, pp. 2328-
2342, 2011, doi: https://doi.org/10.1016/j.energy.2011.03.013.
[238] D. J. Garcia and F. You, "Systems engineering opportunities for agricultural and
organic waste management in the food-water-energy nexus," Current Opinion
in Chemical Engineering, vol. 18, pp. 23-31, 2017, doi:
10.1016/j.coche.2017.08.004.
[239] P. Morone, A. Koutinas, N. Gathergood, M. Arshadi, and A. Matharu, "Food
waste: Challenges and opportunities for enhancing the emerging bio-economy,"
Journal of Cleaner Production, vol. 221, pp. 10-16, 2019, doi:
https://doi.org/10.1016/j.jclepro.2019.02.258.
[240] J. Nicoletti, C. Ning, and F. You, "Incorporating Agricultural Waste-to-Energy
Pathways into Biomass Product and Process Network through Data-Driven
Nonlinear Adaptive Robust Optimization," Energy, vol. 180, pp. 556-571, 2019.
[241] R. Hakawati, B. M. Smyth, G. McCullough, F. De Rosa, and D. Rooney, "What
is the most energy efficient route for biogas utilization: Heat, electricity or
transport?," Applied Energy, vol. 206, pp. 1076-1087, 2017, doi:
10.1016/j.apenergy.2017.08.068.
[242] Y. Y. Jin, T. Chen, X. Chen, and Z. X. Yu, "Life-cycle assessment of energy
consumption and environmental impact of an integrated food waste-based
biogas plant," Applied Energy, vol. 151, pp. 227-236, 2015, doi:
10.1016/j.apenergy.2015.04.058.
[243] R. Campuzano and S. González-Martínez, "Characteristics of the organic
fraction of municipal solid waste and methane production: A review," Waste
Manage. (Oxford), vol. 54, pp. 3-12, 2016, doi:
https://doi.org/10.1016/j.wasman.2016.05.016.
[244] C. Villani, Optimal transport: old and new. Springer Science & Business Media,
2008.
239
[245] C. Zhao and Y. Guan, "Data-driven risk-averse stochastic optimization with
Wasserstein metric," Oper. Res. Lett., vol. 46, no. 2, pp. 262-267, 2018, doi:
https://doi.org/10.1016/j.orl.2018.01.011.
[246] T. A. Reddy, Applied data analysis and modeling for energy engineers and
scientists. Springer Science & Business Media, 2011.
[247] M. Rizwan, J. H. Lee, and R. Gani, "Optimal design of microalgae-based
biorefinery: Economics, opportunities and challenges," Applied Energy, vol. 150,
pp. 69-79, 2015, doi: https://doi.org/10.1016/j.apenergy.2015.04.018.
[248] Z. Zheng et al., "Effect of dairy manure to switchgrass co-digestion ratio on
methane production and the bacterial community in batch anaerobic digestion,"
Applied Energy, vol. 151, pp. 249-257, 2015, doi:
https://doi.org/10.1016/j.apenergy.2015.04.078.
[249] C. G. Gutierrez-Arriaga, M. Serna-Gonzalez, J. M. Ponce-Ortega, and M. M. El-
Halwagi, "Sustainable Integration of Algal Biodiesel Production with Steam
Electric Power Plants for Greenhouse Gas Mitigation," ACS Sustain. Chem. Eng.,
vol. 2, no. 6, pp. 1388-1403, 2014, doi: 10.1021/sc400436a.
[250] J. Seader, W. D. Seider, and D. R. Lewin, Product and process design principles:
synthesis, analysis and evaluation. Wiley, 2004.
[251] D. Bertsimas, M. Sim, and M. Zhang, "Adaptive Distributionally Robust
Optimization," Manage. Sci., vol. 65, no. 2, pp. 604-618, 2019, doi:
10.1287/mnsc.2017.2952.
[252] D. J. Garcia and F. You, "Network-Based Life Cycle Optimization of the Net
Atmospheric CO2-eq Ratio (NACR) of Fuels and Chemicals Production from
Biomass," ACS Sustain. Chem. Eng., vol. 3, no. 8, pp. 1732-1744, 2015, doi:
10.1021/acssuschemeng.5b00262.
[253] "https://www.indexmundi.com/commodities/," 2019.
[254] E. Rosenthal, GAMS-A user’s guide. Washington, DC: GAMS Development
Corporation, 2008.
[255] J. Remon, P. Arcelus-Arrillaga, L. Garcia, and J. Arauzo, "Simultaneous
production of gaseous and liquid biofuels from the synergetic co-valorisation of
bio-oil and crude glycerol in supercritical water," Applied Energy, vol. 228, pp.
2275-2287, 2018, doi: 10.1016/j.apenergy.2018.07.093.
[256] I. Ullah Khan et al., "Biogas as a renewable energy fuel – A review of biogas
upgrading, utilisation and storage," Energy Convers. Manage., vol. 150, pp. 277-
294, 2017, doi: https://doi.org/10.1016/j.enconman.2017.08.035.
[257] A. Shapiro, "On Duality Theory of Conic Linear Problems," in Semi-Infinite
Programming: Recent Advances, M. Á. Goberna and M. A. López Eds. Boston,
MA: Springer US, 2001, pp. 135-165.
[258] A. J. Conejo and L. Baringo, Power system operations. Springer, 2018.
[259] X. Xia and A. M. Elaiw, "Optimal dynamic economic dispatch of generation: A
review," Electric Power Systems Research, vol. 80, no. 8, pp. 975-986, 2010,
doi: https://doi.org/10.1016/j.epsr.2009.12.012.
[260] J. Hetzer, D. C. Yu, and K. Bhattarai, "An Economic Dispatch Model
Incorporating Wind Power," IEEE Transactions on Energy Conversion, vol. 23,
no. 2, pp. 603-611, 2008, doi: 10.1109/TEC.2007.914171.
240
[261] A. Alqurashi, A. H. Etemadi, and A. Khodaei, "Treatment of uncertainty for next
generation power systems: State-of-the-art in stochastic optimization," Electric
Power Systems Research, vol. 141, pp. 233-245, 2016, doi:
https://doi.org/10.1016/j.epsr.2016.08.009.
[262] L. Á and X. A. Sun, "Adaptive Robust Optimization With Dynamic Uncertainty
Sets for Multi-Period Economic Dispatch Under Significant Wind," IEEE Trans.
Power Syst., vol. 30, no. 4, pp. 1702-1713, 2015, doi:
10.1109/TPWRS.2014.2357714.
[263] J. Zhao, T. Zheng, and E. Litvinov, "Variable Resource Dispatch Through Do-
Not-Exceed Limit," IEEE Trans. Power Syst., vol. 30, no. 2, pp. 820-828, 2015,
doi: 10.1109/TPWRS.2014.2333367.
[264] R. A. Jabr, S. Karaki, and J. A. Korbane, "Robust Multi-Period OPF With
Storage and Renewables," IEEE Trans. Power Syst., vol. 30, no. 5, pp. 2790-
2799, 2015, doi: 10.1109/TPWRS.2014.2365835.
[265] H. Qiu, B. Zhao, W. Gu, and R. Bo, "Bi-Level Two-Stage Robust Optimal
Scheduling for AC/DC Hybrid Multi-Microgrids," IEEE Transactions on Smart
Grid, vol. 9, no. 5, pp. 5455-5466, 2018, doi: 10.1109/TSG.2018.2806973.
[266] W. Wu, J. Chen, B. Zhang, and H. Sun, "A Robust Wind Power Optimization
Method for Look-Ahead Power Dispatch," IEEE Transactions on Sustainable
Energy, vol. 5, no. 2, pp. 507-515, 2014, doi: 10.1109/TSTE.2013.2294467.
[267] Z. Li, W. Wu, B. Zhang, and B. Wang, "Adjustable Robust Real-Time Power
Dispatch With Large-Scale Wind Power Integration," IEEE Transactions on
Sustainable Energy, vol. 6, no. 2, pp. 357-368, 2015, doi:
10.1109/TSTE.2014.2377752.
[268] Z. Lin, H. Chen, Q. Wu, W. Li, M. Li, and T. Ji, "Mean-tracking model based
stochastic economic dispatch for power systems with high penetration of wind
power," Energy, vol. 193, p. 116826, 2020, doi:
https://doi.org/10.1016/j.energy.2019.116826.
[269] R. Lu, T. Ding, B. Qin, J. Ma, X. Fang, and Z. Y. Dong, "Multi-Stage Stochastic
Programming to Joint Economic Dispatch for Energy and Reserve with
Uncertain Renewable Energy," IEEE Transactions on Sustainable Energy, pp.
1-1, 2019, doi: 10.1109/TSTE.2019.2918269.
[270] F. Qiu and J. Wang, "Chance-Constrained Transmission Switching With
Guaranteed Wind Power Utilization," IEEE Trans. Power Syst., vol. 30, no. 3,
pp. 1270-1278, 2015, doi: 10.1109/TPWRS.2014.2346987.
[271] Z. Zhang, Y. Sun, D. W. Gao, J. Lin, and L. Cheng, "A Versatile Probability
Distribution Model for Wind Power Forecast Errors and Its Application in
Economic Dispatch," IEEE Trans. Power Syst., vol. 28, no. 3, pp. 3114-3125,
2013, doi: 10.1109/TPWRS.2013.2249596.
[272] C. Tang et al., "Look-Ahead Economic Dispatch With Adjustable Confidence
Interval Based on a Truncated Versatile Distribution Model for Wind Power,"
IEEE Trans. Power Syst., vol. 33, no. 2, pp. 1755-1767, 2018, doi:
10.1109/TPWRS.2017.2715852.
[273] Z. Wang, C. Shen, F. Liu, X. Wu, C. Liu, and F. Gao, "Chance-Constrained
Economic Dispatch With Non-Gaussian Correlated Wind Power Uncertainty,"
241
IEEE Trans. Power Syst., vol. 32, no. 6, pp. 4880-4893, 2017, doi:
10.1109/TPWRS.2017.2672750.
[274] Y. Yang, W. Wu, B. Wang, and M. Li, "Analytical Reformulation for Stochastic
Unit Commitment Considering Wind Power Uncertainty with Gaussian Mixture
Model," IEEE Trans. Power Syst., pp. 1-1, 2019, doi:
10.1109/TPWRS.2019.2960389.
[275] B. Khorramdel, A. Zare, C. Y. Chung, and P. Gavriliadis, "A Generic Convex
Model for a Chance-Constrained Look-Ahead Economic Dispatch Problem
Incorporating an Efficient Wind Power Distribution Modeling," IEEE Trans.
Power Syst., vol. 35, no. 2, pp. 873-886, 2020, doi:
10.1109/TPWRS.2019.2940288.
[276] K. Baker and A. Bernstein, "Joint Chance Constraints in AC Optimal Power
Flow: Improving Bounds Through Learning," IEEE Transactions on Smart Grid,
vol. 10, no. 6, pp. 6376-6385, 2019, doi: 10.1109/TSG.2019.2903767.
[277] M. S. Modarresi et al., "Scenario-Based Economic Dispatch With Tunable Risk
Levels in High-Renewable Power Systems," IEEE Trans. Power Syst., vol. 34,
no. 6, pp. 5103-5114, 2019, doi: 10.1109/TPWRS.2018.2874464.
[278] H. Ming, L. Xie, M. C. Campi, S. Garatti, and P. R. Kumar, "Scenario-Based
Economic Dispatch With Uncertain Demand Response," IEEE Transactions on
Smart Grid, vol. 10, no. 2, pp. 1858-1868, 2019, doi:
10.1109/TSG.2017.2778688.
[279] X. Geng and L. Xie, "Data-driven decision making in power systems with
probabilistic guarantees: Theory and applications of chance-constrained
optimization," Annual Reviews in Control, vol. 47, pp. 341-363, 2019, doi:
https://doi.org/10.1016/j.arcontrol.2019.05.005.
[280] O. Ciftci, M. Mehrtash, and A. Kargarian, "Data-Driven Nonparametric Chance-
Constrained Optimization for Microgrid Energy Management," IEEE
Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2447-2457, 2020, doi:
10.1109/TII.2019.2932078.
[281] W. Sun, M. Zamani, M. R. Hesamzadeh, and H. Zhang, "Data-Driven
Probabilistic Optimal Power Flow With Nonparametric Bayesian Modeling and
Inference," IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1077-1090,
2020, doi: 10.1109/TSG.2019.2931160.
[282] C. Ning and F. You, "Data-Driven Adaptive Robust Unit Commitment Under
Wind Power Uncertainty: A Bayesian Nonparametric Approach," IEEE Trans.
Power Syst., vol. 34, no. 3, pp. 2409-2418, 2019, doi:
10.1109/TPWRS.2019.2891057.
[283] W. Wei, F. Liu, and S. Mei, "Distributionally Robust Co-Optimization of
Energy and Reserve Dispatch," IEEE Transactions on Sustainable Energy, vol.
7, no. 1, pp. 289-300, 2016, doi: 10.1109/TSTE.2015.2494010.
[284] Y. L. Zhang, S. Q. Shen, and J. L. Mathieu, "Distributionally Robust Chance-
Constrained Optimal Power Flow With Uncertain Renewables and Uncertain
Reserves Provided by Loads," IEEE Trans. Power Syst., vol. 32, no. 2, pp. 1378-
1388, Mar 2017, doi: 10.1109/tpwrs.2016.2572104.
242
[285] M. Shahidehpour, Y. Zhou, Z. Wei, S. Chen, Z. Li, and G. Sun, "Distributionally
Robust Co-optimization of Energy and Reserve for Combined Distribution
Networks of Power and District Heating," IEEE Trans. Power Syst., pp. 1-1,
2019, doi: 10.1109/TPWRS.2019.2954710.
[286] Z. Shi, H. Liang, S. Huang, and V. Dinavahi, "Distributionally Robust Chance-
Constrained Energy Management for Islanded Microgrids," IEEE Transactions
on Smart Grid, vol. 10, no. 2, pp. 2234-2244, 2019, doi:
10.1109/TSG.2018.2792322.
[287] X. Lu, K. W. Chan, S. Xia, B. Zhou, and X. Luo, "Security-Constrained
Multiperiod Economic Dispatch With Renewable Energy Utilizing
Distributionally Robust Optimization," IEEE Transactions on Sustainable
Energy, vol. 10, no. 2, pp. 768-779, 2019, doi: 10.1109/TSTE.2018.2847419.
[288] B. Li, R. Jiang, and J. L. Mathieu, "Distributionally Robust Chance-Constrained
Optimal Power Flow Assuming Unimodal Distributions With Misspecified
Modes," IEEE Transactions on Control of Network Systems, vol. 6, no. 3, pp.
1223-1234, 2019, doi: 10.1109/TCNS.2019.2930872.
[289] Y. Chen, W. Wei, F. Liu, and S. Mei, "Distributionally robust hydro-thermal-
wind economic dispatch," Applied Energy, vol. 173, pp. 511-519, 2016/07/01/
2016, doi: https://doi.org/10.1016/j.apenergy.2016.04.060.
[290] H. Ma, R. Jiang, and Z. Yan, "Distributionally Robust Co-Optimization of
Power Dispatch and Do-Not-Exceed Limits," IEEE Trans. Power Syst., vol. 35,
no. 2, pp. 887-897, 2020, doi: 10.1109/TPWRS.2019.2941635.
[291] W. J. Xie and S. Ahmed, "Distributionally Robust Chance Constrained Optimal
Power Flow with Renewables: A Conic Reformulation," IEEE Trans. Power
Syst., vol. 33, no. 2, pp. 1860-1867, Mar 2018, doi:
10.1109/tpwrs.2017.2725581.
[292] M. Lubin, Y. Dvorkin, and S. Backhaus, "A Robust Approach to Chance
Constrained Optimal Power Flow With Renewable Generation," IEEE Trans.
Power Syst., vol. 31, no. 5, pp. 3840-3849, 2016, doi:
10.1109/TPWRS.2015.2499753.
[293] C. Duan, L. Jiang, W. Fang, J. Liu, and S. Liu, "Data-Driven Distributionally
Robust Energy-Reserve-Storage Dispatch," IEEE Transactions on Industrial
Informatics, vol. 14, no. 7, pp. 2826-2836, 2018, doi: 10.1109/TII.2017.2771355.
[294] H. Zhang, Z. Hu, E. Munsing, S. J. Moura, and Y. Song, "Data-Driven Chance-
Constrained Regulation Capacity Offering for Distributed Energy Resources,"
IEEE Transactions on Smart Grid, vol. 10, no. 3, pp. 2713-2725, 2019, doi:
10.1109/TSG.2018.2809046.
[295] Y. Guo, K. Baker, E. Dall'Anese, Z. Hu, and T. Summers, "Data-based
distributionally robust stochastic optimal power flow, Part I: Methodologies,"
IEEE Trans. Power Syst., pp. 1-1, 2018, doi: 10.1109/TPWRS.2018.2878385.
[296] C. Ordoudis, V. A. Nguyen, D. Kuhn, and P. Pinson, "Energy and Reserve
Dispatch with Distributionally Robust Joint Chance Constraints," 2018.
[297] Y. Chen, Q. Guo, H. Sun, Z. Li, W. Wu, and Z. Li, "A Distributionally Robust
Optimization Model for Unit Commitment Based on Kullback–Leibler
243
Divergence," IEEE Trans. Power Syst., vol. 33, no. 5, pp. 5147-5160, 2018, doi:
10.1109/TPWRS.2018.2797069.
[298] Y. Z. Chen, Y. S. Wang, D. Kirschen, and B. S. Zhang, "Model-Free Renewable
Scenario Generation Using Generative Adversarial Networks," IEEE Trans.
Power Syst., vol. 33, no. 3, pp. 3265-3275, May 2018, doi:
10.1109/tpwrs.2018.2794541.
[299] S. Zhao and F. You, "Distributionally Robust Chance Constrained Programming
with Generative Adversarial Networks (GANs)," AIChE J., vol. 66, no. 6, p.
e16963, 2020.
[300] S. Nowozin, B. Cseke, and R. Tomioka, "f-gan: Training generative neural
samplers using variational divergence minimization," in Advances in neural
information processing systems, 2016, pp. 271-279.
[301] X. Nguyen, M. J. Wainwright, and M. I. Jordan, "Estimating Divergence
Functionals and the Likelihood Ratio by Convex Risk Minimization," IEEE
Transactions on Information Theory, vol. 56, no. 11, pp. 5847-5861, 2010, doi:
10.1109/TIT.2010.2068870.
[302] I. J. Goodfellow et al., "Generative Adversarial Nets," in Advances in Neural
Information Processing Systems 27, vol. 27, Z. Ghahramani, M. Welling, C.
Cortes, N. D. Lawrence, and K. Q. Weinberger Eds., (Advances in Neural
Information Processing Systems, 2014.
[303] G. Schildbach, L. Fagiano, and M. Morari, "Randomized Solutions to Convex
Programs with Multiple Chance Constraints," SIAM J. Optim., vol. 23, no. 4, pp.
2479-2501, 2013, doi: 10.1137/120878719.
[304] C. Draxl, A. Clifton, B.-M. Hodge, and J. McCaa, "The Wind Integration
National Dataset (WIND) Toolkit," Applied Energy, vol. 151, pp. 355-366, 2015,
doi: https://doi.org/10.1016/j.apenergy.2015.03.121.
[305] G. Morales-España, "Unit commitment: computational performance, system
representation and wind uncertainty management," Comillas Pontifical
University, 2014.
[306] D. Q. Mayne, "Model predictive control: Recent developments and future
promise," Automatica, vol. 50, no. 12, pp. 2967-2986, 2014, doi:
https://doi.org/10.1016/j.automatica.2014.10.128.
[307] M. Morari and J. Lee, "Model predictive control: past, present and future,"
Comput. Chem. Eng., vol. 23, no. 4, pp. 667-682, 1999/05/01/ 1999, doi:
https://doi.org/10.1016/S0098-1354(98)00301-9.
[308] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Scokaert, "Constrained
model predictive control: Stability and optimality," Automatica, vol. 36, no. 6,
pp. 789-814, 2000, doi: https://doi.org/10.1016/S0005-1098(99)00214-9.
[309] S. J. Qin and T. A. Badgwell, "A survey of industrial model predictive control
technology," Control Engineering Practice, vol. 11, no. 7, pp. 733-764, 2003,
doi: https://doi.org/10.1016/S0967-0661(02)00186-7.
[310] J. B. Rawlings and D. Q. Mayne, Model predictive control: Theory and design.
Nob Hill Pub., 2009.
[311] A. Bemporad and M. Morari, "Robust model predictive control: A survey," in
Robustness in identification and control: Springer, 1999, pp. 207-226.
244
[312] D. Q. Mayne, M. M. Seron, and S. V. Raković, "Robust model predictive control
of constrained linear systems with bounded disturbances," Automatica, vol. 41,
no. 2, pp. 219-224, 2005/02/01/ 2005, doi:
https://doi.org/10.1016/j.automatica.2004.08.019.
[313] W. Langson, I. Chryssochoos, S. V. Raković, and D. Q. Mayne, "Robust model
predictive control using tubes," Automatica, vol. 40, no. 1, pp. 125-133,
2004/01/01/ 2004, doi: https://doi.org/10.1016/j.automatica.2003.08.009.
[314] L. Chisci, J. A. Rossiter, and G. Zappa, "Systems with persistent disturbances:
predictive control with restricted constraints," Automatica, vol. 37, no. 7, pp.
1019-1028, 2001, doi: https://doi.org/10.1016/S0005-1098(01)00051-6.
[315] A. Mesbah, "Stochastic Model Predictive Control: An Overview and
Perspectives for Future Research," IEEE Control Systems Magazine, vol. 36, no.
6, pp. 30-44, 2016, doi: 10.1109/MCS.2016.2602087.
[316] M. Farina, L. Giulioni, and R. Scattolini, "Stochastic linear Model Predictive
Control with chance constraints – A review," Journal of Process Control, vol.
44, pp. 53-67, 2016, doi: https://doi.org/10.1016/j.jprocont.2016.03.005.
[317] D. Mayne, "Robust and stochastic model predictive control: Are we going in the
right direction?," Annual Reviews in Control, vol. 41, pp. 184-192, 2016, doi:
https://doi.org/10.1016/j.arcontrol.2016.04.006.
[318] M. Lorenzen, F. Dabbene, R. Tempo, and F. Allgöwer, "Constraint-Tightening
and Stability in Stochastic Model Predictive Control," IEEE Trans. Autom.
Control., vol. 62, no. 7, pp. 3165-3177, 2017, doi: 10.1109/TAC.2016.2625048.
[319] D. Muñoz-Carpintero, G. Hu, and C. J. Spanos, "Stochastic Model Predictive
Control with adaptive constraint tightening for non-conservative chance
constraints satisfaction," Automatica, vol. 96, pp. 32-39, 2018, doi:
https://doi.org/10.1016/j.automatica.2018.06.026.
[320] B. Kouvaritakis, M. Cannon, S. V. Raković, and Q. Cheng, "Explicit use of
probabilistic distributions in linear predictive control," Automatica, vol. 46, no.
10, pp. 1719-1724, 2010, doi: https://doi.org/10.1016/j.automatica.2010.06.034.
[321] M. Cannon, B. Kouvaritakis, S. V. Raković, and Q. Cheng, "Stochastic Tubes
in Model Predictive Control With Probabilistic Constraints," IEEE Trans.
Autom. Control., vol. 56, no. 1, pp. 194-200, 2011, doi:
10.1109/TAC.2010.2086553.
[322] L. Dai, Y. Xia, Y. Gao, B. Kouvaritakis, and M. Cannon, "Cooperative
distributed stochastic MPC for systems with state estimation and coupled
probabilistic constraints," Automatica, vol. 61, pp. 89-96, 2015, doi:
https://doi.org/10.1016/j.automatica.2015.07.025.
[323] M. Korda, R. Gondhalekar, F. Oldewurtel, and C. N. Jones, "Stochastic MPC
Framework for Controlling the Average Constraint Violation," IEEE Trans.
Autom. Control., vol. 59, no. 7, pp. 1706-1721, 2014, doi:
10.1109/TAC.2014.2310066.
[324] D. Chatterjee and J. Lygeros, "On Stability and Performance of Stochastic
Predictive Control Techniques," IEEE Trans. Autom. Control., vol. 60, no. 2, pp.
509-514, 2015, doi: 10.1109/TAC.2014.2335274.
245
[325] D. Chatterjee, P. Hokayem, and J. Lygeros, "Stochastic Receding Horizon
Control With Bounded Control Inputs: A Vector Space Approach," IEEE Trans.
Autom. Control., vol. 56, no. 11, pp. 2704-2710, 2011, doi:
10.1109/TAC.2011.2159422.
[326] G. Schildbach, L. Fagiano, C. Frei, and M. Morari, "The scenario approach for
Stochastic Model Predictive Control with bounds on closed-loop constraint
violations," Automatica, vol. 50, no. 12, pp. 3009-3018, 2014, doi:
https://doi.org/10.1016/j.automatica.2014.10.035.
[327] G. C. Calafiore and L. Fagiano, "Stochastic model predictive control of LPV
systems via scenario optimization," Automatica, vol. 49, no. 6, pp. 1861-1866,
2013, doi: https://doi.org/10.1016/j.automatica.2013.02.060.
[328] M. Lorenzen, F. Dabbene, R. Tempo, and F. Allgöwer, "Stochastic MPC with
offline uncertainty sampling," Automatica, vol. 81, pp. 176-183, 2017, doi:
https://doi.org/10.1016/j.automatica.2017.03.031.
[329] M. Farina, L. Giulioni, L. Magni, and R. Scattolini, "An approach to output-
feedback MPC of stochastic linear discrete-time systems," Automatica, vol. 55,
pp. 140-149, 2015, doi: https://doi.org/10.1016/j.automatica.2015.02.039.
[330] M. Farina and R. Scattolini, "Model predictive control of linear systems with
multiplicative unbounded uncertainty and chance constraints," Automatica, vol.
70, pp. 258-265, 2016, doi: https://doi.org/10.1016/j.automatica.2016.04.008.
[331] J. A. Paulson and A. Mesbah, "An efficient method for stochastic optimal
control with joint chance constraints for nonlinear systems," International
Journal of Robust and Nonlinear Control, 2017.
[332] P. Sopasakis, D. Herceg, A. Bemporad, and P. Patrinos, "Risk-averse model
predictive control," Automatica, vol. 100, pp. 281-288, 2019, doi:
https://doi.org/10.1016/j.automatica.2018.11.022.
[333] S. Singh, Y. Chow, A. Majumdar, and M. Pavone, "A Framework for Time-
Consistent, Risk-Sensitive Model Predictive Control: Theory and Algorithms,"
IEEE Trans. Autom. Control., vol. 64, no. 7, pp. 2905-2912, 2019, doi:
10.1109/TAC.2018.2874704.
[334] I. Yang, "A dynamic game approach to distributionally robust safety
specifications for stochastic systems," Automatica, vol. 94, pp. 94-101, Aug
2018, doi: 10.1016/j.automatica.2018.04.022.
[335] A. Aswani, H. Gonzalez, S. S. Sastry, and C. Tomlin, "Provably safe and robust
learning-based model predictive control," Automatica, vol. 49, no. 5, pp. 1216-
1226, 2013, doi: https://doi.org/10.1016/j.automatica.2013.02.003.
[336] U. Rosolia and F. Borrelli, "Learning Model Predictive Control for Iterative
Tasks. A Data-Driven Control Framework," IEEE Trans. Autom. Control., vol.
63, no. 7, pp. 1883-1896, 2018, doi: 10.1109/TAC.2017.2753460.
[337] U. Rosolia, X. Zhang, and F. Borrelli, "Data-Driven Predictive Control for
Autonomous Systems," Annual Review of Control, Robotics, and Autonomous
Systems, vol. 1, no. 1, pp. 259-286, 2018, doi: 10.1146/annurev-control-060117-
105215.
[338] T. Koller, F. Berkenkamp, M. Turchetta, and A. Krause, "Learning-Based
Model Predictive Control for Safe Exploration," in 2018 IEEE Conference on
246
Decision and Control (CDC), 2018, pp. 6059-6066, doi:
10.1109/CDC.2018.8619572.
[339] D. Limon, J. Calliess, and J. M. Maciejowski, "Learning-based Nonlinear Model
Predictive Control," IFAC-PapersOnLine, vol. 50, no. 1, pp. 7769-7776, 2017,
doi: https://doi.org/10.1016/j.ifacol.2017.08.1050.
[340] E. Terzi, L. Fagiano, M. Farina, and R. Scattolini, "Learning-based predictive
control for linear systems: A unitary approach," Automatica, vol. 108, p. 108473,
2019, doi: https://doi.org/10.1016/j.automatica.2019.06.025.
[341] T. A. N. Heirung, B. E. Ydstie, and B. Foss, "Dual adaptive model predictive
control," Automatica, vol. 80, pp. 340-348, 2017, doi:
https://doi.org/10.1016/j.automatica.2017.01.030.
[342] N. M. Filatov and H. Unbehauen, "Survey of adaptive dual control methods,"
IEE Proceedings Control Theory and Applications, vol. 147, no. 1, pp. 118-128,
2000.
[343] L. Hewing, K. P. Wabersich, M. Menner, and M. N. Zeilinger, "Learning-Based
Model Predictive Control: Toward Safe Learning in Control," Annual Review of
Control, Robotics, and Autonomous Systems, vol. 3, no. 1, pp. 269-296, 2020,
doi: 10.1146/annurev-control-090419-075625.
[344] L. Hewing, J. Kabzan, and M. N. Zeilinger, "Cautious model predictive control
using Gaussian process regression," arXiv preprint arXiv:1705.10702, 2017.
[345] R. Soloperto, M. A. Müller, S. Trimpe, and F. Allgöwer, "Learning-Based
Robust Model Predictive Control with State-Dependent Uncertainty," IFAC-
PapersOnLine, vol. 51, no. 20, pp. 442-447, 2018, doi:
https://doi.org/10.1016/j.ifacol.2018.11.052.
[346] Z. Wu, D. Rincon, and P. D. Christofides, "Real-Time Adaptive Machine-
Learning-Based Predictive Control of Nonlinear Processes," Ind. Eng. Chem.
Res., 2019, doi: 10.1021/acs.iecr.9b03055.
[347] R. Gomes, M. Welling, and P. Perona, "Incremental learning of nonparametric
Bayesian mixture models," in 2008 IEEE Conference on Computer Vision and
Pattern Recognition, 2008, pp. 1-8, doi: 10.1109/CVPR.2008.4587370.
[348] J. A. Paulson, T. L. M. Santos, and A. Mesbah, "Mixed stochastic-deterministic
tube MPC for offset-free tracking in the presence of plant-model mismatch,"
Journal of Process Control, 2018, doi:
https://doi.org/10.1016/j.jprocont.2018.04.010.
[349] M. Korda, R. Gondhalekar, J. Cigler, and F. Oldewurtel, "Strongly feasible
stochastic model predictive control," in 2011 50th IEEE Conference on Decision
and Control and European Control Conference, 2011, pp. 1245-1251, doi:
10.1109/CDC.2011.6161250.
[350] S. Samuelson and I. Yang, "Safety-Aware Optimal Control of Stochastic
Systems Using Conditional Value-at-Risk," in 2018 Annual American Control
Conference (ACC), 2018, pp. 6285-6290, doi: 10.23919/ACC.2018.8430957.
[351] J. Sethuraman, "A CONSTRUCTIVE DEFINITION OF DIRICHLET
PRIORS," Statistica Sinica, vol. 4, no. 2, pp. 639-650, Jul 1994. [Online].
Available: <Go to ISI>://WOS:A1994PK25300016.
247
[352] T. Campbell and J. P. How, "Bayesian Nonparametric Set Construction for
Robust Optimization," in 2015 American Control Conference, (Proceedings of
the American Control Conference, 2015, pp. 4216-4221.
[353] D. M. Blei and M. I. Jordan, "Variational Inference for Dirichlet Process
Mixtures," Bayesian Anal., vol. 1, no. 1, pp. 121-143, 2006, doi: 10.1214/06-
ba104.
[354] K. Kurihara, M. Welling, and N. Vlassis, "Accelerated variational Dirichlet
process mixtures," in Advances in neural information processing systems, 2007,
pp. 761-768.
[355] F. D. Brunner, W. Heemels, and F. Allgöwer, "Robust event-triggered MPC
with guaranteed asymptotic bound and average sampling rate," IEEE
Transactions on Automatic Control, vol. 62, no. 11, pp. 5694-5709, 2017.
[356] M. Cannon and B. Kouvaritakis, "Model Predictive Control—Classical, Robust
and Stochastic," ed: Springer: New York, NY, USA, 2016.
[357] F. Blanchini, "Set invariance in control," Automatica, vol. 35, no. 11, pp. 1747-
1767, 1999/11/01/ 1999, doi: https://doi.org/10.1016/S0005-1098(99)00113-2.
[358] I. Kolmanovsky and E. G. Gilbert, "Theory and computation of disturbance
invariant sets for discrete-time linear systems," Mathematical problems in
engineering, vol. 4, no. 4, pp. 317-367, 1998.
[359] J. Lofberg, "YALMIP: A toolbox for modeling and optimization in MATLAB,"
in 2004 IEEE international conference on robotics and automation (IEEE Cat.
No. 04CH37508), 2004: IEEE, pp. 284-289.
[360] M. Herceg, M. Kvasnica, C. N. Jones, and M. Morari, "Multi-parametric
toolbox 3.0," in 2013 European control conference (ECC), 2013: IEEE, pp. 502-
510.
[361] S. V. Rakovic, E. C. Kerrigan, K. I. Kouramas, and D. Q. Mayne, "Invariant
approximations of the minimal robust positively invariant set," IEEE
Transactions on automatic control, vol. 50, no. 3, pp. 406-410, 2005.
[362] A. Shapiro and A. Kleywegt, "Minimax analysis of stochastic problems,"
Optimization Methods and Software, vol. 17, no. 3, pp. 523-542, 2002.
[363] G. A. Hanasusanto, D. Kuhn, S. W. Wallace, and S. Zymler, "Distributionally
robust multi-item newsvendor problems with multimodal demand
distributions," Math. Program., vol. 152, no. 1-2, pp. 1-32, 2015.
[364] X. J. Zhang, M. Kamgarpour, A. Georghiou, P. Goulart, and J. Lygeros, "Robust
optimal control with adjustable uncertainty sets," Automatica, vol. 75, pp. 249-
259, Jan 2017, doi: 10.1016/j.automatica.2016.09.016.
[365] I. R. Petersen and R. Tempo, "Robust control of uncertain systems: Classical
results and recent developments," Automatica, vol. 50, no. 5, pp. 1315-1335,
May 2014, doi: 10.1016/j.automatica.2014.02.042.
[366] C. Z. Wu, K. L. Teo, and S. Y. Wu, "Min-max optimal control of linear systems
with uncertainty and terminal state constraints," Automatica, vol. 49, no. 6, pp.
1809-1815, Jun 2013, doi: 10.1016/j.automatica.2013.02.052.
[367] M. E. Villanueva, R. Quirynen, M. Diehl, B. Chachuat, and B. Houska, "Robust
MPC via min-max differential inequalities," Automatica, vol. 77, pp. 311-321,
Mar 2017, doi: 10.1016/j.automatica.2016.11.022.
248
[368] D. Bertsimas and M. Sim, "The price of robustness," Oper. Res., vol. 52, no. 1,
pp. 35-53, Jan-Feb 2004, doi: 10.1287/opre.1030.0065.
[369] C. Ning and F. You, "Data-driven adaptive nested robust optimization: General
modeling framework and efficient computational algorithm for decision making
under uncertainty," AIChE J., vol. 63, no. 9, pp. 3790-3817, 2017a, doi:
10.1002/aic.15717.
[370] İ. Yanıkoğlu, B. L. Gorissen, and D. d. Hertog, "A Survey of Adjustable Robust
Optimization," Eur. J. Oper. Res., 2018/09/06/ 2018, doi:
10.1016/j.ejor.2018.08.031.
[371] D. Bertsimas and I. Dunning, "Multistage Robust Mixed-Integer Optimization
with Adaptive Partitions," Oper. Res., vol. 64, no. 4, pp. 980-998, 2016, doi:
10.1287/opre.2016.1515.
[372] G. C. Calafiore, "Multi-period portfolio optimization with linear control
policies," Automatica, vol. 44, no. 10, pp. 2463-2473, Oct 2008, doi:
10.1016/j.automatica.2008.02.007.
[373] H. Bannister, B. Goldys, S. Penev, and W. Wu, "Multiperiod mean-standard-
deviation time consistent portfolio selection," Automatica, vol. 73, pp. 15-26,
Nov 2016, doi: 10.1016/j.automatica.2016.06.021.
[374] G. C. Calafiore, "Direct data-driven portfolio optimization with guaranteed
shortfall probability," Automatica, vol. 49, no. 2, pp. 370-380, Feb 2013, doi:
10.1016/j.automatica.2012.11.012.
[375] F. Oldewurtel, R. Gondhalekar, C. N. Jones, M. Morari, and Ieee, "Blocking
Parameterizations for Improving the Computational Tractability of Affine
Disturbance Feedback MPC Problems," in Proceedings of the 48th IEEE
Conference on Decision and Control, 2009 Held Jointly with the 2009 28th
Chinese Control Conference, (IEEE Conference on Decision and Control, 2009,
pp. 7381-7386.
[376] P. J. Goulart, E. C. Kerrigan, and J. A. Maciejowski, "Optimization over state
feedback policies for robust control with constraints," Automatica, vol. 42, no.
4, pp. 523-533, Apr 2006, doi: 10.1016/j.automatica.2005.08.023.
[377] K. Postek and D. den Hertog, "Multistage Adjustable Robust Mixed-Integer
Optimization via Iterative Splitting of the Uncertainty Set," INFORMS J.
Comput., vol. 28, no. 3, pp. 553-574, Sum 2016, doi: 10.1287/ijoc.2016.0696.
[378] D. Bertsimas and F. de Ruiter, "Duality in Two-Stage Adaptive Linear
Optimization: Faster Computation and Stronger Bounds," INFORMS J. Comput.,
vol. 28, no. 3, pp. 500-511, Sum 2016, doi: 10.1287/ijoc.2016.0689.
[379] G. C. Calafiore, "An affine control method for optimal dynamic asset allocation
with transaction costs," SIAM Journal on Control and Optimization, vol. 48, no.
4, pp. 2254-2274, 2009, doi: 10.1137/080723776.
[380] A. Lorca, X. A. Sun, E. Litvinov, and T. Zheng, "Multistage adaptive robust
optimization for the unit commitment problem," Oper. Res., vol. 64, no. 1, pp.
32-51, 2016.
[381] D. Bertsimas and A. Georghiou, "Design of near optimal decision rules in
multistage adaptive mixed-integer optimization," Oper. Res., vol. 63, no. 3, pp.
610-627, 2015, doi: 10.1287/opre.2015.1365.
249
[382] D. Bertsimas and V. Goyal, "On the power and limitations of affine policies in
two-stage adaptive optimization," Math. Program., journal article vol. 134, no.
2, pp. 491-531, 2012, doi: 10.1007/s10107-011-0444-4.
[383] D. Bertsimas, D. A. Iancu, and P. A. Parrilo, "A Hierarchy of Near-Optimal
Policies for Multistage Adaptive Optimization," IEEE Trans. Autom. Control.,
vol. 56, no. 12, pp. 2803-2818, Dec 2011, doi: 10.1109/tac.2011.2162878.
[384] D. Bertsimas and C. Caramanis, "Finite Adaptability in Multistage Linear
Optimization," IEEE Trans. Autom. Control., vol. 55, no. 12, pp. 2751-2766,
2010, doi: 10.1109/TAC.2010.2049764.
[385] G. A. Hanasusanto, D. Kuhn, and W. Wiesemann, "K-Adaptability in Two-
Stage Robust Binary Programming," Oper. Res., vol. 63, no. 4, pp. 877-891,
2015, doi: 10.1287/opre.2015.1392.
[386] A. Ardestani-Jaafari and E. Delage, "Linearized robust counterparts of two-stage
robust optimization problem with applications in operations management,"
Manuscript, HEC Montreal, 2016.
[387] G. Xu and S. Burer, "A copositive approach for two-stage adjustable robust
optimization with uncertain right-hand sides," Comput. Optim. Appl., journal
article vol. 70, no. 1, pp. 33-59, December 27 2018, doi: 10.1007/s10589-017-
9974-x.
[388] A. Takeda, S. Taguchi, and R. H. Tütüncü, "Adjustable Robust Optimization
Models for a Nonlinear Two-Period System," J. Optim. Theory Appl., journal
article vol. 136, no. 2, pp. 275-295, 2008, doi: 10.1007/s10957-007-9288-8.
[389] A. Thiele, T. Terry, and M. Epelman, "Robust linear optimization with
recourse," Tech. Rep., pp. 4-37, 2009.
[390] A. Georghiou, A. Tsoukalas, and W. Wiesemann, "A Primal-Dual Lifting
Scheme for Two-Stage Robust Optimization," Optimization Online, 2017.
[391] M. Bodur and J. Luedtke, "Two-stage Linear Decision Rules for Multi-stage
Stochastic Programming," arXiv preprint arXiv:1701.04102, 2017.
[392] J. Zou, S. Ahmed, and X. A. Sun, "Stochastic dual dynamic integer
programming," Math. Program., journal article March 02 2018, doi:
10.1007/s10107-018-1249-5.
[393] C. Ning and F. You, " A Transformation-Proximal Bundle Algorithm for
Solving Multistage Adaptive Robust Optimization Problems," in 2018 IEEE
57th Conference on Decision and Control (CDC), Miami Beach, FL, USA 17-
19 Dec. 2018 2018, pp. 2439-2444.
[394] J.-B. Hiriart-Urruty and C. Lemaréchal, Convex analysis and minimization
algorithms I: Fundamentals. Springer science & business media, 2013.
[395] C. Lemarechal and C. Sagastizabal, "Practical aspects of the Moreau-Yosida
regularization: Theoretical preliminaries," SIAM J. Optim., vol. 7, no. 2, pp. 367-
385, May 1997, doi: 10.1137/s1052623494267127.
[396] X. Chen and Y. Zhang, "Uncertain Linear Programs: Extended Affinely
Adjustable Robust Counterparts," Oper. Res., vol. 57, no. 6, pp. 1469-1482,
Nov-Dec 2009, doi: 10.1287/opre.1080.0605.
250
[397] K. C. Kiwiel, "An Inexact Bundle Approach to Cutting-Stock Problems,"
INFORMS J. Comput., vol. 22, no. 1, pp. 131-143, Win 2010, doi:
10.1287/ijoc.1090.0326.
[398] W. van Ackooij, N. Lebbe, and J. Malick, "Regularized decomposition of large
scale block-structured robust optimization problems," Comput. Manag. Sci., vol.
14, no. 3, pp. 393-421, 2017, doi: 10.1007/s10287-017-0281-x.
[399] A. Ruszczynski and A. Swietanowski, "Accelerating the regularized
decomposition method for two stage stochastic linear problems," Eur. J. Oper.
Res., vol. 101, no. 2, pp. 328-342, Sep 1997, doi: 10.1016/s0377-
2217(96)00401-8.
[400] K. C. Kiwiel, "A Proximal Bundle Method with Approximate Subgradient
Linearizations," SIAM J. Optim., vol. 16, no. 4, pp. 1007-1023, 2006, doi:
10.1137/040603929.
[401] Ben-Tal, L. El Ghaoui, and A. Nemirovski, Robust optimization. Princeton
University Press, 2009.
[402] L. A. Wolsey, Integer programming. Wiley, 1998.
[403] A. Belloni, "Lecture Notes for IAP 2005 Course Introduction to Bundle
Methods," Operation Research Center, MIT, Version of February, 2005.
[404] M. J. Hadjiyiannis, P. J. Goulart, and D. Kuhn, "A scenario approach for
estimating the suboptimality of linear decision rules in two-stage robust
optimization," in 2011 50th IEEE Conference on Decision and Control and
European Control Conference, 12-15 Dec. 2011 2011, pp. 7386-7391, doi:
10.1109/CDC.2011.6161342.
[405] A. Ardestani-Jaafari and E. Delage, "The Value of Flexibility in Robust
Location-Transportation Problems," Transp. Res., vol. 52, no. 1, pp. 189-209,
Jan-Feb 2018, doi: 10.1287/trsc.2016.0728.
[406] A. Ben-Tal, B. Golany, and S. Shtern, "Robust multi-echelon multi-period
inventory control," Eur. J. Oper. Res., vol. 199, no. 3, pp. 922-935, Dec 2009,
doi: 10.1016/j.ejor.2009.01.058.
[407] D. Bertsimas and A. Thiele, "A robust optimization approach to inventory
theory," Oper. Res., vol. 54, no. 1, pp. 150-168, Jan-Feb 2006, doi:
10.1287/opre.1050.0238.
[408] J. D. Schwartz, W. L. Wang, and D. E. Rivera, "Simulation-based optimization
of process control policies for inventory management in supply chains,"
Automatica, vol. 42, no. 8, pp. 1311-1320, Aug 2006, doi:
10.1016/j.automatica.2006.03.019.
[409] A. Georghiou, A. Tsoukalas, and W. Wiesemann, "Robust Dual Dynamic
Programming," Available on Optimization Online, 2016.
[410] C.-T. See and M. Sim, "Robust Approximation to Multiperiod Inventory
Management," Oper. Res., vol. 58, no. 3, pp. 583-594, 2010, doi:
10.1287/opre.1090.0746.
[411] F. Maggioni, M. Bertocchi, F. Dabbene, and R. Tempo, "Sampling methods for
multistage robust convex optimization problems," arXiv preprint
arXiv:1611.00980, 2016.
251
[412] F. You and I. E. Grossmann, "Stochastic inventory management for tactical
process planning under uncertainties: MINLP models and algorithms," AIChE
J., vol. 57, no. 5, pp. 1250-1277, 2011, doi: 10.1002/aic.12338.
[413] C. Ning and F. You, "A Data-Driven Multistage Adaptive Robust Optimization
Framework for Planning and Scheduling under Uncertainty," AIChE J., vol. 63,
no. 10, pp. 4343–4369, 2017, doi: 10.1002/aic.15792.
[414] D. A. Van Dyk and X.-L. Meng, "The art of data augmentation," Journal of
Computational and Graphical Statistics, vol. 10, no. 1, pp. 1-50, 2001.
[415] S. Hauberg, O. Freifeld, A. B. L. Larsen, J. Fisher, and L. Hansen, "Dreaming
more data: Class-dependent distributions over diffeomorphisms for learned data
augmentation," in Artificial Intelligence and Statistics, 2016, pp. 342-350.
252